DETAILED ACTION
This action is written in response to the remarks and amendments dated 2/17/26. This action is made final. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
The Examiner is persuaded by the Applicant’s arguments pertaining to the outstanding rejection under §101. Accordingly, and in view of the Applicant’s claim amendments, these rejections are withdrawn.
The Applicants argue that the previous art of record does not anticipate or render obvious the claims as currently amended. The Examiner provides updated prior art rejections below necessitated by the current amendments.
Claim Objections
Claim 1 is objected to because it seems grammatically incorrect or incomplete. Appropriate correction is required. The Examiner suggests a possible correction below:
“wherein the training operations comprise the machine learning model generating one or more inference training outputs responsive to training data;”
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.
The following are the references relied upon in the rejections below:
Friedlander (US 2020/0364597 A1)
Kawthekar (Kawthekar, Prasad, and Christian Kästner. "Sensitivity analysis for building evolving and & adaptive robotic software." Proceedings of the IJCAI Workshop on Autonomous Mobile Service Robots (WSR). Vol. 7. 2016.)
Naik (Naik, Dayakar L., and Ravi Kiran. "A novel sensitivity-based method for feature selection." Journal of Big Data 8, no. 1, 128. 2021.)
Russell (S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, 2nd Ed., 2003, chapt 18-21, pp. 649-789.)
Utans (Utans J, Moody J, Rehfuss S, Siegelmann H. Input variable selection for neural networks: Application to predicting the US business cycle. In Proceedings of 1995 Conference on Computational Intelligence for Financial Engineering (CIFEr) 1995 Apr 9 (pp. 118-122). IEEE.)
Yeung (Yeung, Daniel S., Ian Cloete, Daming Shi, and Wing wY Ng. Sensitivity analysis for neural networks. Springer, 2010.)
Claims 1-3, 5-6, 8-10, 12-13, 15-17 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Russell, Friedlander, Yeung and Kawthekar.
Regarding claims 1, 8 and 15, Russell discloses a (and a related system and computer-readable storage medium) method comprising:
performing model-training supervision operations on multiple iterations of training operations being performed on a machine learning model, the multiple iterations of the training operations being sufficient to enable the machine learning model to generate, responsive to … data, real-time inferences … ;
P. 742, fig. 20.21 (reproduced below).
PNG
media_image1.png
324
616
media_image1.png
Greyscale
‘multiple iterations’ :: ”for each I in examples do”
“sufficient to enable the machine learning model to generate… real time inferences” :: The examiner notes that this is an intended result. Nevertheless, Russell discloses such useful results at p. 743, fig. 20.22.
P. 763, sec. 1, “sensory input”.
wherein the training operations comprise the machine learning model generating one or more inference training outputs responsive training data;
Id.
outputs :: “g(in)”
wherein a volume of the training data over the multiple iterations is sufficient to train the machine learning model to generate the real-time inferences about the dynamic environment;
Id.
The Examiner notes that this is an intended result only.
wherein performing the model-training supervision operations comprise, for each of the multiple iterations of the training operations:
determining, using a processor, a first value of a first output parameter associated with an instance of the one or more inference training outputs; and
Id. “g(in)”
Friedlander discloses the following further limitations which Russell does not disclose:
determining, using the processor, a first sensitivity metric based on the first value of the first output parameter and one or more values of a first input parameter associated with an instance of the training data; …
[0078] “The methods for stochastic optimization of robust inference problems may include iteratively performing one or more steps in each iteration of the iterative optimization process until at least one stopping criterion is met. Such stopping criterion may comprise a set of rules containing one or more rules for determining one or more of an accuracy, sensitivity, or specificity of a solution to the robust inference problem. The stopping criterion may be based at least in part on a magnitude of a distance between the current continuous vector in one iteration of the optimization process and the updated current continuous vectors in the same iteration or a different iteration, e.g., a previous or subsequent iteration.” (Emphasis added.)
determining, using the processor and based on the first sensitivity metric, whether to continue the training operations.
Id.
At the time of filing, it would have been obvious to a person of ordinary skill to combine the technique disclosed by Friedlander for using sensitivity as a stopping condition with the perceptron learning system (ie a neural network learning system) disclosed by Russell because it would provide for optimization of additional computing power (and time) needed for additional training along with additional gains in model predictive power. Both disclosures pertain to machine learning.
Yeung discloses the following further limitations which Russell/Friedlander do not disclose:
wherein the values of the first input parameter are provided to the machine learning model wherein the first sensitivity metric is indicative of a rate of change of the first output parameter relative to the first input parameter; and
P. 19, eqn. 2.3 (reproduced below) and description thereof: “Fig. 2.2 illustrates geometrical interpretation of Eq. (2.2) in space _K. Point o(x(n)) represents the nominal response of the neural network for the nth element of the training set x(n). The disturbance _x of the input vector causes the perturbed response at o(x(n) + _x). This response can be expressed as a combination of three vectors as indicated in Eq. (2.2).”
PNG
media_image2.png
144
244
media_image2.png
Greyscale
At the time of filing, it would have been obvious to a person of ordinary skill to combine neural network sensitivity analysis (as taught by Yeung) with the combined system of Russell/Friedland because the former is a decades-old technique for analyzing the operation, performance, and robustness to noise of neural networks. As set forth in Yeung:
“Sensitivity refers to how a neural network output is influenced by its input and/or weight perturbations. Sensitivity analysis dates back to the 1960 s, when Widrow investigated the probability of misclassification caused by weight perturbations, which are caused by machine imprecision and noisy input (Widrow and Hoff, 1960). In network hardware realization, such perturbations must be analyzed prior to its design, since they significantly affect network training and generalization. The initial idea of sensitivity analysis has been extended to the optimization of neural networks, such as through sample reduction, feature selection, and critical vector learning.” (P. 17.)
Kawthekar discloses the following further limitations which Russell/Friedlander/Yeung do not disclose:
the multiple iterations of the training operations being sufficient to enable the machine learning model to generate, responsive to real-time and varied sensor data, real-time inferences about a dynamic environment;
P. 2, second col., “It is also worth noting that software design choices can control some aspects of hardware design as well, since the software implements the interface of a robot’s algorithms to its sensors and actuators. For instance, a robot’s hard ware may contain multiple (possibly redundant) sensors that are used for localization (such as Kinect, Lidar etc.). Underdifferent environment conditions, such as changing sensor noise, the robot can increase or decrease the number of its active sensors to ensure high localization accuracy, while simultaneously trying to minimize the power consumption by the sensors. This kind of adaptation can be made possible by including the (boolean) choice of activating a sensor in the robot’s influence model.”
At the time of filing, it would have been obvious to a person of ordinary skill to apply the combined machine learning system of Russell/Friedlander/Yeung to robotic control tasks (as addressed by Kawthekar) because the former can help automate the latter. In other words, sophisticated AI algorithms (such as those taught by Russell/Friedlander/Yeung) are necessary to create robots with useful behavior in real-world tasks.
Regarding independent claim 8, Friedland also discloses its further limitation comprising one or more processors; and at least one computer-readable storage medium.
[0040] “one or more computer processors”.
[0041] “non-transitory computer readable medium (e.g., computer memory)”.
Regarding independent claim 15, Friedland also discloses its further limitation comprising a non-transitory computer readable medium.
[0041] “non-transitory computer readable medium (e.g., computer memory)”.
Regarding claims 2, 9 and 16, Friedlander discloses the further limitation wherein the model-training supervision operations further comprise:
determining that the first sensitivity metric is greater than a threshold sensitivity value; and
[0078] “The methods for stochastic optimization of robust inference problems may include iteratively performing one or more steps in each iteration of the iterative optimization process until at least one stopping criterion is met. Such stopping criterion may comprise a set of rules containing one or more rules for determining one or more of an accuracy, sensitivity, or specificity of a solution to the robust inference problem. The stopping criterion may be based at least in part on a magnitude of a distance between the current continuous vector in one iteration of the optimization process and the updated current continuous vectors in the same iteration or a different iteration, e.g., a previous or subsequent iteration.” (Emphasis added.)
in response to determining that the first sensitivity metric is greater than the threshold sensitive value, determining to end the training of the machine learning model.
Id. ‘stopping criterion’.
[As noted supra in the mapping for claim 1, Yeung teaches calculation of a sensitivity metric.]
Regarding claims 3, 10 and 17, their further limitations are an obvious variation of claims 1/8/15. Their further limitations recited merely repeating the functionality of claims 1/8/15 a second time. The Examiner notes that perceptron (and indeed all neural networks) are inherently necessarily trained by iterating through each labeled training example, often multiples times (ie over multiple epochs). See Russell, p. 745, fig. 20.21 (reproduced supra): ‘repeat…. until'. The purpose of a stopping criterion (as disclosed by Friedlander) is to define a point at which iterative training should stop.
Regarding claims 5, 12 and 19 Yeung discloses the following further limitation wherein determining the first sensitivity metric comprises:
calculating a Jacobian matrix based on the first value of the first output parameter and the one or more values of a first input parameter.
P. 19, eqn. 2.3 (reproduced below) and description thereof: “Fig. 2.2 illustrates geometrical interpretation of Eq. (2.2) in space _K. Point o(x(n)) represents the nominal response of the neural network for the nth element of the training set x(n). The disturbance _x of the input vector causes the perturbed response at o(x(n) + _x). This response can be expressed as a combination of three vectors as indicated in Eq. (2.2).”
PNG
media_image2.png
144
244
media_image2.png
Greyscale
Regarding claims 6, 13 and 20, Yeung discloses the following further limitation wherein:
performing the model-training supervision operations result in a plurality of sensitivity metrics that are each indicative of a rate of change of an output parameter relative to an input parameter; and
P. 19, eqn. 2.3 (reproduced below) and description thereof: “Fig. 2.2 illustrates geometrical interpretation of Eq. (2.2) in space _K. Point o(x(n)) represents the nominal response of the neural network for the nth element of the training set x(n). The disturbance _x of the input vector causes the perturbed response at o(x(n) + _x). This response can be expressed as a combination of three vectors as indicated in Eq. (2.2).”
PNG
media_image2.png
144
244
media_image2.png
Greyscale
the method further includes identifying, based on the plurality of sensitivity metrics, one or more critical input parameters.
P. 71, “The sensitivity matrices for a trained neural network can be evaluated for both training and testing data sets; the norms of MSA sensitivity matrix columns can be used for ranking inputs according to their significance and for reducing the size of the network accordingly through pruning less relevant inputs.”
Claims 4, 11 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Russell, Friedlander, Yeung, Kawthekar and Utans.
Regarding claims 4, 11 and 18, Utans discloses the following further limitation which neither Russell/Friedlander/Yeung/Kawthekar discloses comprising:
determining that a deviation between the first sensitivity metric and the second sensitivity metric is greater than a threshold deviation value; and
P. 121, second col. “Local optima for the number of inputs are found at 15 on the FPE curve and 13 on the NCV curve. Due to the variability in the FPE and NCV estimates (readily apparent in figure 5 for NCV), we favor choosing the first good local minimum for these curves rather than a slightly better global minimum. This local minimum for NCV corresponds to a global minimum for the test error. Choosing it leads to a reduction of 35 in the number of input series and a reduction in the number of network weights from 151 to 46. Inclusion of additional input variables, while decreasing the training error, does not improve the test-set performance.”
in response, determining to end the training operations.”
Id.
At the time of filing, it would have been obvious to a person of ordinary skill to combine the feature selection technique disclosed by Utans with the Russell/Friedlander/Yeung/Kawthekar system because it could provide for simpler, more interpretable models—which are also faster to train—by eliminating features with low importance (as measured by sensitivity to input perturbations).
Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Russell, Friedlander, Yeung, Kawthekar and Naik.
Regarding claims 7 and 14, Naik discloses the following further limitations which neither Russell/Friedlander/Yeung/Kawthekar disclsoe comprising:
identifying, based on the plurality of sensitivity metrics, at least one negligible input; and
p. 4, “Higher the change in the magnitude of the output variable y∈R of the FFNN with respect to the input feature xk ∈ R, higher is the importance of the feature xk.”
See also eqn. 5.
simplifying the machine learning model by removing the at least one negligible input.
P. 2, discussing using “’sensitivity-based-pruning (SBP)’ to remove irrelevant input features from a nonlinear regression model”.
At the time of filing, it would have been obvious to a person of ordinary skill to combine the feature selection technique disclosed by Naik with the Russell/Friedlander/Yeung/Kawthekar system because it could provide for simpler, more interpretable models by eliminating features with low importance (as measured by sensitivity to input perturbations).
Additional Relevant Prior Art
The following references were identified by the Examiner as being relevant to the disclosed invention, but are not relied upon in any particular prior art rejection:
Srinath discloses techniques for modeling machine learning performance, including eg feature selection (see [0043]) and sensitivity analysis (see [0049]). (2024/0370776 A1)
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Vincent Gonzales whose telephone number is (571) 270-3837. The examiner can normally be reached on Monday-Friday 7 a.m. to 4 p.m. MT. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang, can be reached at (571) 270-7092.
Information regarding the status of an application may be obtained from the USPTO Patent Center.
/Vincent Gonzales/Primary Examiner, Art Unit 2124