Prosecution Insights
Last updated: April 19, 2026
Application No. 18/054,216

Method for Generating Training Data for Training a Machine Learning Algorithm

Final Rejection §101§103
Filed
Nov 10, 2022
Examiner
MORALES, PEDRO JESUS
Art Unit
2124
Tech Center
2100 — Computer Architecture & Software
Assignee
Robert Bosch GmbH
OA Round
2 (Final)
67%
Grant Probability
Favorable
3-4
OA Rounds
3y 11m
To Grant
99%
With Interview

Examiner Intelligence

Grants 67% — above average
67%
Career Allow Rate
6 granted / 9 resolved
+11.7% vs TC avg
Strong +50% interview lift
Without
With
+50.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 11m
Avg Prosecution
20 currently pending
Career history
29
Total Applications
across all art units

Statute-Specific Performance

§101
26.9%
-13.1% vs TC avg
§103
40.4%
+0.4% vs TC avg
§102
13.5%
-26.5% vs TC avg
§112
17.1%
-22.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 9 resolved cases

Office Action

§101 §103
DETAILED ACTION This action is responsive to Applicant’s reply filed 20 November 2025. This action is made final. Status of the Claims Claims 1-4, 7-11 and 14 are currently amended. Claims 6 and 13 are canceled. Claim status is currently pending and under examination for claims 1-5, 7-12 and 14 of which independent claims are 1 and 8. Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Amendment Applicant’s amendments to the Drawings and Claims have overcome each and every objection and 112(b) rejections previously set forth in the Non-Final Office Action mailed August 27th 2025. The claims as amended are no longer interpreted under 35 U.S.C. 112(f). Applicant’s arguments regarding the art rejections are moot in view of the new grounds of rejection necessitated by applicant’s amendment. In regards to the rejection of claims 1-5, 7-12 and 14 for being directed towards an abstract idea without significantly more, Applicant argues the claims are not directed to a judicial exception but rather to an improvement to the technical problem of training a machine learning algorithm to perform a task (See Applicant’s response page 13). On Pages 11-12, Applicant argues the amended claims do not recite a mental process because the “training the machine learning algorithm …” step cannot be practically performed in the human mind or with the aid of pen and paper. Applicant’s argument is not persuasive because the Examiner did not point to the “training” step as being directed to mental processes in the previous Office Action. The Examiner pointed to the “training” step as amounting to no more than mere instructions to “apply” the judicial exception on a computer and is nothing more than an attempt to generally link the use of the judicial exception to the technological environment of computers. On Pages 13-15, Applicant argues the claims provide a technical solution to the problem of “training machine learning algorithm to perform a task by solving a common problem with poorly distributed training data.” Applicant’s argument is not persuasive since the improvement of not having to train a machine learning algorithm with poorly distributed training data is not reflected in the claims. Thus, the rejections of claims 1-5, 7-12 and 14 as being directed towards an abstract idea without significantly more are still maintained. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-5, 7-12 and 14 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Independent Claims 1 and 8 Step 2A Prong One: Does the claim recite an abstract idea, law of nature, or natural phenomenon? Yes, independent claim 1, under the broadest reasonable interpretation, recites the following limitations that are abstract ideas: approximating a manifold in which at least one part of the data points of the first training data is located by determining, for each respective data point from the first training data, nearest neighbors of the respective data point within the data points of the first training data; (mental process) determining a structure of the at least one part of the data points of the first training data in the manifold using principal component analysis based on the nearest neighbors of each respective data point; (math, mental process) and generating additional training data based on the determined structure of the at least one part of the data points of the first training data in the manifold (mental process) The “approximating” step involves identifying a manifold based on the nearest neighbors of each data point which amounts to no more than observations, evaluations, and judgments that can be performed in the human mind or with the use of a physical aid (e.g., pen and paper). The claim recites the step of approximating a manifold at a high degree of generality, thus the step is not required to have any specific level of complexity that would preclude the step from being mental processes. Therefore, the “approximating” step is considered to be mental processes, see MPEP § 2106.04(a)(2)(III). The “determining” step involves identifying the structure (relationships) of data points by performing principal component analysis which represents a mathematical calculation and amounts to no more than evaluations, observations, and judgments that can be performed in the human mind or with the use of a physical aid (e.g., pen and paper). The claim recites the step of determining a structure of the data points at a high degree of generality, thus the step is not required to have any specific level of complexity that would preclude the step from being mental processes. Therefore, the “determining” step is considered to be a mathematical concept, see MPEP § 2106.04(a)(2)(I), and mental processes, see MPEP § 2106.04(a)(2)(III). The “generating” step involves identifying new data samples based on the identified structure of data points which amounts to no more than observations, evaluations, and judgments that can be performed in the human mind or with the use of a physical aid (e.g., pen and paper). The claim recites the step of generating additional training data at a high degree of generality, thus the step is not required to have any specific level of complexity that would preclude the step from being mental processes. Therefore, the “generating” step is considered to be mental processes, see MPEP § 2106.04(a)(2)(III). Therefore, the independent claims recite a judicial exception. Independent claim 8 recites similar limitations corresponding to claim 1, therefore the same subject matter eligibility analysis is applied. Step 2A Prong Two: Does the claim recite additional elements that integrate the judicial exception into a practical application? No, the judicial exception recited above is not integrated into a practical application. The claims recite the following additional elements, but these additional elements are not sufficient to integrate the judicial exception into a practical application: providing first training data for training the machine learning algorithm; (MPEP § 2106.05(g) necessary data gathering and insignificant extra-solution activity to the judicial exception) and training the machine learning algorithm to perform a task based on the first training data and the additional training data (MPEP § 2106.05(f) mere instructions to implement an abstract idea on a computer, or generally links exception to a technological environment) a control device for generating training data for training a machine learning algorithm, the training data respectively comprise a data point and a data value associated with the data point, the control device comprising (claim 8) (MPEP § 2106.05(f) mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea) a memory configured to store first training data and program code; (claim 8) (MPEP § 2106.05(f) mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea) and a processor operably connected to the memory and configured to execute the program code to (claim 8) (MPEP § 2106.05(f) mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea) The “providing” step amounts to mere data gathering and is recited at a high level of generality, thus adding insignificant extra-solution activity to the judicial exception – see MPEP § 2106.05(g). Under MPEP § 2106.05(d), such additional elements have been found by the courts to not integrate a judicial exception into a practical application. The “training” step is recited at a high-level of generality such that the limitation amounts to no more than mere instructions to “apply” the judicial exception on a computer. It can also be viewed as nothing more than an attempt to generally link the use of the judicial exception to the technological environment of computers, see MPEP § 2106.05(f). The remaining additional elements are recited at a high-level of generality such that they amount to no more than mere instructions to “apply” an exception using a generic component. Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, see MPEP § 2106.05(f). Therefore, the above limitations do not integrate the judicial exception into a practical application. Step 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception? No. The claims do not include additional elements that are sufficient for the claims to amount to significantly more than the judicial exception. In regards to the “providing” step, this step adds insignificant extra-solution activity. An extra-solution activity is a well-understood, routine and conventional (WURC) activity per MPEP § 2106.05(d)(II), “the courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. i. Receiving or transmitting data over a network, e.g., using the Internet to gather data.” The “providing” step does not integrate the judicial exception into a practical application and does not amount to significantly more. In regards to the “training” step, the limitations are recited so generically such that they amount to no more than mere instructions to “apply” the judicial exception on a computer using generic computer components. Mere instructions to apply a judicial exception cannot provide an inventive concept. See MPEP § 2106.05(f). In regards to the remaining additional elements, the limitations are recited so generically such that they amount to no more than mere instructions to “apply” the judicial exception on a computer using generic computer components. Mere instructions to apply a judicial exception cannot provide an inventive concept. See MPEP § 2106.05(f). Therefore, independent claims 1 and 8 are not patent eligible. Dependent Claims 2-5, 7 and 9-12 and 14 The remaining dependent claims being rejected do not recite additional elements, whether considered individually or in combination, that are sufficient to integrate the judicial exception into a practical application or amount to significantly more than a judicial exception. Dependent claim 2 recites the further limitation “the generating the additional training data includes varying coefficients of the principal component analysis to determine the additional training data.” The step is recited at a high-level of generality such that the limitations amount to no more than mere instructions to “apply” the judicial exception on a computer. They can also be viewed as nothing more than an attempt to generally link the use of the judicial exception to the technological environment of computers, see MPEP § 2106.05(f). Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, see MPEP § 2106.05(f). The step does not integrate the judicial exception into a practical application and does not amount to significantly more. Dependent claim 3 recites the further limitation “wherein, for each data point from the first training data, the nearest neighbors are determined based on a Euclidean norm.” The step involves calculating nearest neighbors by using a Euclidean norm, which represents a mathematical calculation and amounts to no more than evaluations, observations, and judgments that can be performed in the human mind or with the use of a physical aid (e.g., pen and paper). The claim recites the step at a high degree of generality, thus the step is not required to have any specific level of complexity that would preclude the step from being mental processes. Therefore, the step is considered to be an abstract idea of a mathematical concept, see MPEP § 2106.04(a)(2)(I), and mental processes, see MPEP § 2106.04(a)(2)(III). This claim does not recite any non-abstract additional elements. Dependent claim 4 recites the further limitation “respectively determining, for each data point in the additional training data, a data value for the respective data point based on data values associated with the nearest neighbors of the respective data point.” The “determining” step involves identifying a data value for each data point based on the data value of a nearest neighbor which amounts to no more than observations, evaluations, and judgments that can be performed in the human mind or with the use of a physical aid (e.g., pen and paper). The claim recites the step of determining a data value at a high degree of generality, thus the step is not required to have any specific level of complexity that would preclude the step from being mental processes. Therefore, the “determining” step is considered to be mental processes, see MPEP § 2106.04(a)(2)(III). This claim does not recite any non-abstract additional elements. Dependent claim 5 recites the further limitation “wherein the first training data comprise sensor data.” This limitation represents mere necessary data gathering and is recited at a high level of generality, thus adding insignificant extra-solution activity to the judicial exception - see MPEP § 2106.05(g). The extra-solution activity is a well-understood, routine and conventional (WURC) activity per MPEP § 2106.05(d)(II), “the courts have recognized the following computer functions as well-understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. i. Receiving or transmitting data over a network, e.g., using the Internet to gather data.” The limitation does not integrate the judicial exception into a practical application and does not amount to significantly more. Dependent claim 7 recites the following limitations: providing a machine learning algorithm for controlling the at least one function of the controllable system (MPEP § 2106.05(f) mere instructions to implement an abstract idea on a computer, or generally links exception to a technological environment) and controlling the at least one function of the controllable system based on the trained machine learning algorithm (MPEP § 2106.05(f) mere instructions to implement an abstract idea on a computer, or generally links exception to a technological environment) The “providing” and “controlling” steps are recited at a high-level of generality such that the limitations amount to no more than mere instructions to “apply” the judicial exception on a computer. They can also be viewed as nothing more than an attempt to generally link the use of the judicial exception to the technological environment of computers, see MPEP § 2106.05(f). Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, see MPEP § 2106.05(f). The limitations do not integrate the judicial exception into a practical application and do not amount to significantly more. Dependent claim 14 recites the following limitations “wherein the processor further executes the program code to control at least one function of a controllable system using the trained machine learning algorithm.” The step is recited at a high-level of generality such that the limitations amount to no more than mere instructions to “apply” the judicial exception on a computer. They can also be viewed as nothing more than an attempt to generally link the use of the judicial exception to the technological environment of computers, see MPEP § 2106.05(f). Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, see MPEP § 2106.05(f). The step does not integrate the judicial exception into a practical application and does not amount to significantly more. Dependent claim 9 recites similar limitations corresponding to claim 2, therefore the same subject matter eligibility analysis is applied. Dependent claim 10 recites similar limitations corresponding to claim 3, therefore the same subject matter eligibility analysis is applied. Dependent claim 11 recites similar limitations corresponding to claim 4, therefore the same subject matter eligibility analysis is applied. Dependent claim 12 recites similar limitations corresponding to claim 5, therefore the same subject matter eligibility analysis is applied. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-2 and 4 are rejected under 35 U.S.C. 103 as being unpatentable over DeVries et al. (“Dataset augmentation in feature space”), hereinafter DeVries, in view of Beigi et al. (US 20230060848 A1), hereinafter Beigi. With respect to claim 1, DeVries teaches: a method for generating training data for training a machine learning algorithm (DeVries discloses “Dataset augmentation, the practice of applying a wide array of domain-specific transformations to synthetically expand a training set, is a standard tool in supervised learning … we adopt a simpler, domain-agnostic approach to dataset augmentation. We start with existing data points and apply simple transformations such as adding noise, interpolating, or extrapolating between them” (P. 1, Abstract).), the training data respectively comprise a data point and a data value associated with the data point, the method comprising (DeVries discloses “in order to augment a dataset, each example is projected into feature space by feeding it through the sequence encoder, extracting the resulting context vector, and then applying a transformation in feature space (Figure 1b). … We include a γ parameter to globally scale the noise … where i indexes the elements of a context vector which corresponds to data points from the training set. … For each sample in the dataset, we find its K nearest neighbours in feature space which share its class label” (P. 4, Sec. 3.2, First Paragraph).): providing first training data for training the machine learning algorithm (DeVries discloses “we demonstrate that extrapolating between samples in feature space can be used to augment datasets and improve the performance of supervised learning algorithms … We show that models trained on datasets that have been augmented using our technique outperform models trained only on data from the original dataset” (P. 2, Sec. 1, Last Paragraph).); approximating a manifold in which at least one part of the data points of the first training data is located by determining, for each respective data point from the first training data, nearest neighbors of the respective data point within the data points of the first training data (The Examiner interprets “manifold” according to its broadest reasonable interpretation (in view of the Applicant’s specification at Paragraph 0012) as encompassing a feature space as disclosed by DeVries. DeVries discloses “we consider augmentation not by a domain-specific transformation, but by perturbing, interpolating, or extrapolating between existing examples. However, we choose to operate not in input space, but in a learned feature space. Bengio et al. (2013) and Ozair & Bengio (2014) claimed that higher level representations expand the relative volume of plausible data points within the feature space, conversely shrinking the space allocated for unlikely data points. As such, when traversing along the manifold it is more likely to encounter realistic samples in feature space than compared to input space” (P. 1, Sec. 1, Second Paragraph). DeVries further discloses “our dataset augmentation technique works by first learning a data representation and then applying transformations to samples mapped to that representation. Our hypothesis is that, due to manifold unfolding in feature space, simple transformations applied to encoded rather than raw inputs will result in more plausible synthetic data. … we use a sequence autoencoder to construct a feature space” (P. 2-3, Sec. 3, First Paragraph). DeVries discloses “for each sample in the dataset, we find its K nearest neighbours in feature space which share its class label. For each pair of neighbouring context vectors, a new context vector can then be generated using interpolation” (P. 4, Sec. 3.2, First Paragraph). DeVries further discloses “For each sample in the dataset we found the 10 nearest in-class neighbours by searching in feature space. We then interpolated or extrapolated between each neighbour and the original sample to produce a synthetic example which was added to the augmented dataset” (P. 4, Sec. 4, Last Paragraph).); determining a structure of the at least one part of the data points of the first training data in the manifold (The Examiner interprets “structure” according to its broadest reasonable interpretation (in view of the Applicant’s specification at Paragraph 0013) as encompassing proximity relationships between neighboring context vectors (‘data points’) in a feature space (‘manifold’). DeVries discloses “for each sample in the dataset, we find its K nearest neighbours in feature space which share its class label. For each pair of neighbouring context vectors, a new context vector can then be generated using interpolation: PNG media_image1.png 110 1144 media_image1.png Greyscale where c ' is the synthetic context vector, c i and c j are neighbouring context vectors, and λ is a variable in the range {0, 1} that controls the degree of interpolation. In our experiments, we use λ = 0.5 so that the new sample balances properties of both original samples” (P. 4, Sec. 3.2, First Paragraph).); generating additional training data based on the determined structure of the at least one part of the data points of the first training data in the manifold (DeVries discloses “for each sample in the dataset, we find its K nearest neighbours in feature space which share its class label. For each pair of neighbouring context vectors, a new context vector can then be generated using interpolation … where c ' is the synthetic context vector, c i and c j are neighbouring context vectors, and λ is a variable in the range {0, 1} that controls the degree of interpolation. In our experiments, we use λ = 0.5 so that the new sample balances properties of both original samples. In a similar fashion, extrapolation can also be applied to the context vectors … Once new context vectors have been created, they can either be used directly as input for a learning task” (P. 4, Sec. 3.2).); and training the machine learning algorithm to perform a task based on the first training data and the additional training data (DeVries discloses “We conduct four different test scenarios on the MNIST dataset. To control for the representation, as a baseline we trained the classifier only on context vectors from the original images (i.e. SA with no augmentation). We then compare this to training with various kinds of dataset augmentation: traditional affine image transformations in input space (shifting, rotation, scaling), extrapolation between nearest neighbours in input space, and extrapolation between nearest neighbours in representational space. For both extrapolation experiments we use three nearest neighbours per sample and γ = 0.5 when generating new data” (P. 8, Sec. 4.6, ¶4). DeVries discloses Table 4 on P. 8 (reproduced below) depicting test errors for models trained with original and augmented data. A baseline model is trained using only original training data (‘first training data’). The baseline model is then further trained with an augmented dataset (‘additional training data’) consisting of samples obtained through feature space extrapolation. PNG media_image2.png 381 816 media_image2.png Greyscale ). However, DeVries does not teach determining a structure of the at least one part of the data points of the first training data in the manifold using principal component analysis based on the nearest neighbors of each respective data point, which is taught by Beigi: determining a structure of the at least one part of the data points of the first training data in the manifold using principal component analysis based on the nearest neighbors of each respective data point (The Examiner interprets “manifold” according to its broadest reasonable interpretation (BRI) in view of the Applicant’s specification (at Paragraph [0012]) as encompassing a low-dimensional feature space as disclosed by Beigi. Beigi discloses “the records are embedded in low-dimensional space. This embedding comprises mapping of the records to a p-dimensional feature space V, where p is between 0 and m. Preferably, p is small, e.g., two or three, resulting in a low-dimensional feature space, which makes the subsequent k-nearest neighbor clustering operation work better. The embedding helps determine which records are similar to each other. FIG. 2B shows a space in which p=2. Embedding may be accomplished using t-stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP), or principal component analysis (PCA). … PCA computes the principal components (the dimensions in the low-dimensional space) of a set of records and uses the principal components to perform a change of basis on the data, sometimes using only the first few principal components and ignoring the rest. FIG. 2C shows the differences between using these three types of dimensionality reduction. The black dots show the original data; the gray dots show how each method generates a slightly different synthetic dataset” [0037]. See Figure 2C depicting the synthetic dataset generated by using a low-dimensional feature space (‘manifold’) obtained from performing principal component analysis. The Examiner interprets “structure” according to its BRI in view of the Applicant’s specification (at Paragraph [0013]) as encompassing distance relationships between embedded points in a low-dimensional feature space (‘manifold’) as disclosed by Beigi below. Beigi discloses pseudo-code for generating synthetic data in [0019-0034] (reproduced below). An original dataset (‘first training data’) is embedded into a low-dimensional feature space (‘manifold’) by using Principal Component Analysis. The embedded original dataset is represented by embedded points rs (‘data points of the first training data’). A random embedded point is chosen and then its k nearest neighbors in the feature space are selected. The nearest neighbors of an embedded point represent distance relationships between embedded points in a feature space, and therefore are a “structure of the at least one part of the data points of the first training data in the manifold”. PNG media_image3.png 600 1064 media_image3.png Greyscale Beigi discloses “Once the records are embedded in low-dimensional space, a seed record rs is selected at random in operation 120. Operation 125 then identifies the k nearest neighbors to the seed record. The value of k is selected heuristically based on the trade-off between fidelity and privacy and on the type of application” [0038].); Beigi teaches using principal component analysis to embed an original dataset in a low-dimensional feature space is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of DeVries with the principal component analysis technique disclosed by Beigi to create a low-dimensional feature space. By using principal component analysis to create a low-dimensional feature space, data points can be transformed into lower-dimensions making it easier to perform nearest neighbor clustering, thereby reducing computational cost and retaining important information. With respect to claim 2, the combination of DeVries in view of Beigi teaches: the method according to Claim 1, wherein: the generating the additional training data includes varying coefficients of the principal component analysis to determine the additional training data (Beigi discloses using principal component analysis (PCA) to obtain a low-dimensional feature space (see [0037]). PCA is used to transform an original dataset to low-dimension embedded points. To transform the original dataset to low-dimension embeddings, the coefficients used in PCA must be adjusted, therefore, “varying coefficient of the principal component analysis” is implied. Beigi discloses nearest neighbors is used to generate synthetic records (‘additional training data’), “in operation 130, a new, synthetic record, rs′, is generated. For each record in the low-dimensional space, the method generates one or more synthetic records by permuting the features of its k nearby neighbors within a certain radius/distance and within the same cluster” [0039]. See also pseudo-code for generating synthetic data in [0019-0034] (reproduced above) that describes how nearest neighbor clustering is performed in a low-dimensional feature space (obtained from performing PCA) to generate synthetic records.). Beigi teaches using principal component analysis and nearest neighbor clustering to generate synthetic data is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of DeVries with the principal component analysis technique disclosed by Beigi to create a low-dimensional feature space. By using principal component analysis to create a low-dimensional feature space, data points can be transformed into lower-dimensions making it easier to perform nearest neighbor clustering, thereby reducing computational cost and retaining important information. With respect to claim 4, the combination of DeVries in view of Beigi teaches: the method according to Claim 1, further comprising: respectively determining, for each data point in the additional training data, a data value for the respective data point based on data values associated with the nearest neighbors of the respective data point (DeVries discloses “For each pair of neighbouring context vectors, a new context vector can then be generated using interpolation: PNG media_image1.png 110 1144 media_image1.png Greyscale where c ' is the synthetic context vector, c i and c j are neighbouring context vectors, and λ is a variable in the range {0, 1} that controls the degree of interpolation. In our experiments, we use λ = 0.5 so that the new sample balances properties of both original samples” (P. 4, Sec. 3.2, First Paragraph).). Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over DeVries in view of Beigi, in further view of Villalonga et al. (“Industrial cyber-physical system for condition-based monitoring in manufacturing processes”), hereinafter Villalonga. With respect to claim 3, the combination of DeVries in view of Beigi teaches the method according to claim 1, however, the combination does not teach determining nearest neighbors based on a Euclidean norm, which Villalonga does: wherein, for each data point from the first training data, the nearest neighbors are determined based on a Euclidean norm (Villalonga discloses “the procedure chosen to obtain the local model in incremental hybrid modeling is the Fuzzy k-Nearest Neighbors (F-kNN) approach. kNN consists of averaging the value of the points closest to the objective point. The kNN algorithm assumes, therefore, that nearby points have similar values. To calculate the proximity, the Euclidean norm was applied” (P. 641, Sec. 3C, First Paragraph).). Villalonga teaches determining a nearest neighbor based on a Euclidean norm is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of DeVries with the technique disclosed by Villalonga to use fewer computing resources. Calculating a Euclidean norm is computationally efficient because a Euclidean norm relies on simple arithmetic operations (addition, squaring, square root) to calculate a distance. Computers are equipped with processors that are optimized to handle simple arithmetic operations, therefore these calculations would not take a long time to compute or use many computational resources. Claims 5, 7-9, 11-12 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over DeVries in view of Beigi, in further view of Javed et al. (“Design and implementation of a cloud enabled random neural network-based …”), hereinafter Javed. With respect to claim 5, the combination of DeVries in view of Beigi teaches the method according to Claim 1, however, the combination does not teach using training data comprised of sensor data, which Javed does: wherein the first training data comprise sensor data (Javed discloses “the trained RNN model is implemented on an indoor environment sensor node. The inputs for the model are: HVAC inlet air temperature, HVAC inlet air CO2 concentrations, inlet air temperature of the environment chamber, and CO2 concentration inside the environment chamber. The output of the model is the number of occupants inside the environment chamber. The training data set for the RNN model is downloaded from the Web portal” (P. 397, Sec. 4B, First Paragraph). Javed further discloses “for each sensor node, the Web portal displays node ID, upload time in milli seconds (the time sensor node is powered), light intensity, CO2 concentrations, temperature, humidity, dewpoint temperature, data receiving time, motion sensor, heating setpoint, cooling setpoint, heating output for HVAC, cooling output for HVAC, ventilation output for HVAC, and number of occupants in the room” (P. 397, Sec. 3F).). Javed teaches using training data gathered from sensors to train a machine learning model is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of DeVries with the technique disclosed by Javed to train an accurate machine learning model. Sensors provide accurate, specific, and real-time data about an environment, which can be used to train a machine learning model that captures contextualized relationships and patterns. By using a trained model that has learned complex patterns and relationships, the model can predict accurate outcomes, which can be used to make well-informed decisions. With respect to claim 7, the combination of DeVries in view of Beigi teaches: the machine learning algorithm having been trained according to the method of Claim 1. However, the combination does not teach controlling at least one function of a controllable system based on a trained machine learning algorithm, which Javed does: a method for controlling at least one function of a controllable system, comprising (Javed discloses “the base station is embedded with RNN models to control the HVAC on the basis of setpoints for heating and cooling. The HVAC of the environment chamber consumes 27.12% less energy with smart controller as compared to simple rule-based controllers” (P. 393, Abstract).): providing a machine learning algorithm for controlling the at least one function of the controllable system (Javed discloses a Random Neural Network (RNN) model, “the estimated setpoints from the RNN model will be used by RNN HVAC control model for controlling the HVAC. … The RNN HVAC controller controls the HVAC on the basis of the setpoints estimated by the RNN PMV-based setpoint estimator … The RNN HVAC controller model is trained by the dataset collected from the environment chamber” (P. 399, Sec. 4E, Last Two Paragraphs).), and controlling the at least one function of the controllable system based on the trained machine learning algorithm (Javed discloses “The RNN HVAC controller controls the HVAC on the basis of the setpoints estimated by the RNN PMV-based setpoint estimator or user defined setpoints for heating and cooling … The outputs of the RNN model are: 1) heating output for turning on the HVAC heating; 2) cooling output for turning on the HVAC cooling; and 3) ventilation for the zone” (P. 399, Sec. 4E, Last Paragraph).). Javed teaches controlling an HVAC system (‘controllable system’) by using a controller embedded with a trained RNN model is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of DeVries with the technique disclosed by Javed to automate tasks. In a system’s environment, there are often many variables to consider which make it difficult for a human operator or a simple controller to make quick, complex decisions in real time. A machine learning model can be trained to learn complex patterns and relationships between data, which can then be used to make accurate predictions on unseen, real-time data. Therefore, a trained machine learning model can be used to save time and resources by automating decision-making tasks that control a system. With respect to claim 8, DeVries teaches: … generating training data for training a machine learning algorithm (DeVries discloses “Dataset augmentation, the practice of applying a wide array of domain-specific transformations to synthetically expand a training set, is a standard tool in supervised learning … we adopt a simpler, domain-agnostic approach to dataset augmentation. We start with existing data points and apply simple transformations such as adding noise, interpolating, or extrapolating between them” (P. 1, Abstract).), the training data respectively comprise a data point and a data value associated with the data point (DeVries discloses “in order to augment a dataset, each example is projected into feature space by feeding it through the sequence encoder, extracting the resulting context vector, and then applying a transformation in feature space (Figure 1b). … We include a γ parameter to globally scale the noise … where i indexes the elements of a context vector which corresponds to data points from the training set. … For each sample in the dataset, we find its K nearest neighbours in feature space which share its class label” (P. 4, Sec. 3.2, First Paragraph).), … comprising: a memory configured to store first training data and program code (DeVries discloses “In all experiments, we trained a LSTM-based sequence autoencoder in order to learn a feature space from the available training examples” (P. 4, Sec. 4, First Paragraph). A computer is implied by training an autoencoder with available training examples, which further implies a memory storing training examples (‘first training data’) and programming instructions.); and a processor operably connected to the memory and configured to execute the program code to (A computer is implied by training an autoencoder with available training examples, which further implies a memory storing training examples and a processor executing programming instructions.): approximate a manifold in which at least one part of the data points of the first training data is located by determining, for each respective data point from the first training data, nearest neighbors of the respective data point within the data points of the first training data (The Examiner interprets “manifold” according to its broadest reasonable interpretation (in view of the Applicant’s specification at Paragraph 0012) as encompassing a feature space as disclosed by DeVries. DeVries discloses “we consider augmentation not by a domain-specific transformation, but by perturbing, interpolating, or extrapolating between existing examples. However, we choose to operate not in input space, but in a learned feature space. Bengio et al. (2013) and Ozair & Bengio (2014) claimed that higher level representations expand the relative volume of plausible data points within the feature space, conversely shrinking the space allocated for unlikely data points. As such, when traversing along the manifold it is more likely to encounter realistic samples in feature space than compared to input space” (P. 1, Sec. 1, Second Paragraph). DeVries further discloses “our dataset augmentation technique works by first learning a data representation and then applying transformations to samples mapped to that representation. Our hypothesis is that, due to manifold unfolding in feature space, simple transformations applied to encoded rather than raw inputs will result in more plausible synthetic data. … we use a sequence autoencoder to construct a feature space” (P. 2-3, Sec. 3, First Paragraph). DeVries discloses “for each sample in the dataset, we find its K nearest neighbours in feature space which share its class label. For each pair of neighbouring context vectors, a new context vector can then be generated using interpolation” (P. 4, Sec. 3.2, First Paragraph). DeVries further discloses “For each sample in the dataset we found the 10 nearest in-class neighbours by searching in feature space. We then interpolated or extrapolated between each neighbour and the original sample to produce a synthetic example which was added to the augmented dataset” (P. 4, Sec. 4, Last Paragraph).); determine a structure of the at least one part of the data points of the first training data in the manifold … (The Examiner interprets “structure” according to its broadest reasonable interpretation (in view of the Applicant’s specification at Paragraph 0013) as encompassing proximity relationships between neighboring context vectors (‘data points’) in a feature space (‘manifold’). DeVries discloses “for each sample in the dataset, we find its K nearest neighbours in feature space which share its class label. For each pair of neighbouring context vectors, a new context vector can then be generated using interpolation: PNG media_image1.png 110 1144 media_image1.png Greyscale where c ' is the synthetic context vector, c i and c j are neighbouring context vectors, and λ is a variable in the range {0, 1} that controls the degree of interpolation. In our experiments, we use λ = 0.5 so that the new sample balances properties of both original samples” (P. 4, Sec. 3.2, First Paragraph).); generate additional training data based on the determined structure of the at least one part of the data points of the first training data in the manifold (DeVries discloses “for each sample in the dataset, we find its K nearest neighbours in feature space which share its class label. For each pair of neighbouring context vectors, a new context vector can then be generated using interpolation … where c ' is the synthetic context vector, c i and c j are neighbouring context vectors, and λ is a variable in the range {0, 1} that controls the degree of interpolation. In our experiments, we use λ = 0.5 so that the new sample balances properties of both original samples. In a similar fashion, extrapolation can also be applied to the context vectors … Once new context vectors have been created, they can either be used directly as input for a learning task” (P. 4, Sec. 3.2).); and train the machine learning algorithm to perform a task based on the first training data and the additional training data (DeVries discloses “We conduct four different test scenarios on the MNIST dataset. To control for the representation, as a baseline we trained the classifier only on context vectors from the original images (i.e. SA with no augmentation). We then compare this to training with various kinds of dataset augmentation: traditional affine image transformations in input space (shifting, rotation, scaling), extrapolation between nearest neighbours in input space, and extrapolation between nearest neighbours in representational space. For both extrapolation experiments we use three nearest neighbours per sample and γ = 0.5 when generating new data” (P. 8, Sec. 4.6, ¶4). DeVries discloses Table 4 on P. 8 (reproduced above) depicting test errors for models trained with original and augmented data. A baseline model is trained using only original training data (‘first training data’). The baseline model is then further trained with an augmented dataset (‘additional training data’) consisting of samples obtained through feature space extrapolation.). However, DeVries does not teach determining a structure of the at least one part of the data points of the first training data in the manifold using principal component analysis based on the nearest neighbors of each respective data point, which is taught by Beigi: determine a structure of the at least one part of the data points of the first training data in the manifold using principal component analysis based on the nearest neighbors of each respective data point (The Examiner interprets “manifold” according to its broadest reasonable interpretation (BRI) in view of the Applicant’s specification (at Paragraph [0012]) as encompassing a low-dimensional feature space as disclosed by Beigi. Beigi discloses “the records are embedded in low-dimensional space. This embedding comprises mapping of the records to a p-dimensional feature space V, where p is between 0 and m. Preferably, p is small, e.g., two or three, resulting in a low-dimensional feature space, which makes the subsequent k-nearest neighbor clustering operation work better. The embedding helps determine which records are similar to each other. FIG. 2B shows a space in which p=2. Embedding may be accomplished using t-stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP), or principal component analysis (PCA). … PCA computes the principal components (the dimensions in the low-dimensional space) of a set of records and uses the principal components to perform a change of basis on the data, sometimes using only the first few principal components and ignoring the rest. FIG. 2C shows the differences between using these three types of dimensionality reduction. The black dots show the original data; the gray dots show how each method generates a slightly different synthetic dataset” [0037]. See Figure 2C depicting the synthetic dataset generated by using a low-dimensional feature space (‘manifold’) obtained from performing principal component analysis. The Examiner interprets “structure” according to its BRI in view of the Applicant’s specification (at Paragraph [0013]) as encompassing distance relationships between embedded points in a low-dimensional feature space (‘manifold’) as disclosed by Beigi below. Beigi discloses pseudo-code for generating synthetic data in [0019-0034] (reproduced above). An original dataset (‘first training data’) is embedded into a low-dimensional feature space (‘manifold’) by using Principal Component Analysis. The embedded original dataset is represented by embedded points rs (‘data points of the first training data’). A random embedded point is chosen and then its k nearest neighbors in the feature space are selected. The nearest neighbors of an embedded point represent distance relationships between embedded points in a feature space, and therefore are a “structure of the at least one part of the data points of the first training data in the manifold”. Beigi discloses “Once the records are embedded in low-dimensional space, a seed record rs is selected at random in operation 120. Operation 125 then identifies the k nearest neighbors to the seed record. The value of k is selected heuristically based on the trade-off between fidelity and privacy and on the type of application” [0038].); Beigi teaches using principal component analysis to embed an original dataset in a low-dimensional feature space is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of DeVries with the principal component analysis technique disclosed by Beigi to create a low-dimensional feature space. By using principal component analysis to create a low-dimensional feature space, data points can be transformed into lower-dimensions making it easier to perform nearest neighbor clustering, thereby reducing computational cost and retaining important information. Furthermore, the combination of DeVries in view of Beigi does not teach a control device, which Javed does: a control device for generating training data for training a machine learning algorithm … (Javed discloses “it is now possible to implement the base station and sensor nodes with a low power ATmel 328P microcontroller-based Moteino [23] board (2 kB RAM). The environment sensor node estimates the number of occupants and sends this information to the base station which controls the HVAC for maintaining a comfortable indoor environment in the room. The base station is integrated with a gateway to upload the data on a Web portal” (P. 394, Sec. 1, Last Paragraph). Javed further discloses “the trained RNN model is implemented on an indoor environment sensor node. The inputs for the model are: HVAC inlet air temperature, HVAC inlet air CO2 concentrations, inlet air temperature of the environment chamber, and CO2 concentration inside the environment chamber. The output of the model is the number of occupants inside the environment chamber. The training data set for the RNN model is downloaded from the Web portal” (P. 397, Sec. 4B, First Paragraph). Javed discloses Figure 19 on P. 401 depicting data that is collected from sensor nodes.) Javed teaches controlling an HVAC system by using a controller (‘control device’) is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of DeVries and the principal component analysis technique of Beigi with the technique disclosed by Javed to automate tasks. In a system’s environment, there are often many real time changes that occur at a high volume, which are not feasible for a human operator to resolve. A controller can be used to automate repetitive, high-volume tasks with precision and real-time control. Therefore, using a controller can be cost-effective since tasks are automated and controllers are less expensive to implement and maintain than human labor. With respect to claim 9, the combination of DeVries in view of Beigi and in further view of Javed teaches: the control device according to Claim 8, wherein the processor further executes the program code to vary coefficients of the principal component analysis to determine the additional training data (Beigi discloses using principal component analysis (PCA) to obtain a low-dimensional feature space (see [0037]). PCA is used to transform an original dataset to low-dimension embedded points. To transform the original dataset to low-dimension embeddings, the coefficients used in PCA must be adjusted, therefore, “varying coefficient of the principal component analysis” is implied. Beigi discloses nearest neighbors is used to generate synthetic records (‘additional training data’), “in operation 130, a new, synthetic record, rs′, is generated. For each record in the low-dimensional space, the method generates one or more synthetic records by permuting the features of its k nearby neighbors within a certain radius/distance and within the same cluster” [0039]. See also pseudo-code for generating synthetic data in [0019-0034] (reproduced above) that describes how nearest neighbor clustering is performed in a low-dimensional feature space (obtained from performing PCA) to generate synthetic records.). Beigi teaches using principal component analysis and nearest neighbor clustering to generate synthetic data is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of DeVries with the principal component analysis technique disclosed by Beigi to create a low-dimensional feature space. By using principal component analysis to create a low-dimensional feature space, data points can be transformed into lower-dimensions making it easier to perform nearest neighbor clustering, thereby reducing computational cost and retaining important information. With respect to claim 11, the combination of DeVries in view of Beigi and in further view of Javed teaches: the control device according to Claim 8, wherein the processor further executes the program code to respectively determine, for each data point in the additional training data, a data value for the respective data point based on data values associated with the nearest neighbors of the respective data point (DeVries discloses “For each pair of neighbouring context vectors, a new context vector can then be generated using interpolation: PNG media_image1.png 110 1144 media_image1.png Greyscale where c ' is the synthetic context vector, c i and c j are neighbouring context vectors, and λ is a variable in the range {0, 1} that controls the degree of interpolation. In our experiments, we use λ = 0.5 so that the new sample balances properties of both original samples” (P. 4, Sec. 3.2, First Paragraph).). With respect to claim 12, the combination of DeVries in view of Beigi and in further view of Javed teaches: the control device according to Claim 8, wherein the first training data comprise sensor data (Javed discloses “the trained RNN model is implemented on an indoor environment sensor node. The inputs for the model are: HVAC inlet air temperature, HVAC inlet air CO2 concentrations, inlet air temperature of the environment chamber, and CO2 concentration inside the environment chamber. The output of the model is the number of occupants inside the environment chamber. The training data set for the RNN model is downloaded from the Web portal” (P. 397, Sec. 4B, First Paragraph). Javed further discloses “for each sensor node, the Web portal displays node ID, upload time in milli seconds (the time sensor node is powered), light intensity, CO2 concentrations, temperature, humidity, dewpoint temperature, data receiving time, motion sensor, heating setpoint, cooling setpoint, heating output for HVAC, cooling output for HVAC, ventilation output for HVAC, and number of occupants in the room” (P. 397, Sec. 3F).). Javed teaches using training data gathered from sensors to train a machine learning model is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of DeVries with the technique disclosed by Javed to train an accurate machine learning model. Sensors provide accurate, specific, and real-time data about an environment, which can be used to train a machine learning model that captures contextualized relationships and patterns. By using a trained model that has learned complex patterns and relationships, the model can predict accurate outcomes, which can be used to make well-informed decisions. With respect to claim 14, the combination of DeVries in view of Beigi and in further view of Javed teaches: The control device according to Claim 8, wherein the processor further executes the program code to control at least one function of a controllable system using the trained machine learning algorithm (Javed discloses “the trained RNN model is implemented on an indoor environment sensor node. The inputs for the model are: HVAC inlet air temperature, HVAC inlet air CO2 concentrations, inlet air temperature of the environment chamber, and CO2 concentration inside the environment chamber. The output of the model is the number of occupants inside the environment chamber. The training data set for the RNN model is downloaded from the Web portal” (P. 397, Sec. 4B, First Paragraph). Javed discloses “The RNN HVAC controller controls the HVAC on the basis of the setpoints estimated by the RNN PMV-based setpoint estimator or user defined setpoints for heating and cooling … The outputs of the RNN model are: 1) heating output for turning on the HVAC heating; 2) cooling output for turning on the HVAC cooling; and 3) ventilation for the zone” (P. 399, Sec. 4E, Last Paragraph).). Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over DeVries in view of Beigi, in further view of Javed and Villalonga. With respect to claim 10, the combination of DeVries in view of Beigi and in further view of Javed teaches the control device according to Claim 8, however the combination does not teach determining nearest neighbors based on a Euclidean norm, which Villalonga does: wherein the processor further executes the program code to respectively determine, for each data point from the first training data, the nearest neighbors based on a Euclidean norm (Villalonga discloses “the procedure chosen to obtain the local model in incremental hybrid modeling is the Fuzzy k-Nearest Neighbors (F-kNN) approach. kNN consists of averaging the value of the points closest to the objective point. The kNN algorithm assumes, therefore, that nearby points have similar values. To calculate the proximity, the Euclidean norm was applied” (P. 641, Sec. 3C, First Paragraph).). Villalonga teaches determining a nearest neighbor based on a Euclidean norm is a known method in the art. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine the method of DeVries with the technique disclosed by Villalonga to use fewer computing resources. Calculating a Euclidean norm is computationally efficient because a Euclidean norm relies on simple arithmetic operations (addition, squaring, square root) to calculate a distance. Computers are equipped with processors that are optimized to handle simple arithmetic operations, therefore these calculations would not take a long time to compute or use many computational resources. Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to PEDRO J MORALES whose telephone number is (571)272-6106. The examiner can normally be reached 8:30 AM - 6:00 PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MIRANDA M HUANG can be reached at (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /PEDRO J MORALES/Examiner, Art Unit 2124 /MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124
Read full office action

Prosecution Timeline

Nov 10, 2022
Application Filed
Aug 22, 2025
Non-Final Rejection — §101, §103
Nov 20, 2025
Response Filed
Jan 22, 2026
Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12591803
SYSTEMS AND METHODS FOR APPLYING MACHINE LEARNING BASED ANOMALY DETECTION IN A CONSTRAINED NETWORK
2y 5m to grant Granted Mar 31, 2026
Patent 12530412
SEARCH-QUERY SUGGESTIONS USING REINFORCEMENT LEARNING
2y 5m to grant Granted Jan 20, 2026
Patent 12524673
MULTITASK DISTRIBUTED LEARNING SYSTEM AND METHOD BASED ON LOTTERY TICKET NEURAL NETWORK
2y 5m to grant Granted Jan 13, 2026
Study what changed to get past this examiner. Based on 3 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
67%
Grant Probability
99%
With Interview (+50.0%)
3y 11m
Median Time to Grant
Moderate
PTA Risk
Based on 9 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month