Last updated: April 19, 2026
Application No. 17/893,050
DEVICE AND METHOD FOR TRAINING A NEURAL NETWORK FOR IMAGE ANALYSIS

Final Rejection §101§103
Filed
Aug 22, 2022
Examiner
VAUGHN, RYAN C
Art Unit
2125
Tech Center
2100 — Computer Architecture & Software
Assignee
Robert Bosch GmbH
OA Round
2 (Final)
This examiner grants 62% of cases after interview

— +19.4% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 235 resolved cases, 2023–2026
Examiner Intelligence

VAUGHN, RYAN C View full profile →
Grants 62% of resolved cases
Career Allow Rate
145 granted / 235 resolved
+6.7% vs TC avg
Strong +19% interview lift
Without
With
+19.4%
Interview Lift
resolved cases with interview
Typical timeline
3y 9m
Avg Prosecution
45 currently pending
Career history
280
Total Applications
across all art units
Statute-Specific Performance

§101
23.9%
-16.1% vs TC avg
§103
40.1%
+0.1% vs TC avg
§102
7.6%
-32.4% vs TC avg
§112
21.9%
-18.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 235 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-13 are presented for examination.

Response to Amendment
	Applicant’s submission of replacement drawings has obviated the drawing objections.  Therefore, those objections are withdrawn.  However, Applicant has submitted no amendments to either the specification or the claims.  Therefore, the remaining objections are maintained, and Examiner will assume for purposes of examination that the originally filed claims are the ones currently under examination.

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Specification
Examiner objects to the specification for containing various grammatical informalities.  Examiner has attached a marked-up copy of the specification indicating where errors have occurred.  To the extent that the markings are not self-explanatory and are not corrected, Examiner will enumerate the remaining objections in a subsequent Office Action.
The abstract of the disclosure is objected to because “vectors; training” should be “vectors; and training”.  A corrected abstract of the disclosure is required and must be presented on a separate sheet, apart from any other text. See MPEP § 608.01(b).

Claim Objections
Claim 8 is objected to because of the following informalities: “plurality of first loss value” should be “plurality of first loss values”.  Appropriate correction is required.

Claim Rejections - 35 USC § 101
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim does not fall within at least one of the four categories of patent eligible subject matter because, under its broadest reasonable interpretation in light of the specification, it is directed to software per se.  The claim is directed to a “training system configured to train a neural network”; however, the claim does not recite that the “system” comprises hardware such as a processor or a memory.  Therefore, the “system” may be construed to include a system that includes only software.  Examiner recommends amending the claim to recite the hardware explicitly.
Claims 1-13 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).
Claim 1
Step 1:  The claim recites a method; therefore, it is directed to the statutory category of processes.
Step 2A Prong 1:  The claim recites, inter alia:
[D]etermining a first feature map … based on a first transformed image, wherein the first transformed image is determined based on a first transformation of a training image:  This limitation could encompass mentally mapping out features of a transformed image.
[D]etermining a second feature map … based on a second transformed image, wherein the second transformed image is determined based on a second transformation of the training image:  This limitation could encompass mentally mapping out features of a second transformed image.
[D]etermining a first loss value characterizing a metric between a first feature vector of the first feature map and a weighted sum of second feature vectors of the second feature map, wherein weights of the weighted sum are determined according to overlaps of a part of the training image characterized by the first feature vector with respect to parts of the training image characterized by the respective second feature vectors:  This limitation could encompass mentally determining the loss value by taking a weighted sum of vectors of the map and calculating a distance between the weighted sum and a first vector.  Additionally or alternatively, this limitation represents a mathematical concept.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  The claim further recites that the feature maps are determined by a neural network.  However, this is a mere instruction to apply the judicial exception using a generic computer programmed with a generic class of computer algorithm.  MPEP § 2106.05(f).
	The claim further recites “training the neural network based on the first loss value”.  However, this limitation merely restricts the above-mentioned abstract ideas to the technological environment of neural network training.  MPEP § 2106.05(h).
Step 2B:  The claim does not contain significantly more than the judicial exception.  The analysis at this step mirrors that of step 2A, prong 2.  As an ordered whole, the claim is directed to a mentally performable process of determining feature maps of transformed images and determining a metric between vectors corresponding to these maps based thereon.  Nothing in the claim recites significantly more than this.  As such, the claim is not patent eligible.

Claim 2
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites that “the first transformation and/or the second transformation characterizes an augmentation of the training image.”  Determining a feature map based on an image so transformed is mentally performable.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  See claim 1 analysis.
Step 2B:  The claim does not contain significantly more than the judicial exception.  See claim 1 analysis.

Claim 3
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites that “each weight of the weighted sum characterizes an intersection over union of the part of the training image characterized by the first feature vector and a part of the training image characterized by a second feature vector of the second feature vectors.”  Determining the weights of the weighted sum remains mentally performable under these additional assumptions.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  See claim 1 analysis.
Step 2B:  The claim does not contain significantly more than the judicial exception.  See claim 1 analysis.

Claim 4
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites that “the first loss value is set to zero when a sum of overlaps of the part of the training image characterized by the first feature vector with respect to the parts of the training image characterized by the respective second feature vectors is less than or equal to a predefined threshold.”  This limitation could encompass mentally calculating a sum of overlaps, comparing it to a threshold, and setting a first loss value to zero if the threshold is not reached.  This limitation also recites a mathematical concept.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  See claim 1 analysis.
Step 2B:  The claim does not contain significantly more than the judicial exception.  See claim 1 analysis.

Claim 5
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites the same judicial exceptions as in claim 1.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  The claim further recites that “the neural network includes an encoder and a predictor”.  This limitation amounts to a mere instruction to apply the judicial exception using a generic computer programmed with a generic class of computer algorithm.  MPEP § 2106.05(f).
The claim further recites that “the second feature map is a second output of the encoder for the second transformed image and the first feature map is an output of the predictor determined for a first output of the encoder for the first transformed image.”  This limitation recites the insignificant extra-solution activity of mere data gathering and output.  MPEP § 2106.05(g).
Step 2B:  The claim does not contain significantly more than the judicial exception.  The analysis at this step is the same as in step 2A, prong 2, except insofar as the outputting limitations recite the well-understood, routine, and conventional activity of receiving or transmitting data over a network.  MPEP § 2106.05(d)(II); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network).

Claim 6
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites that “the metric characterizes a cosine similarity.”  Determining a cosine similarity is a mathematical concept and could be calculated mentally given sufficiently simple data.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  See claim 1 analysis.
Step 2B:  The claim does not contain significantly more than the judicial exception.  See claim 1 analysis.

Claim 7
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites that “for each first feature vector from a plurality of first feature vectors of the first feature map, a respective first loss value is determined, to determine a plurality of first loss values.”  This limitation could encompass mentally determining the loss values.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  See claim 1 analysis.
Step 2B:  The claim does not contain significantly more than the judicial exception.  See claim 1 analysis.

Claim 8
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites that “the neural network is trained based on the first loss or a sum of the plurality of first loss values or a mean of the plurality of first loss value, by means of a gradient descent algorithm, wherein gradients of parameters of the neural network are determined with respect to the first loss value or with respect to the sum of the plurality of first loss values or with respect to the mean of the plurality of first loss values.”  This limitation recites a mathematical concept of training a neural network with a gradient descent algorithm.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  See claim 1 analysis.
Step 2B:  The claim does not contain significantly more than the judicial exception.  See claim 1 analysis.

Claim 9
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites that “each gradient of the first loss value with respect to a second feature vector or a gradient of the sum of the plurality of first loss values with respect to a second feature vector or a gradient of the mean of the plurality of first loss values with respect to a second feature vector, is not backpropagated through the neural network.”  This claim recites the mathematical concept of selectively backpropagating gradients using a gradient descent algorithm.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  See claim 1 analysis.
Step 2B:  The claim does not contain significantly more than the judicial exception.  See claim 1 analysis.

Claim 10
Step 1:  The claim recites a method; therefore, it is directed to the statutory category of processes.
Step 2A Prong 1:  The claim recites the same judicial exceptions as in claim 1, with the exception that claim 10 additionally recites “determining the control signal based on an output signal of a neural network; wherein the neural network includes at least one layer and wherein parameters of the at least one layer have been trained”.  This limitation encompasses mentally determining the control signal by visually inspecting the network output and determining what control signal should be sent based thereon.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  The analysis at this step mirrors that of claim 1, with the exception that this claim additionally recites that the method is “for determining a control signal of an actuator”.  However, this limitation merely limits the judicial exception to the field of use of controlling devices via an actuator.  MPEP § 2106.05(h).
Step 2B:  The claim does not contain significantly more than the judicial exception.  The analysis at this step mirrors that of claim 1, with the exception that this claim additionally recites that the method is “for determining a control signal of an actuator”.  However, this limitation merely limits the judicial exception to the field of use of controlling devices via an actuator.  MPEP § 2106.05(h).

Claim 11
Step 1:  A process, as above.
Step 2A Prong 1:  The claim recites that “the actuator is part of: (i) a robot or (ii) a manufacturing machine or (iii) an automated personal assistant or (iv) an access control system or (v) a surveillance system or (vi) an imaging system.”  Determining the control signal for such an actuator (as opposed to actuating the system using the control system) is mentally performable.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application. See claim 10 analysis.
Step 2B:  The claim does not contain significantly more than the judicial exception.  See claim 10 analysis.

Claim 12
Step 1:  As noted above, the claim is directed to software per se; however, for purposes of this rejection, it will be assumed that the claim is directed to the statutory category of machines.
Step 2A Prong 1:  The claim recites the same judicial exceptions as in claim 1.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  The analysis at this step mirrors that of claim 1, with the exception that claim 12 additionally recites a “training system configured to train a neural network, wherein the neural network is configured for image analysis”.  However, this is a mere instruction to apply the judicial exception using a generic computer.  MPEP § 2106.05(f).
Step 2B:  The claim does not contain significantly more than the judicial exception.  The analysis at this step mirrors that of claim 1, with the exception that claim 12 additionally recites a “training system configured to train a neural network, wherein the neural network is configured for image analysis”.  However, this is a mere instruction to apply the judicial exception using a generic computer.  MPEP § 2106.05(f).

Claim 13
Step 1:  The claim recites a non-transitory machine-readable storage medium; therefore, it is directed to the statutory category of articles of manufacture.
Step 2A Prong 1:  The claim recites the same judicial exceptions as in claim 1.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  The analysis at this step mirrors that of claim 1, with the exception that claim 13 additionally recites a “non-transitory machine-readable storage medium on which is stored a computer program for training a neural network, wherein the neural network is configured for image analysis, the computer program, when executed by a computer, causing the computer to perform the [method]”.  However, this is a mere instruction to apply the judicial exception using a generic computer.  MPEP § 2106.05(f).
Step 2B:  The claim does not contain significantly more than the judicial exception.  The analysis at this step mirrors that of claim 1, with the exception that claim 13 additionally recites a “non-transitory machine-readable storage medium on which is stored a computer program for training a neural network, wherein the neural network is configured for image analysis, the computer program, when executed by a computer, causing the computer to perform the [method]”.  However, this is a mere instruction to apply the judicial exception using a generic computer.  MPEP § 2106.05(f).

Claim Rejections - 35 USC § 103
Claims 1-2, 5-6, 8-9, and 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al., “Exploring Simple Siamese Representation Learning,” in arXiv preprint arXiv:2011.10566 (2020) (“Chen”) in view of Xu (US 20210319564) (“Xu”).
Regarding claim 1, Chen discloses “[a] computer-implemented method for training a neural network, wherein the neural network is configured for image analysis (Siamese networks are weight-sharing neural networks applied on two or more inputs; they are natural tools for comparing entities; recent methods define the inputs as two augmentations of one image – Chen, sec. 1, first paragraph), the training comprising the following steps: 
determining a first feature map by the neural network based on a first transformed image, wherein the first transformed image is determined based on a first transformation of a training image (architecture takes as input two randomly augmented views x1 and x2 [transformed images] from an image x; the two views are processed by an encoder network f consisting of a backbone and a projection MLP head [neural network]; a prediction MLP head h transforms the output of one view and matches it to the other view – Chen, sec. 3, first paragraph [output of prediction MLP head = first feature map]; Siamese networks can model invariance with respect to more complicated transformations (e.g., augmentations) – id. at sec. 1, last paragraph); 
determining a second feature map by the neural network based on a second transformed image, wherein the second transformed image is determined based on a second transformation of the training image (architecture takes as input two randomly augmented views x1 and x2 [transformed images] from an image x [i.e., there are multiple transformations of the same image]; the two views are processed by an encoder network f consisting of a backbone and a projection MLP head [neural network]; a prediction MLP head h transforms the output of one view and matches it to the other view – Chen, sec. 3, first paragraph [output of encoder on second image = second feature map]; Siamese networks can model invariance with respect to more complicated transformations (e.g., augmentations) – id. at sec. 1, last paragraph); 
determining a first loss value characterizing a metric between a first feature vector of the first feature map and a … second feature vector[] of the second feature map (denoting the two output vectors [feature vectors] as p1 = h(f(x1)) [first feature vector] and z2 = f(x2) [second feature vector], their negative cosine similarity D(p1, z2) [metric] is minimized; a symmetrized loss L [first loss value] is defined as 0.5D(p1, z2) + 0.5D(p2, z1) – Chen, sec. 3, first paragraph) …; and 
training the neural network based on the first loss value (Chen Algorithm 1 shows that the loss L is backpropagated and the projection and prediction networks are updated via stochastic gradient descent).”  
Chen appears not to disclose explicitly the further limitations of the claim.  However, Xu discloses “determining a … weighted sum of second feature vectors of the second feature map, wherein weights of the weighted sum are determined according to overlaps of a part of the training image characterized by the first feature vector with respect to parts of the training image characterized by the respective second feature vectors (generating a matte for an entire image may include stitching together matte patches by determining alpha values for pixels within overlapping regions; each predicted alpha value within an overlapping region may be the weighted sum of the alpha values [feature vectors] of pixels in the overlapping region where weights in the overlapping region are negatively proportional to the distance to the nearest non-overlapping neighbor [union of patches containing the overlapping region = part of training image characterized by the second feature vectors of the second feature map; patch containing the nearest non-overlapping region = part of training image characterized by first feature vector of the first feature map, so that the weights are determined according to overlaps insofar as they are determined by distance to non-overlapping neighbors]  – Xu, paragraph 93) ….”
Xu and the instant application both relate to image processing using machine learning and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chen to determine a weighted sum of feature vectors based on weights determined based on overlaps between regions, as disclosed by Xu, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would increase the reliability of the network by utilizing information from multiple areas of the image in the training thereof.  See Xu, paragraphs 2-4.

Claim 12 is a system claim corresponding to method claim 1 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 13 is a non-transitory machine-readable storage medium claim corresponding to method claim 1 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 2, Chen, as modified by Xu, discloses that “the first transformation and/or the second transformation characterizes an augmentation of the training image (architecture takes as input two randomly augmented images x1 and x2 from an image x – Chen, sec. 3, first paragraph).”  

Regarding claim 5, Chen, as modified by Xu, discloses that “the neural network includes an encoder and a predictor, wherein the second feature map is a second output of the encoder for the second transformed image and the first feature map is an output of the predictor determined for a first output of the encoder for the first transformed image (Chen Fig. 1 discloses a SimSiam architecture that applies an encoder f and then a predictor h for a first augmented image x1 [i.e., the first feature map is the output of the predictor that takes as input the first output of the encoder] and an encoder for a second augmented image x2 [i.e., the second feature map is the second output of the encoder for the second transformed image]).”  

Regarding claim 6, Chen, as modified by Xu, discloses that “the metric characterizes a cosine similarity (negative cosine similarity of the output vectors p1 = h(f(x1)) and z2 = f(x2) is minimized – Chen, sec. 3, first paragraph).”  

Regarding claim 8, Chen, as modified by Xu, discloses that “the neural network is trained based on the first loss or a sum of the plurality of first loss values or a mean of the plurality of first loss value[s], by means of a gradient descent algorithm, wherein gradients of parameters of the neural network are determined with respect to the first loss value or with respect to the sum of the plurality of first loss values or with respect to the mean of the plurality of first loss values (important component for the method to work is a stop-gradient operation; it is implemented by modifying the loss function such that the encoder on x2 receives no gradient from z2 but receives gradients from p2 (and vice versa for x1) [i.e., at least gradients with respect to the loss value are received] – Chen, sec. 3, second paragraph; Algorithm 1 shows that the updating of the networks occurs using SGD (stochastic gradient descent)).”  

Regarding claim 9, Chen, as modified by Xu, discloses that “each gradient of the first loss value with respect to a second feature vector or a gradient of the sum of the plurality of first loss values with respect to a second feature vector or a gradient of the mean of the plurality of first loss values with respect to a second feature vector, is not backpropagated through the neural network (important component for the method to work is a stop-gradient operation; it is implemented by modifying the loss function such that the encoder on x2 receives no gradient [of a first loss value] from z2 [second feature vector] but receives gradients from p2 (and vice versa for x1) [note that the lack of receipt of a gradient implies that the gradient is not backpropagated through the network] – Chen, sec. 3, second paragraph).”

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Chen in view of Xu and further in view of Zatloukal et al. (WO 2020123553) (“Zatloukal”).
Regarding claim 3, the rejection of claim 1 is incorporated.  Xu further discloses a “weight of [a] weighted sum”, as shown above in the rejection of claim 1.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chen to employ weighted sums, as disclosed by Xu, for substantially the same reasons as given in the rejection of claim 1.
Neither Chen nor Xu appears to disclose explicitly the further limitations of the claim.  However,  Zatloukal discloses that “each [data point] characterizes an intersection over union of the part of the training image characterized by the first feature vector and a part of the training image characterized by a second feature vector of the second feature vectors (object comprising a video or an image may be represented in a vector space as a vector, referred to as a feature vector; execution system may calculate a similarity metric of vectors in the vector space; a similarity metric may be a Jaccard similarity coefficient [intersection over union] – Zatloukal, paragraphs 42-43).”  
Zatloukal and the instant application both relate to machine learning and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Chen and Xu to determine the similarity of the two portions of the image using an intersection over union, as disclosed by Zatloukal, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would provide the system with a simple and widely used way to determine whether the vectors are similar, thereby avoiding the necessity of developing expensive custom techniques for determining similarity.  See Zatloukal, paragraphs 42-43.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Chen in view of Xu and further in view of Agarwal et al. (US 20220144303) (“Agarwal”).
Regarding claim 4, neither Chen nor Xu appears to disclose explicitly the further limitations of the claim.  However, Agarwal discloses that “the first loss value is set to zero when a sum of overlaps of the part of the training image characterized by the first feature vector with respect to the parts of the training image characterized by the respective second feature vectors is less than or equal to a predefined threshold (system may modify a face detector by adding a separate head for estimating pedestrian attention in parallel with existing box classification and regression branches; for any training anchor i, the system may minimize a multi-task loss function one of whose terms is a loss for an attention head that is a function of a predicted probability that is nonzero if anchor i [part of training image characterized with a first feature vector] has an overlap [consisting of one term, so the sum of overlaps is the overlap] with a ground truth face box [part of training image characterized by second feature vectors] is above a threshold [i.e., the probability, and hence the loss when the loss function passes through the origin, is zero if the threshold is not exceeded] – Agarwal, paragraphs 59-60; see also paragraph 30 (disclosing that the objects are encoded into feature vectors)).”  
Agarwal and the instant application both relate to computer vision based on machine learning and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Chen and Xu to set the loss function to zero if the overlap between two image-related feature vectors does not reach a threshold, as disclosed by Agarwal, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would discard unnecessary information and ensure that the system only focuses on vectors that have sufficient overlap.  See Agarwal, paragraphs 59-60.

Claim 7 is  rejected under 35 U.S.C. 103 as being unpatentable over Chen in view of Xu and further in view of Cai et al. (US 20200279156) (“Cai”).
Regarding claim 7, neither Chen nor Xu appears to disclose explicitly the further limitations of the claim.  However, Cai discloses that “for each first feature vector from a plurality of first feature vectors of the first feature map, a respective first loss value is determined, to determine a plurality of first loss values (in a feedforward phase of training, sample data are input, the models determine feature vectors, the feature vectors are passed through abstraction layers, and an output vector is determined; a backpropagation phase may include calculating the gradient of a cost function determined based on a plurality of loss functions; one loss function may be determined for each set of sample data [i.e., there is a loss per feature vector] – Cai, paragraph 20; see also paragraph 41 (disclosing that the sample data comprise feature maps)).” 
Cai and the instant application both relate to determining loss values in machine learning and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Chen and Xu to calculate multiple loss values, as disclosed by Cai, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would allow the system to determine an overall cost function using more information than if a single loss function were used in calculating the cost function, thereby enhancing the training of the network and increasing the accuracy of the trained network.  See Cai, paragraph 20. 

Claims 10-11 are rejected under 35 U.S.C. 103 as being unpatentable over Chen in view of Xu and further in view of Li et al. (US 10748057) (“Li”).
Regarding claim 10, Chen discloses “[a] computer-implemented method …, the method comprising: …
[using a] neural network [that] includes at least one layer[,] … wherein parameters of the at least one layer have been trained (Chen Algorithm 1 shows that a loss L is backpropagated through several multi-layer perceptrons [neural networks containing at least one layer] and the projection and prediction networks are updated [i.e., the parameters are trained] via stochastic gradient descent) by: 
determining a first feature map by the neural network based on a first transformed image, wherein the first transformed image is determined based on a first transformation of a training image (architecture takes as input two randomly augmented views x1 and x2 [transformed images] from an image x; the two views are processed by an encoder network f consisting of a backbone and a projection MLP head [neural network]; a prediction MLP head h transforms the output of one view and matches it to the other view – Chen, sec. 3, first paragraph [output of prediction MLP head = first feature map]; Siamese networks can model invariance with respect to more complicated transformations (e.g., augmentations) – id. at sec. 1, last paragraph), 
determining a second feature map by the neural network based on a second transformed image, wherein the second transformed image is determined based on a second transformation of the training image (architecture takes as input two randomly augmented views x1 and x2 [transformed images] from an image x [i.e., there are multiple transformations of the same image]; the two views are processed by an encoder network f consisting of a backbone and a projection MLP head [neural network]; a prediction MLP head h transforms the output of one view and matches it to the other view – Chen, sec. 3, first paragraph [output of encoder on second image = second feature map]; Siamese networks can model invariance with respect to more complicated transformations (e.g., augmentations) – id. at sec. 1, last paragraph), 
determining a first loss value characterizing a metric between a first feature vector of the first feature map and a … second feature vector[] of the second feature map (denoting the two output vectors [feature vectors] as p1 = h(f(x1)) [first feature vector] and z2 = f(x2) [second feature vector], their negative cosine similarity D(p1, z2) [metric] is minimized; a symmetrized loss L [first loss value] is defined as 0.5D(p1, z2) + 0.5D(p2, z1) – Chen, sec. 3, first paragraph) …, and 
training the neural network based on the first loss value (Chen Algorithm 1 shows that the loss L is backpropagated and the projection and prediction networks are updated via stochastic gradient descent).”
	Chen appears not to disclose explicitly the further limitations of the claim.  However, Xu discloses “determining a … weighted sum of second feature vectors of the second feature map, wherein weights of the weighted sum are determined according to overlaps of a part of the training image characterized by the first feature vector with respect to parts of the training image characterized by the respective second feature vectors (generating a matte for an entire image may include stitching together matte patches by determining alpha values for pixels within overlapping regions; each predicted alpha value within an overlapping region may be the weighted sum of the alpha values [feature vectors] of pixels in the overlapping region where weights in the overlapping region are negatively proportional to the distance to the nearest non-overlapping neighbor [union of patches containing the overlapping region = part of training image characterized by the second feature vectors of the second feature map; patch containing the nearest non-overlapping region = part of training image characterized by first feature vector of the first feature map, so that the weights are determined according to overlaps insofar as they are determined by distance to non-overlapping neighbors]  – Xu, paragraph 93) ….”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chen to determine a weighted sum of feature vectors based on weights determined based on overlaps between regions, as disclosed by Xu, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would increase the reliability of the network by utilizing information from multiple areas of the image in the training thereof.  See Xu, paragraphs 2-4. 
Neither Chen nor Xu appears to disclose explicitly the further limitations of the claim.  However, Li discloses “determining a control signal of an actuator, … comprising: 
determining the control signal based on an output signal of a neural network (using a combined neural network includes applying input data as input to the combined neural network model; generating output that is based on applying the input to the combined neural network model; and using the output to control one or more actuators of a robot – Li, col. 3, ll. 26-33) ….”
Li and the instant application both relate to the use of neural networks to control actuators and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Chen and Xu to use the output of a neural network to control an actuator, as disclosed by Li, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would automate the actuation of the system, thereby reducing the need for human input to perform the actuation.  See Li, col. 3, ll. 26-33. 

Regarding claim 11, Chen, as modified by Xu and Li, discloses that “the actuator is part of: (i) a robot or (ii) a manufacturing machine or (iii) an automated personal assistant or (iv) an access control system or (v) a surveillance system or (vi) an imaging system (using a combined neural network includes applying input data as input to the combined neural network model; generating output that is based on applying the input to the combined neural network model; and using the output to control one or more actuators of a robot – Li, col. 3, ll. 26-33).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Chen and Xu to use the output of a neural network to control an actuator of a robot, as disclosed by Li, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would automate the actuation of the robot, thereby reducing the need for human input to perform the actuation.  See Li, col. 3, ll. 26-33. 

Response to Arguments
Applicant's arguments filed December 5, 2025 (“Remarks”) have been fully considered but they are not persuasive.
Applicant first argues that the claims are eligible under 35 USC § 101 because (a) Examiner has allegedly failed to address statements in the specification that link the claimed invention to technological improvements, (b) Examiner’s position is allegedly untenable in light of the Director’s recent precedential decision in Ex parte Desjardins, and (c) Examiner has allegedly failed to conduct the step 2B analysis properly because he did not provide Berkheimer evidence that the additional elements are well-understood, routine, and conventional.  Remarks at 4-7.
Regarding (a), Examiner would first note that, by Applicant’s own admission, the Park and Mixer decisions cited by Applicant are nonprecedential and do not bind Examiner in the instant case.  Second, none of the MPEP sections Applicant cites indicates, as Applicant suggests, that a rejection under § 101 must affirmatively address the specification’s statements of purported improvements to technology.  At most, the cited sections indicate that, when considering whether additional elements integrate a judicial exception into a practical application, the examiner must evaluate such statements.  Examiner performed this evaluation and concluded that the sections of the specification relied upon by Applicant do not have a sufficient nexus to the claim language itself to suggest to the ordinary artisan that the claims are to an improvement in technology.  The independent claims themselves recite determining a loss value between a feature vector of a first feature map and a weighted sum of feature vectors of a second feature map.  They do not state that those feature maps are based solely on parts of the original images instead of the entire images.  Therefore, an ordinary artisan would not read the claims as providing a system that “determine[s] more fine-grained similarities in the training image than simply the global image.”  Moreover, the purported improvement cited by Applicant relates to limitations that form part of the judicial exception itself, namely, determining the feature maps and loss values.  Examiner reminds Applicant that the judicial exception itself cannot provide the inventive concept.  MPEP § 2106.05(I).
Regarding (b), Desjardins is distinguishable from the instant application.  As an initial matter, it is worth noting that Desjardins does not stand for the proposition that all claims directed to methods of training neural networks are per se eligible.  Second, the claims at issue in Desjardins explicitly linked the steps recited to an improvement in training by reciting “optimiz[ing] performance of the machine learning model on the second machine learning task while protecting performance of the machine learning model on the first machine learning task”.  As noted above, no such nexus exists in the instant claims.  To the extent that training is recited at all, it is recited in the preamble in a location that is not entitled to patentable weight and at the very end of the claim at such a high level of generality that it amounts to a mere recitation of the field of use or technological environment in which the judicial exception is practiced.  MPEP § 2106.05(h).
Regarding (c), Applicant is misrepresenting the analysis at step 2B.  Examiners are not required to prove that every additional element analyzed at step 2A, prong 2 is well-understood, routine, and conventional, but rather only those limitations that the examiner has identified as insignificant extra-solution activity.  Here, since Examiner did not analyze any limitation of the independent claims as insignificant extra-solution activity, Berkheimer evidence is not needed.  For limitations where such a showing is required (e.g., claim 5), the appropriate evidence was provided.
Regarding the art rejection, Applicant argues that the Chen/Xu combination does not render the independent claims eligible because (d) Examiner has allegedly construed the claims unreasonably broadly by equating the alpha value of Xu to the claimed feature vectors, and (e) Examiner has allegedly not sufficiently explained how to alter the negative cosine similarity of Chen to incorporate the weighted sum of alpha values of Xu.  Remarks at 8-11.
Regarding (d), as an initial matter, the claim does not limit the dimensionality of the feature vector, so the term may fairly encompass a scalar such as an alpha value, which is merely a one-dimensional vector.  Second, Applicant points neither to the specification nor to a commonly accepted definition in the art to show what the broadest reasonable interpretation of “feature vector” is.  Since the specification does not redefine it, Examiner has used the most commonly accepted definition in the art, i.e., a vector of values representing features of an input.  Since, by Applicant’s own admission, the alpha values of Xu represent opacity of the image, they represent features of the input image and therefore qualify as feature vectors.
Regarding (e), the test for obviousness is not whether the features of a secondary reference may be bodily incorporated into the structure of the primary reference; nor is it that the claimed invention must be expressly suggested in any one or all of the references.  Rather, the test is what the combined teachings of the references would have suggested to those of ordinary skill in the art.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981).  Here, the relevant question is why an ordinary artisan would have been motivated to modify Chen to determine a weighted sum of feature vectors whose weights are determined according to overlaps in the training image, as disclosed by Xu.  As explained in the rejection itself, the motivation for doing so is to increase the reliability of the network by utilizing information from multiple areas of the input images during training.  Applicant does not dispute this reasoning, but instead demands an explanation of how Xu can be bodily incorporated into Chen that is not required by the law of obviousness.  Examiner declines the invitation to present this unnecessary showing.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN C VAUGHN whose telephone number is (571)272-4849. The examiner can normally be reached M-R 7:00a-5:00p ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RYAN C VAUGHN/             Primary Examiner, Art Unit 2125
Read full office action
Prosecution Timeline

Aug 22, 2022
Application Filed
Jun 03, 2025
Non-Final Rejection — §101, §103
Dec 05, 2025
Response Filed
Dec 15, 2025
Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/304,163
Patent 12602448
PROGRESSIVE NEURAL ORDINARY DIFFERENTIAL EQUATIONS
2y 5m to grant Granted Apr 14, 2026
17/465,916
Patent 12602610
CLASSIFICATION BASED ON IMBALANCED DATASET
2y 5m to grant Granted Apr 14, 2026
17/227,817
Patent 12561583
Systems and Methods for Machine Learning in Hyperbolic Space
2y 5m to grant Granted Feb 24, 2026
17/730,148
Patent 12541703
MULTITASKING SCHEME FOR QUANTUM COMPUTERS
2y 5m to grant Granted Feb 03, 2026
17/830,142
Patent 12511526
METHOD FOR PREDICTING A MOLECULAR STRUCTURE
2y 5m to grant Granted Dec 30, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
62%
Grant Probability
81%
With Interview (+19.4%)
3y 9m
Median Time to Grant
Moderate
PTA Risk
Based on 235 resolved cases by this examiner. Grant probability derived from career allow rate.