Last updated: April 19, 2026
Application No. 17/658,314
RAPID, ENERGY-SAVING, ITERATIVE OPERATION OF ARTIFICIAL NEURAL NETWORKS

Final Rejection §103
Filed
Apr 07, 2022
Examiner
HAN, KYU HYUNG
Art Unit
2123
Tech Center
2100 — Computer Architecture & Software
Assignee
Robert Bosch GmbH
OA Round
2 (Final)
This examiner grants 43% of cases after interview

— +41.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 7 resolved cases, 2023–2026
Examiner Intelligence

HAN, KYU HYUNG View full profile →
Grants 43% of resolved cases
Career Allow Rate
3 granted / 7 resolved
-12.1% vs TC avg
Strong +42% interview lift
Without
With
+41.7%
Interview Lift
resolved cases with interview
Typical timeline
4y 6m
Avg Prosecution
30 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
38.4%
-1.6% vs TC avg
§103
50.9%
+10.9% vs TC avg
§102
4.2%
-35.8% vs TC avg
§112
6.6%
-33.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 7 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Remarks
	Claim Rejections – 35 U.S.C. 103
Applicant’s prior art arguments have been fully considered but they are not persuasive.
Applicant argues (pgs. 9-11) that Elhoushi does not teach the following limitation: “wherein at least the iterative block of the ANN is executed on at least one processor using machine commands that exclusively access an internal memory of the processor.”
Applicant argues that Examiner’s statement of “Elhoushi teaches that the convolution to calculate the parameters is modulated by bitwise-shift convolution rather than regular multiplication” does not explain how the bitwise operation of Elhoushi discloses the exclusive access.
Examiner disagrees. While the variables that contain the values to be bitwise shifted may be loaded from memory external to the processor, such as RAM, the actual bitwise operation is done using internal memory of the processor. Indeed, these bitwise operations, which map to “machine commands” in the claim, are processed in the ALU in the processor and immediately stored in the cache or register of the CPU, which are internal memories of the processor. In fact, Elhoushi states, “replacing multiplication with bitwise shift reduces energy. A common optimization in C++ compilers is to detect an integer multiplication with a (constant that is known at compilation time to be) a power of 2 and replace it with a bitwise shift.” (Page 7, Paragraph 1). Elhoushi describes the advantage of using a bitwise shift instead of multiplication in C++. This is because in C++, the bitwise operations, like AND, OR, shifting, is done in the immediate memory (register/cache) internal to the processor. Therefore, Elhoushi does indeed teach this limitation.
The foregoing applies to all independent claims and their dependent claims.


Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as "configured to" or "so that"; and
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step for”) in a claim with functional language creates a rebuttable presumption that the claim element is to be treated in accordance with 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph).  The presumption that 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph) is invoked is rebutted when the function is recited with sufficient structure, material, or acts within the claim itself to entirely perform the recited function.  
Absence of the word “means” (or “step for”) in a claim creates a rebuttable presumption that the claim element is not to be treated in accordance with 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph).  The presumption that 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph) is not invoked is rebutted when the claim element recites function but fails to recite sufficiently definite structure, material or acts to perform that function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. 
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: “control device” in claim 16.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. 
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections – 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-10, 12-15, are rejected under 35 U.S.C. 103 as being unpatentable over Kopuklu et al. (“Convolutional Neural Networks With Layer Reuse”) hereinafter known as Kopuklu in view of Cha et al. (“Hierarchical Auxiliary Learning”) hereinafter known as Cha in view of Elhoushi et al. (“DeepShift: Towards Multiplication-Less Neural Networks”).


Regarding independent claim 1, Kopuklu teaches:
A method for operating an artificial neural network (ANN), which processes inputs in a sequence of layers, to give outputs, the method comprising the following steps: defining, within the ANN, to be executed multiple times, at least one iterative block including one or more layers; (Kopuklu [Page 1, Column 2, Paragraph 2]: “Instead of stacking convolutional layers one after another, as in Fig. 1 (a), we feed the output of the convolutional blocks to itself for a given N times before passing the output to the next layer” Kopuklu teaches a convolutional block whose output is fed back into itself. This convolutional block is iterated N times and is comprised of at least one layer in the neural network.)
defining number J of iterations, which is a maximum number of times the iterative block is to be executed; (Kopuklu [Page 2, Column 1, Paragraph 3]: “In contrast, we reuse the layers multiple times, and as we increase the number-of-reuse (N), convolutional filters also get more gradient updates” Kopuklu teaches that the layers that comprise the block is reused multiple times, namely N times. This is the maximum number of times that the block can be executed.)
mapping an input of the iterative block onto an output, by the iterative block; (Kopuklu [Page 3, Column 1, Table 1]: Kopuklu teaches that the MaxPool layers are after the output of the iterative blocks. Therefore, the output of the iterative block becomes the input to the MaxPool layers.
supplying the output of the iterative block to the iterative block again as the input of the iterative bock, which is, in turn, mapped onto a new output by the iterative block; (Kopuklu [Page 1, Column 2, Paragraph 2]: “Instead of stacking convolutional layers one after another, as in Fig. 1 (a), we feed the output of the convolutional blocks to itself for a given N times before passing the output to the next layer” Kopuklu teaches a convolutional block whose output is fed back into itself as its input. Kopuklu [Page 3, Column 1, Table 1]: Kopuklu teaches that the MaxPool layers are after the output of the iterative blocks. Therefore, the output of the iterative block becomes the input to the MaxPool layers.)
supplying by the iterative block, once the iterations of the iterative block have been completed, the output delivered by the iterative, as an input, to a layer of the ANN succeeding the iterative block or as the output of the ANN; (Kopuklu [Page 3, Column 1, Table 1]: Kopuklu teaches that the MaxPool layers are after the output of the iterative blocks. Therefore, the output of the iterative block becomes the input to the MaxPool layers.)
…
wherein parameters that characterize a behavior of the layers in the iterative block are defined or modulated … (Kopuklu [Page 4, Column 2, Paragraph 3]: “This paper proposes a parameter reuse strategy, Layer Reuse (LRU), where convolutional layers of a CNN architecture (LruNet) are used repeatedly.” Kopuklu teaches that the parameters of the layers are reused repeatedly. This shows that the parameters are both defined and modulated by reusing during repeating the iterations and not reusing otherwise.)

Kopuklu does not explicitly teach:
… based on output of an auxiliary ANN that receives the following as inputs: the input of the iterative block, and/or a processing product that is formed in a layer of the ANN which is downstream of the iterative block, and/or the parameters that characterize the behavior of the layers of the iterative block, and/or a continuous index of the iteration currently being executed by the iterative block.

However, Cha teaches:
… based on output of an auxiliary ANN that receives the following as inputs: the input of the iterative block, and/or a processing product that is formed in a layer of the ANN which is downstream of the iterative block, and/or the parameters that characterize the behavior of the layers of the iterative block, and/or a continuous index of the iteration currently being executed by the iterative block. (Cha [Page 3, Paragraph 4]: “What the auxiliary block does is to take the output from any layer and to calculate an auxiliary score based on the output and superclass information; the auxiliary score then modifies the original output from the layer, which becomes a new input to the next layer.” Cha teaches that the auxiliary block, which taken by itself is a smaller ANN, receives the output from any layer. This shows that the input to the iterative block, which is an output of some layer, is received as the input to the auxiliary block. The output of the auxiliary block becomes the new input to the next layer (the iterative block), which shows that the parameters of the iterative block is modulated by the auxiliary block through the auxiliary score.)

Kopuklu and Cha are in the same field of endeavor as the present invention, as the
references are directed to reusing a block in a neural network and using an auxiliary neural network to modulate the parameters, respectively. It would have been obvious, before the effective filing date of the claimed invention, to a person of ordinary skill in the art, to combine using a reusable block for several iterations as taught in Kopuklu with using an auxiliary neural network to control the parameters of the neural network as taught in Cha. Cha provides this additional functionality. As such, it would have been obvious to one of ordinary skill in the art to modify the teachings of Kopuklu to include teachings of Cha because the combination would allow for the neural network to require less space because of the reusable, iterative block and to require less time because of the auxiliary neural network. This has the potential benefit of speeding up the training time by designating the modulation of parameters to the auxiliary neural network and reducing the space needed to store the neural network, as now some blocks are iterative.

Kopuklu and Cha do not explicitly teach:
wherein at least the iterative block of the ANN is executed on at least one processor using machine commands that exclusively access an internal memory of the processor. 

However, Elhoushi teaches:
wherein at least the iterative block of the ANN is executed on at least one processor using machine commands that exclusively access an internal memory of the processor. (Elhoushi [Page 7, Column 2, Paragraph 1]: “weights between different layers of memory of a GPU or CPU.” Elhoushi teaches that the convolution to calculate the parameters is modulated by bitwise-shift convolution rather than regular multiplication. The fact that the operations are bitwise and primitive show that they are necessarily directly accessed from the internal memory of the processor.)

Elhoushi is in the same field as the present invention, since it is directed to using machine commands, namely shifting bits, to calculate the parameters of a neural network.  It would have been obvious, before the effective filing date of the claimed invention, to a person of ordinary skill in the art, to combine having a reusable, iterative block in a neural network as taught in Kopuklu as modified by Cha with using bit-wise shift calculations to calculate the parameters as taught in Elhoushi. Elhoushi provides this additional functionality. As such, it would have been obvious to one of ordinary skill in the art to modify the teachings of Kopuklu as modified by Cha to include teachings of Elhoushi because the combination would allow for the calculation of parameters to be done much faster, as the bit-shifting is faster than the floating-point calculations. This has the potential benefit of speeding up calculating the parameters of the neural network, as the relatively time intensive floating point calculations can be replaced with much faster bit-wise shifting. 

Regarding dependent claim 2, Kopuklu and Cha teach:
The method as recited in claim 1, 

Cha teaches:
wherein new values of the parameters that characterize the behavior of the layers in the iterative block , and/or updates to the parameters that characterize the behavior of the layers in the iterative block are defined using a differentiable function of the output of the auxiliary ANN. (Cha [Page 4, Paragraph 2]: “Let … a be … the auxiliary score” Cha teaches that the auxiliary score is represented by a in the equation that follows. Cha [Page 4, Equation 4]: Cha teaches that the loss function is differentiable with respect to the auxiliary score, as the differential is calculated in Equation 4. This characterizes the behavior of the iterative block as it is based on the auxiliary score, which controls the behavior.)
The reasons to combine are substantially similar to those of claim 1.

Regarding dependent claim 3, Kopuklu and Cha teach:
The method as recited in claim 1, 

Elhoushi teaches:
wherein at least one of the parameters that characterize the behavior of the layers in the iterative block is modulated by executing a machine command that is quicker to execute on a hardware platform used for this purpose than it would be to set the parameter to any desired value. (Elhoushi [Page 5, Column 2, Paragraph 3]: “Floating point (FP) multiplication consist of more steps such as adding the exponents, and normalizing. In terms of CPU cycles, taking 32-bit Intel Atom instruction processor as an example, integer and FP multiplication instruction take 5 to 6 clock cycles, while a bit-wise shift instruction takes only 1 clock cycle” Elhoushi teaches that the floating point multiplication to set the value to any floating point multiple takes 5 to 6 times as long as the bit-wise shift instruction that is a machine command.)

The reasons to combine are substantially similar to those of claim 1.

Regarding dependent claim 4, Kopuklu, Cha, and Elhoushi teach:
The method as recited in claim 3, 

Elhoushi teaches:
wherein the at least one of the parameters that characterize the behavior of the layers in the iterative block is modulated by incrementing, or decrementing, or bitwise offset. (Elhoushi [Page 2, Column 1, Paragraph 2]: “reduce computation and power budget of CNNs by replacing regular multiplication-based convolution and linear operations (a.k.a fully-connected layer or matrix multiplication) with bitwise-shift-based convolution and linear operations respectively.” Elhoushi teaches that the convolution to calculate the parameters is modulated by bitwise-shift convolution rather than regular multiplication.)
The reasons to combine are substantially similar to those of claim 1.

Regarding dependent claim 5, Kopuklu and Cha teach:
The method as recited in claim 1, 

Kopuklu teaches:
wherein only a portion of the parameters that characterize the behavior of the layers in the iterative block are changed at the time of switching between the iterations for which the iterative block is executed. (Kopuklu [Page 3, Column 2, Table 4]: Kopuklu teaches that the total parameters that make up the neural network with iterative blocks – when new layers are added – are much greater in number than the parameters that are originally there. This shows that the number of parameters that are changed are only a portion of the iterative block.)
The reasons to combine are substantially similar to those of claim 1.

Regarding dependent claim 6, Kopuklu and Cha teach:
The method as recited in claim 5, 

Kopuklu teaches:
wherein, starting from at least one iteration, a proportion of between 1% and 20% of the parameters that characterize the behavior of the layers in the iterative block are changed at the time of switching to the next iteration. (Kopuklu [Page 3, Column 2, Table 4]: Kopuklu teaches that the total parameters that make up the neural network with iterative blocks – when new layers are added – are much greater in number than the parameters that are originally there. This shows that the number of parameters that are changed are only a portion of the iterative block. The table shows that in the 115-depth network, 206k parameters are changed while there are 1562k total parameters, for a 13% proportion.)
The reasons to combine are substantially similar to those of claim 1.


Regarding dependent claim 7, Kopuklu and Cha teach:
The method as recited in claim 6, 

Kopuklu teaches:
wherein the proportion is between 1% and 15%. (Kopuklu [Page 3, Column 2, Table 4]: Kopuklu teaches that the total parameters that make up the neural network with iterative blocks – when new layers are added – are much greater in number than the parameters that are originally there. This shows that the number of parameters that are changed are only a portion of the iterative block. The table shows that in the 115-depth network, 206k parameters are changed while there are 1562k total parameters, for a 13% proportion.)
The reasons to combine are substantially similar to those of claim 1.

Regarding dependent claim 8, Kopuklu and Cha teach:
The method as recited in claim 5, 

Kopuklu teaches:
wherein, at the time of a first switch between iterations, a first portion of the parameters that characterize the behavior of the layers in the iterative block is changed, and at the time of a second switch between iterations a second portion of the parameters that characterize the behavior of the layers in the iterative block is changed, the second portion not being congruent with the first portion. (Kopuklu [Page 3, Column 2, Table 4]: Kopuklu teaches that the total parameters that make up the neural network with iterative blocks – when new layers are added – are much greater in number than the parameters that are originally there. This shows that the number of parameters that are changed are only a portion of the iterative block. The second portion that is not part of these parameters are different and are thus not congruent with the first portion.)
The reasons to combine are substantially similar to those of claim 1.

Regarding dependent claim 9, Kopuklu and Cha teach:
The method as recited in claim 1, 

Kopuklu teaches:
wherein the ANN is selected which at first processes inputs using a plurality of convolution layers, and which, from a result obtained, using at least one further layer determines at least one classification score relating to a specified classification as the output, the iterative block being defined such that the iterative block includes at least a portion of the convolution layers. (Kopuklu [Page 4, Column 1, Paragraph 2]: “We again analyzed the effect of LRU on the performance, and 14-LruNet-2x achieves 5.14% better classification accuracy than 1-LruNet-2x.” Kopuklu teaches that the convolution is performed to determine a classification score (as demonstrated by the percent increase in accuracy) and the classification as output. Kopuklu [Page 1, Column 2, Paragraph 2]: “Instead of stacking convolutional layers one after another, as in Fig. 1 (a), we feed the output of the convolutional blocks to itself for a given N times before passing the output to the next layer” Kopuklu teaches a convolutional block whose output is fed back into itself. This convolutional block is iterated N times and is comprised of at least one layer in the neural network.)
The reasons to combine are substantially similar to those of claim 1.

Regarding dependent claim 10, Kopuklu and Cha teach:
The method as recited in claim 9, 

Kopuklu teaches:
wherein image data and/or time series data are selected as the inputs of the ANN. (Kopuklu [Page 4, Column 1, Paragraph 3]: “There are 50k training and 10k testing images in grayscale for 10 classes with image resolution of 28x28.” Kopuklu teaches that image data is used for training and testing the ANN.)
The reasons to combine are substantially similar to those of claim 1.

Regarding dependent claim 12, Kopuklu, Cha, and Elhoushi teach:
The method as recited in claim 11, 
Elhoushi teaches:
wherein the processor is a CPU or an FPGA, and wherein the internal memory is a register memory and/or a cache memory. (Elhoushi [Page 5, Column 2, Paragraph 3]: “In terms of CPU cycles, taking 32-bit Intel Atom instruction processor as an example, integer and FP multiplication instruction take 5 to 6 clock cycles, while a bit-wise shift instruction takes only 1 clock cycle” Elhoushi teaches that the processor is an Intel CPU. Elhoushi [Page 7, Column 2, Paragraph 1]: “weights between different layers of memory of a GPU or CPU.” Elhoushi teaches that the memory is of a GPU or CPU, which may be cache memory.)
The reasons to combine are substantially similar to those of claim 1.


Claim 13 is substantially similar to claim 1, but has the following additional elements:
Regarding independent claim 13, Kopuklu and Cha teach:
… the method for training comprising: providing learning inputs, and associated learning outputs onto which the ANN is to map the respective learning inputs; (Kopuklu [Page 4, Column 1, Paragraph 3]: “There are 50k training and 10k testing images in grayscale for 10 classes with image resolution of 28x28.” Kopuklu teaches that image data is used for training and testing the ANN, as inputs. Kopuklu [Page 4, Column 1, Paragraph 2]: “We again analyzed the effect of LRU on the performance, and 14-LruNet-2x achieves 5.14% better classification accuracy than 1-LruNet-2x.” Kopuklu teaches that the convolution is performed to determine a classification score (as demonstrated by the percent increase in accuracy) and the classification as output. This is mapped to the input, as the classification prediction corresponds to the image.)
mapping the learning inputs onto outputs by the ANN; (Kopuklu [Page 4, Column 1, Paragraph 3]: “There are 50k training and 10k testing images in grayscale for 10 classes with image resolution of 28x28.” Kopuklu teaches that image data is used for training and testing the ANN, as inputs. Kopuklu [Page 4, Column 1, Paragraph 2]: “We again analyzed the effect of LRU on the performance, and 14-LruNet-2x achieves 5.14% better classification accuracy than 1-LruNet-2x.” Kopuklu teaches that the convolution is performed to determine a classification score (as demonstrated by the percent increase in accuracy) and the classification as output. This is mapped to the input, as the classification prediction corresponds to the image.)
assessing a discrepancy between the outputs and the learning outputs, using a specified cost function; (Kopuklu [Page 4, Column 1, Paragraph 2]: “We again analyzed the effect of LRU on the performance, and 14-LruNet-2x achieves 5.14% better classification accuracy than 1-LruNet-2x.” Kopuklu teaches that the convolution is performed to determine a classification score (as demonstrated by the percent increase in accuracy) and the classification as output. That there is a classification accuracy percentage shows that the discrepancy is quantified – and since it is determined using training, it necessarily involves a cost function.)
and optimizing the parameters that characterize the behavior of the layers in the iterative block, including their changes at the time of switching between iterations, … such that, on further processing of learning inputs by the ANN, the assessment by the cost function is expected to have improved. (Kopuklu [Page 4, Column 1, Paragraph 2]: “We again analyzed the effect of LRU on the performance, and 14-LruNet-2x achieves 5.14% better classification accuracy than 1-LruNet-2x.” Kopuklu teaches that the convolution is performed to determine a classification score (as demonstrated by the percent increase in accuracy) and the classification as output. The fact that the experiment involves comparing the classification accuracies of different networks, with and without the iterative block, show that there is an improvement in training with the cost function as the comparison is positive change in speed.)
… and/or parameters that characterize a behavior of the auxiliary ANN, … (Cha [Page 4, Paragraph 2]: “Let … a be … the auxiliary score” Cha teaches that the auxiliary score is represented by a in the equation that follows. Cha [Page 4, Equation 4]: Cha teaches that the loss function is differentiable with respect to the auxiliary score, as the differential is calculated in Equation 4. This characterizes the behavior of the iterative block as it is based on the auxiliary score, which controls the behavior.)
The reasons to combine are substantially similar to those of claim 1.

Regarding dependent claim 14, Kopuklu and Cha teach:
The method as recited in claim 13, 

Elhoushi teaches:
wherein the cost function includes a contribution that depends on the number of parameters that is changed at the time of switching between iterations, and/or a rate of change of the changed parameters, and/or an absolute or relative change over all parameters. (Elhoushi [Page 4, Column 1, Equation 12]: Elhoushi teaches that the backward pass gradient is the cost function with respect to the weights. This shows that the contribution that goes into the cost function is the parameters.)
The reasons to combine are substantially similar to those of claim 1.


Regarding dependent claim 15, Kopuklu and Cha teach:
The method as recited in claim 13, 

Kopuklu teaches:
wherein, at the same time as and/or in alternation with the parameters that characterize the behavior of the layers in the iterative block, further parameters which characterize the behavior of further neurons and/or other processing units of the ANN outside the iterative block, are also optimized to an assessment by the cost function that is expected to be better. (Kopuklu [Page 4, Column 1, Paragraph 2]: “We again analyzed the effect of LRU on the performance, and 14-LruNet-2x achieves 5.14% better classification accuracy than 1-LruNet-2x.” Kopuklu teaches that the convolution is performed to determine a classification score (as demonstrated by the percent increase in accuracy) and the classification as output. The fact that the experiment involves comparing the classification accuracies of different networks, with and without the iterative block, show that there is an improvement in training with the cost function as the comparison is positive change in speed.)
The reasons to combine are substantially similar to those of claim 1.



Claims 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Kopuklu in view of Cha in view of Hayat et al. (US 20210101616 A1)

Claim 16 is substantially similar to claim 1, but has the following additional elements:
Regarding independent claim 16, Kopuklu and Cha teach the similarities between claim 16 and claim 1. Kopuklu and Cha do not explicitly teach:
A control device for a vehicle, comprising: an input interface configured to be connected to one or more sensors of the vehicle;
an output interface configured to be connected to one or more actuators of the vehicle;
and an artificial neural network (ANN), the ANN taking part in processing of measurement data obtained over the input interface by the one or more sensors to give a control signal for the output interface;

However, Hayat teaches:
A control device for a vehicle, comprising: an input interface configured to be connected to one or more sensors of the vehicle; (Hayat [¶ 0071]: “video inputs for receiving image data from multiple image sensors … system 100 may be included on a vehicle 200” Hayat teaches a video input for receiving images from an image sensor from a vehicle.)
an output interface configured to be connected to one or more actuators of the vehicle; (Hayat [¶ 0157]: “For example, the gain of the heading error tracking control loop may depend on … a steering actuator loop” Hayat teaches that the gain of the heading error, which is the output, is connected to the steering actuator loop.)
and an artificial neural network (ANN), the ANN taking part in processing of measurement data obtained over the input interface by the one or more sensors to give a control signal for the output interface; (Hayat [¶ 0139]: “any of the modules (e.g., modules 402, 404, and 406) disclosed herein may implement techniques associated with a trained system (such as a neural network or a deep neural network)” Hayat teaches that a neural network may be implemented to the modules that specifically provide for the output interface.)

Hayat is in the same field as the present invention, since it is directed to using a neural network in the process of taking data from a vehicle sensor and outputting the result to the vehicle.  It would have been obvious, before the effective filing date of the claimed invention, to a person of ordinary skill in the art, to combine using iterative, reusable blocks in the neural network as taught in Kopuklu as modified by Cha with using a vehicle to get input data through its sensors and making the vehicle use the output of the neural network as taught in Hayat. Hayat provides this additional functionality. As such, it would have been obvious to one of ordinary skill in the art to modify the teachings of Kopuklu as modified by Cha to include teachings of Hayat because the combination would allow for the neural networks containing reusable blocks to be connected to vehicles such as cars, and possibly autonomous cars. This has the potential benefit of increasing the efficiency of the calculation of tasks that require the neural networks by the car because the neural network has the optimization of having reusable iterative blocks. 

Claim 17 is substantially similar to claim 1, but has the following additional elements:
Regarding independent claim 17, Kopuklu and Cha teach the similarities between claim 17 and claim 1. Kopuklu and Cha do not explicitly teach:
A non-transitory machine-readable data carrier on which is stored a computer program for operating an artificial neural network (ANN), which processes inputs in a sequence of layers, to give outputs, the computer program, when executed by one or more computers, causing the one or more computers to perform the following steps: 

However, Hayat teaches:
A non-transitory machine-readable data carrier on which is stored a computer program for operating an artificial neural network (ANN), which processes inputs in a sequence of layers, to give outputs, the computer program, when executed by one or more computers, causing the one or more computers to perform the following steps: (Hayat [¶ 0287]: “non-transitory memory … may be configured to store data, such as computer codes or instructions executable by a processor” Hayat [¶ 0139]: “any of the modules (e.g., modules 402, 404, and 406) disclosed herein may implement techniques associated with a trained system (such as a neural network or a deep neural network)” Hayat teaches that the non-transitory memory that is used for the computations regarding neural networks stores the instructions for the processors to perform.)
The reasons to combine are substantially similar to those of claim 16.

Claim 18 is substantially similar to claim 1, but has the following additional elements:
Regarding independent claim 18, Kopuklu and Cha teach the similarities between claim 18 and claim 1. Kopuklu and Cha do not explicitly teach:
A computer configured to operate an artificial neural network (ANN), which processes inputs in a sequence of layers, to give outputs, the computer configured to:

However, Hayat teaches:
A computer configured to operate an artificial neural network (ANN), which processes inputs in a sequence of layers, to give outputs, the computer configured to: (Hayat [¶ 0287]: “non-transitory memory … may be configured to store data, such as computer codes or instructions executable by a processor” Hayat [¶ 0139]: “any of the modules (e.g., modules 402, 404, and 406) disclosed herein may implement techniques associated with a trained system (such as a neural network or a deep neural network)” Hayat teaches that the non-transitory memory that is used for the computations regarding neural networks stores the instructions for the processors to perform.)
The reasons to combine are substantially similar to those of claim 16.


Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KYU HYUNG HAN whose telephone number is (703) 756-5529.  The examiner can normally be reached on MF 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on (571) 270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Kyu Hyung Han/
Examiner
Art Unit 2123

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123
Read full office action
Prosecution Timeline

Apr 07, 2022
Application Filed
May 07, 2025
Non-Final Rejection — §103
Oct 13, 2025
Response Filed
Jan 02, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/332,295
Patent 12585928
HARDWARE ARCHITECTURE FOR INTRODUCING ACTIVATION SPARSITY IN NEURAL NETWORK
2y 5m to grant Granted Mar 24, 2026
17/317,300
Patent 12387101
SYSTEMS AND METHODS FOR PRUNING BINARY NEURAL NETWORKS GUIDED BY WEIGHT FLIPPING FREQUENCY
2y 5m to grant Granted Aug 12, 2025
Study what changed to get past this examiner. Based on 2 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
43%
Grant Probability
85%
With Interview (+41.7%)
4y 6m
Median Time to Grant
Moderate
PTA Risk
Based on 7 resolved cases by this examiner. Grant probability derived from career allow rate.