Last updated: April 19, 2026
Application No. 17/201,768
PRUNING NEURAL NETWORKS

Final Rejection §101§102§103
Filed
Mar 15, 2021
Examiner
HOANG, AMY P
Art Unit
2143
Tech Center
2100 — Computer Architecture & Software
Assignee
Nvidia Corporation
OA Round
4 (Final)
Interview Optional

— +64.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 232 resolved cases, 2023–2026
Examiner Intelligence

HOANG, AMY P View full profile →
Grants 70% — above average
Career Allow Rate
163 granted / 232 resolved
+15.3% vs TC avg
Strong +64% interview lift
Without
With
+64.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
31 currently pending
Career history
263
Total Applications
across all art units
Statute-Specific Performance

§101
15.9%
-24.1% vs TC avg
§103
46.0%
+6.0% vs TC avg
§102
17.0%
-23.0% vs TC avg
§112
13.4%
-26.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 232 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statements submitted on 06/02/2025, 08/28/2025 and 11/26/2025 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.

Response to Amendment
The Amendment filed on 10/03/2025 has been entered. Claims 1-21 and 29-34 remain pending in the application.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-21 and 29-34 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Step 1: In the instant case, the claims are directed to a processor (claims 1-7 and 29-34), a machine-readable medium (claims 8-14) and a system (claims 15-21). Therefore, the claims are eligible under Step 1 for being directed to a machine and a manufacture.
Step 2A Prong 1:  The claim(s) recite(s):
Independent claims 1, 8, 15 and 29:
train a neural network, at least in part, by: calculate, for a sub-network of a neural network, a stability value that indicates a measure of potential change to parameters of a plurality of nodes of the sub-network during one or more subsequent updates of the neural network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with calculating a value that indicates a measure of potential change to parameters to at least in part train a neural network.
determine that the sub-network is stable according to a determination that the stability value satisfies one or more criteria - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship with determining that the stability value satisfies one or more criteria.
based on the determination that the sub-network is stable, remove one or more nodes of the neural network other than the plurality of nodes - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data when the sub-network is stable, selecting data based on judgement and removing nodes based on the determination that the sub-network is stable, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Dependent Claim 2:
Identify the sub-network by at least: calculating a set of importance scores for the plurality of nodes of the sub-network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with calculating a set of scores to identify the sub-network;
determining that the set of importance scores for the plurality of nodes of the sub-network satisfies one or more other criteria - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship with determining that the set of importance scores satisfies one or more other criteria.
Dependent Claim 3:
wherein the one or more criteria comprises the value being above a pre-defined threshold - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship wherein the one or more criteria comprises the value being above a pre-defined threshold.
Dependent Claim 4:
wherein the set of importance scores are based on a magnitude-based criterion - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship relating the set of importance scores to a magnitude-based criterion.
Dependent Claim 5:
wherein the sub-network is determined based at least in part on maximum values of the set of importance scores - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship with determining the sub-networks based on maximum values relationship of a set of scores.
Dependent Claim 6:
wherein the value is determined based on normalized differences between different sub-networks of the neural network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with determining the value based on normalized differences between sub-networks.
Dependent Claim 7:
wherein the normalized differences are based on differences between numbers of nodes of layers of the different sub-networks of the neural network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship wherein the normalized differences are based on differences between numbers of nodes of layers of the different sub-networks of the neural network.
Dependent Claim 9:
determine a set of sub-networks of the neural network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mentally performable process with determining a set of sub-networks;
determine a set of values corresponding to the set of sub-networks - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with determining a set of values; and
select the sub-network based at least in part on the set of values - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Dependent Claim 10:
wherein the set of sub-networks are determined based on gradient-based criterion - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with determining the set of sub-networks based on a criterion.
Dependent Claim 11:
perform one or more pruning processes on the neural network to obtain a second neural network that matches the sub-network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mentally performable process with aid of pen and paper to perform a pruning process on the observed neural network to obtain a second neural network that is judged to match the sub-network.
Dependent Claim 12:
wherein a first value of the set of values is determined based at least in part on a difference between numbers of neurons of a layer of a first sub-network and a corresponding layer of a second sub-network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with determining a first value of the set of values based on a difference between numbers of neurons of a layer of a first sub-network and a corresponding layer of a second sub-network;
Dependent Claim 13:
wherein the sub-network corresponds to a value of the set of values that is greater than at least one or more other values of the set of values - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship wherein the sub-network corresponds to a value of the set of values that is greater than at least one or more other values of the set of values.
Dependent Claim 16:
determine a first set of importance scores for nodes of the neural network for a first training epoch - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with determining a set of scores;
determine a first sub-network of the neural network based on the first set of importance scores - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship relating a first sub-network with the first set of importance scores; and
calculate a first value for the first sub-network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with calculating a value.
Dependent Claim 17:
determine a second set of importance scores for the nodes of the neural network for a second training epoch - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with determining a set of importance scores;
determine a second sub-network of the neural network based on the second set of importance scores - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship relating a second sub-network with the second set of importance scores; and
calculate a second value for the second sub-network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with calculating a value.
Dependent Claim 18:
wherein the second value is calculated based on differences between the second sub-network and the first sub-network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with calculating a value.
Dependent Claim 19:
compare the first value with the second value - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship with comparing first and second values.
Dependent Claim 20:
remove the one or more nodes of the neural network by at least, as a result of determining that the second value is greater than the first value and one or more values for one or more sub-networks - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical relationship relating removal of the one or more nodes of the neural network to the second value being greater than the first value and one or more values for one or more sub-networks.
Dependent Claim 21:
perform one or more training processes on the pruned neural network using one or more gradient descent algorithms - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with performing training processes using algorithms.
Dependent Claim 30:
determine a first set of metric values for the neural network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with determining values;
determine a second set of metric values for the neural network - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation with determining values; and
compare the second set of metric values and the first set of metric values to determine a metric value of the second set of metric values - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical relationship comparatively relating the second set of metric values and the first set of metric values to determine a metric value of the second set of metric values.
Dependent Claim 31:
wherein the metric value is greater than one or more metric values of the first set of metric values and the second set of metric values - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical relationship wherein the metric value is greater than one or more metric values of the first set of metric values and the second set of metric values.
Dependent Claim 32:
remove the one or more nodes of the neural network based at least in part on the metric value to obtain the sub-network corresponding to the metric value Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of removing the one or more nodes based on a mathematical relationship to the metric value and to obtain the sub-network corresponding to the metric value.

Step 2A Prong 2:  This judicial exception is not integrated into a practical application because they recite the additional elements:
Independent claims 1, 8, 15 and 29:
A processor, one or more circuits; a machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to - these limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
A system, comprising: one or more computers having one or more processors to - these limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
Independent claim 29:
use a neural network to infer information from one or more inputs, wherein the neural network is trained, at least in part, by – these limitations are mere instructions to implement a judicial exception with using a neural network to infer information from one or more inputs, wherein the neural network is trained but fails to recite details of specific training technique, e.g. via supervised/unsupervised/reinforcement training/etc., and no description of the mechanism for the neural network, e.g. convolution neural network/recurrent neural network/etc., thus these additional elements do not integrate a judicial exception into a practical application (see MPEP 2106.05(f))
Dependent Claims 2, 9, 11, 16-17, 19-21, 30, 32:
wherein the one or more circuits are to; wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to; wherein the one or more processors are further to - these limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
Dependent Claim 14:
wherein the neural network is an image processing neural network of one or more vehicle systems - this limitation is recited at a high level of generality and can be viewed as nothing more than an attempt to generally link the use of the judicial exception, determining a set of sub-networks of the neural network, to the field of use or technological environment wherein the neural network is an image processing neural network of one or more vehicle systems (see MPEP 2106.05(h)).
Dependent Claim 33:
wherein the one or more inputs comprise one or more images - this limitation is recited at a high level of generality and can be viewed as nothing more than an attempt to generally link the use of the judicial exception, calculating a value for a sub-network of the neural network, to the field of use or technological environment wherein the one or more inputs comprise one or more images (see MPEP 2106.05(h)).
Dependent Claim 34:
wherein the processor is part of one or more edge devices - this limitation is recited at a high level of generality and can be viewed as nothing more than an attempt to generally link the use of the judicial exception, calculating a value for a sub-network of the neural network, to the field of use or technological environment wherein the processor is part of one or more edge devices (see MPEP 2106.05(h)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B:  The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements
Independent claims 1, 8, 15 and 29:
A processor, one or more circuits; a machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to - these limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
A system, comprising: one or more computers having one or more processors to - these limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
Independent claim 29:
use a neural network to infer information from one or more inputs, wherein the neural network is trained, at least in part, by – these limitations are mere instructions to implement a judicial exception with using a neural network to infer information from one or more inputs, wherein the neural network is trained but fails to recite details of specific training technique, e.g. via supervised/unsupervised/reinforcement training/etc., and no description of the mechanism for the neural network, e.g. convolution neural network/recurrent neural network/etc., thus these additional elements do not integrate a judicial exception into a practical application (see MPEP 2106.05(f))
Dependent Claims 2, 9, 11, 16-17, 19-21, 30, 32:
wherein the one or more circuits are to; wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to; wherein the one or more processors are further to - these limitations amount to components of a general purpose computer that applies a judicial exception, by use of conventional computer functions (see MPEP § 2106.05(b)).
Dependent Claim 14:
wherein the neural network is an image processing neural network of one or more vehicle systems - this limitation is recited at a high level of generality and can be viewed as nothing more than an attempt to generally link the use of the judicial exception, determining a set of sub-networks of the neural network, to the field of use or technological environment wherein the neural network is an image processing neural network of one or more vehicle systems (see MPEP 2106.05(h)).
Dependent Claim 33:
wherein the one or more inputs comprise one or more images - this limitation is recited at a high level of generality and can be viewed as nothing more than an attempt to generally link the use of the judicial exception, calculating a value for a sub-network of the neural network, to the field of use or technological environment wherein the one or more inputs comprise one or more images (see MPEP 2106.05(h)).
Dependent Claim 34:
wherein the processor is part of one or more edge devices - this limitation is recited at a high level of generality and can be viewed as nothing more than an attempt to generally link the use of the judicial exception, calculating a value for a sub-network of the neural network, to the field of use or technological environment wherein the processor is part of one or more edge devices (see MPEP 2106.05(h)).Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-4, 8-9, 11-13, 15-17, 29 and 34 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Yu et al. (hereinafter Yu), US 20180373978 A1.

Regarding independent claim 1, Yu discloses a processor (Fig. 1, 110; [0014]), comprising:
one or more circuits ([0090] Referring again to FIG. 6, processor 620 may comprise one or more circuits, such as digital circuits, to perform at least a portion of a computing procedure and/or process) to:
calculate, for a sub-network of a neural network, a stability value that indicates a measure of potential change to parameters of a plurality of nodes of the sub-network during one or more subsequent updates of the neural network ([0017] importance parameters may be determined for respective weight parameters and/or groups of weight parameters. Weight parameters and/or groups of weight parameters having importance parameters below a specified threshold value may be removed, in an embodiment. As utilized herein, the term “importance parameter” refers to a parameter for a respective weight parameter and/or a respective group of weight parameters that may be compared against a threshold value, such as to determine whether a weight parameter and/or group of weight parameters are to be removed; Fig. 3; [0026] In an embodiment, signals and/or states representative of a set of neural network weight parameters, such as set of neural network weight parameters 310, may be processed to group the set of neural network weight parameters into a plurality of weight groups (i.e. sub-network of a neural network) according to a specified group size, such as depicted at block 322, for example; [0027] pruning groups of neural network weight parameters may include calculating importance parameters (i.e. calculate, for a sub-network of a neural network, a stability value) for respective groups of neural network weight parameters. For example, calculating importance parameters for respective groups of neural network weight parameters may include calculating root-mean-square (RMS) parameters for the respective groups, for example; [0028] Further, as depicted at block 326, for example, a grouped and/or pruned set of network weight parameters may be retrained, for example to maintain a desired and/or specified level of accuracy within the neural network model (i.e. a stability value that indicates a measure of potential change to parameters));
determine that the sub-network is stable according to a determination that the stability value satisfies one or more criteria ([0027] pruning groups of neural network weight parameters may include removing one or more groups determined to have an importance parameter below a specified threshold value (Note: Examiner interprets the claimed one or more criteria as groups of neural network weight parameters with importance parameter above a specified threshold value)); and
based on the determination that the sub-network is stable, remove one or more nodes of the neural network other than the plurality of nodes ([0027] groups of neural network weight parameters may be pruned, as depicted at block 324 … pruning groups of neural network weight parameters may include removing one or more groups determined to have an importance parameter below a specified threshold value (Note: Examiner notes that groups of neural network weight parameters with importance parameter below a specified threshold value are removed and groups of neural network weight parameters with importance parameter above a specified threshold value are not removed)).

Regarding dependent claim 2, Yu discloses all the limitations as set forth in the rejection of claim 1 that is incorporated. Yu further discloses
wherein the one or more circuits are to identify the sub-network by at least:
calculating a set of importance scores for the plurality of nodes of the sub-network ([0027] pruning groups of neural network weight parameters may include calculating importance parameters (i.e. a set of importance scores) for respective groups of neural network weight parameters);
determining that the set of importance scores for the plurality of nodes of the sub-network satisfies one or more other criteria ([0027] pruning groups of neural network weight parameters may include removing one or more groups determined to have an importance parameter below a specified threshold value. (Note: Examiner interprets the claimed one or more criteria as groups of neural network weight parameters with importance parameter above a specified threshold value)).

Regarding dependent claim 3, Yu discloses all the limitations as set forth in the rejection of claim 2 that is incorporated. Yu further discloses wherein the one or more criteria comprises the value being above a pre-defined threshold ([0027] pruning groups of neural network weight parameters may include removing one or more groups determined to have an importance parameter below a specified threshold value. (Note: Examiner interprets the claimed one or more criteria comprises the value being above a pre-defined threshold as groups of neural network weight parameters with importance parameter value above a specified threshold value)). 

Regarding dependent claim 4, Yu discloses all the limitations as set forth in the rejection of claim 2 that is incorporated. Yu further discloses
wherein the set of importance scores are based on a magnitude-based criterion ([0017] importance parameters may be determined for respective weight parameters and/or groups of weight parameters. Weight parameters and/or groups of weight parameters having importance parameters below a specified threshold value may be removed, in an embodiment. As utilized herein, the term “importance parameter” refers to a parameter for a respective weight parameter and/or a respective group of weight parameters that may be compared against a threshold value, such as to determine whether a weight parameter and/or group of weight parameters are to be removed. Also, the terms “weight parameter,” “neural network weight parameter,” “importance parameter,” and/or the like are merely convenient labels, and are to be associated with appropriate physical quantities).

Regarding independent claim 8, claim 8 contains substantially similar limitations to those found in claim 1. Therefore, it is rejected for the same reason as claim 1 above. Yu further discloses a non-transitory machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to (Fig. 6, 640; [0086] Memory 622 may store electronic files and/or electronic documents, such as relating to one or more users, and may also comprise a computer-readable medium that may carry and/or make accessible content, including code and/or instructions, for example, executable by processor 620 and/or some other device, such as a controller, as one example, capable of executing computer instructions).

Regarding dependent claim 9, Yu discloses all the limitations as set forth in the rejection of claim 8 that is incorporated. Yu further discloses wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to:
determine a set of sub-networks of the neural network ([0027] pruning groups of neural network weight parameters may include removing one or more groups determined to have an importance parameter below a specified threshold value);
determine a set of values corresponding to the set of sub-networks ([0029] FIG. 4 is an illustration of an embodiment 400 of an example process for formatting a set of neural network weight parameters. In an embodiment, such as example embodiment 400, a relatively dense set of parameters, such as matrix 410, may include weight parameters, such as may be associated with one or more nodes, connections, and/or layers, for example, of a neural network model); and
select the sub-network based at least in part on the set of values ([0029] signals and/or states representative of a set of neural network weight parameters, such as matrix 410, may be processed to group the set of neural network weight parameters into a plurality of groups according to a specified group size. For example, neural network weight parameters of matrix 410, for example, may be grouped into groups of two, as depicted via gray scale shading in FIG. 4).

Regarding dependent claim 11, Yu discloses all the limitations as set forth in the rejection of claim 9 that is incorporated. Yu further discloses wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to perform one or more pruning processes on the neural network to obtain a second neural network that matches the sub-network ([0090] processor to perform at least a portion of computing procedure/process, [0027] pruning groups of neural network weight parameters may include removing one or more groups determined to have an importance parameter below a specified threshold value; [0028] Further, as depicted at block 326, for example, a grouped and/or pruned set of network weight parameters may be retrained, for example to maintain a desired and/or specified level of accuracy within the neural network model. In an embodiment, a process of grouping and/or pruning sets of neural network weight parameters and retraining the grouped and/or pruned sets of neural network weight parameters may be performed in an iterative fashion, such as until a calculated accuracy parameter is determined to fall below a specified threshold, for example; [0030] FIG. 4, matrix 420 may represent a set of neural network weight parameters resulting from example grouping and/or pruning processes).

Regarding dependent claim 12, Yu discloses all the limitations as set forth in the rejection of claim 9 that is incorporated. Yu further discloses wherein a first value of the set of values is determined based at least in part on a difference between numbers of neurons of a layer of a first sub-network and a corresponding layer of a second sub-network ([0037] processing set of weight parameters include calculating root-mean-square, [0041] one or more weight parameters may operate in a specified manner on one or more parameters representative of one or more output values to yield a connection, such as between a node of a first layer and a node of a second layer).

Regarding dependent claim 13, Yu discloses all the limitations as set forth in the rejection of claim 9 that is incorporated. Yu further discloses wherein the sub-network corresponds to a value of the set of values that is greater than at least one or more other values of the set of values ([0029] FIG. 4 is an illustration of an embodiment 400 of an example process for formatting a set of neural network weight parameters. In an embodiment, such as example embodiment 400, a relatively dense set of parameters, such as matrix 410, may include weight parameters, such as may be associated with one or more nodes, connections, and/or layers, for example, of a neural network model).

Regarding independent claim 15, claim 15 contains substantially similar limitations to those found in claim 1. Therefore, it is rejected for the same reason as claim 1 above. Yu further discloses a system (Fig. 6; [0075]), comprising:
one or more computers ([0075]) having one or more processors ([0076]) to train a neural network ([0016] relatively large neural network models, such as Deep Neural Network (DNN) models, for example, may utilize considerable memory storage space, memory interface bandwidth, and/or other computing resources, for example. Such utilization of computing resources may pose challenges for mobile devices, such as mobile device 100, for example. In some circumstances, neural network models, such as deep neural network (DNN) models, may include relatively significant redundancy, such as, for example, with respect to weight parameters that may make up, at least in part, vectors and/or multidimensional matrices representative of various inputs, outputs, connections and/or nodes of a neural network model, for example. In some circumstances, weight parameters in neural network models and/or systems may be intended to model, at least in part, synapses between biological neurons; [0017] neural network weight parameter pruning operations may be utilized to remove redundant weight parameters within a neural network. As utilized herein, the term “redundant” in connection with a neural network weight parameter and/or the like refers to a weight parameter and/or the like that may be removed from a neural network model while maintaining a specified level of accuracy, such as after retraining operations, for example).

Regarding dependent claim 16, Yu discloses all the limitations as set forth in the rejection of claim 15 that is incorporated. Yu further discloses wherein the one or more processors are further to:
determine a first set of importance scores for nodes of the neural network for a first training epoch ([0019] FIG. 2 is an illustration of an embodiment 200 of an example process for formatting a set of neural network weight parameters (i.e. a first set of scores), in accordance with an embodiment. In an embodiment, such as example embodiment 200, a relatively dense matrix, such as matrix 210, may include weight parameters, such as may be associated with one or more nodes, connections, and/or layers, for example, of a neural network model; [0028] a process of grouping and/or pruning sets of neural network weight parameters and retraining the grouped and/or pruned sets of neural network weight parameters may be performed in an iterative fashion (i.e. a first training epoch); [0038] processing signals and/or states representative of a set of neural network weight parameters may include iteratively formatting and retraining the set of neural network weight parameters until a calculated accuracy parameter is determined to fall below a specified threshold);
determine a first sub-network of the neural network based on the first set of importance scores (Fig. 3; [0026] signals and/or states representative of a set of neural network weight parameters, such as set of neural network weight parameters 310, may be processed to group the set of neural network weight parameters into a plurality of weight groups according to a specified group size (i.e. a first sub-network)); and
calculate a first value for the first sub-network ([0027] pruning groups of neural network weight parameters may include calculating importance parameters (i.e. a first value) for respective groups of neural network weight parameters. For example, calculating importance parameters for respective groups of neural network weight parameters may include calculating root-mean-square (RMS) parameters for the respective groups).

Regarding dependent claim 17, Yu discloses all the limitations as set forth in the rejection of claim 16 that is incorporated. Yu further discloses wherein the one or more processors are further to:
determine a second set of importance scores for the nodes of the neural network for a second training epoch ([0019] FIG. 2 is an illustration of an embodiment 200 of an example process for formatting a set of neural network weight parameters (i.e. a second set of scores), in accordance with an embodiment. In an embodiment, such as example embodiment 200, a relatively dense matrix, such as matrix 210, may include weight parameters, such as may be associated with one or more nodes, connections, and/or layers, for example, of a neural network model; [0028] a process of grouping and/or pruning sets of neural network weight parameters and retraining the grouped and/or pruned sets of neural network weight parameters may be performed in an iterative fashion (i.e. a second training epoch); [0038] processing signals and/or states representative of a set of neural network weight parameters may include iteratively formatting and retraining the set of neural network weight parameters until a calculated accuracy parameter is determined to fall below a specified threshold);
determine a second sub-network of the neural network based on the second set of importance scores (Fig. 3; [0026] signals and/or states representative of a set of neural network weight parameters, such as set of neural network weight parameters 310, may be processed to group the set of neural network weight parameters into a plurality of weight groups according to a specified group size (i.e. a second sub-network)); and
calculate a second value for the second sub-network ([0027] pruning groups of neural network weight parameters may include calculating importance parameters (i.e. a second value) for respective groups of neural network weight parameters. For example, calculating importance parameters for respective groups of neural network weight parameters may include calculating root-mean-square (RMS) parameters for the respective groups).

Regarding independent claim 29, claim 29 contains substantially similar limitations to those found in claim 1. Therefore, it is rejected for the same reason as claim 1 above. Yu further discloses one or more circuits ([0090] Referring again to FIG. 6, processor 620 may comprise one or more circuits, such as digital circuits, to perform at least a portion of a computing procedure and/or process) to use a network to infer information from one or more inputs ([0051] Employing a model permits collected measurements to potentially be identified and/or processed, and/or potentially permits estimation and/or prediction of an underlying deterministic component), wherein the neural network is trained ([0016] relatively large neural network models, such as Deep Neural Network (DNN) models, for example, may utilize considerable memory storage space, memory interface bandwidth, and/or other computing resources, for example. Such utilization of computing resources may pose challenges for mobile devices, such as mobile device 100, for example. In some circumstances, neural network models, such as deep neural network (DNN) models, may include relatively significant redundancy, such as, for example, with respect to weight parameters that may make up, at least in part, vectors and/or multidimensional matrices representative of various inputs, outputs, connections and/or nodes of a neural network model, for example. In some circumstances, weight parameters in neural network models and/or systems may be intended to model, at least in part, synapses between biological neurons).

Regarding dependent claim 34, Yu discloses all the limitations as set forth in the rejection of claim 29 that is incorporated. Yu further discloses wherein the processor is part of one or more edge devices ([0055] Network devices capable of operating as a server device, a client device and/or otherwise, may include, as examples, dedicated rack-mounted servers, desktop computers, laptop computers, set top boxes, tablets, netbooks, smart phones, wearable devices, integrated devices combining two or more features of the foregoing devices, and/or the like, or any combination thereof).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 5-7, 18-19 and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Yu as applied in claims 1, 15 and 29, in view of SIVARAMAN et al. (hereinafter SIVARAMAN), US 20220086070 A1.

Regarding dependent claim 5, Yu discloses all the limitations as set forth in the rejection of claim 2 that is incorporated. Yu does not explicitly teach wherein the sub-network is determined based at least in part on maximum values of the set of scores.
However, in the same field of endeavor, SIVARAMAN teaches wherein the sub-network is determined based at least in part on maximum values of the set of importance scores ([0116] At the end of each epoch, a device (or a group of devices) will be chosen as the “winner” that has the maximum similarity score with the IoT device whose run-time profile is being checked. It is expected to have a group of winner devices when the dynamic similarity is considered).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of removing redundant nodes based on the calculated maximum similarity score as suggested in SIVARAMAN into Yu’s system because both of these systems are addressing removing redundant nodes based on similarity scores. This modification would have been motivated by the desire to improve network performance and security (SIVARAMAN, [0002]-[0003]).

Regarding dependent claim 6, Yu discloses all the limitations as set forth in the rejection of claim 3 that is incorporated. Yu does not explicitly teach
wherein the value is determined based on normalized differences between different sub-networks of the neural network.
However, in the same field of endeavor, SIVARAMAN teaches wherein the value is determined based on normalized differences between different sub-networks of the neural network ([0111] There are a number of metrics for measuring the similarity of two sets. For example, the Jaccard index has been used for comparing two sets of categorical values, and is defined by the ratio of the size of the intersection of two sets to the size of their union; [0113] These two metrics collectively represent the Jaccard index. Each of these metrics would take a value between 0 (i.e., dissimilar) and 1 (i.e., identical). Similarity scores are computed every epoch time (e.g., 15 minutes). When computing |R∩Mi|, redundant branches of the run-time profile are temporarily removed based on the MUD profile that it is being checked against. This assures that duplicate elements are pruned from R when checking against each M.sub.i).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of removing redundant nodes based on the calculated maximum similarity score as suggested in SIVARAMAN into Yu’s system because both of these systems are addressing removing redundant nodes based on similarity scores. This modification would have been motivated by the desire to improve network performance and security (SIVARAMAN, [0002]-[0003]).

Regarding dependent claim 7, the combination of Yu and SIVARAMAN teaches all the limitations as set forth in the rejection of claim 6 that is incorporated. Yu further teaches wherein the normalized differences are based on differences between numbers of nodes of layers of the different sub-networks of the neural network ([0041] one or more weight parameters may operate in a specified manner on one or more parameters representative of one or more output values to yield a connection, such as between a node of a first layer and a node of a second layer).

Regarding dependent claim 18, Yu teaches all the limitations as set forth in the rejection of claim 17 that is incorporated. Yu does not explicitly teach wherein the second value is calculated based on differences between the second sub-network and the first sub-network.
However, in the same field of endeavor, SIVARAMAN teaches wherein the second value is calculated based on differences between the second sub-network and the first sub-network ([0111] There are a number of metrics for measuring the similarity of two sets. For example, the Jaccard index has been used for comparing two sets of categorical values, and is defined by the ratio of the size of the intersection of two sets to the size of their union; [0113] These two metrics collectively represent the Jaccard index. Each of these metrics would take a value between 0 (i.e., dissimilar) and 1 (i.e., identical). Similarity scores are computed every epoch time (e.g., 15 minutes). When computing |R∩Mi|, redundant branches of the run-time profile are temporarily removed based on the MUD profile that it is being checked against. This assures that duplicate elements are pruned from R when checking against each M.sub.i).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of removing redundant nodes based on the calculated maximum similarity score as suggested in SIVARAMAN into Yu’s system because both of these systems are addressing removing redundant nodes based on similarity scores. This modification would have been motivated by the desire to improve network performance and security (SIVARAMAN, [0002]-[0003]).

Regarding dependent claim 19, Yu teaches all the limitations as set forth in the rejection of claim 17 that is incorporated. Yu does not explicitly teach wherein the one or more processors are further to compare the first value with the second value.
However, in the same field of endeavor, SIVARAMAN teaches wherein the one or more processors are further to compare the first value with the second value ([0111] There are a number of metrics for measuring the similarity of two sets. For example, the Jaccard index has been used for comparing two sets of categorical values, and is defined by the ratio of the size of the intersection of two sets to the size of their union; [0113] These two metrics collectively represent the Jaccard index. Each of these metrics would take a value between 0 (i.e., dissimilar) and 1 (i.e., identical). Similarity scores are computed every epoch time (e.g., 15 minutes). When computing |R∩Mi|, redundant branches of the run-time profile are temporarily removed based on the MUD profile that it is being checked against. This assures that duplicate elements are pruned from R when checking against each M.sub.i).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of removing redundant nodes based on the calculated maximum similarity score as suggested in SIVARAMAN into Yu’s system because both of these systems are addressing removing redundant nodes based on similarity scores. This modification would have been motivated by the desire to improve network performance and security (SIVARAMAN, [0002]-[0003]).

Regarding dependent claim 30, Yu teaches all the limitations as set forth in the rejection of claim 29 that is incorporated. Yu further teaches 
determine a first set of metric values for the neural network ([0027] pruning groups of neural network weight parameters may include calculating importance parameters (i.e. a first set of metric values) for respective groups of neural network weight parameters. For example, calculating importance parameters for respective groups of neural network weight parameters may include calculating root-mean-square (RMS) parameters for the respective groups);
determine a second set of metric values for the neural network ([0027] pruning groups of neural network weight parameters may include calculating importance parameters (i.e. a first set of metric values) for respective groups of neural network weight parameters. For example, calculating importance parameters for respective groups of neural network weight parameters may include calculating root-mean-square (RMS) parameters for the respective groups).
Yu does not explicitly teach
compare the second set of metric values and the first set of metric values to determine a metric value of the second set of metric values.
However, in the same field of endeavor, SIVARAMAN teaches compare the second set of metric values and the first set of metric values to determine a metric value of the second set of metric values ([0111] There are a number of metrics for measuring the similarity of two sets. For example, the Jaccard index has been used for comparing two sets of categorical values, and is defined by the ratio of the size of the intersection of two sets to the size of their union; [0113] These two metrics collectively represent the Jaccard index. Each of these metrics would take a value between 0 (i.e., dissimilar) and 1 (i.e., identical). Similarity scores are computed every epoch time (e.g., 15 minutes). When computing |R∩Mi|, redundant branches of the run-time profile are temporarily removed based on the MUD profile that it is being checked against. This assures that duplicate elements are pruned from R when checking against each M.sub.i).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of removing redundant nodes based on the calculated maximum similarity score as suggested in SIVARAMAN into Yu’s processor because both are addressing removing redundant nodes based on similarity scores. This modification would have been motivated by the desire to improve network performance and security (SIVARAMAN, [0002]-[0003]).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Yu as applied in claim 8, in view of TANG et al. (hereinafter TANG), US 20200372363 A1.

Regarding dependent claim 10, Yu teaches all the limitations as set forth in the rejection of claim 9 that is incorporated. Yu does not explicitly teach wherein the set of sub-networks are determined based on gradient-based criterion.
However, in the same field of endeavor, TANG teaches wherein the set of sub-networks are determined based on gradient-based criterion ([0012] Each processing node in the layer Lyr(j) may be coupled to one or more processing nodes in the preceding layer Lyr(j−1) via connections therebetween. Each connection may be associated with a weight, the processing node may compute a weighted sum of one or more pieces of input data from the processing nodes in the preceding layer Lyr(j−1). A connection associated with a weight larger in magnitude is more influential in generating the weighted sum than a connection associated with a weight smaller in magnitude. When the value of a weight is 0, the connection associated with the weight may be regarded as being eliminated from the artificial neural network 1, achieving network connectivity sparsity, and reducing computational complexity, power consumption and operational costs; [0031] The artificial neural network 1 separates the weights w into the connectivity variables {tilde over (m)} and the weight variables {tilde over (w)}, trains the connectivity variables {tilde over (m)} to form sparse connectivity structure, and trains the weight variables {tilde over (w)} to form a simple model for the artificial neural network 1. Further, in order to train the connectivity variables {tilde over (m)} and the weight variables {tilde over (w)}, the connectivity variable gradient is redefined as the connectivity mask gradient and the weight variable gradient is redefined as the weight gradient. The resultant sparse connectivity structure of the artificial neural network 1 can significantly reduce computational complexity, memory requirements and power consumption).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of reducing a total number of the connections based on connectivity variable gradients as suggested in TANG into Yu’s processor instructions because both are addressing reducing a total number of connections between the plurality of processing nodes to reduce a performance loss. This modification would have been motivated by the desire to improve network performance (TANG, [0004]).

Claims 14 and 33 are rejected under 35 U.S.C. 103 as being unpatentable over Yu as applied in claims 8 and 29, in view of Jain et al. (hereinafter Jain), US 20240007414 A1.

Regarding dependent claim 14, Yu teaches all the limitations as set forth in the rejection of claim 9 that is incorporated. Yu does not explicitly teach wherein the neural network is an image processing neural network of one or more vehicle systems.
However, in the same field of endeavor, Jain teaches wherein the neural network is an image processing neural network of one or more vehicle systems (Fig. F3; [0182] Other example groups of IoT devices may include remote weather stations F314, local information terminals F316, alarm systems F318, automated teller machines F320, alarm panels F322, or moving vehicles, such as emergency vehicles F324 or other vehicles F326, among many others; [0688] At least one benefit of examples disclosed herein results in better filters (sometimes referred to as “descriptors”) generated during image convolution (i.e. an image processing neural network). Briefly turning to FIG. ID14_2B, the example input image ID14_210 is shown with eight (8) example filters surrounding it. The illustrated example of FIG. ID14_2B includes a first dynamic filter ID14_250, a second dynamic filter ID14_252, a third dynamic filter ID14_254, a fourth dynamic filter ID14_256, a fifth dynamic filter ID14_258, a sixth dynamic filter ID14_260, a seventh dynamic filter ID14_262 and an eighth dynamic filter ID14_264. Unlike traditional convolution techniques that apply the same filters in a manner independent of specific portions of the input image ID14_210, example dynamic filters disclosed herein are uniquely generated based on each portion of the input image ID14_210; [0708]-[0709] FIG. ID14_4 is a flowchart representative of example machine-readable instructions ID14_400 which may be executed to implement the example machine learning trainer ID14_300 to train a CNN using dynamic kernels).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of image processing in a convolutional neural network as suggested in Jain into Yu’s processor instructions because both are addressing pruning technique. This modification would have been motivated by the desire to provide pruning technique to minimize loss in accuracy and performance (Jain, [0494]).

Regarding dependent claim 33, Yu teaches all the limitations as set forth in the rejection of claim 29 that is incorporated. Yu does not explicitly teach wherein the one or more inputs comprise one or more images.
However, in the same field of endeavor, Jain teaches wherein the one or more inputs comprise one or more images ([0684] FIG. ID14_2A is a conceptual illustration of an example convolution operation using a dynamic kernel ID14_200. As shown, an example input image ID14_210 is represented similarly to input image ID14_110 as a grid of pixel values).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of image processing in a convolutional neural network as suggested in Jain into Yu’s processor because both are addressing pruning technique. This modification would have been motivated by the desire to provide pruning technique to minimize loss in accuracy and performance (Jain, [0494]).

Claims 20-21 and 31-32 are rejected under 35 U.S.C. 103 as being unpatentable over Yu, in view of SIVARAMAN as applied in claims 19 and 30, further in view of TANG.

Regarding dependent claim 20, the combination of Yu and SIVARAMAN teaches all the limitations as set forth in the rejection of claim 19 that is incorporated. The combination of Yu and SIVARAMAN does not explicitly teach wherein the one or more processors are to remove the one or more nodes of the neural network by at least, as a result of determining that the second value is greater than the first value and one or more values for one or more sub-networks, removing the one or more nodes from the neural network to result in a pruned neural network corresponding to the second sub-network.
However, in the same field of endeavor, TANG teaches wherein the one or more processors are to remove the one or more nodes of the neural network by at least, as a result of determining that the second value is greater than the first value and one or more values for one or more sub-networks, removing the one or more nodes from the neural network to result in a pruned neural network corresponding to the second sub-network ([0012] Each processing node in the layer Lyr(j) may be coupled to one or more processing nodes in the preceding layer Lyr(j−1) via connections therebetween. Each connection may be associated with a weight, the processing node may compute a weighted sum of one or more pieces of input data from the processing nodes in the preceding layer Lyr(j−1). A connection associated with a weight larger in magnitude is more influential in generating the weighted sum than a connection associated with a weight smaller in magnitude. When the value of a weight is 0, the connection associated with the weight may be regarded as being eliminated from the artificial neural network 1, achieving network connectivity sparsity, and reducing computational complexity, power consumption and operational costs).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of reducing a total number of the connections based on connectivity variable gradients as suggested in TANG into Yu and SIVARAMAN’s system because both are addressing reducing a total number of connections between the plurality of processing nodes to reduce a performance loss. This modification would have been motivated by the desire to improve network performance (TANG, [0004]).

Regarding dependent claim 21, the combination of Yu, SIVARAMAN and TANG teaches all the limitations as set forth in the rejection of claim 20 that is incorporated. TANG further teaches wherein the one or more processors are further to perform one or more training processes on the pruned neural network using one or more gradient descent algorithms ([0012] Each processing node in the layer Lyr(j) may be coupled to one or more processing nodes in the preceding layer Lyr(j−1) via connections therebetween. Each connection may be associated with a weight, the processing node may compute a weighted sum of one or more pieces of input data from the processing nodes in the preceding layer Lyr(j−1). A connection associated with a weight larger in magnitude is more influential in generating the weighted sum than a connection associated with a weight smaller in magnitude. When the value of a weight is 0, the connection associated with the weight may be regarded as being eliminated from the artificial neural network 1, achieving network connectivity sparsity, and reducing computational complexity, power consumption and operational costs; [0031] The artificial neural network 1 separates the weights w into the connectivity variables {tilde over (m)} and the weight variables {tilde over (w)}, trains the connectivity variables {tilde over (m)} to form sparse connectivity structure, and trains the weight variables {tilde over (w)} to form a simple model for the artificial neural network 1. Further, in order to train the connectivity variables {tilde over (m)} and the weight variables {tilde over (w)}, the connectivity variable gradient is redefined as the connectivity mask gradient and the weight variable gradient is redefined as the weight gradient. The resultant sparse connectivity structure of the artificial neural network 1 can significantly reduce computational complexity, memory requirements and power consumption).

Regarding dependent claim 31, the combination of Yu and SIVARAMAN teaches all the limitations as set forth in the rejection of claim 30 that is incorporated. The combination of Yu and SIVARAMAN does not explicitly teach wherein the metric value is greater than one or more metric values of the first set of metric values and the second set of metric values.
However, in the same field of endeavor, TANG teaches wherein the metric value is greater than one or more metric values of the first set of metric values and the second set of metric values ([0012] Each processing node in the layer Lyr(j) may be coupled to one or more processing nodes in the preceding layer Lyr(j−1) via connections therebetween. Each connection may be associated with a weight, the processing node may compute a weighted sum of one or more pieces of input data from the processing nodes in the preceding layer Lyr(j−1). A connection associated with a weight larger in magnitude is more influential in generating the weighted sum than a connection associated with a weight smaller in magnitude. When the value of a weight is 0, the connection associated with the weight may be regarded as being eliminated from the artificial neural network 1, achieving network connectivity sparsity, and reducing computational complexity, power consumption and operational costs).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of reducing a total number of the connections based on connectivity variable gradients as suggested in TANG into Yu and SIVARAMAN’s processor because both are addressing reducing a total number of connections between the plurality of processing nodes to reduce a performance loss. This modification would have been motivated by the desire to improve network performance (TANG, [0004]).

Regarding dependent claim 32, the combination of Yu, SIVARAMAN and TANG teaches all the limitations as set forth in the rejection of claim 31 that is incorporated. TANG further teaches wherein the one or more circuits are further to remove the one or more nodes of the neural network based at least in part on the metric value to obtain the sub-network corresponding to the metric value ([0012] A connection associated with a weight larger in magnitude is more influential in generating the weighted sum than a connection associated with a weight smaller in magnitude. When the value of a weight is 0, the connection associated with the weight may be regarded as being eliminated from the artificial neural network 1, achieving network connectivity sparsity, and reducing computational complexity, power consumption and operational costs).

Response to Arguments
Applicant's arguments filed 10/03/2025 have been fully considered.
(1) The 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph rejections to Claim 5 is respectfully withdrawn in response to Applicant's amendment to the claim.
(2) In the Remark, Applicant argues that Yu fails to teach or suggest calculating, for a sub-network of a neural network, a stability value that indicates a measure of potential change to parameters of a plurality of nodes of the sub-network during one or more subsequent updates of the neural network as recited by amended claim 1. In fact, Yu makes no mention of calculating any value that indicates a measure of potential change to parameters of nodes of a network, much less calculating the stability value that indicates the measure of potential change to the parameters during-one or more subsequent updates of the neural network, which is used to determine that the sub-network is stable as recited by amended claim 1. 
As to point (2), Examiner respectfully disagrees. Examiner notes that the claims place no limitations on how a stability value indicates a measure of potential change to parameters of a plurality of nodes of the sub-network during one or more subsequent updates of the neural network and what criteria the stability value needs to satisfy to determine a sub-network is considered stable. Thus, Yu’s disclosure of referring “importance parameter” to a parameter for a respective weight parameter and/or a respective group of weight parameters that may be compared against a threshold value, such as to determine whether a weight parameter and/or group of weight parameters are to be removed and a process of grouping and/or pruning sets of neural network weight parameters and retraining the grouped and/or pruned sets of neural network weight parameters may be performed in an iterative fashion, such as until a calculated accuracy parameter is determined to fall below a specified threshold and calculating importance parameters for groups of neural network weight parameters where these importance parameters are evaluated and compared to a specified threshold value ([0017]; Fig. 3; [0026]-[0028]) is considered within the broadest reasonable interpretation of the claim. 
Similar arguments have been presented for independent claims 8, 15 and 29 and thus, Applicant’s arguments are not persuasive for the same reasons.
Claims 1-21 and 29-34 remain rejected as set forth above.
(3) Regarding to 35 U.S.C 101 rejection, Applicant argues that amended claim 1 does not recite a "mathematical concept" because it involves a calculation; the claim also recites other features that are based on a calculation. For example, amended claim 1 recites removing nodes of a neural network based on a determination that the sub-network is stable. Therefore, the amended claims do not fall within the subject matter grouping of "mathematical concepts."
As to point (3), Examiner respectfully disagrees. Amended claim 1 recites calculate, for a sub-network of a neural network, a stability value that indicates a measure of potential change to parameters of a plurality of nodes of the sub- network during one or more subsequent updates of the neural network. This limitation encompasses a mathematical concept of a mathematical calculation with calculating a value that indicates a measure of potential change to parameters to at least in part train a neural network. Further, the claim recites based on the determination that the sub-network is stable, remove one or more nodes of the neural network other than the plurality of nodes. As discussed in the rejection above, the claims are directed to an abstract idea that encompasses mental processes of evaluating data when the sub-network is stable, selecting data based on judgement and removing nodes based on the determination that the sub-network is stable. The claims place no limits on how the mental steps are performed. That is, nothing in the claim element precludes the step from practically being performed in the mind.  Thus, the broadest reasonable interpretation of the steps is that those steps fall within the mental process groupings of abstract ideas because they cover concepts performed in the human mind, including observation, evaluation, judgment, and opinion. See MPEP 2106.04.
(4) Applicant alleges that the claim is not directed to a judicial exception because the claim as a whole integrates the judicial exception into a practical application. The claims reflect an improvement to the technology for the technical field of training neural networks and the use of the alleged judicial exception "meaningfully limits the claim by going beyond generally linking the use of the judicial exception to a particular technological environment." For example, amended claim 1 recites "calculate, for a sub- network of a neural network, a stability value that indicates a measure of potential change to parameters of a plurality of nodes of the sub-network of the neural network during one or more subsequent updates of the neural network ...based on the determination that the sub-network is stable, remove one or more nodes of the neural network other than the plurality of nodes." By removing nodes of a neural network based on stability value, as recited in amended claim 1, the amount of time to train of a neural network can be reduced, resulting in an improvement in the speed to train a neural network.
As to point (4), Examiner respectfully disagrees. One way to determine integration into a practical application is when the claimed invention improves the functioning of a computer or improves another technology or technical field. To evaluate an improvement to a computer or technical field, the specification must set forth an improvement in technology and the claim itself must reflect the disclosed improvement. See MPEP 2106.04(d)(1) and 2106.05(a). The consideration of whether the claim as a whole includes an improvement to a computer or to a technological field requires an evaluation of the specification and the claim to ensure that a technical explanation of the asserted improvement is present in the specification, and that the claim reflects the asserted improvement. While the disclosure states that “techniques described herein achieve various technical advantages, including but not limited to: an ability to determine an optimal time to prune one or more neurons from a neural network; an ability to reduce complexity of a neural network during one or more training processes; an ability to reduce training time of a pruned neural network; and various other technical advantages”, there is no improvement to the functioning of a computer nor to any other technology. At best, the claimed combination amounts to an improvement to the abstract idea rather than to any technology. See MPEP 2106.05(a). Any purported improvements are provided by the judicial exception alone, i.e. mathematical calculations/relationships, thus the claim as a whole does not integrate the judicial exception into a practical application nor provide significantly more than the judicial exception. Thus, the claims are patent ineligible and are rejected under 35 U.S.C. 101 as detailed in the rejections set forth above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Applicant is required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
Deangelis et al. (US 5734797 A) discloses generating and optimizing artificial neural networks.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way.  A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMY P HOANG whose telephone number is (469)295-9134. The examiner can normally be reached M-TH 8:30-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JENNIFER WELCH can be reached at 571-272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AMY P HOANG/Examiner, Art Unit 2143                                                                                                                                                                                                        
/JENNIFER N WELCH/Supervisory Patent Examiner, Art Unit 2143
Read full office action
Prosecution Timeline

Mar 15, 2021
Application Filed
Apr 24, 2024
Non-Final Rejection — §101, §102, §103
Jul 16, 2024
Interview Requested
Jul 23, 2024
Examiner Interview Summary
Jul 23, 2024
Applicant Interview (Telephonic)
Aug 29, 2024
Response Filed
Oct 21, 2024
Final Rejection — §101, §102, §103
Dec 06, 2024
Interview Requested
Mar 24, 2025
Applicant Interview (Telephonic)
Mar 24, 2025
Examiner Interview Summary
Apr 24, 2025
Request for Continued Examination
Apr 28, 2025
Response after Non-Final Action
May 30, 2025
Non-Final Rejection — §101, §102, §103
Oct 03, 2025
Response Filed
Jan 05, 2026
Final Rejection — §101, §102, §103
Feb 25, 2026
Applicant Interview (Telephonic)
Feb 25, 2026
Examiner Interview Summary
Precedent Cases

Applications granted by this same examiner with similar technology

17/455,325
Patent 12602596
APPARATUS AND METHOD FOR VALIDATING DATASET BASED ON FEATURE COVERAGE
2y 5m to grant Granted Apr 14, 2026
18/525,453
Patent 12572263
ACCESS CARD WITH CONFIGURABLE RULES
2y 5m to grant Granted Mar 10, 2026
17/572,921
Patent 12536432
PRE-TRAINING METHOD OF NEURAL NETWORK MODEL, ELECTRONIC DEVICE AND MEDIUM
2y 5m to grant Granted Jan 27, 2026
17/241,391
Patent 12475669
METHOD AND APPARATUS WITH NEURAL NETWORK OPERATION FOR DATA NORMALIZATION
2y 5m to grant Granted Nov 18, 2025
18/386,907
Patent 12461595
SYSTEM AND METHOD FOR EMBEDDED COGNITIVE STATE METRIC SYSTEM
2y 5m to grant Granted Nov 04, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
70%
Grant Probability
99%
With Interview (+64.2%)
3y 3m
Median Time to Grant
High
PTA Risk
Based on 232 resolved cases by this examiner. Grant probability derived from career allow rate.
PRUNING NEURAL NETWORKS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email