Prosecution Insights
Last updated: April 19, 2026
Application No. 18/411,542

IMPORTANCE-AWARE MODEL PRUNING AND RE-TRAINING FOR EFFICIENT CONVOLUTIONAL NEURAL NETWORKS

Non-Final OA §101§102§103
Filed
Jan 12, 2024
Examiner
VARNDELL, ROSS E
Art Unit
2674
Tech Center
2600 — Communications
Assignee
Intel Corporation
OA Round
1 (Non-Final)
85%
Grant Probability
Favorable
1-2
OA Rounds
2y 4m
To Grant
98%
With Interview

Examiner Intelligence

Grants 85% — above average
85%
Career Allow Rate
520 granted / 615 resolved
+22.6% vs TC avg
Moderate +13% lift
Without
With
+13.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 4m
Avg Prosecution
28 currently pending
Career history
643
Total Applications
across all art units

Statute-Specific Performance

§101
6.3%
-33.7% vs TC avg
§103
66.9%
+26.9% vs TC avg
§102
6.4%
-33.6% vs TC avg
§112
10.7%
-29.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 615 resolved cases

Office Action

§101 §102 §103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 26-45 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim(s) recite(s) mathematical calculations for pruning a neural network. Specifically, computing a loss, deriving importance scores for weights based on that loss, selecting weights with lower importance scores, and setting those weights to zero. This judicial exception is not integrated into a practical application because the claims are drafted a purely functional level with no recitation of a specific technical improvement, particular machine, or transformation beyond generic computing components. The covariance-based improvement described in the specification, which is the actual technical advance over conventional pruning, is not recited in the claims. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited additional elements (a neural network, a computer processor, and non-transitory computer-readable media) are generic computing components that are well-understood, routine, and conventional, and do not meaningfully limit the abstract idea. Claims 26, 33, 40: Step 1: All three are statutory categories (process, manufacture, machine). Step 2A, Prong 1: The claims recite mathematical concepts - computing a loss, deriving importance scores from the loss, selecting weights by score, and zeroing them. This is a mathematical calculation/relationship under MPEP § 2106.04(a)(2)(1). Judicial exception identified. Step 2A, Prong 2: No practical application. The claims are purely functional with no recitation of a specific technical improvement, particular machine, or transformation. The covariance-based improvement described in the spec is not claimed. "Neural network" at this abstraction level is itself a mathematical construct. Step 2B: Generic processor and non-transitory media are WURC. No inventive concept. Conclusion: Claims 26, 33, and 40 are rejected under 35 U.S.C. § 101 as directed to the abstract idea of mathematical calculations - computing a loss, scoring weights by their effect on that loss, selecting low-scoring weights, and zeroing them – without integration into a practical application or significantly more. See MPEP § § 2106, 2106.04, 2106.04(a)(2)(I), 2106.05, 2106.07; Alice Corp. v. CLS Bank lnt'l, 573 U.S. 208 (2014); Mayo Collaborative Servs. v. Prometheus Labs., Inc. , 566 U.S. 66 (2012); Recentive Analytics, Inc. v. Fox Corp., 101 F.4th 956 (Fed. Cir. 2024). Claim Meaningful Limitation Added Eligible? Rationale 27, 34, 41 Select weight with smaller importance score No Mathematical comparison – adds math to math 28, 35 , 42 Input data is training data No Field-of-use; does not integrate exception into practical application 29, 36, 43 Network already trained; maintain unselected weights; further train after zeroing No Data-state descriptor & additional mathematical steps on the model 30, 37, 44 During retraining, hold zeros fixed; update only non-zero weights No Constrained optimization – limits abstract idea without integrating it into practical application 31 , 38, 45 Select and zero an additional unselected weight No Iterating the same abstract mathematical process 32, 39 Layers are convolutional No Architectural field-of-use limitation; Recentive Analytics, applying math to a particular model type is insufficient All dependent claims 27-32, 34-39, and 41-45 remain ineligible. None adds a limitation that (1) reflects a specific technical improvement grounded in the claim language, (2) ties the abstract idea to a particular machine in a non-generic way, or (3) effects a transformation of a physical article. The dependent claims collectively add mathematical refinements (score comparison method, data type, training state, fixed-zero retraining, iteration, architecture type) all of which operate entirely within the abstract idea or at best constitute field-of-use or data-category narrowing. Claim Rejections - 35 USC § 102 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claim(s) 26-31, 33- 38, and 40-45 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Le Cun et al., "Optimal Brain Damage," Advances in Neural Information Processing Systems, vol. 2, pp. 598- 605, 1990 (hereinafter "OBD"). Claims 26, 33, and 40. (New) OBD discloses one or more non-transitory computer-readable media storing instructions executable to perform operations (ODB describes a computational algorithm executed on a trained network: “training has converged” and the method proceeds by “computing the second derivates … hkk” via backpropagation, then iterating “to step 2.” The entire ODB procedure (forward pass, Hessian diagonal computation, saliency ranking, and weight deletion) is an algorithmic sequence that can only be performed by a processor operating on weights stored in memory. The paper reports empirical results on a digit recognition system, confirming actual computer implementation. The examiner takes official notice that implementing such an algorithm on a processor and storing it on a non-transitory computer-readable medium was well-known and the only possible means of execution.), the operations comprising: providing input data to a neural network, the neural network comprising one or more layers with weights, the input data processed in the one or more layers (OBD: "The method was validated using our handwritten digit recognition network trained with backpropagation ... The network state is computed using the standard formulae PNG media_image1.png 49 167 media_image1.png Greyscale ... where xi is the state of unit i, ai its total input (weighted sum) ... and Wij is the connection going from unit j to unit i" (Section 2.1). This teaches providing input data to a multi-layer neural network whose layers have weight parameters Wij.); computing a loss of the neural network based on the input data and the weights (OBD: "We assume the objective function is the usual mean-squared error (MSE); generalization to other additive error measures is straightforward ... We approximate the objective function E by a Taylor series. A perturbation δU of the parameter vector will change the objective function by PNG media_image2.png 51 438 media_image2.png Greyscale " (Section 2). This teaches computing a loss function E (the MSE objective) as a function of the input data and the network weights.); determining importance scores for the weights based on the loss, an importance score of a weight indicating a measurement of a change in the loss by removing the weight (OBD: "it is more than reasonable to define the saliency of a parameter to be the change in the objective function caused by deleting that parameter ... 4. Compute the saliencies for each parameter: sk = hkk uk2 / 2" (Section 2 and Section 2.2, The Recipe, Step 4). The saliency sk derived from the diagonal second derivative hkk of the loss E and the weight value uk, and directly approximate ΔE – the change in the loss – caused by deleting (setting to zero) weight parameter k. This teaches determining an importance score (saliency) for each weight that indicates a measurement of the change in the loss by removing that weight.); selecting one or more weights based on the importance scores of the weights (OBD: "Sort the parameters by saliency and delete some low-saliency parameters" (Section 2.2, The Recipe, Step 5). This teaches selecting one or more weights – specifically those with the lowest importance scores (saliencies) – for removal.); and changing the one or more selected weights to one or more zeros (OBD: "Deleting a parameter is defined as setting it to 0 and freezing it there" (Section 2.2). This teaches changing the selected low-saliency weights to zero.). Claims 27, 34, and 41. (New) ODB discloses the one or more non-transitory computer-readable media of The one or more non-transitory computer-readable media of wherein selecting the one or more weights based on the importance scores of the weights comprises: comparing an importance score of a first weight with an importance score of a second weight; and selecting the first weight over the second weight based on the importance score of the first weight being smaller than the importance score of the second weight (OBD: "Sort the parameters by saliency and delete some low-saliency parameters" (Section 2.2, Step 5); "It is clear that deleting parameters by order of saliency causes a significantly smaller increase of the objective function than deleting them according to their magnitude" (Section 2.3). Sorting by saliency rank and deleting those with the lowest (smallest) saliency scores directly teaches comparing the importance scores of individual weights and selecting a weight with a smaller importance score over a weight with a larger importance score for deletion.). Claims 28, 35, and 42. (New) ODB discloses the one or more non-transitory computer-readable media of The one or more non-transitory computer-readable media of wherein the input data is training data used to train the neural network (OBD: "It was trained on a database of segmented handwritten zip code digits and printed digits containing approximately 9300 training examples" (Section 2.3). This teaches that the input data provided to the neural network is training data used in training the network.). Claims 29, 36, and 43. (New) ODB discloses the one or more non-transitory computer-readable media of The one or more non-transitory computer-readable media of wherein the neural network has been trained, and the operations further comprise: maintaining one or more values of one or more unselected weights; and after changing the one or more selected weights to the one or more zeros and maintaining the one or more values of the one or more unselected weights, further training the neural network (OBD: "Train the network until a reasonable solution is obtained" (Section 2.2, Step 2) – the neural network has been trained prior to pruning. "Deleting a parameter is defined as setting it to 0 and freezing it there" (Section 2.2) – only the selected low-saliency weights are frozen at zero; the unselected weights retain their current values (are maintained). "Iterate to step 2" (Section 2.2, Step 6) – the network undergoes further training (retraining) after the selected weights are set to zero. This teaches maintaining the values of unselected weights while further training the network after changing selected weights to zeros.). Claims 30, 37, and 44. (New) ODB discloses the one or more non-transitory computer-readable media of The one or more non-transitory computer-readable media of wherein further training the neural network comprises: maintaining the one or more zeros; and modifying the one or more values of the one or more unselected weights (OBD: "Deleting a parameter is defined as setting it to 0 and freezing it there" (Section 2.2). The phrase "freezing it there" expressly teaches that the zeroed weights are maintained (held at zero) throughout further training. The remaining unselected weights are then updated through continued backpropagation in Step 2, thereby modifying their values while the zeros are maintained. This teaches maintaining zeros of selected weights and modifying values of unselected weights during further training.). Claims 31, 38, and 45. (New) ODB discloses the one or more non-transitory computer-readable media of The one or more non-transitory computer-readable media of wherein the operations further comprise: selecting an additional weight from the one or more unselected weights based on one or more importance scores of the one or more unselected weights; and changing the additional weight to a zero (OBD: "Iterate to step 2" (Section 2.2, Step 6). In each subsequent iteration, OBD recomputes the second-derivative saliency scores (Steps 3-4) for the remaining non-zero (previously unselected) weights, then "sort[s] the parameters by saliency and delete[s] some low-saliency parameters" (Step 5) – i.e., selects one or more additional weights from the unselected pool based on their updated importance scores and sets them to zero. This teaches iteratively selecting additional weights from the unselected weights based on importance scores and changing them to zero. See also OBD Section 2.3, Figure 2, showing the performance benefit of iterative pruning with retraining.). Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. Claims 32 and 39 is/are rejected under 35 U.S.C. 103 as being unpatentable over OBD as applied to claims 26 and 33 above, in view of Han et al., "Learning both Weights and Connections for Efficient Neural Networks," arXiv: 1506.02626, 2015 (hereinafter "Han"). Claims 32 and 39. (New) The one or more non-transitory computer-readable media of The one or more non-transitory computer-readable media of wherein the one or more layers comprises one or more convolutional layers. OBD does not explicitly recite that the one or more layers comprise one or more convolutional layers, as OBD describes its network as a shared-weight architecture. However, Han, in the same field of neural network pruning for computational efficiency, explicitly teaches that the weight selection and zeroing methodology is applicable to one or more convolutional layers (Han: "Both CONV and FC layers can be pruned, but with different sensitivity" (Section 5); "pruning reduces the number of weights by 12x and computation by 6x ... [for layers] conv1, conv2" (Table 3, LeNet-5 results); "We further examine the performance of pruning on the lmageNet ... dataset ... VGG-16 has far more convolutional layers ... We aggressively pruned both convolutional and fully-connected layers" (Section 4.3). This teaches that a weight-pruning-and-zeroing methodology like that of OBD applies to and is particularly beneficial for one or more convolutional layers in a CNN.) Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply OBD's importance-score-based weight pruning to convolutional layers, as taught by Han. The motivation for this combination would have been to reduce the storage and computational burden of convolutional layers; which, as Han demonstrates, account for the majority of parameters and arithmetic operations in deep CNNs used for computer vision tasks. Thereby enabling deployment of accurate neural network models on resource constrained mobile and embedded devices without loss of predictive accuracy. Conclusion The prior art made of record but not relied, yet considered pertinent to the applicant’s disclosure, is listed on the PTO-892 form. Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ross Varndell whose telephone number is (571)270-1922. The examiner can normally be reached M-F, 9-5 EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, O’Neal Mistry can be reached at (313)446-4912. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /Ross Varndell/Primary Examiner, Art Unit 2674
Read full office action

Prosecution Timeline

Jan 12, 2024
Application Filed
Apr 01, 2026
Non-Final Rejection — §101, §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12603810
System and Method for Communications Beam Recovery
2y 5m to grant Granted Apr 14, 2026
Patent 12597238
AUTOMATIC IMAGE VARIETY SIMULATION FOR IMPROVED DEEP LEARNING PERFORMANCE
2y 5m to grant Granted Apr 07, 2026
Patent 12582348
DEVICE AND METHOD FOR INSPECTING A HAIR SAMPLE
2y 5m to grant Granted Mar 24, 2026
Patent 12579441
SYSTEMS AND METHODS FOR IMAGE RECONSTRUCTION
2y 5m to grant Granted Mar 17, 2026
Patent 12579786
SYSTEM AND METHOD FOR PROPERTY TYPICALITY DETERMINATION
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
85%
Grant Probability
98%
With Interview (+13.0%)
2y 4m
Median Time to Grant
Low
PTA Risk
Based on 615 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month