Notice of Pre-AIA or AIA Status
This Final communication is in response to Application No. 17/842,611 filed 06/16/2022. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The amendment filed 10/29/2025, which provides amendments to claims 1 and 5, cancels claims 2-4 and 6-7 and adds claims 8 and 9, has been entered. Claims 1, 5, 8-9 are pending. Applicant’s amendments have overcome the 112(b) and 101 rejections.
Response to Arguments
Applicant’s arguments with respect to 35 U.S.C § 103 filed 10/29/2025 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim 1, 5 and 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Xu (US 2022/0067527 A1) in view of Nagel (NPL: ‘Data-Free Quantization Through Weight Equalization and Bias Correction’ (Published 2019)).
Regarding claim 1, Xu teaches:
A method of retraining a compression model executed in a computer device, wherein the computer device comprises at least one processor configured to execute computer-readable instructions included in a memory, and ([0028] “In some implementations, a network compression engine 205 may be equipped with logic executable to train deep neural networks to jointly optimize for both sparse and low-precision neural networks while maintaining high accuracy and providing a high compression rate.”)
training, by the at least one processor, a deep learning model to generate initially trained weights of the trained deep learning model; and ([0138] “Example 16 is a method including: performing a plurality of a training iterations using a set of training data to train a neural network model, where each of the plurality of training iterations includes a respective forward pass and a respective backward pass, and the neural network model includes a plurality of layers;”)
performing, by the at least one processor, a plurality of training iterations, wherein each training iteration of the plurality of training iterations comprises: ([0031] “Following completion of the training iteration, the next training iteration may be performed, with the next iteration of sparsification-quantization being performed on the updated full-precision weights.”)
pruning, by the at least one processor, the weights of the trained deep learning model to acquire pruned weights… ([0023] “For instance, network pruning may be utilized to assist in deploy neural network models on embedded systems with limited hardware resources. Network pruning may reduce network size by pruning redundant connections or channels from pre-trained models, and fine-tuning the pruned model to recover accuracy. During network pruning, while many of the related approaches differ in the method of determining the importance of weights or channels of the subject neural network model may be determined, with weights or channels determined to be of lesser importance pruned, or removed, from the model.”)
retraining, by the at least one processor, the deep learning model… (0054] “During at least a subset of the training iterations to be performed during the training, weights of one or more layers may be sparsified 615 during corresponding forward passes of the training iteration. The sparsification may be according to a layer-wise statistic aware sparsification (or pruning) technique (among other example sparsification techniques), which results in a subset of the weights of the corresponding layer being removed (at least for this particular training iteration).” See FIG 6 for training iterations).
Xu does not teach:
the pruning increases a variance of the pruned weights;
reducing, by the at least one processor, the increased variance of the pruned weights to generate variance-reduced weights by applying variance equalization
However, Nagel does teach this:
the pruning increases a variance of the pruned weights; (Section 3.2 Biased Quantization Error “Such a biased error on the outputs can be introduced in many settings, e.g. when weights or activations are clipped [23], or in non-quantization approaches, such as weight tensor factorization or channel pruning [13, 34]”)
reducing, by the at least one processor, the increased variance of the pruned weights to generate variance-reduced weights by applying variance equalization (Section 4.1.2 Equalizing ranges over multiple layers describes how they equalization the ranges of channels/weights. This would also equalizes the variance of the channels/weights.)
Xu and Nagel are considered analogous art to the claimed invention because they are in the same field of endeavor being model compression and quantization. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the weight pruning of Xu with the weight equalization of Nagel and then use Xu retraining on the new weights. One would be motivated since pruning weights can present error (Nagel Section 3.2 Biased Quantization) and weight equalization can prevent these errors.
Regarding claim 5, Xu teaches:
at least one processor implemented to execute computer-readable instructions included in a memory ([0100] “In an example, the instructions 1082 provided via the memory 1054, the storage 1058, or the processor 1052 may be embodied as a non-transitory, machine readable medium 1060 including code to direct the processor 1052 to perform electronic operations in the IoT device 1050.”)
training a deep learning model to generate initially trained weights of the trained deep learning model; and ([0138] “Example 16 is a method including: performing a plurality of a training iterations using a set of training data to train a neural network model, where each of the plurality of training iterations includes a respective forward pass and a respective backward pass, and the neural network model includes a plurality of layers;”)
performing, by the at least one processor, a plurality of training iterations, wherein each training iteration of the plurality of training iterations comprises: ([0031] “Following completion of the training iteration, the next training iteration may be performed, with the next iteration of sparsification-quantization being performed on the updated full-precision weights.”)
pruning the weights of the trained deep learning model to acquire pruned weights… ([0023] “For instance, network pruning may be utilized to assist in deploy neural network models on embedded systems with limited hardware resources. Network pruning may reduce network size by pruning redundant connections or channels from pre-trained models, and fine-tuning the pruned model to recover accuracy. During network pruning, while many of the related approaches differ in the method of determining the importance of weights or channels of the subject neural network model may be determined, with weights or channels determined to be of lesser importance pruned, or removed, from the model.”)
retraining the deep learning model… (0054] “During at least a subset of the training iterations to be performed during the training, weights of one or more layers may be sparsified 615 during corresponding forward passes of the training iteration. The sparsification may be according to a layer-wise statistic aware sparsification (or pruning) technique (among other example sparsification techniques), which results in a subset of the weights of the corresponding layer being removed (at least for this particular training iteration).” See FIG 6 for training iterations).
Xu does not teach:
the pruning increases a variance of the pruned weights;
reducing the increased variance of the pruned weights to generate variance-reduced weights by applying variance equalization
However, Nagel does teach this:
the pruning increases a variance of the pruned weights; (Section 3.2 Biased Quantization Error “Such a biased error on the outputs can be introduced in many settings, e.g. when weights or activations are clipped [23], or in non-quantization approaches, such as weight tensor factorization or channel pruning [13, 34]”)
reducing the increased variance of the pruned weights to generate variance-reduced weights by applying variance equalization (Section 4.1.2 Equalizing ranges over multiple layers describes how they equalization the ranges of channels/weights. This would also equalizes the variance of the channels/weights.)
Xu and Nagel are considered analogous art to the claimed invention because they are in the same field of endeavor being model compression and quantization. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the weight pruning of Xu with the weight equalization of Nagel and then use Xu retraining on the new weights. One would be motivated since pruning weights can present error (Nagel Section 3.2 Biased Quantization) and weight equalization can prevent these errors.
Regarding claim 8, Xu in view of Nagel teaches claim 1 as outlined above. Nagel further teaches:
the reducing comprises reducing the increased variance of the pruned weights to generate the variance-reduced weights by applying the variance equalization based on the initially trained weights. (Section 4.1.2 Equalizing ranges over multiple layers describes how they equalization the ranges of channels/weights. This would also equalizes the variance of the channels/weights.)
Regarding claim 9, Xu in view of Nagel teaches claim 5 as outlined above. Nagel further teaches:
reduce the increased variance of the pruned weights to generate the variance-reduced weights by applying the variance equalization based on the initially trained weights. (Section 4.1.2 Equalizing ranges over multiple layers describes how they equalization the ranges of channels/weights. This would also equalizes the variance of the channels/weights.)
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL PATRICK GRUSZKA whose telephone number is (571)272-5259. The examiner can normally be reached M-F 9:00 AM - 6:00 PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached at (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/DANIEL GRUSZKA/Examiner, Art Unit 2121
/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121