Prosecution Insights
Last updated: April 19, 2026
Application No. 18/362,435

Neural Network Training Method and Related Device

Non-Final OA §101§102§103
Filed
Jul 31, 2023
Examiner
HICKS, AUSTIN JAMES
Art Unit
2142
Tech Center
2100 — Computer Architecture & Software
Assignee
Huawei Technologies Co., Ltd.
OA Round
1 (Non-Final)
76%
Grant Probability
Favorable
1-2
OA Rounds
3y 4m
To Grant
99%
With Interview

Examiner Intelligence

Grants 76% — above average
76%
Career Allow Rate
308 granted / 403 resolved
+21.4% vs TC avg
Strong +25% interview lift
Without
With
+25.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
54 currently pending
Career history
457
Total Applications
across all art units

Statute-Specific Performance

§101
13.9%
-26.1% vs TC avg
§103
46.3%
+6.3% vs TC avg
§102
17.3%
-22.7% vs TC avg
§112
19.2%
-20.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 403 resolved cases

Office Action

§101 §102 §103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea of a mathematical concept without significantly more. The claims recite calculating the weights of a neural network using forward propagation, determining the fit using expansion and binarization, calculating the gradient of the error for the weights, calculating subfunction of the series expansion, fitting the error to a neural network, and using Fourier/wavelet/discrete Fourier as the expansion. This judicial exception is not integrated into a practical application because the claims to a specific data type merely link the abstract idea to the field of computers. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the components such as memory and processor are generic computer parts. Claim Rejections - 35 USC § 102 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claims 1, 2, 4, 5, 7-9, 11, 12, 14-16, 18 and 19 are rejected under 35 U.S.C. 102(a)(1) as being described by Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks by Gong et al. Gong teaches claims 1, 8 and 15. A neural network training method, comprising: (Gong abs. “DSQ can automatically evolve during training to gradually approximate the standard quantization.”) performing, in a forward propagation process and using a binarization function, (Gong sec. 3.1 “For 1-bit binary quantization, the binary neural network (BNN) limits its activations and weights to either-1 or +1 usually using the binary function… sgn(x)…” The binarizing happens in the forward pass in algorithm 1 in section 3.5, see below.) PNG media_image1.png 146 342 media_image1.png Greyscale binarization processing on a target weight to obtain a weight of a first neural network layer in a neural network, (See above in Gong algorithm 1 where the wq- is the weight and it is a function of binarization of the weight wsq. These weights are the layer weights.) or on an activation value of a second neural network layer in neural network to obtain an input of the first neural network layer; (The inputs are quantized/binarized too because Gong sec. 4.1 says “When building up a quantized model, we simply insert DSQ function to all places that will be quantized ,e.g., the inputs and weights of a convolution layer.”) determining a fitting function based on series expansion of the binarization function; and (Gong sec. 1 “DSQ employs a series of hyperbolic tangent functions to gradually approach the staircase function for low-bit quantization (e.g., sign for 1-bit case), and meanwhile keeps the smoothness for easy gradient calculation.” The binarization is aproximated by a series of tangent functions.) calculating, in a backward propagation process, a first gradient of a loss function with respect to the target weight using a second gradient of the fitting function as a third gradient of the binarization function. (Gong sec. 1 “DSQ employs a series of hyperbolic tangent functions to gradually approach the staircase function for low-bit quantization (e.g., sign for 1-bit case), and meanwhile keeps the smoothness for easy gradient calculation.” Gong abs “Owing to its differentiable property, DSQ can help pursue the accurate gradients in backward propagation…” This shows that Gong uses a series of smooth differentiable tangent function to approximate the gradient used in backpropagation. This teaches taking the gradient of the fitting function instead of the gradient of the binarization function, because binaries are not differentiable.) Gong teaches claims 2, 9 and 16. The neural network training method of claim 1, further comprising determining a plurality of subfunctions based on the series expansion, wherein the fitting function comprises the plurality of subfunctions and an error function. (Gong sec. 1 “DSQ employs a series of hyperbolic tangent functions to gradually approach the staircase function for low-bit quantization (e.g., sign for 1-bit case), and meanwhile keeps the smoothness for easy gradient calculation.” The subfunctions are the tangent functions, the error function is included because the gradient is calculated and the gradient is a gradient of error.) Gong teaches claims 4, 11 and 18. The neural network training method of claim 2, further comprising fitting the error function using at least one neural network layer, wherein calculating the first gradient comprises: calculating, in the backward propagation process, fourth gradients of the plurality of subfunctions with respect to the target weight; calculating a fifth gradient of the at least one neural network layer with respect to the target weight; and calculating the first gradient based on the fourth gradients and the fifth gradient. (Gong algorithm 1 and equation 6, see below. The gradient of alpha is the gradient of the subfunctions. It’s all with respect to weight because Gong sec. 4.1 says “When building up a quantized model, we simply insert DSQ function to all places that will be quantized ,e.g., the inputs and weights of a convolution layer.”) PNG media_image2.png 116 308 media_image2.png Greyscale Algorithm 1 PNG media_image3.png 40 280 media_image3.png Greyscale Equation 6 Gong teaches claims 5, 12 and 19. The neural network training method of claim 1, further comprising determining a plurality of subfunctions based on the series expansion, wherein the fitting function comprises the plurality of subfunctions. (Gong sec. 1 “DSQ employs a series of hyperbolic tangent functions to gradually approach the staircase function for low-bit quantization (e.g., sign for 1-bit case), and meanwhile keeps the smoothness for easy gradient calculation.” The tangent functions are the series expansion and, collectively, they are the fitting function.) Gong teaches claims 7 and 14. The neural network training method of claim 1, wherein a data type of the target weight is a 32-bit floating point type, a 64-bit floating point type, a 32-bit integer type, or an 8-bit integer type. (Gong sec. 2.2 “frameworks usually support 8-bit integer arithmetic…” Gong sec. 4.5 says “while existing open-source high performance inference frameworks (e.g.,NCNN-8-bit [31]) usually only support 8-bit operations. In practice, the lower bitwidth doesn’t mean a faster inference speed, mainly due to the overflow and transferring among the registers…” This teaches an 8bit integer type as the standard before this paper. That means the 8-bit integer type was known at the time of filing.) Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 3, 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks by Gong et al and Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm by Liu et al. Claims 6, 13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks by Gong et al and https://www.physics.smu.edu/scalise/P4321sp20/fs.pdf (Olver). Gong teaches claims 3, 10 and 17. The neural network training method of claim 2, further comprising fitting the error function by using a (Gong sec. 4.1 “All convolution and fully-connected layers except the first and the last one are quantized with DSQ.”) Gong doesn’t teach a two layer with a residual. However, Liu teaches a two-layer fully connected neural network with a residual. (Liu p. 14 “keep the weights and activations in the first convolution and the last fully-connected layers to be real-valued.” Liu p. 2 “we propose to keep these real activations via adding a simple yet effective shortcut, dubbed Bi-Real net. As shown in Fig. 1(b), the shortcut connects the real activations to an addition operator with the real-valued activations of the next block.” The shortcut connection is the residual connection.) Liu, Gong and the claims are all directed to training fully connected neural networks. It would have been obvious to a person having ordinary skill in the art, at the time of filing, to have a residual shortcut to “keep these real activations” (Liu p. 14) because “the representational capability of the Bi-Real net is significantly enhanced and the additional cost on computation is negligible.” Liu abs. Gong teaches claims 6, 13 and 20. The neural network training method of claim 1, wherein the series expansion is a (Gong sec. 1 “DSQ employs a series of hyperbolic tangent functions to gradually approach the staircase function for low-bit quantization (e.g., sign for 1-bit case), and meanwhile keeps the smoothness for easy gradient calculation.”) Gong doesn’t teach a Fourier series expansion of the binarization function. However, Olver teaches a Fourier series expansion of the binarization function. (Olver p. 645 “Thus, the Fourier series converges, as expected, to f(x) at all points of continuity; at discontinuities, the Fourier series can’t decide whether to converge to the right or left hand limit, and so ends up ‘splitting the difference’ by converging to their average; see Figure 12.4.” Fig. 12.4 shows a fourier series of a binary/step function, below. And equation 12.41, shows a fourier series expansion of a step function.) PNG media_image4.png 250 314 media_image4.png Greyscale PNG media_image5.png 86 652 media_image5.png Greyscale Olver, Gong and the claims all turn a non-continuous step function into a smooth function. It would have been obvious to a person having ordinary skill in the art, at the time of filing, to use a Fourier series to do the conversion because “If f(x) is any piecewise continuous function, then its Fourier coefficients are well defined — the integrals (12.28) exist and are finite.” Olver p. 644 Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to Austin Hicks whose telephone number is (571)270-3377. The examiner can normally be reached Monday - Thursday 8-4 PST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela Reyes can be reached at (571) 270-1006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /AUSTIN HICKS/Primary Examiner, Art Unit 2142
Read full office action

Prosecution Timeline

Jul 31, 2023
Application Filed
Feb 24, 2026
Non-Final Rejection — §101, §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12591767
NEURAL NETWORK ACCELERATION CIRCUIT AND METHOD
2y 5m to grant Granted Mar 31, 2026
Patent 12554795
REDUCING CLASS IMBALANCE IN MACHINE-LEARNING TRAINING DATASET
2y 5m to grant Granted Feb 17, 2026
Patent 12530630
Hierarchical Gradient Averaging For Enforcing Subject Level Privacy
2y 5m to grant Granted Jan 20, 2026
Patent 12524694
OPTIMIZING ROUTE MODIFICATION USING QUANTUM GENERATED ROUTE REPOSITORY
2y 5m to grant Granted Jan 13, 2026
Patent 12524646
VARIABLE CURVATURE BENDING ARC CONTROL METHOD FOR ROLL BENDING MACHINE
2y 5m to grant Granted Jan 13, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
76%
Grant Probability
99%
With Interview (+25.1%)
3y 4m
Median Time to Grant
Low
PTA Risk
Based on 403 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month