Last updated: May 29, 2026

Application No. 18/334,758

APPARATUS, METHOD AND COMPUTER PROGRAM PRODUCT FOR QUANTIZING NEURAL NETWORKS

Non-Final OA §103

Filed

Jun 14, 2023

Priority

Jun 16, 2022 — provisional 63/366,518

Examiner

BECK, LERON

Art Unit

2487

Tech Center

2400 — Computer Networks

Assignee

Nokia Technologies Oy

OA Round

3 (Non-Final)

Interview Optional

— +11.7% interview lift. Interview lift (+11.7%) is below the 15.0% threshold. A written response is recommended.

Based on 859 resolved cases, 2023–2026

Examiner Intelligence

BECK, LERON View full profile →

Grants 80% — above average

Career Allowance Rate

683 granted / 859 resolved

+21.5% vs TC avg

Moderate +12% lift

Without

With

+11.7%

Interview Lift

resolved cases with interview

Typical timeline

2y 7m

Avg Prosecution

36 currently pending

Career history

910

Total Applications

across all art units

Statute-Specific Performance

§101

1.4%

-38.6% vs TC avg

§103

84.7%

+44.7% vs TC avg

§102

2.6%

-37.4% vs TC avg

§112

1.5%

-38.5% vs TC avg

Black line = Tech Center average estimate • Based on career data from 859 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 2/2/2026 has been entered.
Response to Arguments
Applicant’s arguments have been considered but are moot in view of new grounds of rejections.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 5-6, 8, 12, 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over US 20220044114A1-Sriram et al (Hereinafter referred to as “Sri”), in view of US 20230139347 A1-Bondarenko et al (Hereinafter referred to as “Bond”).
Regarding claim 1, Sri discloses an apparatus comprising at least one processor ([0002]); and at least one non-transitory memory comprising instructions that, when executed with the at least one processor ([0118), cause the apparatus at least to perform:
determine one or more quantization parameters (quantizers) based at least on one or more of the following ([0112], wherein the weights can be quantized by finding the absolute maximum value):
a mean absolute value computed based on a set of parameters of a neural network comprising a parameter;
a maximum absolute value computed based on a set of activations of the neural network comprising an activation ([0113], wherein both weights and activations are uniformly quantized during the forward-pass of the training using the absolute maximum value of weights and a running average of absolute maximum value of activations) ;
a number of parameters in the set of parameters of the neural network
comprising the parameter ([0113]); or 
a maximum absolute value computed based on an output value computed based on the parameter and the activation; and
quantize at least one of the parameters or the activation based at least on the one or more quantization parameters ([0112]. Wherein parameters may be quantized; [0119], quantize the parameters))
overfitting one or more multiplier parameters, wherein the one or more multiplier parameters are used for scaling one or more activations to a range ([0098]), and wherein values of the one or more multiplier parameters are determined based at least on a training process [0247]; According to instant applicant’s publication, [0493], overfitting refers to training or fine tuning a NN. Therefore, to be consistent with applicant’s specification, Sri discloses DNN training in [0060]. Sri also discloses in [0062], during training of a neural network first applies QAT to generate a first trained model and then applies PTQ on the first trained model to output a second trained model that has parameters (e.g., weights and activations) represented by low-bit integers. In addition, Sri discloses in [0070], QAT may be applied by quantizing all weights and activations of the neural network except for layers that require finer granularity in representation than the 8-bit quantization can provide (e.g., regression layers). In some embodiments, QAT  applies quantization on all weights and activation except for the last layer of the neural network. This may result in a mixed-precision DNN model. Weights may refer to the parameter within a neural network that transforms input data within the neural network's hidden layers. Within each node of the neural network, there is a set of inputs, weights, and a bias value and as the input enters each node, a weight is multiplied on the input, a bias is added and an activation function is applied. The resulting output is either observed, or passed to the next layer in the neural network. Each input node takes in information that can be numerically expressed (e.g., activation values) where each node is given a number. This node is then passed through the neural network during training.
Sriram fails to disclose wherein the one or more quantization parameters are determined further based at least on one or more of the following: a maximum absolute value of one or more kernel weights, a scaling factor of one or more input activations, a precision of an accumulator, or an approximate maximum value in accumulation.
However, in the same field of endeavor, Bond discloses wherein the one or more quantization parameters are determined further based at least on one or more of the following: a maximum absolute value of one or more kernel weights, a scaling factor of one or more input activations, a precision of an accumulator, or an approximate maximum value in accumulation ([0050], wherein For example, distinct scaling factors and zero-points may be applied per embedding dimension of an activation tensor rather than having two scalars for an entire activation tensor. As such, the quantization parameters may be collectively denoted by vectors s, z ∈ [AltContent: rect].sup.d, where s represents the scaling factor and z represents the zero-points).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the apparatus disclosed by Sri to disclose wherein the one or more quantization parameters are determined further based at least on one or more of the following: a maximum absolute value of one or more kernel weights, a scaling factor of one or more input activations, a precision of an accumulator, or an approximate maximum value in accumulation as taught by Bond, to improve quantization for neural networks ([0026], Bond).

Regarding claim 5, analyses are analogous to those presented for claim 1 and are applicable for claim 5.
Regarding claim 6, Sri discloses the apparatus of claim 5, wherein the apparatus is further caused to: determine allocation bits to be used for representing the set of activations and the set of parameters of the neural network, based on one or more test data samples ([0519]); and signal the allocation bits to a decoder ([0397], wherein send instruction to decoder).
Regarding claim 8, analyses are analogous to those presented for claim 1 and are applicable for claim 8.
Regarding claim 12, analyses are analogous to those presented for claim 1 and are applicable for claim 12.
Regarding claim 15, analyses are analogous to those presented for claim 1 and are applicable for claim 15.
Regarding claim 16, analyses are analogous to those presented for claim 1 and are applicable for claim 16.
Regarding claim 17, Sri discloses the apparatus of claim 1, wherein values of the one or more multiplier parameters are determined based at least on a training process (According to instant applicant’s publication, [0493], overfitting refers to training or fine tuning a NN. Therefore, to be consistent with applicant’s specification, Sri discloses DNN training in [0060]. Sri also discloses in [0062], during training of a neural network first applies QAT to generate a first trained model and then applies PTQ on the first trained model to output a second trained model that has parameters (e.g., weights and activations) represented by low-bit integers. In addition, Sri discloses in [0070], QAT may be applied by quantizing all weights and activations of the neural network except for layers that require finer granularity in representation than the 8-bit quantization can provide (e.g., regression layers). In some embodiments, QAT  applies quantization on all weights and activation except for the last layer of the neural network. This may result in a mixed-precision DNN model. Weights may refer to the parameter within a neural network that transforms input data within the neural network's hidden layers. Within each node of the neural network, there is a set of inputs, weights, and a bias value and as the input enters each node, a weight is multiplied on the input, a bias is added and an activation function is applied. The resulting output is either observed, or passed to the next layer in the neural network. Each input node takes in information that can be numerically expressed (e.g., activation values) where each node is given a number. This node is then passed through the neural network during training).
Regarding claim 18, analyses are analogous to those presented for claim 17 and are applicable for claim 18.
Regarding claim 19, analyses are analogous to those presented for claim 17 and are applicable for claim 19.
Regarding claim 20, analyses are analogous to those presented for claim 17 and are applicable for claim 20.
Claim(s) 2-4, 7, 9-11, 13 are rejected under 35 U.S.C. 103 as being unpatentable over US 20220044114A1-Sriram et al (Hereinafter referred to as “Sri”), in view of US 20230139347 A1-Bondarenko et al (Hereinafter referred to as “Bond”), in further view of US 20190191172 A1-Ruanovskyy et al (Hereinafter referred to as “Rus”).
Regarding claim 2, Sri discloses the apparatus of claim 1 (see claim 1), 
Sri fails to disclose wherein the apparatus is caused to signal the one or more quantization parameters to a decoder
However, in the same field of endeavor, Rus discloses wherein the apparatus is caused to signal the one or more quantization parameters to a decoder ([0139])
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the imaging optical system disclosed by Sri to disclose wherein the apparatus is caused to signal the one or more quantization parameters to a decoder as taught by Rus, to improve the compression efficiency ([0024], Rus).
Regarding claim 3, Rus discloses the apparatus of claim 2, wherein the one or more quantization parameters are signaled as part of a supplemental enhancement information message or an adaptation parameter set ([0139]).
Regarding claim 4, Rus discloses the apparatus of claim 2, wherein the apparatus is further caused to signal association between the signaled one or more quantization parameters and data to be quantized by using the one or more quantization parameters ([0139]).
Regarding claim 7, Sri discloses the apparatus of claim 5, wherein the apparatus is further caused to: determine the one or more quantization parameters for one or more activations based on one or more test data([0112-113]; 
Sri fail to disclose signal the one or more quantization parameters to a decoder, wherein a reconstructed quantization parameters is applied by the decoder to quantize the one or more activations at a decoding stage.
However, in the same field of endeavor, Rus discloses determine the one or more quantization parameters (abstract); signal the one or more quantization parameters to a decoder ([0139]) , wherein a reconstructed quantization parameters is applied by the decoder to quantize the one or more activations at a decoding stage ([0060]).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the imaging optical system disclosed by Sri to disclose signal the one or more quantization parameters to a decoder, wherein a reconstructed quantization parameters is applied by the decoder to quantize the one or more activations at a decoding stage as taught by Rus, to improve the compression efficiency ([0024], Rus).
Regarding claim 9, analyses are analogous to those presented for claim 2 and are applicable for claim 9.
Regarding claim 10, analyses are analogous to those presented for claim 3 and are applicable for claim 10.
Regarding claim 11, analyses are analogous to those presented for claim 4 and are applicable for claim 11.
Regarding claim 13, analyses are analogous to those presented for claim 7 and are applicable for claim 13.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LERON BECK whose telephone number is (571)270-1175. The examiner can normally be reached M-F 8 am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Czekaj can be reached at (571) 272-7327. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

LERON . BECK
Examiner
Art Unit 2487



/LERON BECK/Primary Examiner, Art Unit 2487

Read full office action

Prosecution Timeline

Jun 14, 2023

Application Filed

Sep 05, 2023

Response after Non-Final Action

May 08, 2025

Non-Final Rejection mailed — §103

Aug 21, 2025

Response Filed

Oct 01, 2025

Final Rejection mailed — §103

Feb 02, 2026

Request for Continued Examination

Feb 13, 2026

Response after Non-Final Action

Feb 24, 2026

Non-Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/906,908

Patent 12634466

WEIGHTED SAMPLE PREDICTION FOR GEOMETRIC PARTITIONING MODE

1y 7m to grant Granted May 19, 2026

18/610,832

Patent 12621461

CODING AFFINE MOTION MODELS FOR VIDEO CODING

2y 1m to grant Granted May 05, 2026

18/507,838

Patent 12615366

LOW FREQUENCY NON-SEPARABLE TRANSFORM SIGNALING IN VIDEO CODING

2y 5m to grant Granted Apr 28, 2026

18/964,830

Patent 12610090

METHOD OF PARTITIONING VIDEO DATA, DEVICE FOR DECODING VIDEO DATA, AND DEVICE FOR ENCODING VIDEO DATA

1y 4m to grant Granted Apr 21, 2026

18/988,319

Patent 12610047

CODING METHOD AND DECODER

1y 4m to grant Granted Apr 21, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

80%

Grant Probability

91%

With Interview (+11.7%)

2y 7m (~0m remaining)

Median Time to Grant

High

PTA Risk

Based on 859 resolved cases by this examiner. Grant probability derived from career allowance rate.