Last updated: May 29, 2026
Application No. 17/210,097
NETWORK QUANTIZATION METHOD, AND INFERENCE METHOD

Non-Final OA §101§103
Filed
Mar 23, 2021
Priority
Sep 27, 2018 — continuation of PCTJP2018036104
Examiner
MAC, GARY
Art Unit
2127
Tech Center
2100 — Computer Architecture & Software
Assignee
Socionext Inc.
OA Round
4 (Non-Final)
This examiner grants 41% of cases after interview

— +30.6% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 17 resolved cases, 2023–2026
Examiner Intelligence

MAC, GARY View full profile →
Grants 41% of resolved cases
Career Allowance Rate
7 granted / 17 resolved
-13.8% vs TC avg
Strong +31% interview lift
Without
With
+30.6%
Interview Lift
resolved cases with interview
Typical timeline
4y 3m
Avg Prosecution
13 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
10.2%
-29.8% vs TC avg
§103
89.8%
+49.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 17 resolved cases
Office Action

§101 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-2, 11-16, 19-20, and 23-26 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Step 1: Claims 1-2, 11-16, 19-20 and 23-26 are method claims for implementing a neural network in a computer. Therefore, claims 1-2,11-16,19-20 and 23-26 are directed to either a process, machine, manufacture, or composition of matter.

Claim 1:
Step 2A Prong 1: The claim recites the following limitations:
obtaining, …, through the plurality of tests, statistical information of tensors handled by the neural network (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper – See par. 38 in Specification – Determining relationship between the value and frequency of each tensor by calculating statistical information such as mean value, median value, etc.) 
generating, …, a quantization parameter set by quantizing values of the tensors based on the statistical information and the neural network (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate the tensors and then determine a quantization parameter set by quantizing values of the tensors based on the evaluation)
generating weight data by quantizing the neural weight data of the neural network using the quantization parameter set (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate the data and apply quantization to the data)
wherein in the generating, based on the statistical information, an adaptive quantization step interval, which is an adaptive distance between two directly adjacent quantized values, in each of a plurality of high-frequency regions, each of the plurality of high-frequency regions corresponding to each peak in a multimodal distribution of the tensors, is set to be smaller than a quantization step interval in a low-frequency region, the plurality of high-frequency regions each including a value, among the values of the tensors, having a frequency that is a local maximum, and the low-frequency region including a value, among the values of the tensors, having a lower frequency than in each of the high-frequency regions (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate the statistical information and set the quantization step intervals)
Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
performing, …, a plurality of tests by inputting a plurality of test datasets to the neural network (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
weight data Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
implementing the quantized network in the computer (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
by using a/the processor (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
Accordingly, the claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim recites the following additional elements:
performing, …, a plurality of tests by inputting a plurality of test datasets to the neural network (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
weight data Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
implementing the quantized network in the computer (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
by using a/the processor (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
The claim is not patent eligible.

Claim 2:
Step 2A Prong 1: The claim recites the following limitations:
obtaining, …, through the plurality of tests, statistical information of tensors handled by the neural network (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate the results of the plurality of tests and obtain statistical information of the tensors handled by the neural network based on the evaluation) 
generating, …, a quantization parameter set by quantizing values of the tensors based on the statistical information and the neural network (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper - a person could manually evaluate the tensors and then determine a quantization parameter set by quantizing values of the tensors based on the evaluation)
generating weight data by quantizing the neural weight data of the neural network using the quantization parameter set (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate the data and apply quantization to the data)
wherein in the generating, a plurality of quantization regions and a non-quantization region which does not overlap with the plurality of quantization regions are determined based on the statistical information, the plurality of quantization regions each including a value, among the values of the tensors, having a frequency that is a local maximum, and values of the tensors in each of the plurality of quantization regions are quantized while values of the tensors in the non-quantization are not quantized (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate the statistical information and then determine quantization regions)
Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
performing, …, a plurality of tests by inputting a plurality of test datasets to the neural network (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
weight data Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
implementing a quantized network including at least the quantized weight data in the computer (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
by a/the processor (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
Accordingly, the claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exception. The claim recites the following additional elements:
performing, …, a plurality of tests by inputting a plurality of test datasets to the neural network (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
weight data Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
implementing a quantized network including at least the quantized weight data in the computer (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
by a/the processor (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f))
The claim is not patent eligible.


Claim 11 incorporates the rejection of claim 2.
Step 2A Prong 1: The judicial exceptions of claim 2 are incorporated. The claim recites: wherein each of the plurality of quantization regions includes, among the values of the tensors, a value having a frequency that is a local maximum, and the non-quantization region includes, among the values of the tensors, a value having a lower frequency than the value in each of the plurality of quantization regions (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate the tensors and statistical information and, when determining a quantization and a plurality of non-quantization regions, select the quantization region to include a value having a frequency that is a local maximum and select the non-quantization regions to include a value having a lower frequency than the value in the quantization region). Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exceptions. The claim is not patent eligible.

Claim 12 incorporates the rejection of claim 1.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites
the following additional elements: wherein each of the plurality of high-frequency regions includes a first region and a second region each including a value, among the values of the tensors, having a frequency that is a local maximum, and the of low-frequency region includes a third region including a value, among the values of the tensors, that is between the values in the first region and the second region (Amounts to generally linking the abstract ideas to a particular technological environment or field of use, as discussed in MPEP 2106.05(h)). Accordingly, the additional elements alone or in combination do not integrate the abstract ideas into a practical application because they do not impose any meaningful limits on practicing the abstract ideas. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exceptions. wherein each of the plurality of high-frequency regions includes a first region and a second region each including a value, among the values of the tensors, having a frequency that is a local maximum, and the of low-frequency region includes a third region including a value, among the values of the tensors, that is between the values in the first region and the second region (Amounts to generally linking the abstract ideas to a particular technological environment or field of use, as discussed in MPEP 2106.05(h)). The claim is not patent eligible.

Claim 13 incorporates the rejection of claim 1.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites: wherein in the generating, the values of the tensors in at least part of the low-frequency region are not quantized (Mental process of evaluation and judgment which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate the tensors in the low-frequency regions and determine to not quantize the values in at least part of each of the low-frequency regions). Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites
no additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exceptions. The claim is not patent eligible.

Claim 14 incorporates the rejection of claim 12.
Step 2A Prong 1: The judicial exceptions of claim 12 are incorporated. The claim recites: wherein in the generating, the values of the tensors in at least part of the low-frequency region are not quantized (Mental process of evaluation and judgment which can be reasonably performed in one’s mind or with the aid of pencil and paper - a person could manually evaluate the tensors in the low-frequency regions and determine to not quantize the values in at least part of each of the low-frequency regions). Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites
no additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exceptions. The claim is not patent eligible.

Claim 15 incorporates the rejection of claim 1.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements: causing the quantized network to perform machine learning (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f)). Accordingly, the additional elements alone or in combination do not integrate the abstract ideas into a practical application because they do not impose any meaningful limits on practicing the abstract ideas. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exceptions. Causing the quantized network to perform machine learning (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, as discussed in MPEP 2106.05(f)). The claim is not patent eligible.

Claim 16 has similar limitations as those of claim 15. Therefore, it is rejected under the same rationale.

Claim 19 incorporates the rejection of claim 1.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites: classifying at least some of the plurality of test datasets into a first type and a second type based on respective instances of statistical information in the plurality of test datasets (Mental process of evaluation and judgment which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate the statistical information and test datasets and then classify the datasets into a first type and a second type). Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements: … wherein the statistical information includes a first information subset and a second information subset corresponding to the first type and the second type, respectively, the quantization parameter set includes a first parameter subset and a second parameter subset corresponding to the first information subset and the second information subset, respectively, and the quantized network includes a first network subset and a second network subset constructed by quantizing the neural network using the first parameter subset and the second parameter subset, respectively (Amounts to generally linking the abstract ideas to a particular technological environment or field of use, as discussed in MPEP 2106.05(h)). Accordingly, the additional elements alone or in combination do not integrate the abstract ideas into a practical application because they do not impose any meaningful limits on practicing the abstract ideas. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exceptions. … wherein the statistical information includes a first information subset and a second information subset corresponding to the first type and the second type, respectively, the quantization parameter set includes a first parameter subset and a second parameter subset corresponding to the first information subset and the second information subset, respectively, and the quantized network includes a first network subset and a second network subset constructed by quantizing the neural network using the first parameter subset and the second parameter subset, respectively (Amounts to generally linking the abstract ideas to a particular technological environment or field of use, as discussed in MPEP 2106.05(h)). The claim is not patent eligible.

Claim 20 has similar limitations as those of claim 19. Therefore, it is rejected under the same rationale.

Claim 23 incorporates the rejection of claim 1.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites
the following additional elements: wherein the frequency of each of the low-frequency region is not zero (Amounts to generally linking the abstract ideas to a particular technological environment or field of use, as discussed in MPEP 2106.05(h)). Accordingly, the additional elements alone or in combination do not integrate the abstract ideas into a practical application because they do not impose any meaningful limits on practicing the abstract ideas. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exceptions. Wherein the frequency of each of the plurality of low-frequency regions is not zero (Amounts to generally linking the abstract ideas to a particular technological environment or field of use, as discussed in MPEP 2106.05(h)). The claim is not patent eligible.

Claim 24 incorporates the rejection of claim 2.
Step 2A Prong 1: The judicial exceptions of claim 2 are incorporated. The claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites
the following additional elements: wherein the frequency of each of the non-quantization region is not zero (Amounts to generally linking the abstract ideas to a particular technological environment or field of use, as discussed in MPEP 2106.05(h)). Accordingly, the additional elements alone or in combination do not integrate the abstract ideas into a practical application because they do not impose any meaningful limits on practicing the abstract ideas. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exceptions. Wherein the frequency of each of the plurality of non-quantization regions is not zero (Amounts to generally linking the abstract ideas to a particular technological environment or field of use, as discussed in MPEP 2106.05(h)). The claim is not patent eligible.

Claim 25 incorporates the rejection of claim 19.
Step 2A Prong 1: The judicial exceptions of claim 19 are incorporated. The claim recites: selecting, from the first type and the second type, a type into which input data input to the quantized network is to be classified (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate input data and select a type into which input data is to be classified); selecting one of the first network subset and the second network subset based on the type, of the first type and the second type, selected in the selecting of the type (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate the network subsets and selected type and then select one of the network subsets); inputting the input data into the one of the first network subset and the second network subset selected in the selecting of the one of the first network subset and the second network subset (Mental process of evaluation and judgement which can be reasonably performed in one’s mind or with the aid of pencil and paper – a person could manually evaluate the input data and selected network subset and then input the input data into the selected network subset). Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements: Outputting an inference result obtained based on the input data using the one of the first network subset and the second network subset (Adding insignificant extra-solution activity of mere data output to the judicial exception – see MPEP 2106.05(g)). Accordingly, the additional elements alone or in combination do not integrate the abstract ideas into a practical application because they do not impose any meaningful limits on practicing the abstract ideas. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exceptions. Outputting an inference result obtained based on the input data using the one of the first network subset and the second network subset (MPEP 2106.05(d)(II) indicates that merely “Presenting offers and gathering statistics” is a well-understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim)). The claim is not patent eligible.

Claim 26 has similar limitations as those of claim 25. Therefore, it is rejected under the same rationale.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 11, 13, 15-16, 19-20, and 23-24 are rejected under 35 U.S.C. 103 as being unpatentable over Baum et al. Pub. No. US 20180285736 A1 (hereinafter Baum) in view of Nitta et al. Pub. No. US 20190325340 A1 (hereinafter Nitta).

Regarding claim 1, Baum teaches: An implementation method of implementing a neural network in a computer, the implementation method comprising: performing, by using a processor, a plurality of tests by inputting a plurality of  datasets to the neural network ([0102] teaches performing inference to perform a data-driven diagnosis by collecting statistics on neuronal activity (i.e., tests are performed by inputting data to the neural network), which may include data input to the neuron (e.g., test data), data output from the neuron, and the internal weights; [0103] teaches that if not enough statistics have been collected, more data is input and additional inference is performed (i.e., a plurality of tests may be performed by inputting a plurality of datasets); [0051] teaches the use of a computer to perform the disclosed functions/acts);
obtaining, by using the processor, through the plurality of tests, statistical information of tensors handled by the neural network ([0102] teaches performing inference to perform a data-driven diagnosis by collecting statistics on neuronal activity (i.e., obtaining statistical information through tests), which may include data input to the neuron, data output from the neuron, and the internal weights (i.e., tensors obtained when datasets are input to the neural network); [0103] teaches that if not enough statistics have been collected, more data is input and additional inference is performed (i.e., the statistics may be obtained based on a plurality of tests performed by inputting a plurality of datasets));
generating, by using the processor, a quantization parameter set by quantizing values of the tensors based on the statistical information database and the neural network ([0105] teaches determining a scaling factor and shift parameters (i.e., a quantization parameter set) that reassign values of the tensors to a narrower region to better represent where most of the data lies (i.e., by quantizing values of the tensors collected from neuronal activity based on the data distribution); [0054] teaches the use of a processor to perform the disclosed functions/acts); and
generating and outputting, by using the processor, a quantized weight data by quantizing weight data of the neural network using the quantization parameter set ([0106] teaches applying the scale and shift parameters to the current weight quantization and the new quantization scheme can be applied to the weight, input data, or both.),
implementing a quantized network in the computer ([abstract, 0051] teaches the use of a computer to perform the disclosed functions/acts of optimizing the quantization scheme),
wherein in the generating, based on the statistical information, an adaptive quantization step interval, which is an adaptive distance between two directly adjacent quantized values, in each of a plurality of high-frequency regions, each of the plurality of high-frequency regions corresponding to each peak in a multimodal distribution of the tensors, is set to be smaller than a quantization step interval in a low-frequency region, the plurality of high-frequency regions each including a value, among the values of the tensors, having a frequency that is a local maximum, and the low-frequency region including a value, among the values of the tensors, having a lower frequency than in each of the high-frequency regions (Figure 14 and [101, 0111] teach an example quantization where values associated with high probability (i.e., a high frequency region) are assigned values closer together (i.e., a smaller quantization step interval) compared to the values with lower probability in the upper end of data values (i.e., a plurality of low-frequency regions). From Figure 14, the bit width is nonlinearly spaced across the lower half of the data range, which is to the left of the half point arrow. The left of the half point arrow shows data with the highest probabilities and there are multiple peaks of high probability data values. The right of the half point arrow of Figure 14 shows data values with low probabilities and the values are quantized with a fixed bit width. The bit width (adaptive distance between two directly adjacent quantized values) is smaller in areas where the data values have high probability compared to areas when the data values have low probability. The distribution of data may include input data, output data, and weights (multimodal distribution).).
	Baum does not explicitly teach: performing … a plurality of tests by inputting a plurality of test datasets to the neural network.
	However, in the analogous art, Nitta teaches: performing … a plurality of tests by inputting a plurality of test datasets to the neural network ([0029] and [0071-0072] teach performing 5-fold cross validation (i.e., a plurality of tests) with a neural network in which data serving as test data is divided into five (i.e., a plurality of test datasets)).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Baum with the above teaching of Nitta because doing so would lead to an expected improvement in learning accuracy and precision (Nitta, [0069-0070] and [0073]).

Regarding claim 2, Baum teaches: A implementation method of implementing a neural network in a computer, the implementation method comprising: performing, by a processor, a plurality of tests by inputting a plurality of  datasets to the neural network ([0102] teaches performing inference to perform a data-driven diagnosis by collecting statistics on neuronal activity (i.e., tests are performed by inputting data to the neural network), which may include data input to the neuron (e.g., test data), data output from the neuron, and the internal weights; [0103] teaches that if not enough statistics have been collected, more data is input and additional inference is performed (i.e., a plurality of tests may be performed by inputting a plurality of datasets); [0051] teaches the use of a computer to perform the disclosed functions/acts);
obtaining, by using the processor, through the plurality of tests, statistical information of tensors handled by the neural network ([0102] teaches performing inference to perform a data-driven diagnosis by collecting statistics on neuronal activity (i.e., obtaining statistical information through tests), which may include data input to the neuron, data output from the neuron, and the internal weights (i.e., tensors obtained when datasets are input to the neural network); [0103] teaches that if not enough statistics have been collected, more data is input and additional inference is performed (i.e., the statistics may be obtained based on a plurality of tests performed by inputting a plurality of datasets));
generating, by the processor, a quantization parameter set by quantizing values of the tensors based on the statistical information and the neural network ([0105] teaches determining a scaling factor and shift parameters (i.e., a quantization parameter set) that reassign values of the tensors to a narrower region to better represent where most of the data lies (i.e., by quantizing values of the tensors collected from neuronal activity based on the data distribution)); and
generating and outputting, by the processor, quantized weight data by quantizing weight data of the neural network using the quantization parameter set ([0106] teaches applying the scale and shift parameters to the current weight quantization and the new quantization scheme can be applied to the weight, input data, or both.),
implementing a quantized network including at least the quantized weight data in the computer ([abstract, 0051, 0106] teaches the use of a computer to perform the disclosed functions/acts of optimizing the quantization scheme),
wherein in the generating, a plurality of quantization regions and a non-quantization region which does not overlap with the plurality of quantization regions are determined based on the statistical information, the plurality of quantization regions each including a value, among the values of the tensors, having a frequency that is a local maximum, and values of the tensors in each of the plurality of quantization regions are quantized while values of the tensors in the non-quantization region are not quantized (Figure 14, [0019], [0080], [0099], and [0109-0111] teach choosing to only quantize the values around the lower end, which have the highest frequency (i.e., not the data values in the low-frequency regions in the non-overlapping upper end). Figure 14 shows on the left side of the histogram multiple peaks of high probability data values and each peak may be considered as a local maximum.).
Baum does not explicitly teach: performing … a plurality of tests by inputting a plurality of test datasets to the neural network.
	However, in the analogous art, Nitta teaches: performing … a plurality of tests by inputting a plurality of test datasets to the neural network ([0029] and [0071-0072] teach performing 5-fold cross validation (i.e., a plurality of tests) with a neural network in which data serving as test data is divided into five (i.e., a plurality of test datasets)).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have modified the invention of Baum with the above teaching of Nitta because doing so would lead to an expected improvement in learning accuracy and precision (Nitta, [0069-0070] and [0073]).

Regarding claim 11, the combination of Baum and Nitta teaches all of the elements of claim 2 as shown in the rejection above. Baum further teaches: wherein each of the plurality of quantization regions includes, among the values of the tensors, a value having a frequency that is a local maximum, and the non-quantization region includes, among the values of the tensors, a value having a lower frequency than the value in each of the plurality of quantization regions ([0019], [0080], [0099], [0109-0111] and Figure 14 teaches that each of the values in the non-quantization regions in the upper end of data values include a value having a lower probability (i.e., frequency) than the local maximum value in the quantization region).

Regarding claim 13, the combination of Baum and Nitta teaches all of the elements of claim 1 as shown in the rejection above. Baum further teaches: wherein in the generating, the values of the tensors in at least part of the  low-frequency region are not quantized ([0019], [0080], [0099], and [0109-0111] teach choosing to only quantize the values around the lower end, which have the highest frequency (i.e., not the data values in the low-frequency regions in the upper end)).

Regarding claim 15, the combination of Baum and Nitta teaches all of the elements of claim 1 as shown in the rejection above. Baum further teaches: causing the quantized network to perform machine learning (Figure 17 and [0116-0117]).

Regarding claim 16, the combination of Baum and Nitta teaches all of the elements of claim 2 as shown in the rejection above. Baum further teaches: causing the quantized network to perform machine learning (Figure 17 and [0116-0117]).

Regarding claim 19, the combination of Baum and Nitta teaches all of the elements of claim 1 as shown in the rejection above. Baum further teaches: classifying at least some of the plurality of test datasets into a first type and a second type based on respective instances of statistical information in the plurality of test datasets ([0116-0117] teaches that a quantization change 214 may be applied due to subsequent input data having a different collection of statistics (i.e., a second dataset is classified as a second type based on statistical information of the dataset)),
wherein the statistical information includes a first information subset and a second information subset corresponding to the first type and the second type, respectively ([0116-0117] teaches that each quantization change 212 and 214 represents a collection of statistics on the input (i.e., each quantization change represents a subset of statistical information)),
the quantization parameter set includes a first parameter subset and a second parameter subset corresponding to the first information subset and the second information subset, respectively ([0116-0117] teaches applying the quantization changes as a result of the mechanism of the present invention, which would include calculating a second set of quantization parameters for the second collection of statistics according to [0105]), and
the quantized network includes a first network subset and a second network subset constructed by quantizing the neural network using the first parameter subset and the second parameter subset, respectively ([0116-0117] teaches applying the quantization change to determine a new quantization scheme as a result of the mechanism of the present invention for the neural network (i.e., constructing a second quantized network using the second parameter subset determined using the second collection of statistics)).

Regarding claim 20, the combination of Baum and Nitta teaches all of the elements of claim 2 as shown in the rejection above. Baum further teaches: classifying at least some of the plurality of test datasets into a first type and a second type based on respective instances of statistical information in the plurality of test datasets ([0116-0117] teaches that a quantization change 214 may be applied due to subsequent input data having a different collection of statistics (i.e., a second dataset is classified as a second type based on statistical information of the dataset)),
wherein the statistical information includes a first information subset and a second information subset corresponding to the first type and the second type, respectively ([0116-0117] teaches that each quantization change 212 and 214 represents a collection of statistics on the input (i.e., each quantization change represents a subset of statistical information)),
the quantization parameter set includes a first parameter subset and a second parameter subset corresponding to the first information subset and the second information subset, respectively ([0116-0117] teaches applying the quantization changes as a result of the mechanism of the present invention, which would include calculating a second set of quantization parameters for the second collection of statistics according to [0105]), and
the quantized network includes a first network subset and a second network subset constructed by quantizing the neural network using the first parameter subset and the second parameter subset, respectively ([0116-0117] teaches applying the quantization change to determine a new quantization scheme as a result of the mechanism of the present invention for the neural network (i.e., constructing a second quantized network using the second parameter subset determined using the second collection of statistics)).

Regarding claim 23, the combination of Baum and Nitta teaches all of the elements of claim 1 as shown in the rejection above. Baum further teaches: wherein the frequency of the low-frequency region is not zero ([0099], [0109-0111], and Figures 12-14 teach that the values in the low-probability (i.e., low-frequency) regions in the upper end of data values have a non-zero probability (i.e., frequency)).

Regarding claim 24, the combination of Baum and Nitta teaches all of the elements of claim 2 as shown in the rejection above. Baum further teaches: wherein the frequency of each of the non-quantization region is not zero ([0099], [0109-0111], and Figures 12-14 teach that the values in the low-probability regions (i.e., the non-quantization regions) in the upper end of data values have a non-zero probability (i.e., frequency)).


Claims 12 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Baum et al. Pub. No. US 20180285736 A1 (hereinafter Baum) in view of Nitta et al. Pub. No. US 20190325340 A1 (hereinafter Nitta), and in further view of Sarrafzadeh et al. Pub. No. US 20140155774 A1 (hereinafter Sarrafzadeh).

Regarding claim 12, the combination of Baum and Nitta teaches all of the elements of claim 1 as shown in the rejection above, but does not explicitly teach: wherein each of the plurality of high-frequency regions includes a first region and a second region each including a value, among the values of the tensors, having a frequency that is a local maximum, and the low-frequency region includes a third region including a value, among the values of the tensors, that is between the values in the first region and the second region.
	However, in the analogous art, Sarrafzadeh teaches: wherein each of the plurality of high-frequency regions includes a first region and a second region each including a value, among the values of the tensors, having a frequency that is a local maximum, and the low-frequency region includes a third region including a value, among the values of the tensors, that is between the values in the first region and the second region ([Abstract], [0063], and Figure 12 teach having input data that oscillates between local maximums (i.e., wherein the high-frequency region includes a first region and a second region each including a value, among the values of the tensors, having a frequency that is a local maximum), with a local minimum between each peak (i.e., each of the plurality of low-frequency regions includes a third region including a value, among the values of the tensors, that is between the values in the first region and the second region)).
	It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have substituted the input data of Sarrafzadeh for the input data used in the combination of Baum and Nitta because: 1) The combination of Baum and Nitta teaches a method of neural network quantization that differs from the method in claim 12 by the substitution of input data that contains the high-frequency data values toward the lower end of data values and the low-frequency data values toward the upper end of data values; 2) Sarrafzadeh teaches that the substituted input data that has a high-frequency region that includes a first region and a second region each including a value, among the values of the tensors, having a frequency that is a local maximum, and each of the plurality of low-frequency regions includes a third region including a value, among the values of the tensors, that is between the values in the first region and the second region was known in the art; and 3) the input data used in the combination of Baum and Nitta could have easily been substituted for the oscillating input data of Sarrafzadeh, and the results of performing network quantization with the oscillating input data would have been predictable.

Regarding claim 14, the combination of Baum, Nitta, and Sarrafzadeh teaches all of the elements of claim 12 as shown in the rejection above. Baum further teaches: wherein in the generating, the values of the tensors in at least part of the low-frequency region are not quantized ([0019], [0080], [0099], and [0109-0111] teach choosing to only quantize the values around the lower end, which have the highest frequency (i.e., not the data values in the low-frequency regions in the upper end)).

Claims 25-26 is rejected under 35 U.S.C. 103 as being unpatentable over Baum et al. Pub. No. US 20180285736 A1 (hereinafter Baum) in view of Nitta et al. Pub. No. US 20190325340 A1 (hereinafter Nitta), and in further view of Hope Simpson et al. Pub. No. US 20190336108 A1 (hereinafter Hope Simpson).

Regarding claim 25, the combination of Baum and Nitta teaches all of the elements of claim 19 as shown in the rejection above, but does not explicitly teach: selecting, from the first type and the second type, a type into which input data input to the quantized network is to be classified; selecting one of the first network subset and the second network subset based on the type, of the first type and the second type, selected in the selecting of the type; and inputting the input data into the one of the first network subset and the second network subset selected in the selecting of the one of the first network subset and the second network subset.
However, in the analogous art, Hope Simpson teaches: selecting, from the first type and the second type, a type into which input data input to the quantized network is to be classified ([0042-0044] teaches that the neural network may be trained to operate in a plurality of different modes based on the type of imaging data or tissue information that may be desired to be obtained, and the neural network may be configured to output different types of imaging data responsive to the same input (i.e., a selection is made of the desired type into which input data is to be classified));
selecting one of the first network subset and the second network subset based on the type, of the first type and the second type, selected in the selecting of the type ([0044] teaches that the neural network may be trained to operate in one or a plurality of different modes based on the type of imaging data or tissue information that may be desired to be obtained and that the propagation path or mode may be selected by the user or automatically invoked by the system depending on the imaging mode or application (i.e., the one of the first network subset and the second network subset is selected based on the desired classification type selected));
inputting the input data into the one of the first network subset and the second network subset selected in the selecting of the one of the first network subset and the second network subset ([0008], [0031], and [0041-0042]); and
outputting an inference result obtained based on the input data using the one of the first network subset and the second network subset ([0031] teaches that the neural network in a selected mode may propagate the input through the network of nodes to obtain predicted data (i.e., an inference result based on the input data), which may be further processed for display (i.e., output)).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Baum and Nitta with the above teaching of Hope Simpson because doing so would lead to an expected improvement in system flexibility for future improvements and adaption to user needs (Hope Simpson, [0003]).

Regarding claim 26, the combination of Baum and Nitta teaches all of the elements of claim 20 as shown in the rejection above, but does not explicitly teach: selecting, from the first type and the second type, a type into which input data input to the quantized network is to be classified; selecting one of the first network subset and the second network subset based on the type, of the first type and the second type, selected in the selecting of the type; and inputting the input data into the one of the first network subset and the second network subset selected in the selecting of the one of the first network subset and the second network subset.
However, in the analogous art, Hope Simpson teaches: selecting, from the first type and the second type, a type into which input data input to the quantized network is to be classified ([0042-0044] teaches that the neural network may be trained to operate in a plurality of different modes based on the type of imaging data or tissue information that may be desired to be obtained, and the neural network may be configured to output different types of imaging data responsive to the same input (i.e., a selection is made of the desired type into which input data is to be classified));
selecting one of the first network subset and the second network subset based on the type, of the first type and the second type, selected in the selecting of the type ([0044] teaches that the neural network may be trained to operate in one or a plurality of different modes based on the type of imaging data or tissue information that may be desired to be obtained and that the propagation path or mode may be selected by the user or automatically invoked by the system depending on the imaging mode or application (i.e., the one of the first network subset and the second network subset is selected based on the desired classification type selected));
inputting the input data into the one of the first network subset and the second network subset selected in the selecting of the one of the first network subset and the second network subset ([0008], [0031], and [0041-0042]); and
outputting an inference result obtained based on the input data using the one of the first network subset and the second network subset ([0031] teaches that the neural network in a selected mode may propagate the input through the network of nodes to obtain predicted data (i.e., an inference result based on the input data), which may be further processed for display (i.e., output)).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Baum and Nitta with the above teaching of Hope Simpson because doing so would lead to an expected improvement in system flexibility for future improvements and adaption to user needs (Hope Simpson, [0003]).


Response to Arguments
Examiner herein responds to Applicant’s remarks and claim amendments filed 09/17/2025.

Claim rejections under 35 U.S.C. 101 (Remarks pp. 7-8): Applicant’s arguments have been fully considered but they are not persuasive.

Applicant’s argument #1: Applicant argues that “Applicant notes that the claims recite particular solutions to identified technical problems, not merely the idea of quantization. Claim 1 specifies a particular mechanism—adaptively narrowing intervals in high-frequency regions and broadening them in low-frequency regions. Claim 2 specifies a particular way—establishing quantization regions and expressly excluding non- quantization regions. Both are concrete technical solutions disclosed in the specification (see Figs. 5-8, [0062]-[0070]) to the problem of reducing quantization error and computation in neural networks.”

Examiner’s response #1: Examiner respectfully disagrees. The quantization of data is an abstract idea that can be performed in the human mind with the aid of pen and paper because quantization of data involves reducing the precision of a value and this can be done as a mental process. The claim limitation “generating ... quantized weight data” defines what type of data to perform the quantization and is still considered to fall under an abstract idea of a mental process. 
	Claim 1 further recites “generating ... an adaptive quantization step interval”, which is a mental process that can be performed in the human mind with the aid of pen and paper. The determination is based on reviewing statistical data and assigning different step intervals based on where data have local maximums and minimums. 
During examination, the examiner should analyze the "improvements" consideration by evaluating the specification and the claims to ensure that a technical explanation of the asserted improvement is present in the specification, and that the claim reflects the asserted improvement (see MPEP §2106.05(a)). The MPEP (§2106.05(a)(II)) also warns, “it is important to keep in mind that an improvement in the abstract idea itself (e.g. a recited fundamental economic concept) is not an improvement in technology.”
An important consideration in determining whether a claim improves technology is the extent to which the claim covers a particular solution to a problem or a particular way to achieve a desired outcome, as opposed to merely claiming the idea of a solution or outcome (see MPEP 2106.05(a)). The amended claims do not provide sufficient details to describe any technological improvement. If the specifications explicitly set forth an improvement but in a conclusory manner (see MPEP 2106.04(d)(1): a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine the claim improves technology.

The rejections of claims 1‐2, 11‐16, 19‐20, and 23‐26 under 35 U.S.C. 101 are maintained.

Claim rejections under 35 U.S.C. 103 (Remarks pp. 8-11): Applicant’s arguments have been fully considered but they are not persuasive.

Applicant’s argument #1: Applicant argues that “However, as amended, it is clarified that the recited plurality of high-frequency regions correspond to each peak in a multimodal distribution of the tensors. Therefore, Fig. 14 of Baum should be interpreted as illustrating two high-frequency regions - namely, the region on the left side of arrow 162 and the region on the right side of arrow 162. Then, the adaptive quantization step interval of one of the plurality of high-frequency regions illustrated in Fig. 14, namely, the region on the right side of arrow 162, is not set to be smaller than a quantization step interval in a low-frequency region, as recited in claim 1.”
Examiner’s response #1: Examiner respectfully disagrees, as the cited Baum references teaches in Figure 14 and [0111] an example quantization where values associated with high probability (i.e., high frequency regions) are assigned values closer together (i.e., a smaller quantization step interval) compared to the values with lower probability in the upper end of data values (i.e., a plurality of low-frequency regions). In Figure 14, a plurality of high-frequency regions is represented by the peaks of data values having a higher frequency on the left half of the histogram and the low frequency region is represented by the peaks of data values having a low frequency on the right half of the histogram. There are multiple peaks on the left side of the histogram of Figure 14 and each peak represent a local maximum. The bit width (step interval) is also nonlinearly spaced in the areas of high probability and the bit width are narrower than the bit widths in the areas of low probability.

Applicant’s argument #2: Applicant argues that “However, Baum does not disclose quantization in a plurality of quantization regions, each of which includes a value among the values of the tensors having a frequency that is a local maximum, as recited in claim 2. Rather, Baum merely discloses quantization in a single lower end region.”
Examiner’s response #2: Examiner respectfully disagrees, as the cited Baum reference teaches in [0111] quantization that is applied nonlinearly. The quantization is applied nonlinearly across a range of data values having high probability as shown in Figure 14. Figure 14 is a histogram showing data values having multiple peaks, which is one or more local maximum.

The rejections of claims 1‐2, 11‐16, 19‐20, and 23‐26 under 35 U.S.C. 103 are maintained.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GARY MAC whose telephone number is (703)756-1517. The examiner can normally be reached Monday - Friday 8:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on (571) 270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/GARY MAC/Examiner, Art Unit 2127                                                                                                                                                                                                                /ABDULLAH AL KAWSAR/        Supervisory Patent Examiner, Art Unit 2127
Read full office action
Prosecution Timeline

Show 3 earlier events
Nov 22, 2024
Final Rejection mailed — §101, §103
Feb 06, 2025
Response after Non-Final Action
Mar 11, 2025
Request for Continued Examination
Mar 17, 2025
Response after Non-Final Action
Jun 30, 2025
Non-Final Rejection mailed — §101, §103
Sep 17, 2025
Response Filed
Nov 28, 2025
Final Rejection mailed — §101, §103
Jan 14, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

17/530,486
Patent 12626130
METHOD AND DEVICE FOR COMPRESSING NEURAL NETWORK
4y 5m to grant Granted May 12, 2026
17/473,957
Patent 12608643
GENERATING WORKFLOW REPRESENTATIONS USING REINFORCED FEEDBACK ANALYSIS
4y 7m to grant Granted Apr 21, 2026
17/378,867
Patent 12596907
NEURAL NETWORK OPERATION APPARATUS AND METHOD
4y 8m to grant Granted Apr 07, 2026
17/067,547
Patent 12572842
METHODS AND SYSTEMS FOR DECENTRALIZED FEDERATED LEARNING
5y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 4 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

4-5
Expected OA Rounds
41%
Grant Probability
72%
With Interview (+30.6%)
4y 3m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 17 resolved cases by this examiner. Grant probability derived from career allowance rate.
NETWORK QUANTIZATION METHOD, AND INFERENCE METHOD

This examiner grants 41% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email