Last updated: April 19, 2026
Application No. 17/843,772
Concepts for Coding Neural Networks Parameters

Non-Final OA §101§103§DP
Filed
Jun 17, 2022
Examiner
YI, HYUNGJUN B
Art Unit
2146
Tech Center
2100 — Computer Architecture & Software
Assignee
Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
OA Round
2 (Non-Final)
This examiner grants 18% of cases after interview

— +31.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 17 resolved cases, 2023–2026
Examiner Intelligence

YI, HYUNGJUN B View full profile →
Grants only 18% of cases
Career Allow Rate
3 granted / 17 resolved
-37.4% vs TC avg
Strong +32% interview lift
Without
With
+31.7%
Interview Lift
resolved cases with interview
Typical timeline
4y 7m
Avg Prosecution
39 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
26.3%
-13.7% vs TC avg
§103
53.9%
+13.9% vs TC avg
§102
12.9%
-27.1% vs TC avg
§112
4.7%
-35.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 17 resolved cases
Office Action

§101 §103 §DP
DETAILED ACTION
This action is responsive to the claims filed on 11/11/2025. Claims 1-8, 16, 18-21, 24-25, 28-29, 34-46, 106-107, 110, and 112-113 are pending for examination.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 11/11/2025 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Response to Arguments
“Kasner is non-analogous art … Kasner … is directed to Universal Trellis Coded Quantization (UTCQ) for image signals and does not relate to deep neural networks or compression of neural network weights… [thus] non-analogous art.” (Remarks, page 19)
Regarding Applicant’s argument that “Kasner is non-analogous art,” Applicant is asserting that Kasner cannot be combined because it is “directed to … image signals.” The Examiner respectfully disagrees. Kasner is reasonably pertinent to the problem addressed by the claims; sequential quantization/dequantization using a state transition process that selects among sets of reconstruction levels and produces an index stream. Kasner expressly teaches a trellis/state structure in which “at any given trellis state, the next codeword must come from one of two supersets,” and that “a sequence of indices … is sufficient to allow the decoder to reproduce” (Kasner, page 1677, col. 2, paragraph 2) the chosen codewords.  This is a general quantization/codec mechanism that is reasonably pertinent to the claimed problem of decoding/encoding a sequential stream of parameters (here, neural network parameters). Han provides the DNN-parameter context and index-based representation (codebook/indices), i.e., “maintaining a codebook structure” with stored indices for each connection.  A Person of Ordinary Skill in the Art (POSITA) would have recognized Kasner as relevant for implementing the claimed state-dependent reconstruction-set selection and state update for any sequential index-coded parameter stream, including Han’s neural network weights.
“Kasner depends on ECTCQ, absent in Han … Han does not disclose or suggest ECTCQ, providing no reason to seek the improvement of ECTCQ’s disadvantages. Without ECTCQ in Kasner, there is no basis for a skilled artisan to apply Kasner to DNN compression.” (Remarks, page 20) 
Regarding Applicant’s argument that Kasner presupposes ECTCQ and therefore allegedly cannot be applied absent ECTCQ in Han. The Examiner respectfully disagrees. The amended claims do not require ECTCQ. The claims require (i) selecting among reconstruction level sets via a state transition process and (ii) entropy coding/decoding indices using arithmetic coding with a probability model depending on the state. Kasner directly teaches the trellis/state selection mechanism and index-sequence reproduction independent of whether ECTCQ is used elsewhere.  Further, Kasner explicitly discusses the interaction between the superset structure and entropy coding models (e.g., “arithmetic models (or Huffman tables)”) (Kasner, page 1677, col. 2, paragraph 1), which supports using state/superset information to select probability models for entropy coding. Accordingly, ECTCQ is not required for the teachings relied upon in Kasner that correspond to the amended claim features.
“No motivation to combine … There is no teaching, suggestion, or motivation in the prior art to apply Kasner’s signal-processing quantization techniques to neural network compression… the combination would require an inventive step, not a routine application.” (Remarks, page 21) 
Regarding Applicant’s argument that there is “no motivation to combine,” Applicant is framing the combination as non-routine. The Examiner respectfully disagrees. Han already teaches compressing neural network parameters using a codebook/index representation and further lossless coding of those values, i.e., a modular compression pipeline that includes entropy coding. Han explains that Huffman coding is a lossless compressor and that “Huffman coding these non-uniformly distributed values saves” storage.  A POSITA seeking predictable improvements in compression efficiency would have been motivated to apply known state-dependent quantization structures (Kasner’s trellis selecting among reconstruction sets) to Han’s sequential parameter stream because Kasner teaches exactly the claimed mechanism for state-dependent set selection and state update.  The combination is a straightforward substitution/augmentation of known codec methods (quantizer codebook structure + entropy coder) to the same overall goal of parameter compression.
“However, it is unclear, how the codebook of Han is to be swapped with the codebook of Kasner… Han discloses training shared weights using feed forward and back-propagation … whereas Kasner discloses training codewords by ‘taking the sample mean…’ … Han and Kasner fail to disclose how to combine two different approaches… undue redesign.” (Remarks, page 21-22) 
Regarding Applicant’s argument that the combination requires “undue redesign” due to differences in training/retraining approaches. The Examiner respectfully disagrees. Applicant is addressing implementation details not recited in the amended claims. Amended claims 1 and 46 are directed to the encoding/decoding operations: selecting reconstruction level sets based on prior indices/state, decoding/encoding a quantization index, dequantizing/quantizing to a selected reconstruction level, updating state, and arithmetic coding/decoding using a probability model depending on state. The claims do not require any specific method of training reconstruction levels. Han teaches codebook/index usage for neural network parameters (“maintaining a codebook structure” and storing indices).  Kasner teaches a trellis-constrained mechanism for selecting allowed codewords and producing an index stream (“Viterbi algorithm… pick the sequence of codewords allowed by the trellis” (Kasner, page 1677, col. 2, paragraph 2)).  Thus, a POSITA could implement the claimed state-dependent selection and index coding without requiring any particular reconciliation of training techniques.
“Neural network weights often exhibit lower correlation and different distributions … It is not readily predictable how the differing statistical distributions … would interact with Kasner’s UTCQ method… does not establish that the combination would have been obvious…” (Remarks, page 22) 
Regarding Applicant’s argument that differing statistics make Kasner’s technique unpredictably inapplicable to neural network weights, the Examiner respectfully disagrees. The amended claims recite structural/functional codec steps (state-dependent set selection, index coding, state update, arithmetic coding with a probability model depending on state) and do not require any particular correlation level or compression gain. Moreover, Han itself demonstrates that quantized weights and indices have biased distributions suitable for entropy coding, i.e., “both distributions are biased” (Han, page 5, section 4) and Huffman coding saves storage. Kasner’s trellis mechanism is expressly about selecting permissible reconstruction levels by state and representing selections by indices.  As such, applying a known trellis/state selection mechanism to another sequential parameter stream (Han’s indices for NN parameters) is a predictable use of prior art techniques for their established purpose.
“Paragraph [0006] only mentions … ‘information from nearby elements’ … fails to specify the information… It is unclear how said paragraph discloses a probability model which depends on the state…” (Remarks, page 22-23) 
Regarding Applicant’s argument that Sze does not disclose a probability model depending on a state, the Examiner respectfully disagrees. Applicant is asserting that Sze only generically mentions “nearby elements” in paragraph [0006]. However, Sze expressly teaches CABAC context modeling where a context model includes a state value and bins are coded using the current state of the selected context model; specifically, “a context model includes a state value” and “a bin is encoded based on the current state of the context model” (Sze, paragraph 27) selected by the context modeler, and the model is updated after coding.  Sze also teaches that the bin encoder/decoder performs arithmetic coding/decoding “using the context model (probability) selected.”  Thus, Sze provides the claimed arithmetic coding/decoding using a probability model whose selection and probability estimation depends on state/context.
“However, this state value is not the state defined in amended claim 1… a state that determines quantization levels is not the same as a state that indicates a symbol probability… Accordingly, Sze does not disclose or suggest the claimed limitation.” (Remarks, page 24) 
Regarding Applicant’s argument that CABAC’s state is not the same as Kasner’s trellis state. The Examiner respectfully disagrees. The amended claim language requires that arithmetic coding uses “a probability model which depends on the state for the current neural network parameter.” In the applied combination, the trellis/quantizer state (Kasner) is the “state … associated with the current neural network parameter” that determines the reconstruction level set, and that same state can be used as a conditioning context to select which probability model is used for arithmetic coding/decoding of the index (Sze). This is consistent with Kasner’s own discussion that separate superset index streams would otherwise require multiple entropy coding models, i.e., “arithmetic models (or Huffman tables).”  Sze teaches selecting a probability/context model and coding bins based on the selected model/state.  Therefore, the combined teachings provide a probability model whose selection depends on the (Kasner) state used for the current parameter, satisfying the claim without requiring that Kasner’s trellis state and CABAC’s internal probability-state be identical constructs.
“The combination proposed by the Examiner would not yield the claimed arrangement… There is no reasonable expectation of success in arriving at the claimed structure from the cited references.” (Remarks, page 25) 
Regarding Applicant’s argument that the combination “would not yield the claimed arrangement” and lacks reasonable expectation of success, the Examiner respectfully disagrees. The proposed combination is a coherent codec pipeline: Han provides neural network parameter compression using codebook/index representation and lossless entropy coding; Han notes Huffman coding is lossless and can be performed offline.  Kasner provides the claimed state transition process that selects among reconstruction level sets and updates state based on previous indices. Sze provides arithmetic coding/decoding using selected probability models and stateful context models.  A POSITA would have had a reasonable expectation that substituting a known trellis/state-dependent quantizer structure (Kasner) into a known DNN parameter compression framework (Han) and substituting arithmetic coding with selectable probability models (Sze) for Huffman coding is a predictable combination of known techniques to achieve known compression objectives (reducing bitstream/storage).

Claim Objections
Claims 24-25, 28, and 34 are objected to because of the following informalities: These claims depend upon a cancelled claim 23, for the purposes of examination they are being interpreted to depend on claim 1. Appropriate correction is required.

Double Patenting
A rejection based on double patenting of the “same invention” type finds its support in the language of 35 U.S.C. 101 which states that “whoever invents or discovers any new and useful process... may obtain a patent therefor...” (Emphasis added). Thus, the term “same invention,” in this context, means an invention drawn to identical subject matter. See Miller v. Eagle Mfg. Co., 151 U.S. 186 (1894); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Ockert, 245 F.2d 467, 114 USPQ 330 (CCPA 1957).
A statutory type (35 U.S.C. 101) double patenting rejection can be overcome by canceling or amending the claims that are directed to the same invention so they are no longer coextensive in scope. The filing of a terminal disclaimer cannot overcome a double patenting rejection based upon 35 U.S.C. 101.
Claims 1, 24-25 and 34-38 are provisionally rejected under 35 U.S.C. 101 as claiming the same invention as that of claim 14-21 of copending Application No. 19267146 (reference application). This is a provisional statutory double patenting rejection since the claims directed to the same invention have not in fact been patented.
Instant Application
Application No 19/267,146
1. (Currently Amended) Apparatus for decoding neural network parameters, which define a neural network, from a data stream, configured to sequentially decode the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices decoded from the data stream for previous neural network parameters, decoding a quantization index for the current neural network parameter from the data stream, wherein the quantization index indicates one reconstruction level out of the selected set of reconstruction levels for the current neural network parameter, dequantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels that is indicated by the quantization index for the current neural network parameter, 
1. Apparatus for decoding neural network parameters, which define a neural network, from a data stream, comprising a processor configured to sequentially decode the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices decoded from the data stream for previous neural network parameters, decoding a quantization index for the current neural network parameter from the data stream, wherein the quantization index indicates one reconstruction level out of the selected set of reconstruction levels for the current neural network parameter, dequantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels that is indicated by the quantization index for the current neural network parameter.
select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, and updating the state for a subsequent neural network parameter depending on the quantization index decoded from the data stream for the immediately preceding neural network parameter, and decode the quantization index for the current neural network parameter from the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter.
14. Apparatus of claim 1, configured to select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, and updating the state for a subsequent neural network parameter depending on the quantization index decoded from the data stream for the immediately preceding neural network parameter, and decode the quantization index for the current neural network parameter from the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter.
24. (Original) Apparatus of claim 23, configured to decode the quantization index for the current neural network parameter from the data stream using binary arithmetic coding by using the probability model which depends on the state for the current neural network parameter for at least one bin of a binarization of the quantization index.
15. Apparatus of claim 14, configured to decode the quantization index for the current neural network parameter from the data stream using binary arithmetic coding by using the probability model which depends on the state for the current neural network parameter for at least one bin of a binarization of the quantization index.
25. (Original) Apparatus of claim 23, wherein the at least one bin comprises a significance bin indicative of the quantization index of the current neural network parameter being equal to zero or not.
16. Apparatus of claim 14, wherein the at least one bin comprises a significance bin indicative of the quantization index of the current neural network parameter being equal to zero or not.
34. (Currently Amended) Apparatus of claim [[22]]23, wherein the probability model additionally depends on the quantization index of previously decoded neural network parameters.
17. Apparatus of claim 14, wherein the probability model additionally depends on the quantization index of previously decoded neural network parameters.
35. (Original) Apparatus of claim 34, configured to preselect, depending on the state or the set of reconstruction levels selected for the current neural network parameter, a subset of probability models out of a plurality of probability models and select the probability model for the current neural network parameter out of the subset of probability models depending on the quantization index of previously decoded neural network parameters.
18. Apparatus of claim 17, configured to preselect, depending on the state or the set of reconstruction levels selected for the current neural network parameter, a subset of probability models out of a plurality of probability models and select the probability model for the current neural network parameter out of the subset of probability models depending on the quantization index of previously decoded neural network parameters.




36. (Original) Apparatus of claim 35, configured to preselect, depending on the state or the set of reconstruction levels selected for the current neural network parameter, the subset of probability models out of the plurality of probability models in a manner so that a subset preselected for a first state or reconstruction levels set is disjoint to a subset preselected for any other state or reconstruction levels set.
19. Apparatus of claim 18, configured to preselect, depending on the state or the set of reconstruction levels selected for the current neural network parameter, the subset of probability models out of the plurality of probability models in a manner so that a subset preselected for a first state or reconstruction levels set is disjoint to a subset preselected for any other state or reconstruction levels set.





37. (Original) Apparatus of claim 35, configured to select the probability model for the current neural network parameter out of the subset of probability models depending on the quantization index of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to.
20. Apparatus of claim 18, configured to select the probability model for the current neural network parameter out of the subset of probability models depending on the quantization index of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to.



38. (Original) Apparatus of claim 35, configured to select the probability model for the current neural network parameter out of the subset of probability models depending on a characteristic of the quantization index of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, the characteristic comprising on or more of the signs of non-zero quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, the number of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and which are non-zero a sum of the absolute values of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to a difference between a sum of the absolute values of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and the number of quantization indices of the previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and which are non-zero.
21. Apparatus of claim 18, configured to select the probability model for the current neural network parameter out of the subset of probability models depending on a characteristic of the quantization index of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, the characteristic comprising on or more of the signs of non-zero quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, the number of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and which are non-zero a sum of the absolute values of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to a difference between a sum of the absolute values of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and the number of quantization indices of the previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and which are non-zero.  



The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 106 and 112 are rejected on the ground of nonstatutory double patenting as being unpatentable over claim 14 of U.S. Patent No. 19267146 . Although the claims at issue are not identical, they are not patentably distinct from each other because they recite substantially identical claims under a different statutory category. This is a provisional nonstatutory double patenting rejection since the claims directed to the same invention have not in fact been patented.

106. (Currently Amended) Method for decoding neural network parameters, which define a neural network, from a data stream, the method comprising: sequentially decoding the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices decoded from the data stream for previous neural network parameters, decoding a quantization index for the current neural network parameter from the data stream, wherein the quantization index indicates one reconstruction level out of the selected set of reconstruction levels for the current neural network parameter, dequantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels that is indicated by the quantization index for the current neural network parameter;
1. Apparatus for decoding neural network parameters, which define a neural network, from a data stream, comprising a processor configured to sequentially decode the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices decoded from the data stream for previous neural network parameters, decoding a quantization index for the current neural network parameter from the data stream, wherein the quantization index indicates one reconstruction level out of the selected set of reconstruction levels for the current neural network parameter, dequantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels that is indicated by the quantization index for the current neural network parameter.
selecting, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, and updating the state for a subsequent neural network parameter depending on the quantization index decoded from the data stream for the immediately preceding neural network parameter, and decoding the quantization index for the current neural network parameter from the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter.
14. Apparatus of claim 1, configured to select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, and updating the state for a subsequent neural network parameter depending on the quantization index decoded from the data stream for the immediately preceding neural network parameter, and decode the quantization index for the current neural network parameter from the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter.
112. (Currently Amended) A non-transitory digital storage medium having a computer program stored thereon to perform the method for decoding neural network parameters, which define a neural network, from a data stream, the method comprising: sequentially decoding the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices decoded from the data stream for previous neural network parameters, decoding a quantization index for the current neural network parameter from the data stream, wherein the quantization index indicates one reconstruction level out of the selected set of reconstruction levels for the current neural network parameter, dequantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels that is indicated by the quantization index for the current neural network parameters 
1. Apparatus for decoding neural network parameters, which define a neural network, from a data stream, comprising a processor configured to sequentially decode the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices decoded from the data stream for previous neural network parameters, decoding a quantization index for the current neural network parameter from the data stream, wherein the quantization index indicates one reconstruction level out of the selected set of reconstruction levels for the current neural network parameter, dequantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels that is indicated by the quantization index for the current neural network parameter.
selecting, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, and updating the state for a subsequent neural network parameter depending on the quantization index decoded from the data stream for the immediately preceding neural network parameter, and decoding the quantization index for the current neural network parameter from the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter, when said computer program is run by a computer.
14. Apparatus of claim 1, configured to select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, and updating the state for a subsequent neural network parameter depending on the quantization index decoded from the data stream for the immediately preceding neural network parameter, and decode the quantization index for the current neural network parameter from the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter.


Claims 46, 107, and 113 are rejected on the ground of nonstatutory double patenting as being unpatentable over claim 14 of U.S. Patent No. 19267146 in view of Sze (US2013/0272389 A1). This is a provisional nonstatutory double patenting rejection since the claims directed to the same invention have not in fact been patented. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to implement the decoding functionality of claim 14 of U.S. Patent No. 19267146 in a corresponding encoding mode (i.e., to encode the quantization indices/neural-network parameters into a data stream), because encoding and decoding are complementary inverse operations in the same compression system and implementing the encoder counterpart is a predictable variation. Furthermore, it would have been obvious to use arithmetic/entropy coding with an adaptive probability state for the quantization indices and to update the coder’s state after coding each index for use in coding the next index, because adaptive probability modeling/state updating based on prior coded values is a known technique to improve compression efficiency; Sze teaches CABAC encoding in which a bin encoder performs binary arithmetic coding using a selected context model (probability), where the context model includes a state value, and the context models are updated throughout the coding process such that a bin is encoded based on the current state of the selected context model and then the context model is updated after the bin is coded. 


46. (Currently Amended) Apparatus for encoding neural network parameters, which define a neural network, into a data stream, configured to sequentially encode the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices encoded into the data stream for previously encoded neural network parameters, quantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels, and encoding a quantization index for the current neural network parameter that indicates the one reconstruction level onto which the quantization index for the current neural network parameter is quantized into the data stream.
1. Apparatus for decoding neural network parameters, which define a neural network, from a data stream, comprising a processor configured to sequentially decode the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices decoded from the data stream for previous neural network parameters, decoding a quantization index for the current neural network parameter from the data stream, wherein the quantization index indicates one reconstruction level out of the selected set of reconstruction levels for the current neural network parameter, dequantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels that is indicated by the quantization index for the current neural network parameter.
select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, 
14. Apparatus of claim 1, configured to select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, 
and updating the state for a subsequent neural network parameter depending on the quantization index encoded into the data stream for the immediately preceding neural network parameter,
and encode the quantization index for the current neural network parameter into the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter.
1… updating the probability of the binary value according to the binary value of the current bin for a next bin by using multiple-parameter probability models, wherein each of the multiple-parameter probability models is updated using an individual set of probability states associated with a corresponding parameter.
107. (Currently Amended) Method for encoding neural network parameters, which define a neural network, into a data stream, the method comprising: sequentially encoding the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices encoded into the data stream for previously encoded neural network parameters, quantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels, and encoding a quantization index for the current neural network parameter that indicates the one reconstruction level onto which the quantization index for the current neural network parameter is quantized into the data stream
1. Apparatus for decoding neural network parameters, which define a neural network, from a data stream, comprising a processor configured to sequentially decode the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices decoded from the data stream for previous neural network parameters, decoding a quantization index for the current neural network parameter from the data stream, wherein the quantization index indicates one reconstruction level out of the selected set of reconstruction levels for the current neural network parameter, dequantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels that is indicated by the quantization index for the current neural network parameter.
selecting, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, 
14. Apparatus of claim 1, configured to select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, 
and updating the state for a subsequent neural network parameter depending on the quantization index encoded into the data stream for the immediately preceding neural network parameter, and encoding the quantization index for the current neural network parameter into the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter.
1… updating the probability of the binary value according to the binary value of the current bin for a next bin by using multiple-parameter probability models, wherein each of the multiple-parameter probability models is updated using an individual set of probability states associated with a corresponding parameter.
113. (Currently Amended) A non-transitory digital storage medium having a computer program stored thereon to perform the method for encoding neural network parameters, which define a neural network, into a data stream, the method comprising: sequentially encoding the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices encoded into the data stream for previously encoded neural network parameters, quantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels, and encoding a quantization index for the current neural network parameter that indicates the one reconstruction level onto which the quantization index for the current neural network parameter is quantized into the data stream.
1. Apparatus for decoding neural network parameters, which define a neural network, from a data stream, comprising a processor configured to sequentially decode the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices decoded from the data stream for previous neural network parameters, decoding a quantization index for the current neural network parameter from the data stream, wherein the quantization index indicates one reconstruction level out of the selected set of reconstruction levels for the current neural network parameter, dequantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels that is indicated by the quantization index for the current neural network parameter.
selecting, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, 
14. Apparatus of claim 1, configured to select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, 
and updating the state for a subsequent neural network parameter depending on the quantization index encoded into the data stream for the immediately preceding neural network parameter, and encoding the quantization index for the current neural network parameter into the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter, when said computer program is run by a computer.
1… updating the probability of the binary value according to the binary value of the current bin for a next bin by using multiple-parameter probability models, wherein each of the multiple-parameter probability models is updated using an individual set of probability states associated with a corresponding parameter.




Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-8, 16, 18-21, 24-25, 28-29, 34-46, 106-107, 110, and 112-113 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Statutory Categories
Claims 1-8, 16, 18-21, 24-25, 28-29, 34-45 are directed to an apparatus.
Claim 46 is directed to an apparatus.
Claim 106 is directed to a method.
Claim 107 is directed to a method.
Claim 112 is directed to a computer-readable medium.
Claim 113 is directed to a computer-readable medium.

Independent Claims 1, 106, and 112
Step 2A Prong 1: Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes. Independent claim 1, 106, and 112 recites limitations that are abstract ideas in the form of mental processes:
Claim 1 recites:
sequentially decode the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices decoded from the data stream for previous neural network parameters, (simply selecting a set is interpreted to be a mental proves of evaluation which can reasonably be performed in human mind)
decoding a quantization index for the current neural network parameter from the data stream, wherein the quantization index indicates one reconstruction level out of the selected set of reconstruction levels for the current neural network parameter, (this limitation merely comprises a mathematical analysis of data and is being considered as directed to a mathematical concept, see MPEP 2106.04(a), page 27, section 2.2.2 of this application’s specification outlines the mathematical procedure for this step)
dequantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels that is indicated by the quantization index for the current neural network parameter. (this limitation merely comprises a mathematical analysis of data and is being considered as directed to a mathematical concept, see MPEP 2106.04(a), page 27, section 2.2.2 of this application’s specification outlines the mathematical procedure for this step)
select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, and (a selection that is being considered as a mental process of evaluation which can reasonably be performed in mind or with aid of pen and paper)
updating the state for a subsequent neural network parameter depending on the quantization index decoded from the data stream for the immediately preceding neural network parameter, and (an updating process that is being considered as a mathematical implementation, see pages 52-3, for support for the mathematical process of this limitation)
decode the quantization index for the current neural network parameter from the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter. (a decoding process that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 27, section 2.2.2 for support for the mathematical process of this limitation)
This claim further recites the following additional elements for the purposes of Step 2A Prong Two analysis:
Apparatus for decoding neural network parameters, which define a neural network, from a data stream, comprising a processor configured to (this limitation invokes computers or other machinery merely as a tool to perform an existing process and is considered as mere instructions to apply an exception, see MPEP 2106.05(f))
The additional limitations fail step 2A Prong 2 of the 101 analysis because they do not transform the claim into a practical application. These limitations are too abstract or lack technical improvement that would make the concept practically useful. Without clear utility or integration into a specific field, the claim does not relate to any particular application. It does not meet the requirements of Step 2A Prong 2, as it fails to make the concept meaningfully applicable in practice. Since the claim as a whole, looking at the additional elements individually and in combination, does not contain any other additional elements that are indicative of integration into a practical application, the claim is directed to an abstract idea.
This claim recites the following additional elements for the purposes of Step 2B analysis:
Apparatus for decoding neural network parameters, which define a neural network, from a data stream, comprising a processor configured to (this limitation invokes computers or other machinery merely as a tool to perform an existing process and is considered as mere instructions to apply an exception, see MPEP 2106.05(f))
The claim also fails Step 2B of the analysis because the additional limitations do not amount to significantly more than the abstract idea itself. The additional limitations do not enhance the claim in a way that would move it beyond its abstract ideas as they minimally elaborate on the core concept without adding any inventive or technical substance. Considering the additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.
Claims 106 and 112 recite limitations substantially similar to claim 1, as such a similar analysis applies.
Dependents of Claim 1
The remaining dependent claims corresponding to independent claim 1 do not recite additional elements, whether considered individually or in combination, that are sufficient to integrate the judicial exception into a practical application or amount to significantly more than the judicial exception. The analysis of which is shown below:
The claims below recite additional limitations which fail step 2A Prong 2 of the 101 analysis because they do not transform the claim into a practical application. These limitations are too abstract or lack technical improvement that would make the concept practically useful. Without clear utility or integration into a specific field, the claim does not relate to any particular application. It does not meet the requirements of Step 2A Prong 2, as it fails to make the concept meaningfully applicable in practice.
The claims also fails Step 2B of the analysis because the additional limitations do not amount to significantly more than the abstract idea itself. The additional limitations do not enhance the claim in a way that would move it beyond its abstract ideas as they minimally elaborate on the core concept without adding any inventive or technical substance. The claims are unpatentable.

Claim 2 recites the further limitation of:
Apparatus of claim 1, wherein the neural network parameters relate to weights of neuron interconnections of the neural network. (For Step 2A and Step 2B: this limitation is merely directing to a field of use, i.e., neural networks. See MPEP 2106.05(h))
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible

Claim 3 recites the further limitation of:
Apparatus of claim 1, wherein the number of reconstruction level sets of the plurality of reconstruction level sets is two. (further specifying a parameter for aforementioned mathematical process is still being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations. Page 23, section 2.1, provides support for the mathematical process of this claim.)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible

Claim 4 recites the further limitation of:
Apparatus of claim 1, configured to parametrize the plurality of reconstruction level sets by way of a predetermined quantization step size and derive information on the predetermined quantization step size from the data stream. (further defining parameterization and step size for this process is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, page 23, section 2.1, provides support for the mathematical process of this claim)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible

Claim 5 recites the further limitation of:
Apparatus of claim 1, wherein the neural network comprises a one or more NN layers and the apparatus is configured to derive, for each NN layer, information on a predetermined quantization step size for the respective NN layer from the data stream, (computing/obtaining per-layer step sizes is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, page 23, section 2.1, provides support for the mathematical process of this claim)
and parametrize, for each NN layer, the plurality of reconstruction level sets using the predetermined quantization step size derived for the respective NN layer so as to be used for dequantizing the neural network parameters belonging to the respective NN layer. (a parameterization per layer that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, page 27, section 2.2.2, provides support for the mathematical process of this claim)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible

Claim 6 recites the further limitation of:
Apparatus of claim 1, wherein the number of reconstruction level sets of the plurality of reconstruction level sets is two and the plurality of reconstruction level sets comprises (further specifying a parameter for aforementioned mathematical process is still being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations. Page 23, section 2.1, provides support for the mathematical process of this claim.)
a first reconstruction level set that comprises zero and even multiples of a predetermined quantization step size, (further defining these construction sets is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations. Page 33, rule c, provides support for the mathematical process of this claim.)
and a second reconstruction level set that comprises zero and odd multiples of the predetermined quantization step size. (further defining these construction sets is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations. Page 33, rule c, provides support for the mathematical process of this claim.)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 7 recites the further limitation of:
Apparatus of claim 1, wherein all reconstruction levels of all reconstruction level sets represent integer multiples of a predetermined quantization step size, and the apparatus is configured to dequantize the neural network parameters by (further defining step sizes is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 33, rules a-d for support for the mathematical process of this limitation)
deriving, for each neural network parameter, an intermediate integer value depending on the selected reconstruction level set for the respective neural network parameter and the entropy decoded quantization index for the respective neural network parameter, (the process of derivation is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 33, rules a-d for support for the mathematical process of this limitation)
and multiplying, for each neural network parameter, the intermediate value for the respective neural network parameter with the predetermined quantization step size for the respective neural network parameter. (a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 25, line 5-6, for support for the mathematical process of this limitation)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 8 recites the further limitation of:
Apparatus of claim 6, wherein the number of reconstruction level sets of the plurality of reconstruction level sets is two and the apparatus is configured to derive the intermediate value for each neural network parameter by, if the selected reconstruction level set for the respective neural network parameter is a first set, multiply the quantization index for the respective neural network parameter by two to acquire the intermediate value for the respective neural network parameter; and(an arithmetic rule that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations)
if the selected reconstruction level set for a respective neural network parameter is a second set and the quantization index for the respective neural network parameter is equal to zero, set the intermediate value for the respective neural network parameter equal to zero; and (an arithmetic rule that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations)
if the selected reconstruction level set for a respective neural network parameter is a second set and the quantization index for the respective neural network parameter is greater than zero, multiply the quantization index for the respective neural network parameter by two and subtract one from the result of the multiplication to acquire the intermediate value for the respective neural network parameter; and (an arithmetic rule that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations)
if the selected reconstruction level set for a current neural network parameter is a second set and the quantization index for the respective neural network parameter is less than zero, multiply the quantization index for the respective neural network parameter by two and add one to the result of the multiplication to acquire the intermediate value for the respective neural network parameter. (an arithmetic rule that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 16 recites the further limitation of:
Apparatus of claim 1, wherein the apparatus is configured to select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, and (a selection that is being considered as a mental process of evaluation which can reasonably be performed in mind or with aid of pen and paper)
updating the state for a subsequent neural network parameter depending on the quantization index decoded from the data stream for the immediately preceding neural network parameter. (an updating process, stated at a high level of generality, that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 27, section 2.2.2, for support for the mathematical process of this limitation)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 18 recites the further limitation of:
Apparatus of claim 8, configured to update the state for the subsequent neural network parameter using a parity of the quantization index decoded from the data stream for the immediately preceding neural network parameter. (an updating process that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 27, section 2.2.2, for support for the mathematical process of this limitation)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 19 recites the further limitation of:
Apparatus of claim 8, wherein the state transition process is configured to transition between four or eight possible states. (further defining the state process is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 52-53 of the spec, for support for the mathematical process of this limitation)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 20 recites the further limitation of:
Apparatus of claim 8, configured to transition, in the state transition process, between an even number of possible states and the number of reconstruction level sets of the plurality of reconstruction level sets is two, (further parameterizing the state transition process is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see pages 52-53, for support for the mathematical process of this limitation)
wherein the determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on the state associated with the current neural network parameter determines a first reconstruction level set out of the plurality of reconstruction level sets if the state belongs to a first half of the even number of possible states, and a second reconstruction level set out of the plurality of reconstruction level sets if the state belongs to a second half of the even number of possible states. (a determination that is being considered as a mental process of evaluation which can reasonably be performed in mind or with aid of pen and paper)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 21 recites the further limitation of:
Apparatus of claim 16, configured to perform the update of the state by means of a transition table which maps a combination of the state and a parity of the quantization index decoded from the data stream for the immediately preceding neural network parameter onto a further state associated with the subsequent neural network parameter. (an updating process that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 27, section 2.2.2, for support for the mathematical process of this limitation)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 24 recites the further limitation of:
Apparatus of claim 23, configured to decode the quantization index for the current neural network parameter from the data stream using binary arithmetic coding by using the probability model which depends on the state for the current neural network parameter for at least one bin of a binarization of the quantization index. (a decoding process that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 26, section 2.2.1, for support for the mathematical process of this limitation)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 25 recites the further limitation of:
Apparatus of claim 23, wherein the at least one bin comprises a significance bin indicative of the quantization index of the current neural network parameter being equal to zero or not. (further defining a significant bit for this process is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see pages 26, section 2.2.1, for support for the mathematical process of this limitation)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 28 recites the further limitation of:
Apparatus of claim 23, configured so that the dependency of the probability model involves a selection of a context out of a set of contexts for the neural network parameters using the dependency, each context having a predetermined probability model associated therewith. (a mental process of selection that can reasonably be performed in the human mind or with aid of pen and paper)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 29 recites the further limitation of:
Apparatus of claim 28, configured to update the predetermined probability model associated with each of the contexts based on the quantization index arithmetically coded using the respective context. (an updating process stated at a high level of generality based on predetermined values is being considered a mental process of selection that can reasonably be performed in the human mind or with aid of pen and paper)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 34 recites the further limitation of:
Apparatus of claim 23, wherein the probability model additionally depends on the quantization index of previously decoded neural network parameters. (this limitation merely comprises a mathematical analysis of data and is being considered as directed to a mathematical concept.)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 35 recites the further limitation of:
Apparatus of claim 34, configured to preselect, depending on the state or the set of reconstruction levels selected for the current neural network parameter, a subset of probability models out of a plurality of probability models and (a simple preselection of a set of models is being considered as a mental process of evaluation which can reasonably be performed in one’s mind)
select the probability model for the current neural network parameter out of the subset of probability models depending on the quantization index of previously decoded neural network parameters. (a selection of a model based on a ascertained value is being considered as a mental process of evaluation which can reasonably be performed in one’s mind)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 36 recites the further limitation of:
Apparatus of claim 35, configured to preselect, depending on the state or the set of reconstruction levels selected for the current neural network parameter, the subset of probability models out of the plurality of probability models in a manner so that a subset preselected for a first state or reconstruction levels set is disjoint to a subset preselected for any other state or reconstruction levels set. (a selection of a model between inherently different sets is being considered as a mental process of evaluation which can reasonably be performed in one’s mind)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 37 recites the further limitation of:
Apparatus of claim 35, configured to select the probability model for the current neural network parameter out of the subset of probability models depending on the quantization index of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to. (a selection that is being considered as a mental process of evaluation which can reasonably be performed in mind or with aid of pen and paper)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 38 recites the further limitation of:
Apparatus of claim 35, configured to select the probability model for the current neural network parameter out of the subset of probability models depending on a characteristic of the quantization index of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, the characteristic comprising on or more of (a selection that is being considered as a mental process of evaluation which can reasonably be performed in mind or with aid of pen and paper)
the signs of non-zero quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, (a selection that is being considered as a mental process of evaluation which can reasonably be performed in mind or with aid of pen and paper)
the number of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and which are non-zero (a counting statistic that (a counting statistic that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations see page 27, section 2.2.2, for support for the mathematical process of this limitation)
a sum of the absolute values of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to (a sum that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations)
a difference between a sum of the absolute values of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and the number of quantization indices of the previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and which are non-zero. (a difference that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 39 recites the further limitation of:
Apparatus of claim 37, configured to locate the previously decoded neural network parameters so that the previously decoded neural network parameters relate to the same neural network layer as the current neural network parameter. (Under step 2A prong II and step 2B this limitation only recites the idea of a solution or outcome of locating previously decoded parameters and fails to recite details of how the solution is accomplished, this limitation is being considered mere instructions to apply an exception, see MPEP 2106.05(f))
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 40 recites the further limitation of:
Apparatus of claim 37, configured to locate one or more of the previously decoded neural network parameters in a manner so that the one or more previously decoded neural network parameters relate to neuron interconnections which emerge from, or lead towards, a neuron to which a neuron interconnection relates which the current neural network parameter refers to, or a further neuron neighboring said neuron. (Under step 2A prong II and step 2B this limitation only recites the idea of a solution or outcome of locating previously decoded parameters and fails to recite details of how the solution is accomplished, this limitation is being considered mere instructions to apply an exception, see MPEP 2106.05(f))
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 41 recites the further limitation of:
Apparatus of claim 1, configured to decode the quantization indices for the neural network parameters and perform the dequantization of the neural network parameters along a common sequential order among the neural network parameters. (a decoding process that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 27, section 2.2.2 for support for the mathematical process of this limitation)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 42 recites the further limitation of:
Apparatus of claim 1, configured to decode the quantization index for the current neural network parameter from the data stream using binary arithmetic coding by using the probability model which depends on previously decoded neural network parameters for one or more leading bins of a binarization of the quantization index and by using an equi-probable bypass mode suffix bins of the binarization of the quantization index which follow the one or more leading bins. (an updating process that is being considered as a mathematical implementation, see pages 27-28, for support for the mathematical process of this limitation)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 43 recites the further limitation of:
Apparatus of claim 42, wherein the suffix bins of the binarization of the quantization index represent bins of a binarization code of a suffix binarization for binarizing values of the quantization index an absolute value of which exceeds a maximum absolute value representable by the one or more leading bins, (an updating process that is being considered as a mathematical implementation, see pages 27-28, for support for the mathematical process of this limitation)
wherein the apparatus is configured to selected the suffix binarization depending on the quantization index of previously decoded neural network parameters. (a selection that is being considered as a mental process of evaluation which can reasonably be performed in mind or with aid of pen and paper)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 44 recites the further limitation of:
Apparatus of claim 1, wherein the neural network parameters relate to one reconstruction layer of reconstruction layers using which the neural network is represented, (relating parameters to reconstruction layers stated at a high level of generality is being considered as a mental process of evaluation which can reasonably be performed in mind or with aid of pen and paper)
and the apparatus is in configured to reconstruct the neural network by combining the neural network parameters, neural network parameter wise, with corresponding neural network parameters of one or mor further reconstruction layers. (an reconstruction process that is being considered as a mathematical implementation, see pages 26-27, for support for the mathematical process of this limitation)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Claim 45 recites the further limitation of:
Apparatus of claim 44, configured to decode the quantization index for the current neural network parameter from the data stream using arithmetic coding using a probability model which depends on corresponding neural network parameter corresponding to the current neural network parameter. (a decoding process that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 28, section 2.2.2 for support for the mathematical process of this limitation)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible
Independent Claims 46, 107, and 113
Step 2A Prong 1: Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes. Independent claim 46, 107, and 113 recites limitations that are abstract ideas in the form of mental processes:
Claim 46 recites:
sequentially encode the neural network parameters by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices encoded into the data stream for previously encoded neural network parameters, (simply selecting a set is interpreted to be a mental proves of evaluation which can reasonably be performed in human mind)
quantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels, and (this limitation merely comprises a mathematical analysis of data and is being considered as directed to a mathematical concept, see MPEP 2106.04(a), page 27, section 2.2.2 of this application’s specification outlines the mathematical procedure for this step)
encoding a quantization index for the current neural network parameter that indicates the one reconstruction level onto which the quantization index for the current neural network parameter is quantized into the data stream. (this limitation merely comprises a mathematical analysis of data and is being considered as directed to a mathematical concept, see MPEP 2106.04(a), page 27, section 2.2.2 of this application’s specification outlines the mathematical procedure for this step)
select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, (a selection that is being considered as a mental process of evaluation which can reasonably be performed in mind or with aid of pen and paper)
and updating the state for a subsequent neural network parameter depending on the quantization index encoded into the data stream for the immediately preceding neural network parameter, (an updating process that is being considered as a mathematical implementation, see pages 52-3, for support for the mathematical process of this limitation)
and encode the quantization index for the current neural network parameter into the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter. (a encoding process that is being considered as a mathematical implementation involving mathematical concepts, algorithms, or calculations, see page 27, section 2.2.2 for support for the mathematical process of this limitation)
This claim further recites the following additional elements for the purposes of Step 2A Prong Two analysis:
Apparatus for encoding neural network parameters, which define a neural network, into a data stream, configured to (this limitation invokes neural networks merely as a tool to perform an existing process and is considered as mere instructions to apply an exception, see MPEP 2106.05(f))
The additional limitations fail step 2A Prong 2 of the 101 analysis because they do not transform the claim into a practical application. These limitations are too abstract or lack technical improvement that would make the concept practically useful. Without clear utility or integration into a specific field, the claim does not relate to any particular application. It does not meet the requirements of Step 2A Prong 2, as it fails to make the concept meaningfully applicable in practice. Since the claim as a whole, looking at the additional elements individually and in combination, does not contain any other additional elements that are indicative of integration into a practical application, the claim is directed to an abstract idea.
This claim recites the following additional elements for the purposes of Step 2B analysis:
Apparatus for encoding neural network parameters, which define a neural network, into a data stream, configured to (this limitation invokes neural networks merely as a tool to perform an existing process and is considered as mere instructions to apply an exception, see MPEP 2106.05(f))
The claim also fails Step 2B of the analysis because the additional limitations do not amount to significantly more than the abstract idea itself. The additional limitations do not enhance the claim in a way that would move it beyond its abstract ideas as they minimally elaborate on the core concept without adding any inventive or technical substance. Considering the additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.
Claims 107 and 113 recites limitations substantially similar to claim 46, as such a similar analysis applies.
Claim 113 recites the following additional limitation for consideration under Step 2A Prong II and Step 2B:
A non-transitory digital storage medium having a computer program stored thereon to perform the method (Under step 2A prong II and step 2B this limitation invokes computers or other machinery merely as a tool to perform an existing process and is considered as mere instructions to apply an exception, see MPEP 2106.05(f))

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-5, 7, 16, 19, 24-25, 28-29, 34-37, 39-44, 45-46, 106-107, 110 and 112-113 are rejected under 35 U.S.C. 103 as being unpatentable over Han et al., (Han, S., Mao, H., & Dally, W. J. (2015) Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149.), hereafter referred to as Han, in view of Kasner et al. (Kasner, J. H., Marcellin, M. W., & Hunt, B. R. (1999). Universal trellis coded quantization. IEEE Transactions on Image Processing, 8(12), 1677-1687.), hereafter referred to as Kasner, and in further view of Sze et al. (US20130272389A1), hereafter referred to as Sze.
Claim 1: Han teaches the following limitations:
Apparatus for decoding neural network parameters, which define a neural network, from a data stream, configured to sequentially decode the neural network parameters (Han, abstract, “Our method first prunes the network by learning only the important connections. Next, we quantize the weights to enforce weight sharing, finally, we apply Huffman coding. After the first two steps we retrain the network to fine tune the remaining connections and the quantized centroids.”, Han describes decoding quantized indices from neural network connections.)
Kasner, in the related field of scalar quantization and entropy coding, teaches the following limitations which the above fails to teach:
by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices decoded from the data stream for previous neural network parameters, (Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows that at any given trellis state, the next codeword must come from one of two supersets 
    PNG
    media_image1.png
    18
    185
    media_image1.png
    Greyscale
 . Given an initial state, a sequence of indices specifying which codeword was chosen from the appropriate superset at each step, is sufficient to allow the decoder to reproduce the sequence of codewords chosen”, In Kasner, each trellis state is associated with access to one of two supersets of codewords (each superset being a subset of the overall uniform codebook). At any given state, the next reconstruction value must be chosen from one of these two supersets; and the state itself is updated based on the previously encoded index sequence. Thus, the superset of codewords at the current state is what is being interpreted as the “set of reconstruction levels (one reconstruction-level set)”, the collection of supersets over the trellis is the “plurality of reconstruction level sets,” and the fact that the next state (and thus which superset applies to the current sample) depends on the previously chosen indices is what is being interpreted as “selecting … a set of reconstruction levels … depending on quantization indices decoded from the data stream for previous neural network parameters.”)
decoding a quantization index for the current neural network parameter from the data stream, wherein the quantization index indicates one reconstruction level out of the selected set of reconstruction levels for the current neural network parameter, (Kasner, page 1678, col. 2, paragraph 1, “The UTCQ quantizer returns the S0 indices and the negative of the S1 indices, allowing one probability model to be used for entropy coding. The decoder may uniquely recover the index stream by simply keeping track of the current state, and negating codewords accordingly”, In Kasner, for each source sample, the UTCQ quantizer returns an index within the currently active superset; the decoder, given the current state and the received index, looks up exactly one reconstruction codeword in that superset. Hence, the superset’s codewords are being interpreted as the “reconstruction levels,” and the index within that superset is the claimed “quantization index” that uniquely points to one codeword (one reconstruction level) out of the set associated with the current state.)
dequantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels that is indicated by the quantization index for the current neural network parameter. (Kasner, page 1678, col. 2, paragraph 2, “During dequantization, two types of reconstruction levels are employed, uniform and trained. For 
    PNG
    media_image2.png
    18
    97
    media_image2.png
    Greyscale
, uniform levels are used (i.e., the codeword is the center of the quantization cell). The remaining codewords are trained on the source data itself, except CW0 which is typically set to 0. The trained codeword 
    PNG
    media_image3.png
    18
    118
    media_image3.png
    Greyscale
, is determined by taking the sample mean of all source symbols that map to 
    PNG
    media_image4.png
    17
    45
    media_image4.png
    Greyscale
 and the negative of all source symbols mapping to 
    PNG
    media_image5.png
    15
    58
    media_image5.png
    Greyscale
”, Kasner’s decoder in whichever selected subset (S0 or S1), performs a lookup of the exact reconstruction level corresponding to the decoded index i or -i.)
select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process (Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows the eight state trellis used in this work… Fig. 1 shows that at any given trellis state, the next codeword must come from one of two supersets 
    PNG
    media_image1.png
    18
    185
    media_image1.png
    Greyscale
. Given an initial state, a sequence of indices specifying which codeword was chosen from the appropriate superset at each step, is sufficient to allow the decoder to reproduce the sequence of codewords chosen [3].”, Kasner’s TCQ’s trellis is a state transition process that selects which codebook subset (superset) to use for each coefficient based on the current state. Kasner explicitly shows an eight-state trellis and explains that “at any given trellis state, the next codeword must come from one of two supersets,” and that given an initial state and a sequence of indices the decoder can reconstruct the sequence of chosen codewords. The movement from state to state along the trellis edges as each new symbol/index is processed is the claimed “state transition process.” At each step, the current state determines which superset (reconstruction-level set) is used, and the next state is determined by the current state and the chosen index, so the trellis operation as a whole is being interpreted as “selecting … the set of quantization levels … by means of a state transition process.” )
by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, (Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows that at any given trellis state, the next codeword must come from one of two supersets 
    PNG
    media_image1.png
    18
    185
    media_image1.png
    Greyscale
.”, The excerpt shows that which superset (the reconstruction level set comprising the set of quantization levels) is valid depends on the current state.)
and updating the state for a subsequent neural network parameter depending on the quantization index decoded from the data stream for the immediately preceding neural network parameter, (Kasner, page 1677, col. 2, paragraph 2, “Given an initial state, a sequence of indices specifying which codeword was chosen from the appropriate superset at each step, is sufficient to allow the decoder to reproduce the sequence of codewords chosen [3].”, the decoder updates its state after each decoded index so that the next state (and thus the next superset) is determined by the previous index.)
A person of ordinary skill in the art (POSITA) before the effective filing date of the claimed invention would have recognized that Kasner’s trellis-coded quantization is applicable to sequential scalar parameters, and a POSITA would have recognized it as reasonably applicable to Han’s sequential neural-network parameter indices. Han’s Deep Compression already teaches a quantization technique that uses a learned codebook of centroids per layer and represents each weight by an index into that codebook. Kasner’s TCQ similarly uses codebooks portioned into subsets and represents each sample by a subset-specific index. To combine the two methods, a POSITA would have applied Kasner’s trellis/state-based subset-selection mechanism to the sequence of quantized neural-network parameters/indices used in Han. Specifically, the decoder would maintain a state as taught by Kasner and, for each parameter, would use the current state to select the applicable reconstruction-level set (superset/subset) before interpreting the decoded index as the reconstruction level within that selected set. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Han (i.e. deep neural network quantization methos) by incorporating the teachings of Kasner (i.e. index based Trellis-Coded Quantization methods). A motivation of which is to provide a codebook quantization technique without needing additional storage and training of codebooks. (Kasner, page 1687, “The advantages of UTCQ are simplicity and flexibility. Unlike previous ECTCQ systems, no prior codebook training is needed and no codebooks are stored. We have shown that the distortion-rate performance of UTCQ is comparable with that of optimal ECTCQ for memoryless sources at most encoding rates.”)
Sze, in the same field of data encoding, teaches the following limitations which the above fails to teach:
and decode the quantization index for the current neural network parameter from the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter. (Sze, paragraph 6, “CABAC has multiple probability modes for different contexts. It first converts all non-binary symbols to binary symbols referred to as bins. Then, for each bin, the coder selects which probability model to use, and uses information from nearby elements to optimize the probability estimate. Arithmetic coding is then applied to compress the data.”, CABAC’s arithmetic coder selects a probability model (context) based on the current state, and then arithmetically decodes each bin under that model, teaching state-dependent arithmetic decoding.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Han (i.e. deep neural network quantization methos) and Kasner by incorporating the teachings of Sze (i.e. Context-Aware Binary Arithmetic Coding methods). Kasner further recognizes that different sets of indices (corresponding to different supersets) may be entropy coded with different coding models (e.g., different arithmetic models or different Huffman tables), supporting selection of the probability model based on the state/superset associated with the current parameter. A motivation of which is to provide an entropy coding technique (arithmetic coding with context/probability models) that would further reduce the overall bitstream size for the quantization indices.(Sze, paragraph 6, “CABAC is an inherently lossless compression technique notable for providing considerably better compression than most other encoding algorithms used in video encoding at the cost of increased complexity.”, by adopting CABAC for entropy-coding Han’s quantization indices, a POSITA would directly reduce the overall bitstream size of a neural networks parameters.)

Claim 2: Han, Kasner, and Sze teaches the limitations of claim 1, Han further teaches:
Apparatus of claim 1, wherein the neural network parameters relate to weights of neuron interconnections of the neural network. (Han, abstract, “To address this limitation, we introduce “deep compression”, a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35× to 49× without affecting their accuracy. Our method first prunes the network by learning only the important connections. Next, we quantize the weights to enforce weight sharing, finally, we apply Huffman coding.”, Han explicitly treats the neural network parameters as weights.)

Claim 3: Han, Kasner, and Sze teaches the limitations of claim 1, Kasner further teaches:
Apparatus of claim 1, wherein the number of reconstruction level sets of the plurality of reconstruction level sets is two. (Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows that at any given trellis state, the next codeword must come from one of two supersets 
    PNG
    media_image1.png
    18
    185
    media_image1.png
    Greyscale
 . Given an initial state, a sequence of indices specifying which codeword was chosen from the appropriate superset at each step, is sufficient to allow the decoder to reproduce the sequence of codewords chosen”, at each step exactly two supersets (reconstruction level sets) are available.)
Claim 4: Han, Kasner, and Sze teaches the limitations of claim 1, Kasner further teaches:
Apparatus of claim 1, configured to parametrize the plurality of reconstruction level sets by way of a predetermined quantization step size and derive information on the predetermined quantization step size from the data stream. (Kasner, page 1678, col. 2, paragraph 1, “The decoder may uniquely recover the index stream by simply keeping track of the current state, and negating codewords accordingly…. The quantization thresholds are simply the midpoints between the reconstruction levels within a subset. This allows for fast computation of superset indices requiring only scaling and rounding. No thresholds need to be precomputed, nor is a binary tree search necessary. For a given trellis, the encoder is completely characterized by the stepsize parameter Δ .", this discloses the reconstruction level sets are explicitly parameterized using a quantization step size gained from a stream of data (index stream).
“During dequantization, two types of reconstruction levels are employed, uniform and trained. For 
    PNG
    media_image2.png
    18
    97
    media_image2.png
    Greyscale
, uniform levels are used (i.e., the codeword is the center of the quantization cell)… The remaining codewords are trained on the source data itself”, the reconstruction level sets (which comprise the step size) are derived from the index stream.)

Claim 5: Han, Kasner, and Sze teaches the limitations of claim 1, Kasner further teaches:
Apparatus of claim 1, wherein the neural network comprises a one or more NN layers and the apparatus is configured to derive, for each NN layer, information on a predetermined quantization step size for the respective NN layer from the data stream, (Kasner, page 1678, col. 2, paragraph 1, “The UTCQ quantizer returns the S0 indices and the negative of the S1 indices, allowing one probability model to be used for entropy coding. The decoder may uniquely recover the index stream by simply keeping track of the current state, and negating codewords accordingly”, Kasner states that, “for a given trellis, the encoder is completely characterized by the stepsize parameter Δ,” and that UTCQ uses uniform thresholds and reconstruction levels based on this Δ. The value of Δ (or its encoded representation) is the “information on a predetermined quantization step size” that the decoder must know or derive from the bitstream to reconstruct the uniform codebook. In the combined mapping with Wang, Wang supplies the multiple NN layers; for each such layer, the trellis-based quantizer operates with a chosen Δ, and the Δ value signaled or implied in the data stream for that layer is what is being interpreted as “information on a predetermined quantization step size for the respective NN layer … derived from the data stream.” The decoder can recover the index stream by tracking state and negating codewords, and derives, quantization information like step sizes from this data stream.)
and parametrize, for each NN layer, the plurality of reconstruction level sets using the predetermined quantization step size derived for the respective NN layer so as to be used for dequantizing the neural network parameters belonging to the respective NN layer. (Kasner, page 1678, col. 2, paragraph 1, “The decoder may uniquely recover the index stream by simply keeping track of the current state, and negating codewords accordingly…. The quantization thresholds are simply the midpoints between the reconstruction levels within a subset. This allows for fast computation of superset indices requiring only scaling and rounding. No thresholds need to be precomputed, nor is a binary tree search necessary. For a given trellis, the encoder is completely characterized by the stepsize parameter Δ .", 
“During dequantization, two types of reconstruction levels are employed, uniform and trained. For 
    PNG
    media_image2.png
    18
    97
    media_image2.png
    Greyscale
, uniform levels are used (i.e., the codeword is the center of the quantization cell)… The remaining codewords are trained on the source data itself”, the quantization thresholds are midpoints between reconstruction levels and that the encoder is completely characterized by the step size parameter Δ. This directly teaches that the reconstruction level sets are parameterized using the predetermined step size, which enables dequantization tailored per NN layer.)

Claim 7: Han, Kasner, and Sze teaches the limitations of claim 1, Kasner further teaches:
Apparatus of claim 1, wherein all reconstruction levels of all reconstruction level sets represent integer multiples of a predetermined quantization step size, and the apparatus is configured to dequantize the neural network parameters by (Kasner, page 1678, col. 2, paragraph 1, “UTCQ uses uniform thresholds and reconstruction levels at the encoder. The quantization thresholds are simply the midpoints between the reconstruction levels within a subset. This allows for fast computation of superset indices requiring only scaling and rounding. No thresholds need to be precomputed, nor is a binary tree search necessary. For a given trellis, the encoder is completely characterized by the stepsize parameter Δ.”, Kasner explains that UTCQ uses uniform thresholds and reconstruction levels and that the encoder is characterized by a stepsize parameter Δ. In a uniform scalar quantizer, the reconstruction values are positioned at regularly spaced points (centers of quantization cells) on the real line, each separated by Δ; these regular positions can be expressed as k·Δ for integer k. Thus, the uniform reconstruction levels in the UTCQ codebook are being interpreted as the claimed “reconstruction levels,” Δ is the “predetermined quantization step size,” and the fact that these levels lie on a uniform grid defined by Δ is what supports the statement that all reconstruction levels of all reconstruction level sets represent integer multiples of Δ. A uniform threshold scheme places reconstruction levels at exact multples of the uniform step size Δ (i.e. midpoints at k * Δ) so all codewords in every subset are integer-multiples of Δ.)

    PNG
    media_image6.png
    87
    275
    media_image6.png
    Greyscale

Figure 2 of Kasner

deriving, for each neural network parameter, an intermediate integer value depending on the selected reconstruction level set for the respective neural network parameter and the entropy decoded quantization index for the respective neural network parameter, (Kasner, page 1679, col. 2, paragraph 3, “Given a source sample to quantize, subset quantization indices may be computed directly. Given a quantization index, the reconstruction level (for uniform codewords) may be computed.”, 
Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows that at any given trellis state, the next codeword must come from one of two supersets 
    PNG
    media_image1.png
    18
    185
    media_image1.png
    Greyscale
 . Given an initial state, a sequence of indices specifying which codeword was chosen from the appropriate superset at each step, is sufficient to allow the decoder to reproduce the sequence of codewords chosen”, In UTCQ, Fig. 2 shows a uniform codebook partitioned into subsets, and Kasner explains that “a sequence of indices specifying which codeword was chosen from the appropriate superset at each step, is sufficient to allow the decoder to reproduce the sequence of codewords chosen” (Kasner, p.1677). Kasner also states that “given a quantization index, the reconstruction level (for uniform codewords) may be computed” (Kasner, p.1679). Under a broadest reasonable interpretation, the index that is entropy-decoded for the active superset is the claimed “entropy decoded quantization index.” Because the underlying uniform codebook is ordered and parameterized by the step size Δ, that index corresponds to a particular uniform codeword location on the real line, which is treated as the claimed “intermediate integer value” that identifies which Δ-spaced reconstruction level is to be used.)
and multiplying, for each neural network parameter, the intermediate value for the respective neural network parameter with the predetermined quantization step size for the respective neural network parameter. (Kasner, page 1678, col. 2, paragraph 2, “During dequantization, two types of reconstruction levels are employed, uniform and trained. For 
    PNG
    media_image2.png
    18
    97
    media_image2.png
    Greyscale
, uniform levels are used (i.e., the codeword is the center of the quantization cell). The remaining codewords are trained on the source data itself, except CW0 which is typically set to 0. The trained codeword 
    PNG
    media_image3.png
    18
    118
    media_image3.png
    Greyscale
, is determined by taking the sample mean of all source symbols that map to 
    PNG
    media_image4.png
    17
    45
    media_image4.png
    Greyscale
 and the negative of all source symbols mapping to 
    PNG
    media_image5.png
    15
    58
    media_image5.png
    Greyscale
”, once the decoder has its intermediate integer value (the subset quantization index), it directly looks up the corresponding reconstruction level in the selected subset. It is interpreted by the examiner that the subset employing uniform levels (e.g. the codeword is the center of the quantization cell) has to multiply the step size (Δ) with the intermediate value (the quantization index) in order to preserve the uniform level nature of the reconstruction sets.)

Claim 16: Han, Kasner, and Sze teaches the limitations of claim 1, Kasner further teaches:
Apparatus of claim 1, wherein the apparatus is configured to select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process by (Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows the eight state trellis used in this work… Fig. 1 shows that at any given trellis state, the next codeword must come from one of two supersets 
    PNG
    media_image1.png
    18
    185
    media_image1.png
    Greyscale
. Given an initial state, a sequence of indices specifying which codeword was chosen from the appropriate superset at each step, is sufficient to allow the decoder to reproduce the sequence of codewords chosen [3].”, Kasner’s TCQ’s trellis is a state transition process that selects which codebook subset (superset) to use for each coefficient based on the current state. Kasner explicitly shows an eight-state trellis and explains that “at any given trellis state, the next codeword must come from one of two supersets,” and that given an initial state and a sequence of indices the decoder can reconstruct the sequence of chosen codewords. The movement from state to state along the trellis edges as each new symbol/index is processed is the claimed “state transition process.” At each step, the current state determines which superset (reconstruction-level set) is used, and the next state is determined by the current state and the chosen index, so the trellis operation as a whole is being interpreted as “selecting … the set of quantization levels … by means of a state transition process.” )
determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, (Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows the eight state trellis used in this work… Fig. 1 shows that at any given trellis state, the next codeword must come from one of two supersets 
    PNG
    media_image1.png
    18
    185
    media_image1.png
    Greyscale
.”, the trellis state corresponds to the state associated with the current neural network parameter when combined with Han.)
and updating the state for a subsequent neural network parameter depending on the quantization index decoded from the data stream for the immediately preceding neural network parameter. (Kasner, page 1678, col. 1, paragraph 1, “Equation (1) states that if we take the codeword associated with index i ∈ S₀, and that corresponding to –i ∈ S₁, the codewords will be the negative of one another. Equation (2) states that the probability of the codeword with index i ∈ S₀ equals the probability of the codeword with index –i ∈ S₁. These relationships allow the use of a single variable-rate code for both supersets [8]. The UTCQ quantizer returns the S₀ indices and the negative of the S₁ indices, allowing one probability model to be used for entropy coding. The decoder may uniquely recover the index stream by simply keeping track of the current state, and negating codewords accordingly.”, Kasner teaches that after decoding each index, the decoder “keeps track of the current state” by noting whether the index came from superset S₀ (positive codeword) or S₁ (negative codeword) and then applies that state when interpreting the next index.  When combined with Han’s sequential decoding of neural-network weight indices, each decoded weight index becomes the “quantization index” that drives Kasner’s sign-based state update.  Thus, for every neural-network parameter in Han’s stream, Kasner’s rule updates the internal trellis state based on the immediately preceding decoded index.)


Claim 19: Han, Kasner, and Sze teaches the limitations of claim 1, Kasner further teaches:
Apparatus of claim 16, wherein the state transition process is configured to transition between four or eight possible states. (Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows the eight state trellis used in this work.”, Kasner showcases an 8 trellis state model Kasner, page 1678, col. 1, paragraph 1, “If such a codeword is required, the trellis must switch to state four by choosing a nonzero codeword from D2. If the following source samples require a string of zero reconstruction levels, the trellis must work its way back to state zero.”, Trellis Coded Quantization (TCQ) must switch (transition) between states as an inherent process.)
Claim 24: Kasner, Han, and Sze teaches the limitations of claim 23. Sze further teaches:
Apparatus of claim 23, configured to decode the quantization index for the current neural network parameter from the data stream using binary arithmetic coding by using the probability model which depends on the state for the current neural network parameter for at least one bin of a binarization of the quantization index. (Sze, paragraph 6, “In CABAC, bins can be either context coded or bypass coded. Bypass coded bins do not require context selection which allows these bins to be processed at a much high throughput than context coded bins.”, CABAC’s binary arithmetic coding of “bins” (the binarized bits of the index) using a state-dependent context model for at least one bin directly maps to the claims bin-wise arithmetic decoding.)

Claim 25: Kasner, Han, and Sze teaches the limitations of claim 23. Sze further teaches:
Apparatus of claim 23, wherein the at least one bin comprises a significance bin indicative of the quantization index of the current neural network parameter being equal to zero or not. (Sze, paragraph 6, “The theory and operation of CABAC coding for H.264/AVC is defined in the International Telecommunication Union, Telecommunication Standardization Sector (ITU-T) standard “Advanced video coding for generic audiovisual services” H.264, revision March 2005 or later, which is incorporated by reference herein. General principles are explained in “Context-Based Adaptive Binary Arithmetic Coding in the H.264/AVC Video Compression Standard,” Detlev Marpe, July 2003, which is incorporated by reference herein.”, Sze incorporates the H.264/AVC CABAC scheme by reference, where for transform coefficients a significant_flag (or significant_coeff_flag) is the first bin in the binarization signaling whether the coefficient is zero or non-zero. In that context, the bin representing significant_flag is the claimed “significance bin indicative of the quantization index … being equal to zero or not”—a bin value of 0 indicates a zero coefficient, and a bin value of 1 indicates a non-zero coefficient. The text in Sze pointing to CABAC’s standard operation is thus being interpreted as importing this significance-bin behavior.)

Claim 28: Kasner, Han, and Sze teaches the limitations of claim 23. Sze further teaches:
Apparatus of claim 22, configured so that the dependency of the probability model involves a selection of a context out of a set of contexts for the neural network parameters using the dependency, each context having a predetermined probability model associated therewith. (Sze, paragraph 6, “ In brief, CABAC has multiple probability modes for different contexts. It first converts all non-binary symbols to binary symbols referred to as bins. Then, for each bin, the coder selects which probability model to use, and uses information from nearby elements to optimize the probability estimate.”, CABAC’s explicit selection of one context (probability model) out of a set for each bin is analogous to the claim’s context-selection mechanic.) 

Claim 29: Kasner, Han, and Sze teaches the limitations of claim 28. Sze further teaches:
Apparatus of claim 28, configured to update the predetermined probability model associated with each of the contexts based on the quantization index arithmetically coded using the respective context. (Sze, paragraph 30, “The CABAC decoding process is the inverse of the encoding process and has similar feedback loops. Referring now to FIG. 1B, a CABAC decoder includes a bin decoder 112, a context modeler 110, and a de-binarizer 114. The context modeler 110 selects a context model for the next context bin to be decoded. As in the encoder, the context models are updated throughout the decoding process to track the probability estimations. That is, a bin is decoded based on the current state of the context model selected by the context modeler 110, and the context model is then updated to reflect the state transition and the MPS after the bin is decoded.”, after each bin (i.e. a portion of the quantization index) is arithmetically decoded under a chosen context model (referred to as the ‘probability model’ in paragraph 6 previously), the decoder updates that context model based on the decoded symbol (the MPS, Most Probable Symbol), which is the bin value.)

Claim 34: Kasner, and Han teaches the limitations of claim 22. Sze further teaches:
Apparatus of claim 22, wherein the probability model additionally depends on the quantization index of previously decoded neural network parameters. (Sze, paragraph 59, “The intra-prediction estimation component 424 (IPE) performs intra-prediction estimation in which tests on CUs in an LCU based on multiple intra-prediction modes, prediction unit sizes, and transform unit sizes are performed using reconstructed data from previously encoded neighboring CUs stored in a buffer (not shown) to choose the best CU partitioning, prediction unit/transform unit partitioning, and intra-prediction modes based on coding cost, e.g., a rate distortion coding cost.”, in the Sze patent the context modeler (the probability model) relies on outputs of the intra-prediction estimation component. The intra-prediction estimation component supplies the previously decoded data that the context modeler uses as its context inputs. Thus, previously decoded neural network parameters (when combined with Han) is what the probability model (the context model of Sze) depends from.)

Claim 35: Kasner, Han, and Sze teaches the limitations of claim 34. Sze further teaches:
Apparatus of claim 34, configured to preselect, depending on the state or the set of reconstruction levels selected for the current neural network parameter, a subset of probability models out of a plurality of probability models (Sze, paragraph 36, “Referring now to the CABAC encoder of FIG. 2A, the binarizer 200 converts syntax elements into strings of one or more binary symbols. The binarizer 200 directs each bin to either the context coding 206 or the bypass coding 208 of the bin encoder 204 based on a bin type determined by the context modeler 202.”, 
Sze, paragraph 6, “Then, for each bin, the coder selects which probability model to use, and uses information from nearby elements to optimize the probability estimate.”, In Sze’s CABAC encoder (Fig. 2A; paragraph, 36), the binarizer 200 “directs each bin to either the context coding 206 or the bypass coding 208 of the bin encoder 204 based on a bin type determined by the context modeler 202.” Sze also explains that, “for each bin, the coder selects which probability model to use” (Paragraph, 6). Under a broadest reasonable interpretation, the “plurality of probability models” are the various context probability models used on the context-coded branch together with the (effectively fixed) model used on the bypass branch. When the binarizer/context modeler decides whether a bin is sent to the context coding 206 path or the bypass 208 path, that decision preselects which subset of the available probability models will be used for that bin (the context-coded subset vs. the bypass subset), corresponding to “preselect[ing] … a subset of probability models out of a plurality of probability models.”)
and select the probability model for the current neural network parameter out of the subset of probability models depending on the quantization index of previously decoded neural network parameters.  (Sze, paragraph 6, “Then, for each bin, the coder selects which probability model to use, and uses information from nearby elements to optimize the probability estimate”, the selection of probability model (the context model as describe further in the patent) is based on the values of nearby elements. By combining with Han’s neural network quantization a POSITA would have applied the nearby elements paradigm to its neural network parameters.)
Claim 36: Kasner, Han, and Sze teaches the limitations of claim 35. Kasner further teaches:
Apparatus of claim 35, configured to preselect, depending on the state or the set of reconstruction levels selected for the current neural network parameter, the subset of probability models out of the plurality of probability models in a manner so that a subset preselected for a first state or reconstruction levels set is disjoint to a subset preselected for any other state or reconstruction levels set.  (Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows the eight state trellis used in this work… Fig. 1 shows that at any given trellis state, the next codeword must come from one of two supersets 
    PNG
    media_image1.png
    18
    185
    media_image1.png
    Greyscale
. Given an initial state, a sequence of indices specifying which codeword was chosen from the appropriate superset at each step, is sufficient to allow the decoder to reproduce the sequence of codewords chosen [3].”, As discussed for claim 18 above, Sze teaches having a plurality of probability models (context models and bypass mode) and preselecting a subset of those probability models for a given bin (e.g., by directing the bin to the context-coded branch or to the bypass branch of the CABAC engine). Kasner, in turn, teaches that at any given trellis state “the next codeword must come from one of two supersets” S₀ or S₁ (the reconstruction level sets from which to select), and that a sequence of indices identifying which codeword from the appropriate superset was chosen allows the decoder to reproduce the sequence of codewords (Kasner, p.1677, col.2). Thus, Kasner’s trellis state determines which of two disjoint reconstruction-level supersets (S₀ vs. S₁) is active at each step. It is interpreted by the examiner that the choice of S₀ vs. S₁ serves as an additional input to Sze’s context modeler so that, when S₀ (a first reconstruction level set selected) is active, one subset of Sze’s probability models is preselected, and when S₁ is active (a second reconstruction level set selected), a different, non-overlapping subset is preselected. In this combined system, the subset of probability models preselected for a first state / reconstruction-level set (S₀) is disjoint from the subset preselected for another state / reconstruction-level set (S₁), as required by the claim.)

Claim 37: Kasner, Han, and Sze teaches the limitations of claim 35. Sze further teaches:
Apparatus of claim 35, configured to select the probability model for the current neural network parameter out of the subset of probability models depending on the quantization index of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to. (Sze, paragraph 6, “Then, for each bin, the coder selects which probability model to use, and uses information from nearby elements to optimize the probability estimate”, Sze (via CABAC) states that, for each bin, the probability model (context) is selected using information from nearby elements—for transform coefficients, this includes previously decoded significant_flags, levels, and positions in the same block. When this is applied to neural-network parameters (via Han/Wang), the “nearby elements” correspond to previously decoded quantized parameters in a neighboring portion of the network (e.g., adjacent weights or nodes). Thus, CABAC’s rule of choosing a context based on the already decoded neighboring coefficients is what is being interpreted as “selecting the probability model … depending on the quantization index of previously decoded neural network parameters which relate to a portion of the neural network neighboring the portion of the current parameter.” )

Claim 39: Kasner, Han, and Sze teaches the limitations of claim 37. Han further teaches:
Apparatus of claim 37, configured to locate the previously decoded neural network parameters so that the previously decoded neural network parameters relate to the same neural network layer as the current neural network parameter. (Han, page 4, section 3.1, “We use k-means clustering to identify the shared weights for each layer of a trained network, so that all the weights that fall into the same cluster will share the same weight.”, Han processes weights per layer, applying quantization and clustering independently by layer. It is interpreted by the examiner that Han’s layer by layer processing by locating previous parameters for a layer is combined with Sze’s contextual modeling and decoding process to fully teach upon this claim.)

Claim 40: Kasner, Han, and Sze teaches the limitations of claim 37. Han further teaches:
Apparatus of claim 37, configured to locate one or more of the previously decoded neural network parameters in a manner so that the one or more previously decoded neural network parameters relate to neuron interconnections which emerge from, or lead towards, a neuron to which a neuron interconnection relates which the current neural network parameter refers to, or a further neuron neighboring said neuron. (Han, page 3, section 3, paragraph 2, “Weight sharing is illustrated in Figure 3. Suppose we have a layer that has 4 input neurons and 4 output neurons, the weight is a 4 × 4 matrix. On the top left is the 4 × 4 weight matrix, and on the bottom left is the 4 × 4 gradient matrix. The weights are quantized to 4 bins (denoted with 4 colors), all the weights in the same bin share the same value, thus for each weight, we then need to store only a small index into a table of shared weights. During update, all the gradients are grouped by the color and summed together, multiplied by the learning rate and subtracted from the shared centroids from last iteration. For pruned AlexNet, we are able to quantize to 8-bits (256 shared weights) for each CONV layers, and 5-bits (32 shared weights) for each FC layer without any loss of accuracy.”, Han processes neural connections at the level of individual neurons in fully connected layers. Neural network parameters are tied to specific neuron interconnections.)

Claim 41: Kasner, Han, and Sze teaches the limitations of claim 37. Han further teaches:
Apparatus of claim 1, configured to decode the quantization indices for the neural network parameters and perform the dequantization of the neural network parameters along a common sequential order among the neural network parameters. (Han, abstract, “Our method first prunes the network by learning only the important connections. Next, we quantize the weights to enforce weight sharing, finally, we apply Huffman coding.”, Han presents a decoding in a strict pipeline of pruning, retraining, quantizing and Huffman coding. Huffman coding is interpreted at the decoding process for each quantized indices of the neural network parameters and inherently does so in a sequential manner.)

Claim 42: Kasner, Han, and Sze teaches the limitations of claim 37. Sze further teaches:
Apparatus of claim 1, configured to decode the quantization index for the current neural network parameter from the data stream using binary arithmetic coding by using the probability model which depends on previously decoded neural network parameters for one or more leading bins of a binarization of the quantization index and by using an equi-probable bypass mode suffix bins of the binarization of the quantization index which follow the one or more leading bins. (Sze, paragraph 36, “Referring now to the CABAC encoder of FIG. 2A, the binarizer 200 converts syntax elements into strings of one or more binary symbols. The binarizer 200 directs each bin to either the context coding 206 or the bypass coding 208 of the bin encoder 204 based on a bin type determined by the context modeler 202. The binarizer also provides a bin index (binIdx) for each bin to the context modeler 202.”,  Sze teaches using context models for some bins and using bypass coding (equi-probable) for others. A POSITA would have applied this to Han’s quantization indices by binarizing them and then decoding with Sze’s adaptive Context Aware Binary Arithmetic Coding instead of the Huffman coding applied in Han. This uses probability models (based on prior indices) for leading bins, and bypass mode (equi-probable) for suffix bins.)

Claim 43: Kasner, Han, and Sze teaches the limitations of claim 42. Sze further teaches:
Apparatus of claim 42, wherein the suffix bins of the binarization of the quantization index represent bins of a binarization code of a suffix binarization for binarizing values of the quantization index an absolute value of which exceeds a maximum absolute value representable by the one or more leading bins, wherein the apparatus is configured to selected the suffix binarization depending on the quantization index of previously decoded neural network parameters. (Sze, paragraph 84, “Referring again to FIG. 6, in this method, the variable i is a bin counter and the variable N is the absolute value of a delta quantization parameter (delta qp) syntax element. Initially, the value of the bin counter i is set to 0. In this method, for values of N greater than or equal to cMax, cMax bins with a value of 1 are context coded into the compressed bit stream followed by bypass coded bins corresponding to the EGk codeword for N-cMax.”, Sze teaches a binarization scheme where absolute value of a delta quantization parameter is first encoded using a fixed number of leading context-coded bins. If the value exceeds that representable range, additional bypass-coded suffix bins are used to encode the excess portion using an Ex-Golomb code. Thus, the suffix binarization is selected based on whether the quantization index exceeds a threshold.)

Claim 44: Han, Kasner, and Sze teaches the limitations of claim 1. Kasner further teaches:
Apparatus of claim 1, wherein the neural network parameters relate to one reconstruction layer of reconstruction layers using which the neural network is represented, and the apparatus is in configured to reconstruct the neural network by combining the neural network parameters, neural network parameter wise, with corresponding neural network parameters of one or mor further reconstruction layers. (Kasner, page 1679, col. 2, section 4, paragraph 3, “Given a quantization index, the reconstruction level (for uniform codewords) may be computed.”, Kasner teaches the generation of reconstructed values based on decoded quantization indices.
Kasner, page 1678, col. 2, paragraph 2, “During dequantization, two types of reconstruction levels are employed, uniform and trained.”, during dequantization (the reconstruction of the neural network) the reconstruction sets are used to determine each dequantized parameter value. When combined with Han these would be applied to neural network parameters values.)

Claim 45: Kasner, Han, and Sze teaches the limitations of claim 44. Sze further teaches:
Apparatus of claim 44, configured to decode the quantization index for the current neural network parameter from the data stream using arithmetic coding using a probability model which depends on corresponding neural network parameter corresponding to the current neural network parameter. (Sze, paragraph 6, “Then, for each bin, the coder selects which probability model to use, and uses information from nearby elements to optimize the probability estimate. Arithmetic coding is then applied to compress the data.”, 
Sze, paragraph 41, “The bins generated by the context coding 224 and bypass coding 222 are provided the multiplexer 226. The multiplexor 226 selects the output of the context coding 224 or the bypass coding 222 to be provided to the de-binarizer 230 according to the bin type provided by the context modeler 228. The de-binarizer 230 receives decoded bins for a syntax element from the bin decoder 220 and operates to reverse the binarization of the encoder to reconstruct the syntax elements.”, the decoding process in Sze includes the de-binarizer, which is fed by bins generated using probability models chosen by the context modeler. Since the selection depends on the value being decoded it is dependent on the corresponding parameter, and when combined with Han would be adapted to neural network parameters.)

Claim 46:  Han teaches:
Apparatus for encoding neural network parameters, which define a neural network, into a data stream, configured to sequentially encode the neural network parameters (Han, abstract, “Our method first prunes the network by learning only the important connections. Next, we quantize the weights to enforce weight sharing, finally, we apply Huffman coding.”, Han explicitly applies quantized weights to neural network parameters, thereby encoding them to a compressed stream.)
Kasner, in the same field of quantization methods, teaches the following limitations which the above fails to teach:
by selecting, for a current neural network parameter, a set of reconstruction levels out of a plurality of reconstruction level sets depending on quantization indices encoded into the data stream for previously encoded neural network parameters, (Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows the eight state trellis used in this work… Fig. 1 shows that at any given trellis state, the next codeword must come from one of two supersets 
    PNG
    media_image1.png
    18
    185
    media_image1.png
    Greyscale
. Given an initial state, a sequence of indices specifying which codeword was chosen from the appropriate superset at each step, is sufficient to allow the decoder to reproduce the sequence of codewords chosen [3].”, Kasner’s TCQ trellis (Fig. 1) uses the previously encoded index to transition the trellis state, which in turn selects superset S₀ (even multiples) or S₁ (odd multiples) for the next parameter. Han’s neural‐network parameters take the place of Kasner’s “coefficients,” so this teaches using previous indices to choose the reconstruction‐level set. The superset of codewords at the current state is what is being interpreted as the “set of reconstruction levels”, the collection of supersets over the trellis is the “plurality of reconstruction level sets,” and the fact that the next state (and thus which superset applies to the current sample) depends on the previously chosen indices is what is being interpreted as “selecting … a set of reconstruction levels … depending on quantization indices decoded from the data stream for previous neural network parameters.” )
quantizing the current neural network parameter onto the one reconstruction level of the selected set of reconstruction levels, (Kasner, page 1677, col. 2, paragraph 2, “During quantization, the Viterbi algorithm [6] is used to pick the sequence of codewords allowed by the trellis structure that minimizes the cumulative MSE between the input data and output reconstruction.”, Kasner’s encoder actually quantizes each parameter value by finding the closest codeword in the active superset via Viterbi. In combination with Han’s weight‐sharing (where each shared weight is treated as a codebook entry), this teaches quantizing a neural-network parameter onto its selected reconstruction level. The superset’s codewords are being interpreted as the “reconstruction levels,” and the index within that superset is the claimed quantization index that uniquely points to one codeword (one reconstruction level) out of the set associated with the current state.)
and encoding a quantization index for the current neural network parameter that indicates the one reconstruction level onto which the quantization index for the current neural network parameter is quantized into the data stream. (Kasner, page 1678, col. 2, paragraph 1, “The UTCQ quantizer returns the S0 indices and the negative of the S1 indices, allowing one probability model to be used for entropy coding. The decoder may uniquely recover the index stream by simply keeping track of the current state, and negating codewords accordingly”, Kasner describes how each quantization index—which inherently indicates a specific reconstruction level in its superset—is entropy-coded into the output stream (here, using sign-shifted indices).  By combining with Han’s teaching that these indices represent neural network weights, one of ordinary skill would recognize that Kasner’s index bit-stream serves as the encoded quantization index for each neural network parameter.)
select, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets by means of a state transition process (Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows the eight state trellis used in this work… Fig. 1 shows that at any given trellis state, the next codeword must come from one of two supersets 
    PNG
    media_image1.png
    18
    185
    media_image1.png
    Greyscale
. Given an initial state, a sequence of indices specifying which codeword was chosen from the appropriate superset at each step, is sufficient to allow the decoder to reproduce the sequence of codewords chosen [3].”, Kasner’s trellis is both a state machine and a selection mechanism: (i) at each symbol time, the current trellis state determines which of the two supersets (reconstruction-level sets) is eligible, and (ii) the transition to the next state is determined by the index chosen within that superset. That is, stateₙ → choice of superset for sample n, and (stateₙ, indexₙ) → stateₙ₊₁. This coupling of superset selection to the evolving trellis state is what is being interpreted as “selecting … the set of quantization levels … by means of a state transition process”—the selection and the transitions are two aspects of the same trellis operation. Thus, Kasner’s TCQ’s trellis is a state transition process that selects which codebook subset (superset) to use for each coefficient based on the current state.)
by determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on a state associated with the current neural network parameter, (Kasner, page 1677, col. 2, paragraph 2, “Fig. 1 shows that at any given trellis state, the next codeword must come from one of two supersets 
    PNG
    media_image1.png
    18
    185
    media_image1.png
    Greyscale
.”, The excerpt shows that which superset (the reconstruction level set comprising the set of quantization levels) is valid depends on the current state.)
and updating the state for a subsequent neural network parameter depending on the quantization index decoded from the data stream for the immediately preceding neural network parameter, (Kasner, page 1677, col. 2, paragraph 2, “Given an initial state, a sequence of indices specifying which codeword was chosen from the appropriate superset at each step, is sufficient to allow the decoder to reproduce the sequence of codewords chosen [3].”, the decoder updates its state after each decoded index so that the next state (and thus the next superset) is determined by the previous index.)
The motivation to combine Han with Kasner is substantially similar to that applied for claim 1 above.
Sze, in the same field of data encoding, teaches the following limitations which the above fails to teach:
and encode the quantization index for the current neural network parameter from the data stream using arithmetic coding using a probability model which depends on the state for the current neural network parameter. (Sze, paragraph 6, “CABAC has multiple probability modes for different contexts. It first converts all non-binary symbols to binary symbols referred to as bins. Then, for each bin, the coder selects which probability model to use, and uses information from nearby elements to optimize the probability estimate. Arithmetic coding is then applied to compress the data.”, CABAC’s arithmetic coder selects a probability model (context) based on the current state, and then arithmetically encodes each bin under that model, teaching state-dependent arithmetic encoding.)
The motivation to combine Han and Kasner with Sze is substantially similar to that applied for claim 1 above.

Claims 106 and 112 recite substantially similar limitations to claim 1 and as such a similar analysis applies.
Claims 107, 110, and 113 recite substantially similar limitations to claim 46 and as such a similar analysis applies.

Claims 6, 18, 20, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Kasner in view of Han and in further view of Sze and Coban et al., (US 11,451,840 B2), hereafter referred to as Coban.
Claim 6: Han, Kasner, and Sze teaches the limitations of claim 1. Coban, in the same field of trellis coded quantization, teaches the following limitations which the above fails to teach:
Apparatus of claim 1, wherein the number of reconstruction level sets of the plurality of reconstruction level sets is two (Coban, col. 12, line 54, “FIG . 3 is a conceptual diagram illustrating an example of how two scalar quantizers can be used to perform quantization… a first quantizer ( e.g. , Q1 ) may be configured with a first set of quantization parameters and a second quantizer ( e.g. , Q2 ) may be configured with a second set of quantization parameters that are different in value from the first set”, two reconstruction level sets Q1 and Q2 are present in Coban.)
and the plurality of reconstruction level sets comprises a first reconstruction level set that comprises zero and even multiples of a predetermined quantization step size, (Coban, col. 13, line 1, “In particular , when using two scalar quantizers , a first quantizer Q0 may map transform coefficient levels ( numbers below the points , e.g. , absolute values ) to even integer multiples of quantization step size Δ.”)
and a second reconstruction level set that comprises zero and odd multiples of the predetermined quantization step size. (Coban, col. 13, line 4, “The second quantizer Q1 may map the transform coefficient levels to odd integer multiples of the quantization step size Δ or to zero .”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Han (i.e. deep neural network quantization methos), Kasner, and Sze by incorporating the teachings of Coban (i.e. Trellis-Coded Quantization using parity based indexing). A motivation of which is to provide a codebook quantization technique using Trellis Coded Quantization as to increase computation efficiency. (Coban, col. 33, line 26, “In this way , video encoder 200 and / or video decoder 300 may quantize or inverse quantize a set of syntax elements representing the remaining levels ( e.g. , gt3 418A - 418N ) of the plurality of coefficient levels for transform coefficients of residual data for the block of video data without grouping all bypass coded bins for simpler parsing , thereby improving a computation efficiency of video encoder 200 and / or video decoder 300 .”)


Claim 18: Han, Kasner, and Sze teaches the limitations of claim 16. Coban, in the same field of trellis coded quantization, teaches the following limitations which the above fails to teach:
Apparatus of claim 16, configured to update the state for the subsequent neural network parameter using a parity of the quantization index decoded from the data stream for the immediately preceding neural network parameter. (Coban, col. 13, line 1, “In particular , when using two scalar quantizers , a first quantizer Q0 may map transform coefficient levels ( numbers below the points , e.g. , absolute values ) to even integer multiples of quantization step size Δ. The second quantizer Q1 may map the transform coefficient levels to odd integer multiples of the quantization step size Δ or to zero .”, Coban drives state transitions by the parity (even/odd) of the previous decoded coefficient level. Because Coban explicitly organizes reconstruction levels into even multiples (Q0) and odd multiples (Q1) of Δ, the quantizer’s behavior is governed by the parity (even/odd) of the relevant integer index or level. Whenever the scheme chooses between Q0 and Q1 based on whether a level (or its integer multiple index) is even or odd, it is effectively using the parity of the quantization index as the update variable for the state machine. That even-versus-odd test is what is being interpreted as “using a parity of the quantization index … to update the state for the subsequent neural network parameter.” It is interpreted that this form of Trellis-Coded Quantization would then be applied to the neural network parameters of Han.)

Claim 20: Han, Kasner, and Sze teaches the limitations of claim 16. Coban, in the same field of trellis coded quantization, teaches the following limitations which the above fails to teach:
Apparatus of claim 16, configured to transition, in the state transition process, between an even number of possible states and the number of reconstruction level sets of the plurality of reconstruction level sets is two, wherein the determining, for the current neural network parameter, the set of quantization levels out of the plurality of reconstruction level sets depending on the state associated with the current neural network parameter determines a first reconstruction level set out of the plurality of reconstruction level sets if the state belongs to a first half of the even number of possible states, and a second reconstruction level set out of the plurality of reconstruction level sets if the state belongs to a second half of the even number of possible states. (Coban, col. 13, line 17, “In this example , coefficients in state 0 and state 1 use the Q0 ( even integer multiples of stepsize ) quantizer , and coefficients in states 2 and 3 use the Q1 ( odd quantizer , and coefficients in states 2 and 3 use the Q1 ( odd sets can be achieved by changing the parity of the level of integer multiples of stepsize ) quantizer.”, Coban’s 4-state example (states 0–3) shows that states 0 and 1 use Q0 (even-multiple quantizer) while states 2 and 3 use Q1 (odd-multiple quantizer). Here, the total number of states (4) is even; the first two states (0,1) constitute the “first half” of the states and are associated with the first reconstruction-level set (Q0), while the remaining two states (2,3) constitute the “second half” and are associated with the second reconstruction-level set (Q1). Thus, Coban’s mapping of Q0 to states in the first half and Q1 to states in the second half of an even-sized state set is what is being interpreted as the claimed “if the state belongs to a first half … use the first reconstruction level set … if the state belongs to a second half … use the second reconstruction level set.”)

Claim 21: Han, Kasner, and Sze teaches the limitations of claim 16. Coban, in the same field of trellis coded quantization, teaches the following limitations which the above fails to teach:

Apparatus of claim 16, configured to perform the update of the state by means of a transition table which maps a combination of the state and a parity of the quantization index decoded from the data stream for the immediately preceding neural network parameter onto a further state associated with the subsequent neural network parameter. (Coban, col 13, line 14, “FIG . 4 is a conceptual diagram illustrating an example state transition scheme for two scalar quantizers used to perform quantization . In this example , coefficients in state 0 and state 1 use the Q0 ( even integer multiples of stepsize quantizer , and coefficients in states 2 and 3 use the Q1 ( odd integer multiples of stepsize ) quantizer .”, figure 4 of Coban clearly discloses a state transition scheme which maps the combination of the state and parity of the quantization index and shows transitions to further states. It is interpreted by the examiner that the neural network quantization of Han and its data stream would similarly apply here and provide a data stream of neural network related parameters to perform the trellis-coded quantization of Coban.)

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Kasner in view of Han and in further view of Sze, Coban, and Schwarz et al., (Schwarz, H., Nguyen, T., Marpe, D., & Wiegand, T. (2019, March). Hybrid video coding with trellis-coded quantization. In 2019 Data Compression Conference (DCC) (pp. 182-191). IEEE.), hereafter referred to as Schwarz.
Claim 8: Han, Kasner, and Sze teaches the limitations of claim 1. Coban, in the same field of trellis coded quantization, teaches the following limitations which the above fails to teach:
Apparatus of claim 7, wherein the number of reconstruction level sets of the plurality of reconstruction level sets is two and the apparatus is configured to derive the intermediate value for each neural network parameter by, if the selected reconstruction level set for the respective neural network parameter is a first set, multiply the quantization index for the respective neural network parameter by two to acquire the intermediate value for the respective neural network parameter; (Coban, col. 13, line 1, “In particular , when using two scalar quantizers , a first quantizer Q0 may map transform coefficient levels ( numbers below the points , e.g. , absolute values ) to even integer multiples of quantization step size Δ.”, mapping each index to an even integer multiple of Δ is exactly the same as computing an intermediate value of the index multiplied by 2.)
and if the selected reconstruction level set for a respective neural network parameter is a second set and the quantization index for the respective neural network parameter is equal to zero, set the intermediate value for the respective neural network parameter equal to zero; (Coban, col. 13, line 4, “The second quantizer Q1 may map the transform coefficient levels to odd integer multiples of the quantization step size Δ or to zero .”, the case when the decoded index is 0 is covered by Q1 outputting 0 as the intermediate value.)
Schwarz, in the same field of trellis coded quantization, teaches the following limitations which Kasner, Han, Sze and Coban fail to teach:
and if the selected reconstruction level set for a respective neural network parameter is a second set and the quantization index for the respective neural network parameter is greater than zero, multiply the quantization index for the respective neural network parameter by two and subtract one from the result of the multiplication to acquire the intermediate value for the respective neural network parameter; (Schwarz, page 184, paragraph 3, “The structure of the two scalar quantizers Q0 and Q1 used in our approach is shown in Fig. 1. The reconstruction values for Q0 are given by the even multiples of the quantization step size Δ; the reconstruction values for Q1 are given by the odd multiples of Δ and, in addition, the value of zero… For both quantizers, the reconstruction values are indicated by integer quantization indexes q, where the quantization index equal to zero corresponds to the reconstruction value equal to zero.”, 
Page 185, paragraph 1, “Given the N quantization indexes qk of a block, with k indicating the coding order, the associated reconstructed transform coefficients t k can be obtained by the following simple algorithm: 
    PNG
    media_image7.png
    90
    254
    media_image7.png
    Greyscale
”, In Schwarz, the TCQ design defines two scalar quantizers Q0 and Q1, where Q0 uses even multiples of Δ and Q1 uses odd multiples of Δ (plus zero), with each reconstruction value associated with an integer quantization index q. For each coefficient, the decoder computes the reconstructed value as t_k = (2·q_k − (s_k >> 1)·sgn(q_k))·Δ. For states corresponding to the second quantizer (Q1), (s_k >> 1) = 1, so the formula specializes to t_k = (2·q_k − sgn(q_k))·Δ. For a positive quantization index q_k > 0, sgn(q_k) = 1, and thus t_k = (2·q_k − 1)·Δ. Under the claim’s terminology, the quantization index is q_k, the “intermediate value” is the integer (2·q_k − 1), and multiplying that intermediate value by the predetermined quantization step size Δ yields the odd reconstruction level. This directly corresponds to “multiply the quantization index … by two and subtract one … to acquire the intermediate value” when the selected reconstruction-level set is the second set and the quantization index is greater than zero.)
and if the selected reconstruction level set for a current neural network parameter is a second set and the quantization index for the respective neural network parameter is less than zero, multiply the quantization index for the respective neural network parameter by two and add one to the result of the multiplication to acquire the intermediate value for the respective neural network parameter. (Schwarz, page 184, paragraph 3, “The structure of the two scalar quantizers Q0 and Q1 used in our approach is shown in Fig. 1. The reconstruction values for Q0 are given by the even multiples of the quantization step size Δ; the reconstruction values for Q1 are given by the odd multiples of Δ and, in addition, the value of zero… For both quantizers, the reconstruction values are indicated by integer quantization indexes q, where the quantization index equal to zero corresponds to the reconstruction value equal to zero.”, 
Page 185, paragraph 1, “Given the N quantization indexes qk of a block, with k indicating the coding order, the associated reconstructed transform coefficients t k can be obtained by the following simple algorithm: 
    PNG
    media_image7.png
    90
    254
    media_image7.png
    Greyscale
”, As above, Schwarzs’ TCQ scheme uses two quantizers Q0 and Q1; Q1’s reconstruction values are odd integer multiples of Δ plus zero, with each value addressed by an integer quantization index q. The decoder reconstructs each coefficient via t_k = (2·q_k − (s_k >> 1)·sgn(q_k))·Δ. For the second quantizer Q1, (s_k >> 1) = 1, giving t_k = (2·q_k − sgn(q_k))·Δ. When the quantization index is negative (q_k < 0), sgn(q_k) = −1, so the expression becomes t_k = (2·q_k + 1)·Δ. The factor (2·q_k + 1) is a negative odd integer (… −5, −3, −1, …), which, when multiplied by Δ, yields the negative odd-multiple reconstruction levels required for Q1. Under the claim’s terminology, the quantization index is q_k, the “intermediate value” is the integer (2·q_k + 1), and multiplying that intermediate value by the predetermined quantization step size Δ produces the final reconstruction level. This matches “multiply the quantization index … by two and add one … to acquire the intermediate value” for the second reconstruction-level set when the quantization index is less than zero.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to incorporate the trellis-coded quantization scheme of Schwarz into the existing Kasner, Han, Sze and Coban combination in order to improve coding efficiency and rate–distortion performance when quantizing and reconstructing coefficients. Schwarz describes replacing conventional scalar quantization by trellis-coded quantization and reports that “the coding efficiency of transform coding can be improved by replacing scalar quantization with trellis-coded quantization (TCQ) and using advanced entropy coding techniques for coding the quantization indexes” (Abstract) and that their implementation in the VVC test model “yielded average bit-rate savings of 4.9% for intra-only coding and 3.3% for typical random access configurations” (Abstract) relative to a scalar-quantization baseline.  A person of ordinary skill in the art implementing Han’s neural-network quantization with Kasner/Coban-style trellis coding would therefore have been motivated to adopt Schwarz explicit TCQ reconstruction rule—using intermediate values of the form 2·q_k − sgn(q_k) (i.e., 2i−1 and 2i+1 for positive and negative indices)—to obtain the same kind of bit-rate savings and improved coding efficiency for neural-network parameters.

Claims 38 are rejected under 35 U.S.C. 103 as being unpatentable over Kasner in view of Han and in further view of Sze and Marpe et al., (Marpe, D., Schwarz, H., & Wiegand, T. (2003). Context-based adaptive binary arithmetic coding in the H. 264/AVC video compression standard. IEEE Transactions on circuits and systems for video technology, 13(7), 620-636.), hereafter referred to as Marpe.

Claim 38: Kasner, Han, and Sze teaches the limitations of claim 35. Marpe, in the same field of Context-Aware Binary Arithmetic Coding further teaches:
Apparatus of claim 35, configured to select the probability model for the current neural network parameter out of the subset of probability models depending on a characteristic of the quantization index of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, (Marpe, page 629, col. 1, paragraph 2, “Characteristic Features: For the coding of residual data within the H.264/AVC standard specifically designed syntax elements are used in CABAC entropy coding mode. These elements and their related coding scheme are characterized by the following distinct features… Context models for coding of nonzero transform coefficients are chosen based on the number of previously transmitted nonzero levels within the reverse scanning path.”, H.264’s CABAC selects a context model for each bin by examining previously coded motion-vector differences (MVD’s), which is interpreted as analogous to choosing a probability model based on previously decoded weight indices from neighboring parameters when combined with Han.)
the characteristic comprising on or more of the signs of non-zero quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, (Marpe, page 629, col. 2, paragraph 2, “Significance Map: If the coded_block_flag indicates that a block has significant coefficients, a binary-valued significance map is encoded. For each coefficient in scanning order, a one-bit symbol significant_coeff_flag is transmitted.”, Marpe’s CABAC description encodes, for each coefficient, both (i) a significant_coeff_flag indicating zero vs. non-zero, and (ii) a coeff_sign_flag representing the sign of each non-zero coefficient. Contexts for these flags are conditioned on neighboring coefficients in the same block. When mapped to neural networks, each transform coefficient corresponds to a parameter; the neighboring coefficients in the block correspond to a neighboring portion of the neural network, and the coeff_sign_flag bits for those neighbors are the “signs of non-zero quantization indices of previously decoded neural network parameters.” Thus, using those sign flags in context templates is what is being interpreted as using the sign characteristics of neighboring non-zero indices to drive probability-model selection.)
the number of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and which are non-zero (Marpe, page 625, col. 2, paragraph 1, “In contrast to all other types of context models, both types depend on the context categories of different block types, as specified below… modeling functions are specified that involve the evaluation of the accumulated number of encoded (decoded) levels with a specific value prior to the current level bin to encode (decode).”, Marpe, page 630, col. 2, last paragraph, “When a level with an absolute value greater than 1 has been encoded, i.e., when NumLgt1 is greater than 0, a context index increment of 4 is used for all remaining levels of the regarded block.”, Marpe specifies context models that depend on counters like NumLgt1(i) and NumT1(i), which are accumulated numbers of previously encoded non-zero coefficients with certain absolute values before the current bin. These counters effectively count how many prior coefficients in the same block (neighboring region) are non-zero. When mapped onto neural-network parameters, that block of coefficients is the neighboring portion of the network, and the count NumLgt1/NumT1 is what is being interpreted as the “number of quantization indices of previously decoded neural network parameters … in the neighboring portion … which are non-zero.”)
a sum of the absolute values of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to a difference between (Marpe, page 630, col. 2, last paragraph, “When a level with an absolute value greater than 1 has been encoded, i.e., when NumLgt1 is greater than 0, a context index increment of 4 is used for all remaining levels of the regarded block.”, Marpe defines counters such as NumLgt1 that accumulate how many previously encoded coefficients in the block have an absolute level greater than 1, and uses conditions like “when NumLgt1 is greater than 0, a context index increment of 4 is used for all remaining levels of the regarded block” (Marpe, p.630, col.2). Each coefficient with |level|>1 contributes at least 2 to the total sum of absolute values across the block, so the condition NumLgt1>0 implies that the sum of absolute values of previously encoded coefficients in that block exceeds a positive minimum. Using NumLgt1 as a context-selection variable is therefore being interpreted as using a characteristic derived from the sum of absolute values of previously decoded neighboring quantization indices when selecting a probability model, corresponding to the claimed “sum of the absolute values … of previously decoded … neighboring parameters.” Using NumLgt1(i) as a context-selection condition is therefore being interpreted, under a broad reading, as using a function of the sum of absolute values of previously decoded neighboring quantization indices when selecting a probability model, corresponding to the claimed “sum of the absolute values … of previously decoded … neighboring parameters.”)
a sum of the absolute values of quantization indices of previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and the number of quantization indices of the previously decoded neural network parameters which relate to a portion of the neural network neighboring a portion which the current neural network parameter relates to, and which are non-zero. (Marpe, page 630, col. 2, last paragraph, “Then, the context for the first bin of coeff_abs_level_minus1 is determined by the current value NumT1, where the following additional rules apply. If more than three past coded coefficients have an absolute value of 1, the context index increment of three is always chosen. When a level with an absolute value greater than 1 has been encoded, i.e., when NumLgt1 is greater than 0, a context index increment of 4 is used for all remaining levels of the regarded block.”, Marpe’s decision logic jointly examines NumT1 (the count of prior levels equal to 1) and NumLgt1 (the count/sum of prior levels >1) to choose between two context-index increments (3 vs. 4).  In effect, it is checking whether the difference between the total magnitude (sum) and the simple count exceeds a threshold, mapping directly to use a difference between “sum of absolute values of quantization indices” and “number of quantization indices” for model selection.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Han (i.e. deep neural network quantization methos) by incorporating the teachings of Marpe (i.e. Context-Aware Binary Arithmetic Coding in H.264/AVC). A motivation of which is to provide a quantization technique that would further reduce the overall bitstream size. (Marpe, page 632, col. 2, paragraph 3, “The CABAC entropy coding method is part of the Main profile of H.264/AVC [1] and may find its way into video streaming, broadcast, or storage applications within this profile. Experimental results have shown the superior performance of CABAC in comparison to the baseline entropy coding method of VLC/CAVLC. For typical test sequences in broadcast applications, averaged bit-rate savings of 9% to 14% corresponding to a range of acceptable video quality of about 30–38 dB were obtained.”, Marpe’s CABAC method provides explicit superior performance in terms of amount of compression yielding acceptable quality loss.)

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Reagan, B., Gupta, U., Adolf, B., Mitzenmacher, M., Rush, A., Wei, G. Y., & Brooks, D. (2018, July). Weightless: Lossy weight encoding for deep neural network compression. In International Conference on Machine Learning (pp. 4324-4333). PMLR.
Oktay, D., Ballé, J., Singh, S., & Shrivastava, A. (2019). Scalable model compression by entropy penalized reparameterization. arXiv preprint arXiv:1906.06624.
US20190387259A1 - Trellis coded quantization coefficient coding

Any inquiry concerning this communication or earlier communications from the examiner should be directed to HYUNGJUN B YI whose telephone number is (703)756-4799. The examiner can normally be reached M-F 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached at (571) 270-0419. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/H.B.Y./Examiner, Art Unit 2146                                                                                                                                                                                                        

/USMAAN SAEED/Supervisory Patent Examiner, Art Unit 2146
Read full office action
Prosecution Timeline

Jun 17, 2022
Application Filed
Aug 05, 2025
Non-Final Rejection — §101, §103, §DP
Nov 11, 2025
Response Filed
Mar 05, 2026
Non-Final Rejection — §101, §103, §DP (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/337,998
Patent 12536429
INTELLIGENTLY MODIFYING DIGITAL CALENDARS UTILIZING A GRAPH NEURAL NETWORK AND REINFORCEMENT LEARNING
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 1 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

2-3
Expected OA Rounds
18%
Grant Probability
49%
With Interview (+31.7%)
4y 7m
Median Time to Grant
Moderate
PTA Risk
Based on 17 resolved cases by this examiner. Grant probability derived from career allow rate.