Last updated: April 19, 2026
Application No. 17/784,856
METHOD AND DEVICE FOR ENCODING/DECODING DEEP NEURAL NETWORK MODEL

Non-Final OA §101§102§103
Filed
Jun 13, 2022
Examiner
BOSTWICK, SIDNEY VINCENT
Art Unit
2124
Tech Center
2100 — Computer Architecture & Software
Assignee
Industry-University Cooperation Foundation Koaea Aerospace University
OA Round
3 (Non-Final)
This examiner grants 52% of cases after interview

— +38.2% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 136 resolved cases, 2023–2026
Examiner Intelligence

BOSTWICK, SIDNEY VINCENT View full profile →
Grants 52% of resolved cases
Career Allow Rate
71 granted / 136 resolved
-2.8% vs TC avg
Strong +38% interview lift
Without
With
+38.2%
Interview Lift
resolved cases with interview
Typical timeline
4y 7m
Avg Prosecution
68 currently pending
Career history
204
Total Applications
across all art units
Statute-Specific Performance

§101
24.4%
-15.6% vs TC avg
§103
40.9%
+0.9% vs TC avg
§102
12.0%
-28.0% vs TC avg
§112
21.9%
-18.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 136 resolved cases
Office Action

§101 §102 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .  A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/8/2025 has been entered.

Remarks
This Office Action is responsive to Applicants' Amendment filed on December 8, 2025, in which claims 1, 5, 9 and 12 are currently amended. Claims 1-12 are currently pending.

Response to Arguments
Applicant’s arguments with respect to rejection of claims 1-12 under 35 U.S.C. 102/103 based on amendment have been considered, however, are not persuasive.  
With respect to Applicant's arguments that the claims cannot be performed entirely in the mind with or without assistance of tools such as pen and paper and/or are not directed towards a mathematical concept, Examiner respectfully disagrees.  In the Final Office Action mailed 10/9/2025 Examiner has provided objective evidence that entropy coding can be performed entirely in the mind with the assistance of tools such as pen and paper and without the use of a computer such that the mere recitation of generic computer components to apply the judicial exception does not integrate the judicial exception into a practical application.  Applicant's arguments on pp. 7-8 of the Remarks submitted 12/8/2025 do not provide evidence otherwise but rather amount to mere allegations of eligibility.  Similarly, with respect to Applicant's arguments on pp. 10-11 of the Remarks submitted 12/8/2025 Applicant has alleged that the instant claims present a technical improvement without actually pointing out what the technical improvement is (see MPEP 2106.05(a) "An important consideration in determining whether a claim improves technology is the extent to which the claim covers a particular solution to a problem or a particular way to achieve a desired outcome, as opposed to merely claiming the idea of a solution or outcome.").  For at least these reasons and those further detailed below Examiner asserts that it would be reasonable and appropriate to maintain the rejection under 35 U.S.C. 101.

Applicant’s arguments with respect to rejection of claims 1-12 under 35 U.S.C. 102/103 based on amendment have been considered, however, are not persuasive.  
With respect to Applicant’s argument on p. 13 of the Remarks submitted 12/8/2025 that “the concept of “sparse activation map” in Georgiadis is substantially a different coding method from the distinction between “global quantization” and “local quantization” in the present invention”, Examiner respectfully disagrees.  local quantization is defined in non-limiting embodiments of the instant specification as applying to sub-blocks of layers ([¶0014]) whereas global quantization is described in non-limiting embodiments of the instant specification as applying uniform or linear quantization to the layer ([¶0056]) such that in view of the instant specification the layer activation map quantization step in Georgiadis (second step in the compression pipeline) is interpreted as global quantization since the quantization occurs for the entire activation-map before the map is split into smaller quantized sub-blocks (local quantization) in the entropy coding stage ([¶0019] "the quantized activation maps may be divided into smaller units, referred to herein as compress blocks").  

    PNG
    media_image1.png
    186
    706
    media_image1.png
    Greyscale

FIG. 1A of US20200143226A1


For at least these reasons the global quantization in Georgiadis is seen as being analogous to the global quantization described in the instant specification.  
With respect to Applicant’s arguments on p. 13 of the Remarks submitted 12/8/2025 that Georgiadis doesn’t teach “when local quantization is performed on the current layer, the quantization information includes […]”, Examiner notes that the instant claims only require one of global quantization or local quantization and Georgiadis has been interpreted as disclosing global quantization with respect to the instant specification.  Since Georgiadis does not disclose performing local quantization, the local quantization is never performed which satisfies the contingent claim limitation.  MPEP 2111.04 instructs that under BRI a method claim with contingent limitations does not include requirements that are not required because the condition precedent is not met as crucially “assume a method claim requires step A if a first condition happens and step B if a second condition happens. If the claimed invention may be practiced without either the first or second condition happening, then neither step A or B is required by the broadest reasonable interpretation of the claim”).
For at least these reasons and those further detailed below Examiner asserts that it is reasonable and appropriate to maintain the prior art rejections in view of Georgiadis.
Claim Rejections - 35 USC § 101
101 Rejection
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-12 are rejected under 35 USC § 101 because the claimed invention is directed to non-statutory subject matter.

Regarding Claim 1:  Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 1 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 1 under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
entropy decoding quantization information for a current layer, wherein the quantization information includes an indicator indicating whether a global quantization or a local quantization is performed on the current layer (observation, evaluation, and judgement. Arithmetic coding is an example of entropy decoding which can readily and practically be performed entirely by hand (See Vashishtha, “Arithmetic Coding”, 2018 at https://www.youtube.com/watch?v=-R2a2a1-2MM which shows arithmetic coding being performed entirely by hand.),
performing dequantization on the current layer using one of the global quantization or the local quantization based on the indicator (observation, evaluation, and judgement.  The European Math Society Encyclopedia of Math defines quantization as: “The partitioning of a set of possible communications generated by an information source (cf. Information, source of ) into a finite (or sometimes countable) number of disjoint subsets in such a way that the information in each class can be represented with a given precision of reproduction of the information (cf. Information, exactness of reproducibility of ) by some specially selected element. To a given quantization of information corresponds a way of coding the information source, defined by a coding function when such a quantization enables one to replace the sending of a continuous signal by that of a discrete signal without violating certain conditions on the precision of reproduction of information” which can readily and practically be performed entirely in the mind with or without the assistance of tools such as pen and paper.  A neural network layer is routinely represented as a matrix such that dequantization of a layer is interpreted as synonymous with dequantization of a matrix.  This interpretation is explicitly reinforced by the instant specification ([¶0043] “one layer of the deep neural network may correspond to a learned weight matrix having one dimension”))
wherein one of the global quantization or the local quantization is performed on the current layer based on a distortion test (observation, evaluation, and judgement)
wherein, when local quantization is performed on the current layer, the quantization information includes, local quantization application information regarding whether or not local quantization is applied to the entire current layer, sub-block size fix information regarding whether or not a sub-block size is applied, sub-block size information on a sub-block size, sub-block local quantization application information regarding whether or not local quantization is applied to a sub-block, local quantization mode information on a local quantization mode, and sub-block position information on a sub-block position. (observation, evaluation, and judgement)
Therefore, claim 1 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 1 recites additional elements “in a plurality of layers of the deep neural network” and “a deep neural network”. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Claim 1 also recites additional elements “obtaining a plurality of layers of the deep neural network” which amounts to generally linking the judicial exception to a particular technology or field of use.  Therefore, claim 1 is directed to a judicial exception.
Step 2B Analysis:  Claim 1 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 1 amount to no more than mere instructions to apply the judicial exception using a generic computer component and insignificant extra-solution activity.  The gathering and outputting data is considered well-understood, routine, and conventional in the art (See MPEP 2106.05(d)(II)(i)).
For the reasons above, claim 1 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to independent claim 12, which recites a computer program product, as well as to dependent claims 2-8. The additional limitations of the dependent claims are addressed briefly below:
Dependent claim 2 recites additional observation, evaluation, and judgement “when global quantization is performed on the current layer, the quantization information includes at least one of global quantization mode information on a global quantization mode, bit size information on a bit size, uniform quantization application information regarding whether or not uniform quantization is applied, individual decoding information on individual decoding of the plurality of layers, parallel decoding information regarding whether or not parallel decoding is performed, codebook information on a codebook, step size information on a step size, and channel number information on a number of channels in the current layer.”
Dependent claim 3 recites additional observation, evaluation, and judgement “when nonuniform quantization is performed on the current layer, the quantization information includes outlier-aware quantization application information regarding application of an outlier-aware quantization mode.”
Dependent claim 4 recites additional observation, evaluation, and judgement “when the global quantization mode is a special global quantization mode, the quantization information includes transform function list position information regarding a position in a transform function list”.
Dependent claim 5 recites additional observation, evaluation, and judgement “when local quantization is performed on the current layer, the quantization information includes sub-block codebook information on a sub-block codebook, and channel number information on a number of channels of the current layer”.
Dependent claim 6 recites observation, evaluation, and judgement “wherein, when the local quantization mode is a mode for allocating a specific bit, the quantization information includes local quantization bit size information on a local quantization bit size”.
Dependent claim 7 recites additional observation, evaluation, and judgement “the entropy decoding of the quantization information for the current layer uses at least one of a limited K-th order Exp_Golomb binarization method, a fixed-length binarization method, a unary binarization method, and a truncated binary binarization method”
Dependent claim 8 recites additional observation, evaluation, and judgement “The method of claim 7, wherein the entropy decoding of the quantization information for the current layer uses, for information generated through binarization, at least one of a context-
based adaptive binary arithmetic coding (CABAC) method, a context-based adaptive variable length coding (CAVLC) method, a conditional arithmetic coding method, and a bypass coding method”

Regarding Claim 9:  Claim 9 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 9 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis:  Claim 9 under its broadest reasonable interpretation is a series of mental processes.  For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: 
performing quantization for a current layer using one of a global quantization or a local quantization (observation, evaluation, and judgement.  The European Math Society Encyclopedia of Math defines quantization as: “The partitioning of a set of possible communications generated by an information source (cf. Information, source of ) into a finite (or sometimes countable) number of disjoint subsets in such a way that the information in each class can be represented with a given precision of reproduction of the information (cf. Information, exactness of reproducibility of ) by some specially selected element. To a given quantization of information corresponds a way of coding the information source, defined by a coding function when . Such a quantization enables one to replace the sending of a continuous signal by that of a discrete signal without violating certain conditions on the precision of reproduction of information” which can readily and practically be performed entirely in the mind with or without the assistance of tools such as pen and paper.  A neural network layer is routinely represented as a matrix such that dequantization of a layer is interpreted as synonymous with dequantization of a matrix.  This interpretation is explicitly reinforced by the instant specification ([¶0043] “one layer of the deep neural network may correspond to a learned weight matrix having one dimension”))
entropy encoding quantization information for the current layer, wherein the quantization information includes an indicator indicating whether the global quantization or the local quantization is performed on the current layer (observation, evaluation, and judgement. Arithmetic coding is an example of entropy decoding which can readily and practically be performed entirely by hand (See Vashishtha, “Arithmetic Coding”, 2018 at https://www.youtube.com/watch?v=-R2a2a1-2MM which shows arithmetic coding being performed entirely by hand.),
generating a bitstream including the quantization information (observation, evaluation, and judgement.  A bitstream is a series of 0s and 1s which can readily be generated entirely in the mind with or without the assistance of tools such as pen and paper.)
wherein one of global quantization or local quantization is performed on the current layer based on a distortion test (observation, evaluation, and judgement)
and wherein, when local quantization is performed on the current layer, the quantization information includes, local quantization application information regarding whether or not local quantization is applied to the entire current layer, sub-block size fix information regarding whether or not a sub-block size is applied, sub-block size information on a sub-block size, sub-block local quantization application information regarding whether or not local quantization is applied to a sub-block, local quantization mode information on a local quantization mode, and sub-block position information on a sub-block position. (observation, evaluation, and judgement)
Therefore, claim 9 recites an abstract idea which is a judicial exception.
Step 2A Prong Two Analysis:  Claim 9 recites additional elements “in a plurality of layers of the deep neural network” and “a deep neural network”. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component.  An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application.  Therefore, claim 9 is directed to a judicial exception.
Step 2B Analysis:  Claim 9 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 9 amount to no more than mere instructions to apply the judicial exception using a generic computer component.
For the reasons above, claim 9 is rejected as being directed to non-patentable subject matter under §101

Therefore, when considering the elements separately and in combination, they do not add significantly more to the inventive concept. Accordingly, claims 1-12 are rejected under 35 U.S.C. § 101. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

	Claims 1, 2, 4, 5, 6, 7, 9, 10,  and 12 are rejected under U.S.C. §102(a)(1) as being anticipated by Georgiadis (US20200143226A1).  

    PNG
    media_image1.png
    186
    706
    media_image1.png
    Greyscale

FIG. 1A of US20200143226A1


	 Regarding claim 1, Georgiadis teaches A method for decoding a deep neural network, the method comprising: ([Abstract] "A system and a method provide compression and decompression of an activation map of a layer of a neural network […] For decoding, each block is decoded independently from other blocks using at least one decompression mode corresponding to the at least one compression mode used to compress the block; and deformatted into a tensor having the size of H×W×C.")
	in a plurality of layers of the deep neural network, entropy decoding quantization information for a current layer; wherein the quantization information includes an indicator indicating whether a global quantization or a local quantization is performed on the current layer ([¶0006] "the at least one lossless compression mode may be selected from a group including Exponential-Golomb encoding, Sparse-Exponential-Golomb encoding, Sparse-Exponential-Golomb-RemoveMin encoding, Golomb-Rice encoding, Exponent-Mantissa encoding, Zero-encoding, Fixed length encoding, and Sparse fixed length encoding" [¶0019] "In the quantization stage, the activation maps of each layer are quantized. In the entropy coding stage, the quantized activation maps may be divided into smaller units, referred to herein as compress blocks, that are compressed using a variety of different compression modes. [...] the compress blocks are compressed to generate a bit stream representing the compressed activation maps of a layer of the neural network. The compress units may be decompressed, dequantized and reformatted into the original shape of the sparsified activation maps" Each of the listed encoding methods are entropy encoding methods.  See FIG. 1A where information is quantized at step 108 as part of the compressor into quantized information (sparsified quantized maps) for a current layer.  See also FIG. 3 where it shows that Encoder and Decoder operate on current layer L and Georgiadis explicitly teaches performing FIG. 3 bidirectionally [¶0019].)
	performing dequantization on the current layer using one of the global quantization or the local quantization based on the indicator ([¶0019] "the activation maps of each layer are quantized. In the entropy coding stage, the quantized activation maps may be divided into smaller units, referred to herein as compress blocks [...] The compress units may be decompressed, dequantized and reformatted into the original shape of the sparsified activation maps"  The quantization/dequantization in Georgiadis is interpreted as global quantization/dequantization in view of the instant specification. global quantization is described in non-limiting embodiments of the instant specification as applying uniform or linear quantization to the layer ([¶0056]) such that in view of the instant specification the layer activation map quantization step in Georgiadis (second step in the compression pipeline) is interpreted as global quantization since the quantization occurs for the entire activation-map before the map is split into smaller quantized sub-blocks (local quantization) in the entropy coding stage ([¶0019] "the quantized activation maps may be divided into smaller units, referred to herein as compress blocks").)
	obtaining a plurality of layers of the deep neural network, wherein at least one of global quantization and local quantization is performed on the current layer ([¶0020] "The encoding and decoding may be performed on the activation maps for each layer of the neural network independently from encoding of activation maps of other layers, and as needed by the training algorithm" [¶0062] "the sparsified activation map that has been generated at a layer of a neural network is configured to be a tensor of size H×W×C in which H corresponds to the height of the input tensor, W to the width of the input tensor, and C to the number of channels of the input tensor. If the values of the sparsified activation map have not been quantized from floating-point numbers to be integers, then at 204 the non-quantized values of the sparsified activation map may be quantized into integer values having any bit width to form a sparsified quantized activation map.")
	based on a distortion test([¶0057] "The decompressor 104 decompresses a bitstream 114 to form activation maps 120 for a neural network 105′ (FIG. 1), which are lossy decompressions corresponding to the original non-sparsified activation maps 106" Lossy (distorted) decompression vs lossless (undistorted) decompression interpreted as being based on a distortion test.)
	and wherein, when local quantization is performed on the current layer, the quantization information includes, local quantization application information regarding whether or not local quantization is applied to the entire current layer, sub-block size fix information regarding whether or not a sub-block size is applied, sub-block size information on a sub-block size, sub-block local quantization application information regarding whether or not local quantization is applied to a sub-block, local quantization mode information on a local quantization mode, and sub-block position information on a sub-block position. (Examiner interprets Georgiadis as not performing local quantization with respect to the instant specification.  MPEP 2111.04 instructs that under BRI a method claim with contingent limitations does not include requirements that are not required because the condition precedent is not met as crucially “assume a method claim requires step A if a first condition happens and step B if a second condition happens. If the claimed invention may be practiced without either the first or second condition happening, then neither step A or B is required by the broadest reasonable interpretation of the claim”).

	 Regarding claim 2, Georgiadis teaches The method of claim 1, wherein, when global quantization is performed on the current layer, the quantization information includes at least one of global quantization mode information on a global quantization mode, bit size information on a bit size, uniform quantization application information regarding whether or not uniform quantization is applied, individual decoding information on individual decoding of the plurality of layers, parallel decoding information regarding whether or not parallel decoding is performed, codebook information on a codebook, step size information on a step size, and channel number information on a number of channels in the current layer. (Georgiadis [¶0019] "a bit stream representing the compressed activation maps" [¶0021] "an encoder that may be configured to receive as an input a tensor of size H×W×C in which H corresponds to the height of the input tensor, W to the width of the input tensor, and C to the number of channels of the input tensor" [¶0029] "the non-quantized values of the sparsified activation map 111 may be quantized by the quantizer 108 into integer values having any bit width (i.e., 8 bits, 12 bits, 16 bits, etc.)" [¶0029] "Typically linear (uniform) quantization is used and q may be anything between 1 and 16 bits").
	
	 Regarding claim 4, Georgiadis teaches The method of claim 2, wherein, when the global quantization mode is a special global quantization mode, the quantization information includes transform function list position information regarding a position in a transform function list.(Georgiadis [¶0055] "The compression modes available for each compress unit may be fixed during compression of an activation map. In one embodiment, the full range of available compression modes may be represented by L bits. If, for example, four compression modes are available, a two bit prefix may be used to indicate corresponding indices (i.e., 00, 01, 10 and 11) for the four available compression modes" [¶0056] "The corresponding index for the selected compression mode may be appended as a prefix to the beginning of the bitstream for the particular compress unit and then the resulting bitstream for the compress unit may be added to the bitstream for the entire activation map" Compression mode list interpreted as synonymous with transform function list.  Compression mode index interpreted as synonymous with transform function list position information.).
	
	 Regarding claim 5, Georgiadis teaches The method of claim 1, wherein, when local quantization is performed on the current layer, the quantization information further includes sub-block codebook information on a sub-block codebook, and channel number information on a number of channels of the current layer.(Georgiadis [Abstract] "A system and a method provide compression and decompression of an activation map of a layer of a neural network […] For decoding, each block is decoded independently from other blocks using at least one decompression mode corresponding to the at least one compression mode used to compress the block; and deformatted into a tensor having the size of H×W×C.").
	
	 Regarding claim 6, Georgiadis teaches The method of claim 5, wherein, when the local quantization mode is a mode for allocating a specific bit, the quantization information includes local quantization bit size information on a local quantization bit size.(Georgiadis [¶0029] "the non-quantized values of the sparsified activation map 111 may be quantized by the quantizer 108 into integer values having any bit width (i.e., 8 bits, 12 bits, 16 bits, etc.) to form a sparsified and quantized activation map 112").
	
	 Regarding claim 7, Georgiadis teaches The method of claim 1, wherein the entropy decoding of the quantization information for the current layer uses at least one of a limited K-th order Exp_Golomb binarization method, a fixed-length binarization method, a unary binarization method, and a truncated binary binarization method.(Georgiadis [¶0031] "Example lossless compression modes include, but are not limited to, Exponential-Golomb encoding, Sparse-Exponential-Golomb encoding, Sparse-Exponential-Golomb-RemoveMin encoding, Golomb-Rice encoding, Exponent-Mantissa encoding, Zero-encoding, Fixed length encoding and Sparse fixed length encoding" [¶0032] "The Exponential-Golomb encoding is a well-known compression mode that assigns variable length codes in which smaller numbers are assigned shorter codes. The number of bits used to encode numbers increases exponentially, and one parameter, commonly referred to as the order k parameter, controls the rate at which the number of bits increase").
	
	 Regarding claim 9, Georgiadis teaches A method for encoding a deep neural network, the method comprising: in a plurality of layers of the deep neural network, performing quantization for a current layer using one of a global quantization or a local quantization; entropy encoding quantization information for the current layer wherein the quantization information includes an indicator indicating whether the global quantization or the local quantization is performed on the current layer;([¶0019] "In the quantization stage, the activation maps of each layer are quantized. In the entropy coding stage, the quantized activation maps may be divided into smaller units, referred to herein as compress blocks, that are compressed using a variety of different compression modes. [...] the compress blocks are compressed to generate a bit stream representing the compressed activation maps of a layer of the neural network. The compress units may be decompressed, dequantized and reformatted into the original shape of the sparsified activation maps" [¶0006] "the at least one lossless compression mode may be selected from a group including Exponential-Golomb encoding, Sparse-Exponential-Golomb encoding, Sparse-Exponential-Golomb-RemoveMin encoding, Golomb-Rice encoding, Exponent-Mantissa encoding, Zero-encoding, Fixed length encoding, and Sparse fixed length encoding"  Each of the listed encoding methods are entropy encoding methods.  See FIG. 1A where information is quantized at step 108 as part of the compressor into quantized information (sparsified quantized maps) for a current layer.  See also FIG. 3 where it shows that Encoder and Decoder operate on current layer L and Georgiadis explicitly teaches performing FIG. 3 bidirectionally [¶0019].)
	generating a bitstream including the quantization information,([¶0043] "If the value x that is to be encoded is a 0, then it is encoded as a “1,” and (4) otherwise a “0” is added to the bitstream and then x - y is encoded using the Exponential-Golomb compression mode.")
	wherein at least one of global quantization and local quantization is performed on the current layer.([¶0020] "The encoding and decoding may be performed on the activation maps for each layer of the neural network independently from encoding of activation maps of other layers, and as needed by the training algorithm" [¶0062] "the sparsified activation map that has been generated at a layer of a neural network is configured to be a tensor of size H×W×C in which H corresponds to the height of the input tensor, W to the width of the input tensor, and C to the number of channels of the input tensor. If the values of the sparsified activation map have not been quantized from floating-point numbers to be integers, then at 204 the non-quantized values of the sparsified activation map may be quantized into integer values having any bit width to form a sparsified quantized activation map.")
	based on a distortion test([¶0057] "The decompressor 104 decompresses a bitstream 114 to form activation maps 120 for a neural network 105′ (FIG. 1), which are lossy decompressions corresponding to the original non-sparsified activation maps 106" Lossy (distorted) decompression vs lossless (undistorted) decompression interpreted as being based on a distortion test.)
	and wherein, when local quantization is performed on the current layer, the quantization information includes, local quantization application information regarding whether or not local quantization is applied to the entire current layer, sub-block size fix information regarding whether or not a sub-block size is applied, sub-block size information on a sub-block size, sub-block local quantization application information regarding whether or not local quantization is applied to a sub-block, local quantization mode information on a local quantization mode, and sub-block position information on a sub-block position.(Not required by the claim.).
	 Regarding claim 10, Georgiadis teaches The method of claim 9, wherein the entropy encoding of the quantization information for the current layer uses at least one of a limited K-th order Exp_Golomb binarization method, a fixed-length binarization method, a unary binarization method, and a truncated binary binarization method.(Georgiadis [¶0031] "Example lossless compression modes include, but are not limited to, Exponential-Golomb encoding, Sparse-Exponential-Golomb encoding, Sparse-Exponential-Golomb-RemoveMin encoding, Golomb-Rice encoding, Exponent-Mantissa encoding, Zero-encoding, Fixed length encoding and Sparse fixed length encoding" [¶0032] "The Exponential-Golomb encoding is a well-known compression mode that assigns variable length codes in which smaller numbers are assigned shorter codes. The number of bits used to encode numbers increases exponentially, and one parameter, commonly referred to as the order k parameter, controls the rate at which the number of bits increase").
	
	 Regarding claim 12, claim 12 is directed towards a computer program product for performing the method of claim 1.  Therefore, the rejection applied to claim 1 also applies to claim 12.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.


	Claim 3 is rejected under U.S.C. §103 as being unpatentable over the combination of Georgiadis and Park (“Energy-efficient Neural Network Accelerator Based on Outlier-aware Low-precision Computation”, 2018).  

	 Regarding claim 3, Georgiadis teaches The method of claim 2.
	However, Georgiadis doesn't explicitly teach wherein, when nonuniform quantization is performed on the current layer, the quantization information includes outlier-aware quantization application information regarding application of an outlier-aware quantization mode.
	Examiner notes that the claim limitation is explicitly contingent on when nonuniform quantization is performed, and as Examiner has interpreted Georgiadis as never performing nonuniform quantization the claim is conditionally satisfied.  However, in the interest of advancing prosecution Examiner has provided the following combination of Georgiadis with Park.

	Park, in the same field of endeavor, teaches when nonuniform quantization is performed on the current layer, the quantization information includes outlier-aware quantization application information regarding application of an outlier-aware quantization mode. ([Abstract] "outlier-aware accelerator (OLAccel)" [p. 10 7] "In this paper, we proposed a hardware accelerator called OLAccel. It implements outlier-aware quantization, which provides a majority of data with fine-grained quantization" [p. 694 §V] "OLAccel16 spends a long execution cycle for the first convolutional layer [...]  OLAccel8 slows down from the second to fifth convolutional layer compared to OLAccel16").

	Georgiadis as well as Park are directed towards neural network quantization.  Therefore, Georgiadis as well as Park are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Georgiadis with the teachings of Park by using the outlier-aware quantization method in Park in additional to the other quantization modes in Georgiadis who explicitly teaches ([¶0031] “It should be understood that other lossless encoding techniques may be used either in addition or as an alternative one of the example compression modes”).  Park provides as additional motivation for combination ([p. 697 §VII] "Our experiments show that OLAccel reduces energy consumption by 43.5% and 27.0% on AlexNet compared with the state-of-the-art 16-bit and 8-bit zero-aware accelerators").  This motivation for combination also applies to the remaining claims which depend on this combination.

	Claims 8 and 11 are rejected under U.S.C. §103 as being unpatentable over the combination of Georgiadis and Kim (US20190230354A1).

	 Regarding claim 8, Georgiadis teaches The method of claim 7.
	However, Georgiadis doesn't explicitly teach, wherein the entropy decoding of the quantization information for the current layer uses, for information generated through binarization, at least one of a context-based adaptive binary arithmetic coding (CABAC) method, a context-based adaptive variable length coding (CAVLC) method, a conditional arithmetic coding method, and a bypass coding method.

	Kim, in the same field of endeavor, teaches the entropy decoding of the quantization information for the current layer uses, for information generated through binarization, at least one of a context-based adaptive binary arithmetic coding (CABAC) method, a context-based adaptive variable length coding (CAVLC) method, a conditional arithmetic coding method, and a bypass coding method. ([¶0098] "The entropy encoder 130 may perform entropy encoding on the prediction block {tilde over (f)} and/or the quantized residual transformation Ê. For example, the entropy encoder 130 may perform entropy encoding using an encoding scheme, such as a context adaptive variable length coding (CAVLC), a context adaptive binary arithmetic coding (CABAC), and a syntax based context adaptive binary arithmetic coding (SBAC).").

	Georgiadis as well as Kim are directed towards entropy coding in neural networks.  Therefore, Georgiadis as well as Kim are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Georgiadis with the teachings of Kim by using CAVLC or CABAC as the entropy coding method in the entropy coding step of Georgiadis.  Kim provides as additional motivation for combination that the encoding method ([¶0004] “may provide technology for removing a block boundary artefact, a ringing artefact, and a high frequency blurring artefact occurring due to quantization through in-loop filtering”).

	 Regarding claim 11, Georgiadis teaches The method of claim 10.
	However, Georgiadis doesn't explicitly teach, wherein the entropy encoding of the quantization information for the current layer uses, for information generated through binarization, at least one of a context-based adaptive binary arithmetic coding (CABAC) method, a context-based adaptive variable length coding (CAVLC) method, a conditional arithmetic coding method, and a bypass coding method.

	Kim, in the same field of endeavor, teaches the entropy encoding of the quantization information for the current layer uses, for information generated through binarization, at least one of a context-based adaptive binary arithmetic coding (CABAC) method, a context-based adaptive variable length coding (CAVLC) method, a conditional arithmetic coding method, and a bypass coding method. ([¶0098] "The entropy encoder 130 may perform entropy encoding on the prediction block {tilde over (f)} and/or the quantized residual transformation Ê. For example, the entropy encoder 130 may perform entropy encoding using an encoding scheme, such as a context adaptive variable length coding (CAVLC), a context adaptive binary arithmetic coding (CABAC), and a syntax based context adaptive binary arithmetic coding (SBAC).").

	Georgiadis as well as Kim are directed towards entropy coding in neural networks.  Therefore, Georgiadis as well as Kim are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Georgiadis with the teachings of Kim by using CAVLC or CABAC as the entropy coding method in the entropy coding step of Georgiadis.  Kim provides as additional motivation for combination that the encoding method ([¶0004] “may provide technology for removing a block boundary artefact, a ringing artefact, and a high frequency blurring artefact occurring due to quantization through in-loop filtering”).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Yang (US20190073582A1) is directed towards a local/sub-block quantization method which encodes quantization mode and policy parameters.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720. The examiner can normally be reached M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/SIDNEY VINCENT BOSTWICK/Examiner, Art Unit 2124
Read full office action
Prosecution Timeline

Jun 13, 2022
Application Filed
Jun 20, 2025
Non-Final Rejection — §101, §102, §103
Sep 10, 2025
Response Filed
Oct 03, 2025
Final Rejection — §101, §102, §103
Dec 08, 2025
Response after Non-Final Action
Jan 09, 2026
Request for Continued Examination
Jan 14, 2026
Response after Non-Final Action
Feb 11, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/373,021
Patent 12561604
SYSTEM AND METHOD FOR ITERATIVE DATA CLUSTERING USING MACHINE LEARNING
2y 5m to grant Granted Feb 24, 2026
18/486,534
Patent 12547878
Highly Efficient Convolutional Neural Networks
2y 5m to grant Granted Feb 10, 2026
16/902,547
Patent 12536426
Smooth Continuous Piecewise Constructed Activation Functions
2y 5m to grant Granted Jan 27, 2026
18/607,777
Patent 12518143
FEEDFORWARD GENERATIVE NEURAL NETWORKS
2y 5m to grant Granted Jan 06, 2026
16/940,293
Patent 12505340
STASH BALANCING IN MODEL PARALLELISM
2y 5m to grant Granted Dec 23, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
52%
Grant Probability
90%
With Interview (+38.2%)
4y 7m
Median Time to Grant
High
PTA Risk
Based on 136 resolved cases by this examiner. Grant probability derived from career allow rate.