Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
The present application is being examined under the claims filed 01/05/2026. The status of the
claims are as follows:
Claims 1 and 3-7 are pending.
Claims 1, 3, and 5 are amended.
Claim 2 is cancelled.
Claims 6 and 7 are new.
Response to Amendment
The Office Action is in response to Applicant’s communication filled 01/05/2026 in response to office action mailed 10/15/2025. The Applicant’s remarks and any amendments to the claims or specification have been considered with the results that follow.
Response to Arguments
Regarding 35 U.S.C. § 101
Applicant argues “claims do not recite purely mathematical operations” (Remarks, p. 1) because claim 1 recites “storing the updated weight … and performing an inference process …” and therefore are eligible under Step 2A Prong One.
Examiner’s response:
Applicant’s arguments have been fully considered but are not persuasive. As explained in the Non-Final Office Action, the claims recite mathematical concepts, including compressing/expanding numeric ranges, calculating Kalman gain after update, and updating weights based on Kalman gain and an error term (Non-Final OA p. 3). The additional limitations regarding “storing” and “performing inference” (now moved into amended independent claim 1) do not remove the recited mathematical calculations; rather, they describe post-calculation use of computed results in a generic neural-network workflow.
The amendments in claim 1 expressly adds the limitations: “storing the updated weight in memory; and performing an inference process with the updated weight”. However, these limitations are not the type of claim limitations that negate the presence of an abstract idea under Step 2A Prong One where the claim still recites the above mathematical calculation.
Accordingly, the rejection under 35 U.S.C. § 101 is maintained.
Applicant argues claims 1 and 5 achieve a “practical application” because they provide technical benefits for edge devices and efficient memory usage; Applicant cites paragraphs 5, 6, and 8 and Enfish.
Examiner’s response:
Applicant’s arguments have been fully considered but are not persuasive. As explained in the Non-Final Office Action, the claims’ additional elements are field-of-use limitations and/or generic implementation that do not integrate the judicial exception into a practical application. In particular, the Office Action explained that limiting the math to particular numeric formats/representations (e.g., decimal-point notation) is a representation choice and not a technological improvement to computer functionality (Non-Final OA p. 8). Likewise, “maintained in learning” and “used in inference” were treated as insignificant extra-solution activity / mere use of results in a generic environment (Non-Final OA pp 6-7).
Applicant’s reliance on asserted advantages (edge device implementation, smaller memory requirements) is not commensurate with the claim scope. The amended claims do not recite a specific technical mechanism (e.g., a specific memory architecture, storage format arrangement, quantization engine, or processor/memory interaction) that effectuates an improvement to computer functionality beyond implementing the mathematical calculations and using the results. Instead, amended claim 1 broadly recites generic steps of storing and using the weight for inference (Amendment p. 1), which remains the type of generic post-solution activity discussed in the Non-Final Office Action (Non-Final OA pp. 6-7).
Accordingly, the rejection under 35 U.S.C. § 101 is maintained.
Applicant argues that the Examiner’s statement that quantizing/storing weights is WURC is incorrect. Applicant asserts practicality “at a level that can be applied to computational processing” (Remarks, p. 2).
Examiner’s response:
Applicant’s arguments have been fully considered but are not persuasive. The Non-Final Office Action explained that persisting quantized weights and using them for inference is well-understood, routine, and conventional in generic computing environments to save memory/operations and is treated generically in the present disclosure (Non-Final OA pp. 6-7). Applicant does not provide evidence establishing that these limitations, as broadly claimed, constitute an inventive concept or integrate the abstract idea into a practical application.
Further, Applicant has now moved the quantization/storage/inference limitations into amended independent claims 1 and 5 (Amendment pp. 1-2). This change does not add a specific technological mechanism; it merely relocates the same generic implementation features previously considered (Non-Final OA pp. 6-7).
Therefore, the §101 rejection is maintained.
Regarding new claims 6 and 7:
Applicant’s amendment introduces new claims 6 and 7 which recite repeating the inference process until error is less than or equal to a present constant value (Remarks, p. 3).
Examiner’s response:
The additional limitations constitute a generic convergence/termination condition for an iterative algorithm and not integrate the judicial exception into a practical application nor add significantly more than the abstract idea. Therefore, claims 6 and 7 remain subject to the 35 U.S.C. § 101 rejection for at least the reasons applied to their respective base claims, and additionally because the termination condition is conventional for iterative computational procedures.
Regarding 35 U.S.C. § 103
Applicant argues that the prior art reference Dally (US20210056446A1) does not disclose “the updated weight being quantized … second decimal point notation … shorter word length and shorter decimal-part length …”; Applicant argues that paragraphs [0072], [0073], [0075], [0110] does not disclose the specific claimed “second decimal point notation” limitation and that Dally is about logarithmic arithmetic rather than compressing floating-point word length/decimal part length (Remarks pp. 3-4).
Applicant further argues that Dally’s discussion of Tensor Cores (16-bit inputs with 32-bit accumulation) does not disclose compressing longer word-length values into shorter word lengths (Remarks p. 4).
Examiner’s response:
Applicant’s arguments have been fully considered but are not persuasive. The Non-Final Office Action relied on Dally for its teaching of low-precision numeric representations and representation conversion in neural-network computations to reduce energy and resource usage (Non-Final OA, pp. 12-13). Dally teaches converting values to alternative numeric formats and using reduced precision representation in neural-network computations.
Amended independent claims 1 and 5 now expressly recite the quantization limitation. However, Dally’s teachings of reduced-precision representations and format conversions render it obvious to express computed parameters (including weights) in reduced-precision formats to achieve predictable resource savings, as explained in the Non-Final Office Action (pp. 10-13).
Obviousness does not require that Dally use Applicant’s exact terminology (“decimal part length”). It is sufficient that Dally teaches reducing precision and converting representations in neural-network processing systems.
Accordingly, the rejections of claims 1 and 5 under 35 U.S.C. § 103 is maintained.
Applicant amended new Claims 6 and 7 that recite repeating the inference process until an error threshold is satisfied.
Examiner’s response:
The addition of repeating until an error threshold is satisfied constitutes a conventional iterative stopping condition in training or inference procedures and would have been an obvious design choice in the combined system of Anderson and Dally (please refer to section §103 in this Final Office Action).
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, and 3-7 are rejected under 35 U.S.C. 101 as being directed to a judicial exception (i.e., an abstract idea) without significantly more.
Regarding claim 1
Claim 1 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 1 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
“compressing a range of possible values of a Kalman gain before an update;“ - this recites numeric range manipulation of a parameter (Kalman gain). The specification teaches doing so by changing decimal-format ranges or applying nonlinear functions (e.g., logarithm), confirming the mathematical nature of the step. This limitation constitutes a mathematical concept under MPEP § 2106.04(a)(2).
“obtaining a Kalman gain after the update from the compressed Kalman gain before the update using an expanded Kalman filter method;” – this recites performing an expanded Kalman filter computation to obtain an updated Kalman gain (i.e., ja mathematical algorithm/calculation). MPEP § 2106.04(a)(2)(I).
“expanding the range of possible values of the Kalman gain after the update,“ – this limitation again recites range manipulation, which the specification says may be done by re-expressing number formats or by a nonlinear function (e.g., exponential). This is a mathematical concept under MPEP § 2106.04(a)(2) and thus an abstract idea.
“updating a weight by adding a weight before the update to a result obtained by multiplying the Kalman gain in which the range of the possible values of the Kalman gain is expanded by an error between a training signal and an inference result in which the weight before the update is used” – this limitation recites a multiply/add weight update using a numerically-defined error, a mathematical step and concept under MPEP § 2106.04(a).
Claim 1 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No. There are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“controlling a processor and memory to execute steps comprising:” – this recites generic computer components used as a tool to perform the claimed steps; it does not, by itself, avoid recitation of an abstract idea where the steps that follow are mathematical calculations. See MPEP § 2106.05(f) (generic computer components).
“the updated weight being quantized by being expressed in a second decimal point notation in which a word length and a length of a decimal part are shorter than the weight before the update expressed in a first decimal point notation;” – this limitation is a formatting/precision choice applied to a computed value (the updated weight). As claimed, it does not recite a specific technological improvement to computer functionality (e.g., a particular memory architecture, hardware quantizer, or non-generic data structure) and instead amounts to using a reduced-precious representation as part of implementing the mathematical calculation. MPEP § 2106.05(f) (generic computer implementation).
“storing the updated weight in the memory;” – this is post-calculation handling of the result (the updated weight) using generic memory, which does not meaningfully limit the judicial exception or apply it in a particular improved technological way. See MPEP 2106.05(g) (insignificant extra-solution activity).
“and performing an inference process with the updated weight” – this is using the computed result in a conventional influence workflow, broadly recited without any specific technical mechanism that improves computer functionality; it is an application/use of the output of the mathematical calculations. See MPEP § 2106.05(g) and MPEP § 2106.05(h) (field-of-use / apply it).
Claim 1 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No. There are no additional elements that amount to significantly more than the judicial exception. The additional elements are:
“controlling a processor and memory to execute steps comprising:” – this limitation recites generic processor/memory that performs functions; this constitutes well-understood, routine, and conventional (WURC) activity and does not add an inventive concept. See MPEP § 2106.05(d).
“the updated weight being quantized by being expressed in a second decimal point notation in which a word length and a length of a decimal part are shorter than the weight before the update expressed in a first decimal point notation;” – as broadly claimed, this limitation represents a well-understood, routine, and conventional (WURC) precision/representation choice applied to the computed result and does not add significantly more. See MPEP § 2106.05(d).
“storing the updated weight in the memory;” – this constitutes well-understood, routine, and conventional (WURC) post-calculation activity that does not add an inventive concept. See MPEP § 2106.05(d).
“and performing an inference process with the updated weight” – this constitutes well-understood, routine, and conventional (WURC) post-calculation activity that does not add an inventive concept. See MPEP § 2106.05(d).
Regarding claim 3
Claim 3 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 3 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
Claim 3 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No. There are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“wherein the weight expressed in the second decimal point notation is maintained in learning,” – this limitation broadly recites continuing to use/maintain the quantized weight during learning, which is a conventional use of a computed/represented value in the claimed workflow and does not recite a specific technological mechanism that integrates the judicial exception into a practical application. See MPEP § 2106.05(g).
“and wherein the weight expressed in the second decimal point notation is used in inference.” – this recites using the quantized weight during inference, which is a conventional application of the output of the mathematical operations and does not impose a meaningful limit or recite a particular technological implementation improving computer functionality. See MPEP § 2106.05(g); MPEP § 2106.05(h).
Claim 3 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No. There are no additional elements that amount to significantly more than the judicial exception. The additional elements are:
“wherein the weight expressed in the second decimal point notation is maintained in learning,” – this limitation amounts to conventional post-calculation use of results (maintaining/using a quantized weight during learning) and therefore are well-understood, routine, and conventional (WURC) activities in neural-network systems. It does not provide an inventive concept, either individually or as an ordered combination with the judicial exception recited in claim 1. See MPEP § 2106.05(d).
“and wherein the weight expressed in the second decimal point notation is used in inference.” – this limitation amounts to conventional post-calculation use of results (maintaining/using a quantized weight during inference) and therefore are well-understood, routine, and conventional (WURC) activities in neural-network systems. It does not provide an inventive concept, either individually or as an ordered combination with the judicial exception recited in claim 1. See MPEP § 2106.05(d).
Regarding claim 4:
Claim 4 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 4 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea. Claim 4 depends from claim 1 which has been determined to recite an abstract idea (see rejection of claim 1).
Claim 4 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, there are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“wherein the Kalman gain is expressed in an arbitrary decimal-point format.“ – this merely limits the abstract calculations to a particular technological environment/representation (a numeric decimal-point format). The specification expressly states the Kalman gain “can be expressed in any floating-point format … for example float32 or bfloat16” (spec. ¶[0037]), and treats decimal-point notation as a generic/conventional selectable representation (floating- or fixed-point; word length; mantissa/decimal part; exponent). Constraining the math to “an arbitrary decimal-point format” is therefore a field-of-use/representation choice, not an improvement to computer functionality; it does not integrate the exception into a practical application. See MPEP § 2106.05(h); § 2106.04(d).
Claim 4 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No, there are no additional elements that amount to significantly more than the judicial exception. The additional elements are:
“wherein the Kalman gain is expressed in an arbitrary decimal-point format.“ – choosing among known numeric formats (float/fixed; decimal-point variants) is well-understood, routine, and conventional (WURC) in generic computing environments; considered alone or in combination with claim 1, it does not add an inventive concept. See MPEP § 2106.05(d).
Regarding claim 5
Claim 5– Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 5 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
“wherein the compressor circuit compresses a range of possible values of a Kalman gain before an update,” – this limitation recites manipulation of numerical values associated with a Kalman gain (compressing a range of possible values), which is a mathematical concept. See MPEP § 2106.04(a)(2).
“wherein the operator circuit performs a first operation to obtain a Kalman gain after the update from the compressed Kalman gain before the update using an expanded Kalman filter method” – this limitation recites determining an updated Kalman gain using an expanded Kalman filter method (i.e., an algorithmic calculation using numerical values), which is a mathematical concept. See MPEP § 2106.04(a)(2).
“and performs a second operation to update a weight by adding a weight before the update to a result obtained by multiplying the Kalman gain in which the range of the possible values of the Kalman gain is expanded by an error between a training signal and an inference result in which the weight before the update is used” – this limitation recites a weight-update rule expressed as multiplication and addition using a Kalman gain and an error term, which is a mathematical concept. See MPEP § 2106.04(a)(2).
“wherein the expander circuit expands the range of the possible values of the Kalman gain after the update,” – this limitation recites additional numerical manipulation (expanding a range of possible values), constituting a mathematical concept. See MPEP § 2106.04(a)(2).
Claim 5 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No. There are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“a processor comprising a memory, a compressor circuit, an operator circuit, and an expander circuit,” – this limitation recites generic computing components and generic circuitry in functional germs, and does not recite a specific technological improvement that integrate the judicial exception into a practical application (i.e., it amounts to using generic components as tools to perform the claimed calculations). See MPEP § 2106.04(f).
“the updated weight being quantized by being expressed in a second decimal point notation in which a word length and a length of a decimal part are shorter than the weight before the update expressed in a first decimal point notation,” – this limitation broadly recites expressing the computed weight in a shorter-precision notation. As claimed, it does not recite a particular technological mechanism (e.g., a specific quantization engine, memory layout, or hardware implementation improving computer functionality) that integrate the judicial exception into a practical application. See MPEP § 2106.05(d).
“and wherein the processor is configured to store the updated weight in the memory;” – this limitation recites storing the output of the mathematical calculations, which is an insignificant extra-solution activity and does not integrate the exception into a practical application. See MPEP § 2106.05(g).
“and perform an inference process with the updated weight.” – this limitation broadly recites using the computed result in inference, without reciting a specific technological implementation that integrates the judicial exception into a practical application. See MPEP § 2106.05(h); MPEP § 2106.05(g).
Claim 5 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No. There are no additional elements that amount to significantly more than the judicial exception. The additional elements are:
“a processor comprising a memory, a compressor circuit, an operator circuit, and an expander circuit,” – this limitation recites generic processor/memory/circuitry performing their well-understood, routine, and conventional (WURC) functions to execute the claimed operations does not add an inventive concept. See MPEP § 2106.05(d).
“the updated weight being quantized by being expressed in a second decimal point notation in which a word length and a length of a decimal part are shorter than the weight before the update expressed in a first decimal point notation,” – as broadly claimed, this is a well-understood, routine, and conventional (WURC) precision/representation choice applied to the computed result and does not add significantly more than the judicial exception. See MPEP § 2106.05(d).
“and wherein the processor is configured to store the updated weight in the memory;” – this limitation recites a generic processor that is configured for storing a computed result in memory is a well-understood, routine, and conventional (WURC) activity in the art and does not add significantly more than the judicial exception. See MPEP § 2106.05(d).
“and perform an inference process with the updated weight.” – this limitation, as broadly claimed, recites a well-understood, routine, and conventional (WURC) step in neural-networks. Merely performing inference with an updated weight does not add significantly more than the judicial exception. See MPEP § 2106.05(d).
Regarding claim 6
Claim 6 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 6 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
Claim 6 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No. There are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“wherein the inference process is repeated until an error between the inference result and training data is equal to or less than a preset constant value.” – this limitation broadly recites repeating the inference process until a specified error threshold is met (i.e., a conventional convergence/termination criterion for an iterative procedure) and does not recite a particular technological implementation that meaningfully limits the judicial exception or integrates it into a practical application. See MPEP § 2106.04(d); MPEP § 2106.05(h).
Claim 6 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No. There are no additional elements that amount to significantly more than the judicial exception. The additional elements are:
“wherein the inference process is repeated until an error between the inference result and training data is equal to or less than a preset constant value.” – this limitation represents a well-understood, routine, and conventional (WURC) iterative stopping condition for computation procedure (i.e., repeating until an error threshold is met) and does not provide an inventive concept either alone or in combination with the judicial exception recited in claim 1. See MPEP § 2106.05(d).
Regarding claim 7
Claim 7 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 7 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
Claim 7 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No. There are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“wherein the inference process is repeated until an error between the inference result and training data is equal to or less than a preset constant value.” – this limitation broadly recites repeating the inference process until a specified error threshold is met (i.e., a conventional convergence/termination criterion for an iterative procedure) and does not recite a particular technological implementation that meaningfully limits the judicial exception or integrates it into a practical application. See MPEP § 2106.04(d); MPEP § 2106.05(h).
Claim 7 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No. There are no additional elements that amount to significantly more than the judicial exception. The additional elements are:
“wherein the inference process is repeated until an error between the inference result and training data is equal to or less than a preset constant value.” – this limitation represents a well-understood, routine, and conventional (WURC) iterative stopping condition for computation procedure (i.e., repeating until an error threshold is met) and does not provide an inventive concept either alone or in combination with the judicial exception recited in claim 1. See MPEP § 2106.05(d).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1 and 3-7 are rejected under 35 U.S.C. 103 as being unpatentable over John K. Anderson et al. (US20150227802A1), henceforth ‘Anderson’, in view of William James Dally et al. (US20210056446A1), henceforth ‘Dally’.
Regarding claim 1, Anderson in view of Dally, teach an online learning method comprising:
“controlling a processor and memory to execute steps comprising:” – Anderson teaches this limitation. Anderson explicitly teaches control logic executed by a processor:
“… control logic … embodied as non-transitory computer readable media on a computer readable medium containing executable program instructions executed by a processor, controller or the like.” (Anderson, p. 2, ¶[0025])
And explicitly discloses storage devices that may include memory:
“A storage device is … any medium capable of storing processes or data, in any form, and may … include hard drives, memory, …” (Anderson, p. 2, ¶[0026])
“” – Anderson teaches this limitation in part. Anderson teaches the Extended Kalman Filter (EKF) / Unscented Kalman Filter (UKF) context (e.g., innovation/error; Kalman gain; weight adaptation):
“the error between the target output and the network output.” (Anderson, pg. 3, ¶[0037])
“the Kalman gain may be computed ... an estimation update may be performed.” (Anderson, pg. 4, ¶[0039])
“obtaining a Kalman gain after the update ” – Anderson teaches this limitation in part. Anderson teaches the Kalman-gain computation and update context:
“the statistics for the update may be computed and the Kalman gain may be computed therefrom. In response to computing the Kalman gain, an estimation update may be performed.” (Anderson, pg. 4, ¶[0039])
And explicitly discloses that the training method type:
“may be a method … and may be selected from Extended Kalman Filter (EKF) …” (Anderson, p. 3, ¶[0034])
“and updating a weight by adding a weight before the update to a result obtained by multiplying the Kalman gain in which the range of the possible values of the Kalman gain is expanded by an error between a training signal and an inference result in which the weight before the update is used,” – Anderson teaches this limitation. Anderson explicitly teaches the innovation/error and Kalman-based weight adaptation:
“the error between the target output and the network output.” (Anderson, pg. 3, ¶[0037])
“weights of the network may be adapted using EKF equations” (Anderson, pg. 3, ¶[0037])
“” – Anderson explicitly teaches that weights are changed (updated/adapted):
“the weights of the network may be updated in step 415.” (Anderson, p. 3, ¶[0032])
“weights of the network may be adapted using EKF equations” (Anderson p. 3, ¶[0037])
“In response to computing the Kalman gain, an estimation update may be performed.” (Anderson p. 4, ¶[0039])
Anderson does not teach these limitations:
“compressing a range of possible values …”
“… from the compressed Kalman gain before the update …”
“expanding the range of possible values of the Kalman gain after the update,”
“the … being quantized by being expressed in a second decimal point notation in which a word length and a length of a decimal part are shorter than the weight before the update expressed in a first decimal point notation;”
“storing the updated weight in the memory;”
“and performing an inference process with the updated weight.”
Dally, however, teaches these limitations:
“compressing a range of possible values …” – Dally teaches compression via representing values in reduced/log domains by conversion:
“the sum … is converted into logarithmic format … by the conversion unit 130” (Dally, p. 6, ¶[0072])
“Logarithmic based arithmetic may be used to reduce area and energy consumption, particularly for neural network computations requiring many multiplications and summing of products.” (Dally, p. 6, ¶[0073])
“… from the compressed Kalman gain before the update …” – Dally teaches performing neural-network core math while converting formats as needed (i.e., compute flow while values are represented in compressed/log formats):
“The system 100, 155, and/or 215 may be used to perform dot product and multiply accumulate functions that are core math functions for linear algebra involved in deep learning inference or training systems.” (Dally, pg. 6, ¶[0075])
“Partial sums and sums may be computed for values represented in a logarithmic format to efficiently perform dot product operations, multiply accumulate operations, … for neural network training and inferencing.” (Dally, pg. 10, ¶[0110])
“expanding the range of possible values of the Kalman gain after the update,” – Dally teaches increasing precision/range at update/accumulation time (i.e., widening representation mid-computation):
“Tensor Cores operate on 16-bit floating point input data with 32-bit floating point accumulation.” (Dally, pg. 10, ¶[0110])
“The 16-bit floating point multiply … results in a full precision product that is then accumulated using 32-bit floating point addition with the other intermediate products…” (Dally, pg. 10, ¶[0110])
“the ” – Dally teaches using compressed/low-precision representations (including logarithmic format conversions) to reduce area/energy for neural network computations:
“Logarithmic based arithmetic may be used to reduce area and energy consumption, particularly for neural network computations …” (Dally, p. 6, ¶[0073])
“the sum … is converted into logarithmic format … by the conversion unit 130” (Dally, p. 6, ¶[0072])
“storing the updated weight in the memory;” – Dally expressly teaches storing weights in a memory structure (weight buffer):
“The scalar inference accelerator 700 includes a weight buffer 701 … The weight buffer 701 stores weight values w … values are stored to the weight buffer 701 … by the LSU 554 …” (Dally, p. 17, ¶[0177])
“and performing an inference process with the ” – Dally teaches performing inference with weights stored in memory, as shown by the Scalar Inference Accelerator 700 where weight values
w
i
from weight buffer 701 are used by Multiplier 710 and downstream logic to generate an output activation:
“FIG. 7A illustrates … scalar inference accelerator 700 … includes a weight buffer 701 … [that] stores weight values
w
i
…” (Dally, p. 17, ¶[0177])
“Weights and input activations … are ‘multiplied’ via addition to produce … output activations” (Dally, p. 17, ¶[0176])
Anderson provides the EKF/UKF neural-network workflow with Kalman gain computation and weight updates, and Dally provides well-known low-precision/logarithmic numeric representations, conversions, and mixed-precision accumulation to reduce compute/memory/energy for deep-learning training/inference. A POSITA would have been motivated to apply Dally’s low-precision/representation-conversion techniques and weight buffering/inference accelerator teachings to Anderson’s Kalman-based online learning method to obtain predictable efficiency improvements (e.g., reduced resource use) while performing the same online learning computations.
Regarding claim 3, Anderson in view of Dally, teach the online learning method according to claim 1,
“wherein the weight ” – Anderson teaches this limitation in part. Anderson teaches training/weight update:
“compute the innovation (the error between the target output and network output.) … adapt weights using the EKF equations” (Anderson, FIG. 8A)
Anderson does not teach:
“… expressed in the second decimal point notation …”
and wherein the weight expressed in the second decimal point notation is used in inference.
Dally, however, teaches these limitations:
“… expressed in the second decimal point notation …” – Dally teaches using compressed/low-precision representations in training contexts (i.e., keeping values in such representations while performing training math):
“Logarithmic-based arithmetic may be used to reduce area and energy consumption, particularly for neural network computations” (Dally, p. 6, ¶[0073])
“The system … may be used to perform dot product and multiply accumulate functions that are core math functions for linear algebra involved in deep learning inference or training systems.” (Dally, p. 6, ¶[0075])
“and wherein the weight expressed in the second decimal point notation is used in inference.” – Anderson does not teach this limitation. Dally, however, teaches this limitation. Dally expressly teaches performing inference using stored weights in an inference accelerator architecture including a weight buffer feeding arithmetic operations:
“Scalar Inference Accelerator 700” with a “Weight Buffer 701” feeding the arithmetic pipeline (Partial Sums Generation Unit 705, Addition Unit 725, etc.). (Dally, Fig. 7A)
Dally also discloses dedicated tensor cores:
“configured to perform deep learning matrix arithmetic … for neural network training and inferencing” (Dally, p. 10, ¶[0109])
And Dally discloses low-precision use in inference:
“operate on 16-bit floating point input data with 32-bit floating point accumulation” (Dally, p. 10, ¶[0109])
A POSITA would have been motivated to apply Dally’s inference accelerator and weight buffering/usage teachings to Anderson’s Kalman-based online learning update method to use the updated (and quantized) weights in subsequent inference and to maintain them for continued learning, yielding the predictable result of enabling inference and continued learning with the updated weights.
Regarding claim 4, Anderson in view of Dally, teach the online learning method according to claim 1,
“wherein the Kalman gain ” – Anderson teaches this limitation in part. Anderson teaches the Kalman gain in the training loop:
“Then, the statistics for the update may be computed and the Kalman gain may be
computed therefrom. In response to computing the Kalman gain, an estimation update
may be performed.” (Anderson, p. 4, ¶[0039])
Anderson does not teach a particular numeric format for expressing the Kalman gain:
“… is expressed in an arbitrary decimal-point format”
Dally, however, teaches this limitation:
“… is expressed in an arbitrary decimal-point format” – Dally teaches expressing NN values in various numeric formats (including floating-point, i.e., decimal-point, formats) and converting between them for computational efficiency:
“Compared with performing multiplication on integer, fixed-point, or floating-point format values, performing multiplication on logarithmic format values is straightforward
and energy efficient …” (Dally, p. 1, ¶[0004])
“Convert the sum of the plurality of input values into a logarithmic format” (Dally, Fig.
2D)
“Tensor Cores operate on 16-bit floating point input data with 32-bit floating point accumulation.” (Dally, p. 10, ¶[0110])
These passages teach explicit standard number formats used in NNs, representation conversion/selection, choosing a particular floating-point word length, and promoting/accumulating at another precision.
A person of skill in the art (POSITA) would apply Dally’s representation choices to Anderson’s Kalman-
gain workflow, selecting an arbitrary decimal-point format for the Kalman gain to meet accuracy/efficiency goals, a predictable implementation detail in light of Dally.
Regarding claim 5, Anderson in view of Dally, teach an online learning device comprising:
“” – Anderson teaches this limitation in part. Anderson teaches the Kalman-gain context:
“the Kalman gain may be computed … [then] an estimation update may be performed.” (Dally, pg. 4, ¶[0039])
“wherein the operator circuit performs a first operation to obtain a Kalman gain ” – Anderson teaches this limitation in part (Kalman filter math):
“9) COMPUTE THE STATISTICS FOR THE UPDATE. 10) COMPUTE THE KALMAN GAIN. 11) PERFORM THE ESTIMATION UPDATE.” (Anderson, FIG. 8B)
“and performs a second operation to update a weight by adding a weight before the update to a result obtained by multiplying the Kalman gain in which the range of the possible values of the Kalman gain is expanded by an error between a training signal and an inference result in which the weight before the update is used,” – Anderson teaches this limitation. Anderson teaches the error/innovation and weight adaptation with Kalman methods:
“compute the innovation (the error between the target output and network output.) … adapt weights using the EKF equations” (Anderson, FIG. 8A)
Anderson also teaches the Kalman-gain-driven estimation update (i.e., applying K to the error):
“Compute the Kalman gain … Perform the estimation update.” (Anderson, FIG. 8B)
“” – Anderson explicitly teaches that weights are changed (updated/adapted):
“the weights of the network may be updated in step 415.” (Anderson, p. 3, ¶[0032])
“weights of the network may be adapted using EKF equations” (Anderson p. 3, ¶[0037])
“In response to computing the Kalman gain, an estimation update may be performed.” (Anderson p. 4, ¶[0039])
Anderson does not teach these limitations:
“a processor comprising a memory, a compressor circuit, an operator circuit, and an expander circuit.”
“wherein the compressor compresses …”
“… after the update from the compressed Kalman gain …”
“the ”
“and wherein the expander circuit expands the range of the possible values of the Kalman gain after the update,”
“and wherein the processor is configured to store the updated weight in the memory;”
“and perform an inference process with the ”
Dally, however teaches these limitations:
“a processor comprising a memory, a compressor circuit, an operator circuit, and an expander circuit.” – Dally explicitly discloses a parallel processing unit (PPU) that is a:
“multi-threaded processor … implemented on one or more integrated circuit devices.” (Dally, p. 7, ¶[0076])
And further discloses memory:
“The buffer is a region in a memory that is accessible ( e.g., read/write) by both the host processor and the PPU 300.” (Dally, p. 7, ¶[0082])
And discloses conversion units, arithmetic units, and inference accelerator structures:
“… the conversion unit 130 may be included in the scalar inference accelerator 700 …” (Dally, p. 18, ¶[0182])
“Each core 550 may include a fully-pipelined, single-precision, double-precision, and/or mixed precision processing unit that includes a floating point arithmetic logic unit and an integer arithmetic logic unit.” (Dally, p. 10, ¶[0108])
“wherein the compressor compresses …”– Dally teaches a compression mechanism (representing values in a reduced/log domain via conversion):
“Logarithmic-based arithmetic may be used to reduce area and energy consumption, particularly for neural network computations” (Dally, pg. 6, ¶[0073])
“the sum of the plurality of input values is converted into logarithmic format … by the conversion unit 130.” (Dally, pg. 6, ¶[0072])
“… after the update from the compressed Kalman gain …” – Dally teaches the operator/compute path for NN math while values reside in compressed formats:
“The system 100, 155, and/or 215 may be used to perform dot product and multiply accumulate functions that are core math functions for linear algebra involved in deep learning inference or training systems.” (Dally, pg. 6, ¶[0075])
And teaches explicit conversion in the path:
“converted into logarithmic format … by the conversion unit 130” (Dally, pg. 6, ¶[0072])
“the updated weight being quantized by being expressed in a second decimal point notation in which a word length and a length of a decimal part are shorter than the weight before the update expressed in a first decimal point notation,” – Dally teaches using compressed/low-precision representations (including logarithmic format conversions) to reduce area/energy for neural network computations:
“Logarithmic based arithmetic may be used to reduce area and energy consumption, particularly for neural network computations …” (Dally, p. 6, ¶[0073])
“the sum … is converted into logarithmic format … by the conversion unit 130” (Dally, p. 6, ¶[0072])
“and wherein the expander circuit expands the range of the possible values of the Kalman gain after the update,” – Dally teaches expansion mid-computation (promotion to a wider precision/range):
“Tensor Cores operate on 16-bit floating point input data with 32-bit floating point accumulation. … The 16-bit floating point multiply … [is] accumulated using 32-bit floating point addition …” (Dally, pg. 10, ¶[0110])
(In addition, drawings from Dally further supports the “expander … expands the range … after the update” concept: Fig. 5A shows expansion (e.g., 16-bit [Wingdings font/0xE0] 32-bit accumulation), while Fig. 1B and Fig. 2C show the Conversion Unit 130 that performs the post-update format conversion consistent with an “expander”.)
“and wherein the processor is configured to store the updated weight in the memory;” – Dally expressly teaches storing weights in a memory structure (weight buffer):
“The scalar inference accelerator 700 includes a weight buffer 701 … The weight buffer 701 stores weight values w … values are stored to the weight buffer 701 … by the LSU 554 …” (Dally, p. 17, ¶[0177])
“and perform an inference process with the ” – Dally teaches performing inference with weights stored in memory, as shown by the Scalar Inference Accelerator 700 where weight values
w
i
from weight buffer 701 are used by Multiplier 710 and downstream logic to generate an output activation:
“FIG. 7A illustrates … scalar inference accelerator 700 … includes a weight buffer 701 … [that] stores weight values
w
i
…” (Dally, p. 17, ¶[0177])
“Weights and input activations … are ‘multiplied’ via addition to produce … output activations” (Dally, p. 17, ¶[0176])
Dally’s stated benefits regarding reduced area/energy and efficient neural network computation motivate applying Dally’s representation conversion/low-precision and inference-accelerator teachings to Anderson’s Kalman-based online learning system to achieve predictable efficiency gains while performing the same learning/inference tasks.
Regarding claim 6, Anderson in view of Dally, teach the online learning method according to claim 1,
“wherein the inference process is repeated until an error between the inference result and training data is equal to or less than a preset constant value.” – Anderson teaches this limitation in part. Anderson teaches performing online learning using an error between the network output and target output and updating weights based on this error:
“the error between the target output and the network output.” (Anderson, p. 3, ¶[0037])
Anderson further teaches performing estimation updates in response to computing the Kalman gain:
“In response to computing the Kalman gain, an estimation update may be performed.” (Anderson, p. 4, ¶[0039])
These passages teach an iterative learning process driven by an error signal, where weights are repeatedly updated as new error values are computed.
Anderson does not explicitly teach repeating the inference process until:
“… an error between the inference result and the training data is equal to or less than a preset constant value.”
However, because Anderson’s learning method already performs iterative weight updates based on an error signal, a person of ordinary skill in the art would have recognized that such iterative learning procedures require a termination condition to determine when sufficient learning has occurred. One of the finite number of predictable termination criteria for such an iterative process is stopping when the error reaches or falls below a predefined threshold. Implementing such an error-threshold stopping condition would have been an obvious design choice to ensure convergence of the learning process while avoiding unnecessary additional iterations.
It would have been obvious to a POSITA to modify Anderson’s EKF/UKF-based online learning method for neural network weight adaptation by incorporating Dally’s compressed/low-precision number representations and conversion techniques because Dally expressly teaches that logarithmic-based arithmetic and representation conversion may be used to reduce area and energy consumption for neural network computations requiring many multiplications and summing of products (Dally, p. 6, ¶¶[0072]-[0075]). Anderson’s online learning method likewise involves repeated numerical computations for weight adaptation using Kalman gain and error/innovation terms (Anderson, e.g., ¶¶[0032], [0037], [0039]), which includes multiply/add operations. A POSITA would therefore have been motivated to apply Dally’s techniques to Anderson to obtain the predictable result of reduced resource consumption while performing the same online learning computations.
Regarding claim 7, Anderson in view of Dally, teach the online learning method according to claim 5,
“wherein the inference process is repeated until an error between the inference result and training data is equal to or less than a preset constant value.” – Anderson teaches this limitation in part. Anderson teaches performing online learning using an error between the network output and target output and updating weights based on this error:
“the error between the target output and the network output.” (Anderson, p. 3, ¶[0037])
Anderson further teaches performing estimation updates in response to computing the Kalman gain:
“In response to computing the Kalman gain, an estimation update may be performed.” (Anderson, p. 4, ¶[0039])
These passages teach an iterative learning process driven by an error signal, where weights are repeatedly updated as new error values are computed.
Anderson does not explicitly teach repeating the inference process until:
“… an error between the inference result and the training data is equal to or less than a preset constant value.”
However, because Anderson’s learning method already performs iterative weight updates based on an error signal, a person of ordinary skill in the art would have recognized that such iterative learning procedures require a termination condition to determine when sufficient learning has occurred. One of the finite number of predictable termination criteria for such an iterative process is stopping when the error reaches or falls below a predefined threshold. Implementing such an error-threshold stopping condition would have been an obvious design choice to ensure convergence of the learning process while avoiding unnecessary
It would have been obvious to a POSITA to modify Anderson’s EKF/UKF-based online learning method for neural network weight adaptation by incorporating Dally’s compressed/low-precision number representations and conversion techniques because Dally expressly teaches that logarithmic-based arithmetic and representation conversion may be used to reduce area and energy consumption for neural network computations requiring many multiplications and summing of products (Dally, p. 6, ¶¶[0072]-[0075]). Anderson’s online learning method likewise involves repeated numerical computations for weight adaptation using Kalman gain and error/innovation terms (Anderson, e.g., ¶¶[0032], [0037], [0039]), which includes multiply/add operations. A POSITA would therefore have been motivated to apply Dally’s techniques to Anderson to obtain the predictable result of reduced resource consumption while performing the same online learning computations.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Paul Coleman whose telephone number is (571)272-4687. The examiner can normally be reached Mon-Fri.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached at (571) 270-7519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PAUL COLEMAN/Examiner, Art Unit 2126
/DAVID YI/Supervisory Patent Examiner, Art Unit 2126