Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claims 1-20 are presented for examination.
This is a Final Action.
Response to Arguments
Applicant's arguments filed 12/19/2025 have been fully considered but they are not persuasive.
With respect to 101 abstract idea applicant makes the following arguments:
1. “Applicant submits that claim 1 is directed to patent eligible subject matter at least because the office action has not met its burden of establishing prima facie unpatentability of claim 1. Any rejection must be based on the ‘preponderance of evidence,’ MPEP 706(I). The examiner should only ‘reject a claim if, in view of the prior art and evidence of record, it is more likely than not that the claim is unpatentable.’ To satisfy this burden, the Office Action must present evidence and articulated reasoning sufficient to support its legal conclusions. A rejection that relies on conclusory statements or evaluates the claims at ‘a high level of generality’ is procedurally improper.”
Examiner respectfully disagrees with the applicant. The rejection specifically quoted and analyzed the limitations: selecting… from a first searchable subspace…; selecting… from a second searchable subspace…; evaluating.. one or more performance metrics. The rejection explained why explained why those limitations constitute selection, comparison and evaluation. The rejection separately addressed the additional elements under step 2A prong two and step 2B. Therefore, the rejection provided articulated reasoning and was not conclusory. Disagreement with the conclusion does not establish procedural impropriety.
2. Applicant argues, “The office action does not adequately support the rejection based on the evidence on record. First, the Office Action analyzes the claim language only at a ‘high level of generality’ without fully considering the interactions between all elements of the claims. The Office Action’s conclusory reduction of all of the above-recited elements as ‘generic data input and output steps’ is a conclusory interpretation at a high level of generality. Further, the Office Action does not contain any analysis of how all the elements of the claim – both the alleged abstract ideas and the alleged ‘additional elements’ – interact.”
Examiner respectfully disagrees with the applicant. The rejection identified specific operative limitations (selection from subspaces and evaluation of metrics). The rejection did not merely label the entire claim as generic input/output. The rejection separately addressed: abstract limitations, Additional computing elements. The interaction of elements are considered when determining that the abstract selection/evaluation was implemented using generic computing components.
3. Applicant argues, “Second, the Office Action also does not give sufficient weight to the evidence on record demonstrating that the claimed invention provides a technical solution to a technical problem. Prior network architecture search methodologies failed to provide a meaningful way to optimize the quantization of the model in view of the characteristics of MAC operations; Systems and methods according to aspects of the present disclosure stand in stark contrast to past network searches, which have failed to recognize the benefits of jointly searching multiple search space to balance precision of layer values with the number of filters in the layer. Specifically, past techniques have failed to appreciate the effect of quantization on the optimum number of filers in a layer…; In this manner, a model may be quantized while accounting for performance.”
Examiner respectfully disagrees with the applicant. The eligibility analysis is based on the claim language. Claim 1 does not recite: MAC operations, Energy consumption calculations, Any specific quantization algorithm, Any hardware-level improvements. The claim recites selecting values from subspaces and evaluating performance metrics. While the specification describes potential technical benefits, those features are not required by the claim. Therefore, the claim remains directed to abstract selection and evaluation implemented on generic computing components.
4. Applicant argues, “The improved techniques described in the specification are reflected in claim 1… Therefore, Applicant submits that claim 1 is directed to patent eligible subject matter at least because the Office Action has not met in burden of establishing prima facie unpatentability of claim 1.”
Examiner respectfully disagrees with the applicant. Restating the claim does not demonstrate technological improvement. The claim defines: Selecting values from defined subspaces. Evaluating performance metrics. Outputting a model. The claim does not recite a specific technology mechanism that improves computer functionality. Merely organizing parameter selection into multiple subspaces does not convert abstract decision-making into a technological improvement.
5. Applicant argues, For at least the reasons discussed above with respect to the above independent claim 1, independent claims 14 and 20 are also patent eligible. Further arguing that, Applicant submits that the claims respectively dependent thereon are also patent eligible.
Examiner respectfully disagrees with the applicant. Claims 14 and 20 recite limitations substantially similar to claim 1, therefore are directed to the same abstract idea implemented using generic computer components. Accordingly, the 101 rejection is maintained for those claims. Furthermore, the dependent claims do not recite additional technological improvements sufficient to integrate the abstract idea into a practical application. They merely add further limitations related to evaluation metrics or parameter selection, which remain wherein the abstract framework.
For the reasons discussed above, Applicant’s arguments are not persuasive. The rejection identified specific claim limitations, namely, selecting values from defined searchable subspaces and evaluating performance metrics, and properly determined that these limitations recite abstract selection, comparison, and evaluation activities. The additional elements, including the computing system, neural network model, receiving, and outputting steps, merely implement these abstract operations using generic computer components and do not recite a specific technological improvement to computer functionality or hardware operation. Although the specification describes potential technical benefits, such features are not required by the claim language. When considered as a whole and as an ordered combination, the claims remain directed to an abstract idea implemented on a generic computer and do not integrate the judicial exception into a practical application or amount to significantly more . Accordingly, the rejection under 35 USC 101 is maintained.
Applicant makes the following argument with respect to claim 1, 103 rejection:
1. Applicant argues Ahmed does not teach a second searchable subspace. Specifically applicant argues: (1) the office action relies on Ahmed’s “state space” which includes static parameters; (2) Static layer parameters do not constitute a second searchable subspace; (3) Therefore Ahmed fails to disclose selecting values from two searchable subspaces as claimed..
Examiner respectfully disagrees with the applicant. Ahmed is relied upon for the first searchable subspace corresponding to quantization levels. Ovtcharov teaches architecture search parameters including number of filters and neurons per layer, which corresponds to layer size. Thus, the combination of Ahmed and Ovtcharov teaches selecting parameters from two distinct subspaces as claimed.
2. Applicant argues, Ahmed only searches quantization levels. Specifically applicant asserts Ahmed only explores the search space by changing quantization levels, and therefore does not teach selecting values from multiple subspaces.
Examiner respectfully disagrees with the applicant. Ahmed teaches automated quantization parameter search using reinforcement learning. Ovtcharov teaches automated architecture hyperparameter search utilizing layer size variations. Because both optimize performance metrics (accuracy, cost, efficiency), combining these known optimization to achieve predictable trade-offs would have been obvious.
3. Applicant argues, Ovtcharov does not cure ahmed’s deficiency. Specifically applicant argues that Ovtcharov is not relied upon to teach multiple search subspaces and therefore does not remedy the alleged deficiency in Ahmed.
Examiner respectfully disagrees with the applicant. Ovtcharov explicitly discloses determining network topology parameters such as the number of filters and neurons in layers. These parameters correspond to the size of a layer and therefore represent a second searchable subspace. The combination of Ahmed’s quantization search and Ovtcharov’s architecture search teaches the claimed limitations.
4. Applicant argues, independent claims 14 and 20 are not obvious over the material cited in the office action, for similar reasons s claim 1.
Examiner respectfully disagrees with the applicant. Claims 14 and 20 recite limitations substantially similar to claim 1, therefore are similarly rejected as claim 1.
5. Applicant argues, Claim 11 is not taught by Ahmed. Specifically applicant argues, claim 11 requires “a scaling factor negatively correlating to a difference in energy consumption between candidate and reference models.” Applicant asserts the office relied on Ahmed’s state of relative accuracy, which measures an accuracy ratio rather than energy consumption differences.
Examiner respectfully disagrees with the applicant. Examiner has further clarified his mappings below in view of applicants arguments, specifically, as mapped in claim 11, Ahmed teaches evaluating candidate neural network models using performance metrics that jointly consider accuracy and computation cost, including memory access energy and multiply-accumulate operations (See Section 2.4). These metrics are incorporated into a reward shaping function that balances accuracy against compute and memory cost, thereby providing a scaling factor reflecting the trade-off between model performance and energy consumption relative to the full-precision reference model.
6. With respect to dependent claims, applicant argues, “Dependent claims include additional limitations not taught by the art.”
Examiner respectfully disagrees with the applicant. The dependent claims add standard performance metrics and reinforcement learning components, which are disclosed in the references. Accordingly, they are rendered obvious for the same reasons.
Examiner respectfully reiterates, Ahmed teaches quantization search; Ovtcharov teaches architecture (layer size) search; combining these known hyperparameter optimization techniques neural network performance would have been obvious to a person of ordinary skill.
Claim Rejections - 35 U.S.C. §101
35 U.S.C. §101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 USC 101 as directed to an abstract idea without significantly more.
With respect to independent claims 1, 14 and 20, specifically claim 1 recites “modifying, by the computing system, the reference neural network model to generate a candidate neural network model, wherein the candidate neural network model is generated by selecting one or more values from a first searchable subspace and one or more values from a second searchable subspace, wherein the first searchable subspace corresponds to a quantization scheme for quantizing one or more values of the candidate neural network model, and the second searchable subspace corresponds to a size of a layer of the candidate neural network model; evaluating, by the computing system, one or more performance metrics of the candidate neural network model”. These limitations, “selecting… from a first searchable subspace” and “selecting… from a second searchable subspace” describes choosing values from defined sets – a form of comparison and decision making; “evaluating… one or more performance metrics” describes analyzing the candidate mode’s characteristics; and Activities such as, selection, comparison and evaluation – are types of mental processes that can be performed by the human mind based on observation/evaluation steps. Accordingly, the claim recites a mental process and mathematical relationships, which can be done utilizing pen and paper.
Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. At step 2A, prong two, claim(s) 1, 14 and 20 recites the additional elements of “neural network model, computer system comprising one or more computing devices, non-transitory computer-readable media, a controller model”; “receiving … a reference neural network model” and “outputting.. a new neural network model based… on the… performance metrics.” These additional elements are generic data input and output steps performed “by a computing system” without any specialized hardware or specific improvement to computer functionality. The claim does not recite a particular algorithmic technique that improves how computer operates – it just applies generic computing to the mental process.
Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claims 1, 14 and 20 at step 2B do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As explained with respect to Step 2A Prong Two, the additional elements as recited in step 2A prong 2 above. No elements individually or in combination adds “significantly more” than the abstract idea hence are no more than well-understood, routine and conventional computer functions that merely apply the abstract idea on a generic computer. There is no unconventional hardware or transformative data processing beyond the abstract idea, the implementing model selection and evaluation on a generic computer is a conventional practice in AI/ML field. Therefore, when viewed as an ordered combination, these additional elements do not integrate the abstract idea into a practical application and do not add significantly more than the abstract idea itself. According, claim 1 is ineligible under 101.
Claims 2-13 are dependent claims and do not recite any additional elements that would amount to significantly more than the abstract idea. Specifically,
Claim 2. With respect to step 2A prong 1 “selecting, by the computing system, the one or more values from the first searchable subspace and the one or more values from the second searchable subspace using a controller model” recites abstract idea of mental steps (observation & evaluation), a person can select from the first searchable subspace and second subspace. Using a controller model can be a set of rules/logic that person applies.
Claim 3. With respect to step 2A prong 2 “updating, by the computing system, the controller model based at least in part on the one or more performance metrics; and generating, by the computing system, the new neural network model using the updated controller model.” recites additional elements of insignificant extra solution activity. With respect to step 2B the recited insignificant extra solution activity is recited at a high level of generality, wherein the updating and generating steps are routine ML/optimization operations carried out on a generic computing system, which are well-understood, routine and conventional.
Claim 4. With respect to step 2A prong 2 “wherein the controller model comprises a reinforcement learning agent.” recites additional elements of insignificant extra solution activity. With respect to step 2B the recited insignificant extra solution activity is recited at a high level of generality, wherein the “reinforcement learning agent” is a generic AI tool performing routine functions, which are well-understood, routine and conventional as taught by the prior art of records.
Claim 5. With respect to step 2A prong 2 “wherein the quantization scheme is selected from binary, modified binary, ternary, exponent, and mantissa quantization schemes.” recites additional elements of insignificant extra solution activity. With respect to step 2B the recited insignificant extra solution activity by specifying the quantization schemes is a narrowing of mental process to a known set of options, this is routine and conventional activity in the art of neural network optimization and does not add “significantly more” to the abstract idea, and is recited at a high level of generality which are well-understood, routine and conventional as taught by the prior art of records.
Claim 6. With respect to step 2A prong 2 “wherein the second searchable subspace corresponds to at least one of a quantity of output units and a quantity of filters.” recites additional elements of insignificant extra solution activity. With respect to step 2B the recited insignificant extra solution activity is recited at a high level of generality, selecting the number of output unites or filters in a neural network is well-understood, routine and conventional design choice in the ML/AI field.
Claim 7. With respect to step 2A prong 1 “wherein the one or more performance metrics comprises an estimated energy consumption of the candidate neural network model directly computed using one or more look up tables or estimation functions.” recites abstract idea of mental steps (observation & evaluation), a person can perform calculations and estimations.
Claim 8. With respect to step 2A prong 2 “wherein the one or more performance metrics comprises a real-world energy consumption associated with implementation of the candidate neural network model on a real- world device. ” recites additional elements of insignificant extra solution activity. With respect to step 2B the recited insignificant extra solution activity is recited at a high level of generality, measuring using generic computing equipment and conventional device execution without any recitation of technical improvement, which are well-understood, routine and conventional.
Claim 9. With respect to step 2A prong 1 “ determining, by the computing system, a reward based at least in part on the one or more performance metrics; and modifying, by the computing system, one or more parameters of the controller model based on the reward. ” recites abstract idea of mental steps (observation & evaluation), a person determine a reward from known data and modify parameters based on a score which are both calculation and evaluations.
Claim 10. With respect to step 2A prong 1 “wherein the controller model is configured to generate the candidate neural network model through performance of evolutionary mutations, and wherein modifying, by the computing system, the reference neural network model to generate a new neural network model comprises: determining, by the computing system, whether to retain or discard the candidate neural network model based at least in part on the one or more performance metrics.” recites abstract idea of mental steps (observation & evaluation), a person can make decisions based on testing and logical reasoning based on observation evaluation and decision making.
Claim 11. With respect to step 2A prong 2 “wherein the one or more performance metrics comprises a scaling factor which negatively correlates to a difference in energy consumption between the candidate neural network model and the reference neural network model.” recites additional elements of insignificant extra solution activity. With respect to step 2B the recited insignificant extra solution activity is recited at a high level of generality, utilization of a performance metrics containing data showing correlations between models, using generic computing equipment and conventional device execution without any recitation of technical improvement, which are well-understood, routine and conventional.
Claim 12. With respect to step 2A prong 1 “evaluating, by the computing system, an energy cost associated with each of two or more of the plurality of layers; modifying, by the computing system, each of the two or more plurality of layers in an order determined by a descending order of the energy costs associated with each of the two or more of the plurality of layers. ” recites abstract idea of mental steps (observation & evaluation), a person can choose or determine algorithms based on search goals.
Claim 13. With respect to step 2A prong 1 “ wherein modifying, by the computing system, each of the two or more plurality of layers comprises: selecting, by the computing system, a first quantization scheme for quantizing values within a first layer and a second quantization scheme for quantizing values within a second layer, wherein the first quantization scheme is different than the second quantization scheme, and wherein the first layer is associated with a first energy cost higher than a second energy cost associated with the second layer.” recites abstract idea of mental steps (observation & evaluation), a person can identifying energy costs for layers (data evaluation, determine which layer has higher cost (comparison), and select different quantization schemes based on the comparison (decision-making).
Claim 14 is similar to claim 1 hence rejected similarly.
Claim 3 is similar to claim 15 hence rejected similarly.
Claim 16 is similar to claim 7 hence rejected similarly.
Claim 17 is similar to claim 8 hence rejected similarly.
Claim 18 is similar to claim 9 hence rejected similarly.
Claim 19 is similar to the combination of claims 5 and 6 hence rejected similarly.
Claim 20 is similar to claim 1 hence rejected similarly.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-20 rejected under 35 U.S.C. 103 as being unpatentable over Ahmed et al. (“ReLeQ: A Reinforcement Learning Approach for Deep Quantization of Neural Networks” – IDS) in view of Ovtcharov et al. (WO2020/190542)
Ahmed teaches, A computer-implemented method for quantizing a neural network model while accounting for performance (Page 3; Col 1, first paragraph teaches – “RELEQ is automated method for efficient… different results”), the method comprising:
receiving, by a computing system comprising one or more computing devices, a reference neural network model (Fig 4, Page 2, Col 1, first paragraph teaches - “The RL agent starts from a full-precision previously trained model and learns the sensitivity of final classification accuracy with respect to the quantization of each layer”, Ahmed);
modifying, by the computing system, the reference neural network model to generate a candidate neural network model (Fig 4, Page 3, Col 1, Paragraph 2 - teaches steps sequentially through the layers and chooses a bitwidth from a predefined set…; Page 5, Col 1, last paragraph teaches - weights for this layer are quantized to the predicted bitwidth, Ahmed), wherein the candidate neural network model is generated by selecting one or more values from a first searchable subspace and one or more values from a second searchable subspace (Table 1, Page 3, Col 1: Sec 2.3, 1st paragraph - teaches the agent… chooses a bitwidth from a predefined set… per layer… table 1 shows parameters used to embed the state space, including layer-specific parameters such as layer dimensions and quantization level (bitwidth), Ahmed), wherein the first searchable subspace corresponds to a quantization scheme for quantizing one or more values of the candidate neural network model (Page 3, Col 1, Sec 2.3, 1st paragraph - teaches in order to consider the effects of previous layer’s quantization levels, the agent steps sequentially through the layers and chooses bitwidth… the set of bitwidth is (1. 2. 3. 4. 5. 6. 7, 8)… the quantization level of each layers, Ahmed);
evaluating, by the computing system, one or more performance metrics of the candidate neural network model (Page 3, section 2.4 Col 2, 2nd – 3rd paragraphs - teaches state of quantization… computed using the compute cost and memory cost of each layer.. state of relative accuracy… ratio of current accuracy… to accuracy of the network when it runs with full prevision, Ahmed); and
outputting, by the computing system, a new neural network model based at least in part on the one or more performance metrics (Page 5, Sec 3, Col 2, 1st paragraph - teaches after the learning process is complete and the agent has converged… perform a long retraining… then obtain the final accuracy for the quantized version of the network, Ahmed).
Ahmed further teaches
the second searchable subspace of the candidate neural network model. ( Table 1 - teaches a layer dimensions listed under layer-specific/static parameter, Ahmed).
However,
Ahmed does not explicitly teach,
the second searchable subspace corresponds to a size of a layer of the candidate neural network model.
However, Ovtcharov teaches,
the second searchable subspace corresponds to a size of a layer of the candidate neural network model (Paragraph 105 - teaches determines network topology parameters including number of filters and number of neurons per layers).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which said subject matter pertains to allow Ahmed’s invention to be combined with Ovtcharov’s invention because both prior arts are in the same field of machine learning model optimization with focus on quantization and architecture tuning.
2. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 1, wherein modifying, by the computing system, the reference neural network model to generate the candidate neural network model comprises:
selecting, by the computing system, the one or more values from the first searchable subspace (Page 3, Col 1: Sec 2.3, 1st paragraph - teaches the agent chooses a bitwidth from a predefined set … per layer, Ahmed) and the one or more values from the second searchable subspace (Table 1 – teaches parameters including layer dimensions and quantization level (bitwidth), Ahmed) using a controller model (Page 3, Section 2.3, Col 1: 2nd paragraph - teaches ReLeQ trains a reinforcement learning agent… policy and value networks… select quantization levels…, Ahmed).
3. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 2, wherein outputting, by the computing system, the new neural network model comprises:
updating, by the computing system, the controller model based at least in part on the one or more performance metrics (Page 3, Section 2.3, Col 1: 2nd paragraph - teaches reward signal… proportional to its accuracy after quantization and its benefits in terms of computation and memory cost… Page 5: col 2: 1st paragraph - use proximal policy optimization… to update the policy and value networks of RELEQ agent, Ahmed); and
generating, by the computing system, the new neural network model using the updated controller model (Page 5, Col 2: 1st paragraph - teaches after.. the agent has converged to a quantization level for each layer… perform long retraining… obtain final accuracy for the quantized version of the network, Ahmed).
4. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 2, wherein the controller model comprises a reinforcement learning agent (Page 5: col 2: 1st paragraph – teaches RELEQ trains a reinforcement learning agent… employs… Proximal policy Optimization … consists of both policy and value networks).
5. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 1, wherein the quantization scheme is selected from binary and ternary schemes (P. 3, section 2.3, Col 1: 2nd paragraph; section 2.4 Col 2: 2nd paragraph - teaches bitwidth from a predefined set (1..8) … includes ternary (2-bit) and binary (1-bit) as possible quantization levels, Ahmed).
The combination of Ahmed and Ovtcharov do not explicitly teach,
… modified binary, exponent, and mantissa quantization schemes.
However, It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which said subject matter pertains to allow the Ahmed to extend the quantization schemes selectable in RELEQ to include other well-known schemes such as modified binary, exponent-based, and mantissa-based quantization. At the time of invention, quantization was mature area of deep learning optimization, and these schemes were recognized equivalent to binary and ternary for representing neural network weights and activations. The choice among these known schemes would have been a matter of design selection and optimization for hardware or accuracy requirements, yielding predictable results. Modified binary was understood to improve dynamic range over pure binary; exponent and mantissa quantization were recognized as common low-precision floating-point techniques. Substituting or adding such known schemes into RELEQ’s quantization search space would have been no more than the predictable use of prior art elements according to their established functions, and would have been well within the routine skill of an ordinary artisan, requiring no more than ordinary creativity.
6. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 1, wherein the second searchable subspace corresponds to at least one of a quantity of output units and a quantity of filters (Paragraphs 29 - controller determines network topology parameters including number of filters per convolutional layer and number of neurons per fully connected layer, Ovtcharov).
7. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 1, wherein the one or more performance metrics comprises an estimated energy consumption of the candidate neural network model directly computed using one or more look up tables or estimation functions (P. 3, Sec 2.4, Col 2, paragraph 2 – teaches memory cost layer l… scaled by... memory access energy… computed… as part of state of quantization… linear scaling with bits per layer, Ahmed).
8. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 1, wherein the one or more performance metrics comprises a real-world energy consumption associated with implementation of the candidate neural network model on a real- world device ( Fig 9, Sec 5.4, p. 8 – evaluated… on stripes custom accelerator… speedup and energy reduction benefits compared to 8-bits, Ahmed).
9. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 2, wherein outputting, by the computing system, the new neural network model comprises:
determining, by the computing system, a reward based at least in part on the one or more performance metrics (P. 4, Sec 2.3, Col 1: 2nd paragraph and Sec 2.6, Col 2 - teaches receive a reward signal that is proportional to accuracy… and benefits… computation and memory cost… shaped to prioritize accuracy, Ahmed); and
modifying, by the computing system, one or more parameters of the controller model based on the reward (P 4, Sec 2.7, Col 2 – PPO… updates the policy and value networks of the RELEQ agent, Ahmed).
10. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 2, wherein the controller model is configured to generate the candidate neural network model through performance of reinforcement learning (Page 4, Col 2, 2nd paragraph RELEQ agent, - PPO is an actor critic style algorithm so RELEQ agent consists of both policy and value networks, Ahmed) and wherein modifying, by the computing system, the reference neural network model to generate a new neural network model comprises: determining, by the computing system, whether to retain or discard the candidate neural network model based at least in part on the one or more performance metrics (Fig 6, page 7, paragraph 2 – “Each point on these chars is a unique combination of bitwidth that are assigned to the layers of the network. The boundary of the solutions denotes the Pareto frontier and is highlighted by a dashed line. The solution found by RELEQ is marked out using an arrow and lays on the desired section of the Pareto frontier where the accuracy loss can be recovered through fine-tuning, which demonstrates the quality of the obtained solution.”).
Ahmed does not explicitly teach, performing evolutionary mutations. However, both evolutionary mutation and reinforcement learning are well-known heuristic search and optimization techniques for exploring large design spaces and refining candidate solutions in machine learning model optimization. As recognized in the art, the two approaches are interchangeable methods for generating and refining candidate model configurations: evolutionary algorithms apply random or guided “mutations” to parameters and select high-performing variants, while RL applies policy-driven “actions” to adjust parameters and select high-reward variants.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which said subject matter pertains, wherein PHOSITA in 2018 familiar with both evolutionary algorithms and RL, would have recognized that RL could be substituted for evolutionary mutations to perform the same role – generating candidate models, evaluating them and discarding poor performers – with predictable results, namely an automated exploration of the architecture/quantization search space. This is a straightforward substitution of one known search technique with another, each performing the same abstract mathematical activity of estimating parameters of a neural network model based on feedback. (KSR Int’l Co. v. Teleflex inc.)
11. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 1, wherein the one or more performance metrics (Section 2.3 teaches “The underlying optimization problem is multi-objective (higher accuracy, lower compute and reduced memory); however preserving the accuracy is the primary concern”, Ahmed) comprises a scaling factor (Section 2.6 teaches “Reward Shaping formulation provides the asymmetry and puts more emphasis on maintaining the accuracy… The reward uses the same terms of State of Quantization and state of relative Accuracy” – the reward shaping function scales the optimization objective using factors derived from quantization and accuracy metrics, Ahmed) which negatively correlates to a difference in energy consumption (Section 2.4 – teaches “State of quantization is a metric to evaluate the benefit of quantization… computed using the compute cost and memory cost of each layer” and “Compute cost of layer l is the number of Multiple-Accumulate Operations (MAcc)… memory cost is the number of weights scaled by the ratio of memory access energy to computation energy” – thus explicitly teaching computing cost and memory cost corresponding to energy consumption differences between model with different quantization levels and further joining the metric to energy consumption characteristics of the model, Ahmed) between the candidate neural network model (Section 2.6 – teaches the reward formulation for ReLeQ aims to preserve accuracy and minimize bitwidth of the layers simultaneously – thus disclosing lower bitwidth reduces compute and memory cost (energy), but may reduce accuracy – creating the negative correlation described in the claim– thus teaching compute cost and memory cost corresponds to the energy consumption difference between models with different quantization levels., Ahmed) and the reference neural network model (Section 2.3 – teaches “The state of Relative accuracy is defined as the ratio of the current accuracy… to accuracy of the network when it runes with full precision” – thus disclosing full-precision network functions as the reference neural network model for performance comparison, Ahmed).
12. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 1, wherein the reference neural network model comprises a plurality of layers (Fig 4, Ahmed), and wherein the method further comprises:
evaluating, by the computing system, an energy cost associated with each of two or more of the plurality of layers (Page 3, Col 1, paragraph 2 - the agent, consequently, receives a reward signal that is proportional to its accuracy after quantization and its benefits in terms of computation and memory cost; page 3, col 2, paragraph 2 “state of quantization is a metric to evaluate the benefit of quantization for the network and it is calculated using the compute cost and memory cost of each layer”); modifying, by the computing system, each of the two or more plurality of layers in an order determined by a descending order of the energy costs associated with each of the two or more of the plurality of layers ( Page 11, - teaches, “techniques for selecting quantization levels” … HAQ utilizes accuracy in the reward formulation and then adjusts the RL solution through an approach that sequentially decreases the layers bitwidth to stay within a predefined resource budget while; agent sequentially steps though layers, per-layer energy cost metrics are computed and could be used for ordering, Ahmed)
13. The combination of Ahmed and Ovtcharov teach, The computer-implemented method of claim 12, wherein modifying, by the computing system, each of the two or more plurality of layers comprises:
selecting, by the computing system, a first quantization scheme for quantizing values within a first layer and a second quantization scheme for quantizing values within a second layer, wherein the first quantization scheme is different than the second quantization scheme, and wherein the first layer is associated with a first energy cost higher than a second energy cost associated with the second layer (Table 2, Page 7: col 1, last paragraph, Ahmed).
Claim 14 is similar to claim 1 hence rejected similarly.
Claim 3 is similar to claim 15 hence rejected similarly.
Claim 16 is similar to claim 7 hence rejected similarly.
Claim 17 is similar to claim 8 hence rejected similarly.
Claim 18 is similar to claim 9 hence rejected similarly.
Claim 19 is similar to the combination of claims 5 and 6 hence rejected similarly.
Claim 20 is similar to claim 1 hence rejected similarly.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMRESH SINGH whose telephone number is (571)270-3560. The examiner can normally be reached Monday-Friday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann J. Lo can be reached at (571) 272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AMRESH SINGH/Primary Examiner, Art Unit 2159