DETAILED ACTION
Notice of AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/23/2025 has been entered.
Response to Amendment
Applicant’s Amendment and remarks dated 12/23/2025 have been considered. Claims 8-20 are pending.
Claim Objections. The objections to claims 8 and 12 are withdrawn in view of Applicant’s amendments to such claims.
35 U.S.C. 112(b) Rejections. The rejection to claim 13 under 35 U.S.C. 112(b) is withdrawn in view of Applicant’s amendments to such claim.
Response to Arguments
On page 7 of Applicant’s 12/23/2025 Amendment and remarks, Applicant asserts that no new matter is introduced by amendment.
The examiner agrees that the instant specification provides sufficient written description support for the claim amendments.
On pages 8-9 of Applicant’s 12/23/2025 Amendment and remarks, with respect to the rejection of claim 8 under 35 U.S.C. 101, Applicant argues that “when considered in its entirety” claim 8 is not a mental process.
The examiner respectfully disagrees. First, the office action identifies at least 4 mental processes (“define a first neural network based upon a first selected network architecture”, “assess a first task accuracy of the first neural network”, “assess a first metric of the first neural network... based on interpolating entries of a lookup table”, and “determine a first network hardware-aware score based upon the assessed first task accuracy and assessed first metric”), and Applicant has not specifically rebutted any of these findings of a mental process.
Second, when evaluating claim 8 in its entirety, the claim is directed to assessing metrics and task accuracy of a neural network, where such “assessment” is a mental process. While the examiner agrees that the “implement the first neural network in hardware based on the first network hardware-aware score satisfying a first criterion” limitation is not a mental step, such limitation is examined under Step 2A, Prong 2 and Step 2B, and has been found to simply be akin to adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result (e.g., any type of implementation based on a criterion being met). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).
On pages 9-10 of Applicant’s 12/23/2025 Amendment and remarks, with respect to the rejection of claim 8 under 35 U.S.C. 101, Applicant argues that claim 8 relates to an improvement “to the operation of the hardware-aware neural architecture technology” and is not a “generic computer implementation.”
The examiner respectfully disagrees. MPEP 2106.04(d)(1) explains, that when evaluating improvements in the functioning of a computer or other technology, examiner should consider:
In short, first the specification should be evaluated to determine if the disclosure provides sufficient details such that one of ordinary skill in the art would recognize the claimed invention as providing an improvement. The specification need not explicitly set forth the improvement, but it must describe the invention such that the improvement would be apparent to one of ordinary skill in the art. Conversely, if the specification explicitly sets forth an improvement but in a conclusory manner (i.e., a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine the claim improves technology. Second, if the specification sets forth an improvement in technology, the claim must be evaluated to ensure that the claim itself reflects the disclosed improvement. That is, the claim includes the components or steps of the invention that provide the improvement described in the specification.
First, the examiner respectfully disagrees that the specification provides sufficient details that would enable one of ordinary skill in the art to recognize the claimed invention as providing an improvement to a computer or other technology. While paras. 0034-0038 describe conventional Hardware-in-the-Loop (HIL) limitations, simply using a machine learning model to simulate hardware (rather than using actual hardware) is not an improvement to technology, at least because the claims do not provide any specific improvements to hardware simulators. And while paras. 0034-0038 explain that look-up tables are limited to the values in such tables, simply using interpolation techniques to estimate values not present in the look-up tables is a mental process and not an improvement to any technology.
Second, the examiner notes that the claim does not reflect any improvement. The claim does not provide any specific technological improvement to hardware simulations or to look-up table prediction.
Applicant’s citation to Ex parte Desjardins on page 9 also does not support its argument. In Desjardins, “the limitation “adjust the first values of the plurality of parameters to optimize performance of the machine learning model on the second machine learning task while protecting performance of the machine learning model on the first machine learning task” reflected the improvement disclosed in the specification, and was itself an in improvement to how machine learning models are trained to learn new tasks while protecting knowledge about previous tasks to overcome the problem of “catastrophic forgetting” encountered in continual learning systems. In contrast, there is no such improvement to machine learning training technologies (or any other technologies) with respect to the present claim 8 (or any of the dependent claims).
On pages 11-12 of Applicant’s 12/23/2025 Amendment and remarks, with respect to the rejection of claim 8 under 35 U.S.C. 103, Applicant argues that JIANG does not teach the entire “assess a first metric of the first neural network using a model predictor that comprises one or more machine learning models to generate the first metric based on interpolating entries of a lookup table” limitation.
The examiner respectfully disagrees with how Applicant has interpreted section 3.6 of JIANG. Section 3.6 recites:
Summary. FNAS framework considers the performance of child
networks on target FPGAs in the neural architecture search process.
As shown in Formula 1, if the latency cannot satisfy the timing
specification, there is no need to train the generated child network.
In addition, the controller will be guided to avoid searching
architectures that have insufficient performance. Consequently, the
search process can be dramatically accelerated, and the performance
of the resultant child network on target FPGAs can be guaranteed.
Significantly, the FNAS framework predicts the latency, and does not bother training and testing a particular child network if such latency is not met. Therefore, JIANG does not teach the “conventional Hardware-in-the-Loop” prior art as alleged by Applicant, because JIANG teaches predicting latency rather than measuring actual latency on actual target FPGAs.
On pages 12-13 of Applicant’s 12/23/2025 Amendment and remarks, with respect to the rejection of claim 8 under 35 U.S.C. 103, Applicant argues that ANTONY does not teach the “based on interpolating entries of a lookup table” limitation, citing to para. 0016 of ANTONY as support.
The examiner respectfully disagrees, and notes that Applicant has ignored para. 0017 of ANTONY which is what was actually cited in the office action. Moreover, while it is true that lookup tables can be used to approximate activation functions (e.g., the discrete data points of a function), in order to have a continuous function interpolation is required, which is what para. 0017 explains. Para. 0017 explicitly states that “A lookup table approximates a function, such as an activation function defined on an interval, by a piecewise-linear function that linearly interpolates the values of the function between a set of sample points for which the values of the function are evaluated and stored in the lookup table.” Such linear interpolation is necessary to estimate data points that are not explicitly measured and contained within the lookup table.
On page 13 of Applicant’s 12/23/2025 Amendment and remarks, with respect to the rejection of claim 8 under 35 U.S.C. 103, Applicant argues that ANTONY does not teach the combination of a “model predictor that comprises one or more machine learning models that generate the first metric based on interpolating entries of a lookup table.”
In response to Applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). Here, JIANG teaches the “model predictor that comprises one or more machine learning models that generate the first metric” as explained above and ANTONY teaches the “based on interpolating entries of a lookup table”, and it is the combination that has been used to reject this limitation.
On page 13 of Applicant’s 12/23/2025 Amendment and remarks, with respect to the rejection of claim 8 under 35 U.S.C. 103, Applicant argues that CHU does not teach the “assess a first metric of the first neural networking using a model predictor that comprises one or more machine learning models that generate the first metric based on interpolating entries of a lookup table” limitation.
Applicant’s argument is respectfully moot because CHU is not relied upon to teach such limitation.
On page 14 of Applicant’s 12/23/2025 Amendment and remarks, with respect to the rejection of claims 9-20 under 35 U.S.C. 103, Applicant argues that such claims should be allowed for the same reasons argued with respect to claim 8.
The examiner respectfully disagrees for the same reasons explained with respect to claim 8.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 8-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding Step 1 of the Alice/Mayo framework, Claims 8-20 are directed to a system (a machine), which falls within one of the four statutory categories of inventions.
Regarding Claim 8
Step 2A, prong 1 (Is the claim directed to a law of nature, a natural phenomenon or an abstract idea).
Claim 8 recites the following mental processes, that in each case under the broadest reasonable interpretation, covers performance of the limitation in the mind (including an observation, evaluation, judgment, opinion) or with the aid of pencil and paper but for the recitation of generic computer components (e.g., “neural network”, “memory”, “processor”).
define a first neural network based upon a first selected network architecture (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally define a neural network by selecting hyperparameters (e.g., learning rate, batch size) based upon a first selected network architecture, such as a CNN with an input layer, a convolutional layer, and an output layer)
assess a first task accuracy of the first neural network (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally review the results of a neural network performing a first task and assess how accurate the neural network appears to be for such task)
assess a first metric of the first neural network... based on interpolating entries of a lookup table (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally assess a metric determined by the model predictor, for example, the model predictor can predict latency, and a human can assess whether such latency is acceptable or not, where the inputs to the model predictor are based on interpolating LUT entries, and a human can mentally interpolate LUT entries (e.g., find a midpoint))
determine a first network hardware-aware score based upon the assessed first task accuracy and assessed first metric (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally determine a score, such as by adding the accuracy + assessed metric (latency) together)
Step 2A, prong 2 (Does the claim recite additional elements that integrate the judicial exception into a practical application?).
The judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements (e.g., “neural network”, “memory”, “processor”) which are recited at a high-level of generality such that they amount to no more than mere instructions to apply the exception using a generic computer component (See MPEP 2106.05(f)).
Regarding the “A hardware-aware neural architecture search (HA-NAS) system configured to search for a neural network architecture to implement a neural network, comprising: a memory; a processor coupled to the memory, wherein the processor is further configured to” limitations, such limitations are recited at a high-level of generality and amount to no more than adding the words “apply it” (or an equivalent) with the judicial exception. In particular, the claim only recites the additional elements of neural networks, memory, and a processor. These additional elements are recited at a high-level of generality and amount to no more than mere instructions to apply the exception using generic computer components. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).
Regarding the “using a model predictor that comprises one or more machine learning models to generate the first metric” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception. In particular, the claim only recites the additional element of machine learning models. This additional element is recited at a high-level of generality and amounts to no more than mere instructions to apply the exception using a generic computer component (machine learning models). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).
Regarding the “implement the first neural network in hardware based on the first network hardware-aware score satisfying a first criterion” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result (e.g., any type of implementation based on a criterion being met). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).
Step 2B (Does the claim recite additional elements that amount to significantly more than the judicial exception?)
In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. As discussed above, the additional elements (e.g., “neural network”, “memory”, “processor”) are recited at a high-level of generality such that they amount to no more than mere instructions to apply the exception using a generic computer component (See MPEP 2106.05(f)).
Regarding the “A hardware-aware neural architecture search (HA-NAS) system configured to search for a neural network architecture to implement a neural network, comprising: a memory; a processor coupled to the memory, wherein the processor is further configured to” limitations, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation merely provides instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).
Regarding the “using a model predictor that comprises one or more machine learning models to generate the first metric” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation merely provides instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).
Regarding the “implement the first neural network in hardware based on the first network hardware-aware score satisfying a first criterion” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result. Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).
Regarding Claim 9
Step 2A, Prong 1
select the first selected network architecture based upon a search strategy and a search space. (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally devise a search strategy (e.g., search for network architectures having latency below a threshold) and a search space (only CNNs with 5 layers or less, with less than 1024 nodes in a convolutional layer), and then select the first network architecture with respect to the search space as a starting point for the search)
Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.
Regarding Claim 10
Step 2A, Prong 1
update the search strategy based upon the first network hardware-aware score. (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally update the search strategy based on the first network hard-aware score, such as by adjusting a latency threshold of the strategy based on the first score already meeting the threshold, to try to find an even better network architecture)
Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.
Regarding Claim 11
Step 2A, Prong 1
select a second network architecture based upon the updated search strategy; (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally select a child network architecture based on the updated search strategy)
define a second neural network based upon a second selected network architecture; (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally define a neural network by selecting hyperparameters (e.g., learning rate, batch size) based upon a second selected network architecture, such as a CNN with an input layer, a convolutional layer, and an output layer)
assess a second task accuracy of the second neural network; (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally review the results of a second task and assess how accurate the neural network appears to be)
assess a second metric of the second neural network using the model predictor; and (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally assess a metric determined by the model predictor, for example, the model predictor can predict latency, and a human can assess whether such latency is acceptable or not)
determine a second network hardware-aware score based upon the assessed second task accuracy and assessed second metric of the second neural network. (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally determine a score, such as by adding the accuracy + assessed metric (latency) together)
Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.
Regarding Claim 12
Step 2A, Prong 1
repeat the steps of selecting a network architecture based upon the updated search strategy, defining a neural network, assessing a task accuracy, and determining a network hardware-aware score for a plurality of iterations to produce a plurality of network hardware-aware scores; and (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally and repeatedly perform these steps as explained with respect to claim 11)
Step 2A, Prong 2
Regarding the “implement the neural network associated with the best network hardware-aware score” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).
Step 2B
Regarding the “implement the neural network associated with the best network hardware-aware score” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result. Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).
Regarding Claim 13
Step 2A, Prong 1
repeat the steps of selecting a network architecture based upon the updated search strategy, defining a neural network, assessing a task accuracy, and determining a network hardware-aware score until a maximum number of steps have been executed. (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally determine a maximum number of trials to execute, such that the neural architecture search can be stopped when the maximum number of trials has been executed, and then mentally perform the recited steps for the maximum number of trials determined)
responsive to determining that the maximum number of steps have been executed, select the neural network associated with the best network hardware-aware score for implementation (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally review all the network hardware-aware scores and select a best one based on some criteria)
Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.
Regarding Claim 14
Step 2A, Prong 1
wherein assessing the first metric of the first neural network includes: (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally assess metrics as explained above with respect to claim 8)
combining the plurality of first block metrics to produce the first metric. (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally combine block metrics to produce the first metric, such as by adding the plurality of first block metrics to produce the first metric)
Step 2A, Prong 2
Regarding the “breaking the first neural network down into a plurality of blocks” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result (merely decomposing a model into constituent parts, e.g., by layers or sub-models, without providing sufficient description about how to perform such decomposition). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).
Regarding the “inputting each block of the plurality of blocks to one of the one or more machine learning models to generate a plurality of first block metrics” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception. In particular, the claim only recites the additional element of a “model”. This additional element is recited at a high-level of generality and amounts to no more than mere instructions to apply the exception using a generic computer component (a model). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).
Step 2B
Regarding the “breaking the first neural network down into a plurality of blocks, wherein the model predictor includes a plurality models corresponding to the plurality of blocks” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result. Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).
Regarding the “inputting each block of the plurality of blocks to one of the one or more machine learning models to generate a plurality of first block metrics” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation merely provides instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).
Regarding Claim 15
Step 2A, Prong 1
wherein the first metric is a latency of the first neural network. (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally assess latency of the first neural network, e.g., whether the measured latency is acceptable or not)
Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.
Regarding Claim 16
Step 2A, Prong 2
Regarding the “wherein each machine learning model of the one or more machine learning models is directed to different target hardware” limitation, this limitation merely describes that the predictor model has multiple components, and such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (predictor models having a plurality of models, where each is directed to different target hardware). As explained by the Supreme Court, a claim directed to a judicial exception cannot be made eligible "simply by having the applicant acquiesce to limiting the reach of the patent for the formula to a particular technological use." Diamond v. Diehr, 450 U.S. 175, 192 n.14, 209 USPQ 1, 10 n. 14 (1981). Thus, limitations that amount to merely indicating a field of use or technological environment in which to apply a judicial exception do not integrate a judicial exception into a practical application.
Step 2B
Regarding the “wherein each machine learning model of the one or more machine learning models is directed to different target hardware” limitation, such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use as explained above, which does not amount to significantly more than the judicial exception. MPEP 2106.05(h).
Regarding Claim 17
Step 2A, Prong 2
Regarding the “wherein the first neural network includes a plurality of blocks including a plurality of block types” limitation, this limitation merely describes that the first neural network has multiple blocks, and such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (neural networks being composed of multiple blocks having different types, such as different layers (blocks) with different types (convolutional, feed-forward, input, output). As explained by the Supreme Court, a claim directed to a judicial exception cannot be made eligible "simply by having the applicant acquiesce to limiting the reach of the patent for the formula to a particular technological use." Diamond v. Diehr, 450 U.S. 175, 192 n.14, 209 USPQ 1, 10 n. 14 (1981). Thus, limitations that amount to merely indicating a field of use or technological environment in which to apply a judicial exception do not integrate a judicial exception into a practical application.
Regarding the “wherein each machine learning model of the one or more machine learning models is directed to a different block type” limitation, this limitation merely describes that the predictor model has multiple components, and such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (predictor models having a plurality of models, where each is directed to different block type). As explained by the Supreme Court, a claim directed to a judicial exception cannot be made eligible "simply by having the applicant acquiesce to limiting the reach of the patent for the formula to a particular technological use." Diamond v. Diehr, 450 U.S. 175, 192 n.14, 209 USPQ 1, 10 n. 14 (1981). Thus, limitations that amount to merely indicating a field of use or technological environment in which to apply a judicial exception do not integrate a judicial exception into a practical application.
Step 2B
Regarding the “wherein the first neural network includes a plurality of blocks including a plurality of block types” limitation, such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use as explained above, which does not amount to significantly more than the judicial exception. MPEP 2106.05(h).
Regarding the “wherein each machine learning model of the one or more machine learning models is directed to a different block type” limitation, such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use as explained above, which does not amount to significantly more than the judicial exception. MPEP 2106.05(h).
Regarding Claim 18
Step 2A, Prong 2
Regarding the “wherein the first neural network includes a plurality of blocks including a plurality of block types” limitation, this limitation merely describes that the first neural network has multiple blocks, and such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (neural networks being composed of multiple blocks having different types, such as different layers (blocks) with different types (convolutional, feed-forward, input, output). As explained by the Supreme Court, a claim directed to a judicial exception cannot be made eligible "simply by having the applicant acquiesce to limiting the reach of the patent for the formula to a particular technological use." Diamond v. Diehr, 450 U.S. 175, 192 n.14, 209 USPQ 1, 10 n. 14 (1981). Thus, limitations that amount to merely indicating a field of use or technological environment in which to apply a judicial exception do not integrate a judicial exception into a practical application.
Regarding the “wherein each machine learning model of the one or more machine learning models is directed to a different block type and a different hardware target” limitation, this limitation merely describes that the predictor model has multiple components, and such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (predictor models having a plurality of models, where each is directed to different block types and hardware targets). As explained by the Supreme Court, a claim directed to a judicial exception cannot be made eligible "simply by having the applicant acquiesce to limiting the reach of the patent for the formula to a particular technological use." Diamond v. Diehr, 450 U.S. 175, 192 n.14, 209 USPQ 1, 10 n. 14 (1981). Thus, limitations that amount to merely indicating a field of use or technological environment in which to apply a judicial exception do not integrate a judicial exception into a practical application.
Step 2B
Regarding the “wherein the first neural network includes a plurality of blocks including a plurality of block types” limitation, such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use as explained above, which does not amount to significantly more than the judicial exception. MPEP 2106.05(h).
Regarding the “wherein each machine learning model of the one or more machine learning models is directed to a different block type and a different hardware target” limitation, such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use as explained above, which does not amount to significantly more than the judicial exception. MPEP 2106.05(h).
Regarding Claim 19
Step 2A, Prong 1
wherein assessing the first task accuracy uses an accuracy predictor function. (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally perform this by mentally deriving and implementing a function for the accuracy predictor, e.g., predicting accuracy > 99% if the hyperparameters satisfy a particular criteria)
Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.
Regarding Claim 20
Step 2A, Prong 1
wherein the accuracy predictor function is based on support vector regression in combination with an early stopping scheme. (under the broadest reasonable interpretation, a human can mentally perform this limitation, for example, a human can mentally use a support vector regression technique, in combination with an early stopping scheme (e.g., stop when accuracy > 99.99%) to predict the accuracy of a model or network)
Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 8-15 are rejected under 35 U.S.C. 103 as being unpatentable over Jiang, Weiwen, et al. "Accuracy vs. efficiency: Achieving both through fpga-implementation aware neural architecture search." Proceedings of the 56th Annual Design Automation Conference 2019, pp. 1-6, hereinafter referenced as JIANG, in view of US 20210397596 A1, hereinafter referenced as ANTONY, and further in view of US 20210110276 A1, hereinafter referenced as CHU.
Regarding Claim 8
JIANG teaches:
A hardware-aware neural architecture search (HA-NAS) system configured to search for a neural network architecture to implement a neural network, comprising: (JIANG, p. 1, section 1: “In this paper, we propose a novel hardware-aware NAS framework to address the above issues. To illustrate our framework, we choose to use Field Programmable Gate Array (FPGA) as a vehicle, as it has gradually become one of the most popular platforms to implement DNNs due to its high performance and energy efficiency, in particular for low-batch real-time applications”;
JIANG, p. 2, section 1: “We build an FPGA-implementation aware neural architecture search framework, namely FNAS, which can generate optimal DNN architectures with guaranteed latency on target FPGAs.”)
a memory; a processor coupled to the memory, wherein the processor is further configured to: (JIANG, p. 3, section 3.3: “In FNAS, each layer is allocated to a dedicated PE, and PEs are performed in the pipeline fashion. Such architecture can be implemented on one FPGA as in [8, 15] or multiple FPGAs as in [4]. The resource (e.g., DSP and memory bandwidth) for each layer can be obtained by considering the load balance.”;
JIANG, p. 5, section 4.1: “To compare FNAS with NAS, we employ both low-end and high-end FPGAs to implement the resultant architectures using MNIST data set. The low-end and high-end FPGAs selected are Xilinx 7A50T and 7Z020, respectively.”;
Examiner’s Note (EN): pursuant to MPEP 2131.01 II, the examiner cites to the Xilinx 7 Series PFGAs Data Sheet: Overview (Sept. 8, 2020), to explain the meaning of the term “7A50T” in JIANG, where at page 3, table 4, the XC7A50T is identified as a Artix-7 FPGA, and on p. 1, table 1, the Artix-7 FPGA family has 13 Mb of RAM and Microblaze CPU, establishing that the FPGA architectures of JIANG include at least a memory (the 13 Mb of RAM) and a processor coupled to the memory (the Microblaze CPU))
define a first neural network based upon a first selected network architecture; (JIANG, p. 3, section 3.1 and Fig. 2: “The problem is formally designed as follows: Given a specific data set, a target FPGA platform and a required inference latency rL, our objective is to automatically generate a neural network, such that its inference latency on the given FPGA platform is less than rL, while achieving the maximum accuracy for the machine learning task on the given data set. ... In FNAS, it takes the FPGA-based inference performance into consideration during child network searching.”;
JIANG, p. 5, section 3.6: “FNAS framework considers the performance of child
networks on target FPGAs in the neural architecture search process.”;
(EN): the initial child network evaluated corresponds to the recited “first neural network” and the initial architecture in the neural architecture search process corresponds to the recited “first selected network architecture”; as shown in Fig. 2, the neural network architectures are defined by “hyperparameters”)
assess a first task accuracy of the first neural network; (JIANG, p. 3, section 3.1: “The problem is formally designed as follows: Given a specific data set, a target FPGA platform and a required inference latency rL, our objective is to automatically generate a neural network, such that its inference latency on the given FPGA platform is less than rL, while achieving the maximum accuracy for the machine learning task on the given data set. ... Specifically, instead of directly applying accuracy A as reward, FNAS employs a reward function f to calculate the reward in terms of accuracy A and performance/latency L.”;
JIANG, p. 5, section 4.1: “In training of the child networks, the number of epochs is
set as 25, and the maximum validation accuracy in the last 5 epochs will be utilized to compute the reward for updating the controller.”Examiner’s Note: JIANG discloses maximizing accuracy for a machine learning task for a given dataset (corresponding to “assess a first task accuracy”) with respect to a child network (corresponding to the recited “first neural network”))
assess a first metric of the first neural network using a model predictor that comprises one or more machine learning models to generate the first metric ... ; and (JIANG, p. 4, section 3.6: “FNAS-Analyzer aims to efficiently and accurately compute the latency L of a neural architecture on target FPGAs with determined schedule. ... FNAS framework considers the performance of child networks on target FPGAs in the neural architecture search process. As shown in Formula 1, if the latency cannot satisfy the timing specification, there is no need to train the generated child network. In addition, the controller will be guided to avoid searching architectures that have insufficient performance. Consequently, the search process can be dramatically accelerated, and the performance of the resultant child network on target FPGAs can be guaranteed.”
(EN): latency corresponds to the recited “first metric of the neural network” and FNAS-Analyzer corresponds to the recited “model predictor”; as shown in Fig. 1, the NAS framework is a neural network model)
determine a first network hardware-aware score based upon the assessed first task accuracy and assessed first metric; and (JIANG, p. 3, section 3.2: “Reward function takes the accuracy A, latency L, and the required latency rL to calculate the reward signal. The function to calculate the reward R is defined as follows.
PNG
media_image1.png
54
346
media_image1.png
Greyscale
Examiner’s Note (EN): the reward corresponds to the recited “determine a first network hardware-aware score” and takes as inputs accuracy and latency)
However, JIANG fails to explicitly teach:
based on interpolating entries of a lookup table
implement the first neural network in hardware based on the first network hardware-aware score satisfying a first criterion
However, in a related field of endeavor (developing machine learning applications, see para. 0002), ANTONY teaches:
assess a first metric of the first neural network using a model predictor that comprises one or more machine learning models to generate the first metric based on interpolating entries of a lookup table (ANTONY, para. 0017: “A lookup table approximates a function, such as an activation function defined on an interval, by a piecewise-linear function that linearly interpolates the values of the function between a set of sample points for which the values of the function are evaluated and stored in the lookup table. The precision with which the lookup table approximates the function depends on the number of sample points (which can be limited by hardware limits on the size of the lookup table), and the selection of the sample points themselves. Some hardware implementations can also limit the selection of the sample points to evenly spaced sample points. For this reason, the precision for a particular machine learning application can be higher than a lookup table approximation can provide for a given hardware implementation”;
ANTONY, para. 0032: “In one or more implementations, the neural processor 212 (and/or CPU 204 and/or GPU 206) may evaluate a neural network associated with ML model 220 by evaluating each of one or more activation functions by obtaining, interpolating, and/or combining the lookup table entries of the multiple LUTs 230 for that activation function.”;
Examiner’s Note: ANTONY discloses interpolating values in a look-up table to generate approximations; the JIANG-ANTONY combination now modifies the hardware-aware neural architecture search system of JIANG so that data points are interpolated using known data points from a look-up table as taught by ANTONY)
Before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the hardware-aware neural architecture search system of JIANG with the teachings of ANTONY concerning interpolating look-up table entries. As disclosed by ANTONY, one of ordinary skill would have been motivated to do so in order to approximate unknown values from known values using interpolation. (para. 0017). One of ordinary skill would further understand the benefit of using interpolation to generate additional training data to train a machine learning model.
However, JIANG and ANTONY fail to explicitly teach:
implement the first neural network in hardware based on the first network hardware-aware score satisfying a first criterion
However, in a related field of endeavor (a search method for a neural network model structure, see para. 0002), CHU teaches:
implement the first neural network in hardware based on the first network hardware-aware score satisfying a first criterion (CHU, para. 0036: “In step S17, the next generation population of network model structure is used as the current population of generation network model repeated until a multi-objective optimization state is optimal structure, the above process is, and a neural network model structure suitable for different scenarios is selected from a final generation population of network model structure.”; (EN): the JIANG-ANTONY-CHU combination now selects a neural network architecture based on the “multi-objective optimization state is optimal” of CHU (corresponding to recited “first criterion”) and implements the neural network architecture in a FPGA (hardware) as disclosed by JIANG)
Before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the hardware-aware neural architecture search system of JIANG with the teachings of ANTONY concerning interpolating look-up table entries, and further with the teachings of CHU pertaining to updating a neural architecture search strategy based on results. As disclosed by CHU, one of ordinary skill would have been motivated to do so because CHU teaches that by “combining reinforcement learning mutation and random mutation, the use of reinforcement learning algorithm to adjust random mutation process in evolutionary algorithm is realized, and a balance between exploring randomness and using learned information is achieved.” (para. 0037).
JIANG further discusses that there is a trade-off between efficiency and accuracy (see pp. 5-6, section 4.2), and one of ordinary skill in the art would have been motivated to use the best reward score in situations where accuracy is favored over efficiency.
Regarding Claim 9
JIANG, ANTONY, and CHU teach the HA-NAS system of claim 8. JIANG further teaches:
select the first selected network architecture based upon a search strategy and a search space. (JIANG, p. 2, section 2: “With the fact that the architectures are growing deeper, the search space grows exponentially, which makes the search process difficult”;
JIANG, p. 3, section 3.1: “The problem is formally designed as follows: Given a specific data set, a target FPGA platform and a required inference latency rL, our objective is to automatically generate a neural network, such that its inference latency on the given FPGA platform is less than rL, while achieving the maximum accuracy for the machine learning task on the given data set.”;
JIANG, p. 5, section 3.6: “FNAS framework considers the performance of child networks on target FPGAs in the neural architecture search process. As shown in Formula 1, if the latency cannot satisfy the timing specification, there is no need to train the generated child network. In addition, the controller will be guided to avoid searching architectures that have insufficient performance. Consequently, the search process can be dramatically accelerated, and the performance of the resultant child network on target FPGAs can be guaranteed.”;
JIANG, p. 5, Figs. 5(b) and 5(c);
Examiner’s Note: the search strategy is to generate a child NN “such that its inference latency on the given FPGA platform is less than rL, while achieving the maximum accuracy for the machine learning task on the given data set” and this strategy is reflected in Figs. 5(b) and 5(c); the search space corresponds to the universe of architectures to actively search, but excluding those that the “controller will be guided to avoid searching”)
Regarding Claim 10
JIANG, ANTONY, and CHU teach the HA-NAS system of claim 9. However, JIANG and ANTONY fail to explicitly teach:
wherein the processor is further configured to: update the search strategy based upon the first network hardware-aware score.
However, in a related field of endeavor (a search method for a neural network model structure, see para. 0002), CHU teaches:
wherein the processor is further configured to: update the search strategy based upon the first network hardware-aware score. (CHU, para. 0023: “In the search method of the neural network model structure, the search strategy 102 is used to search for a network structure in the search space 101, the performance of the searched network structure is evaluated by the performance evaluation strategy 103, and the search strategy 102 is updated according to evaluation results.”; (EN): the JIANG-CHU combination now uses the reward score of JIANG to update the search strategy as disclosed by CHU)
Before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the hardware-aware neural architecture search system of JIANG with the teachings of ANTONY concerning interpolating look-up table entries, and further with the teachings of CHU pertaining to updating a neural architecture search strategy based on results. As disclosed by CHU, one of ordinary skill would have been motivated to do so because CHU teaches that by “combining reinforcement learning mutation and random mutation, the use of reinforcement learning algorithm to adjust random mutation process in evolutionary algorithm is realized, and a balance between exploring randomness and using learned information is achieved.” (para. 0037).
Regarding Claim 11
JIANG, ANTONY, and CHU teach the HA-NAS system of claim 10. JIANG, ANTONY, and CHU further make obvious:
wherein processor is further configured to:
select a second network architecture based upon the updated search strategy; (JIANG, p. 5, section 4.1: “For instance, we explain the configuration of MNIST: (1) its child network has 4 layers, (2) the possible filter size (height and width) is 5, 7 or 14, (3) the possible channel number is 9, 18, or 36, and (4) it will find 60 child networks.”; (EN) the JIANG-CHU combination now iteratively searches the next child network (of up to 60 child networks) using the updated search strategy of CHU; pursuant to MPEP 2144.04 VI.B, duplication of parts or steps have “no patentable significance unless a new and unexpected result is produced” and the instant limitation is a duplication of the selection of a network architecture step of claim 8)
define a second neural network based upon a second selected network architecture; (JIANG, p. 3, section 3.1 and Fig. 2: “The problem is formally designed as follows: Given a specific data set, a target FPGA platform and a required inference latency rL, our objective is to automatically generate a neural network, such that its inference latency on the given FPGA platform is less than rL, while achieving the maximum accuracy for the machine learning task on the given data set. ... In FNAS, it takes the FPGA-based inference performance into consideration during child network searching.”;
JIANG, p. 5, section 3.6: “FNAS framework considers the performance of child
networks on target FPGAs in the neural architecture search process.”;
(EN): the next child network evaluated corresponds to the recited “second neural network” and the next architecture in the neural architecture search process corresponds to the recited “second selected network architecture”; as shown in Fig. 2, the neural network architectures are defined by “hyperparameters”; pursuant to MPEP 2144.04 VI.B, duplication of parts or steps have “no patentable significance unless a new and unexpected result is produced” and the instant limitation is a duplication of the step of claim 8 pertaining to defining a first neural network based upon a first selected network architecture)
assess a second task accuracy of the second neural network; (JIANG, p. 3, section 3.1: “The problem is formally designed as follows: Given a specific data set, a target FPGA platform and a required inference latency rL, our objective is to automatically generate a neural network, such that its inference latency on the given FPGA platform is less than rL, while achieving the maximum accuracy for the machine learning task on the given data set. ... Specifically, instead of directly applying accuracy A as reward, FNAS employs a reward function f to calculate the reward in terms of accuracy A and performance/latency L.”;
JIANG, p. 5, section 4.1: “In training of the child networks, the number of epochs is
set as 25, and the maximum validation accuracy in the last 5 epochs will be utilized to compute the reward for updating the controller.”
(EN): pursuant to MPEP 2144.04 VI.B, duplication of parts or steps have “no patentable significance unless a new and unexpected result is produced” and the instant limitation is a duplication of the step of claim 8 pertaining to assess a first task accuracy of the second neural network)
assess a second metric of the second neural network using the model predictor; and (JIANG, p. 4, section 3.6: “FNAS-Analyzer aims to efficiently and accurately compute the latency L of a neural architecture on target FPGAs with determined schedule.”; (EN): pursuant to MPEP 2144.04 VI.B, duplication of parts or steps have “no patentable significance unless a new and unexpected result is produced” and the instant limitation is a duplication of the step of claim 8 pertaining to assess a first metric of the second neural network using the model predictor)
determine a second network hardware-aware score based upon the assessed second task accuracy and assessed second metric of the second neural network. (JIANG, p. 3, section 3.2: “Reward function takes the accuracy A, latency L, and the required latency rL to calculate the reward signal. The function to calculate the reward R is defined as follows.
PNG
media_image1.png
54
346
media_image1.png
Greyscale
Examiner’s Note (EN): pursuant to MPEP 2144.04 VI.B, duplication of parts or steps have “no patentable significance unless a new and unexpected result is produced” and the instant limitation is a duplication of the step of claim 8 pertaining to determine a first network hardware-aware score based upon the assessed task accuracy and assessed metric)
Before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the hardware-aware neural architecture search system of JIANG with the teachings of ANTONY concerning interpolating look-up table entries, and further with the teachings of CHU pertaining to updating a neural architecture search strategy based on results. As disclosed by CHU, one of ordinary skill would have been motivated to do so because CHU teaches that by “combining reinforcement learning mutation and random mutation, the use of reinforcement learning algorithm to adjust random mutation process in evolutionary algorithm is realized, and a balance between exploring randomness and using learned information is achieved.” (para. 0037).
Regarding Claim 12
JIANG, ANTONY, and CHU teach the HA-NAS system of claim 11. JIANG, ANTONY, and CHU further make obvious:
the processor is further configured to:
repeat the steps of selecting a network architecture based upon the updated search strategy, defining a neural network, assessing a task accuracy, and determining a network hardware-aware score for a plurality of iterations to produce a plurality of network hardware-aware scores; and (JIANG, p. 5, section 4.1: “For instance, we explain the configuration of MNIST: (1) its child network has 4 layers, (2) the possible filter size (height and width) is 5, 7 or 14, (3) the possible channel number is 9, 18, or 36, and (4) it will find 60 child networks.”; (EN): the JIANG-CHU combination now iteratively searches the next child network (of up to 60 child networks) using the updated search strategy of CHU; pursuant to MPEP 2144.04 VI.B, duplication of parts or steps have “no patentable significance unless a new and unexpected result is produced” and the instant limitation is a duplication of the steps of claim 8 with respect to the first network architecture and of claim 11 with respect to the second network architecture)
implement the neural network associated with the best network hardware-aware score. (CHU, para. 0036: “In step S17, the next generation population of network model structure is used as the current population of generation network model repeated until a multi-objective optimization state is optimal structure, the above process is, and a neural network model structure suitable for different scenarios is selected from a final generation population of network model structure.”; (EN): the JIANG-ANTONY-CHU combination now uses the highest reward value of JIANG (corresponding to recited “best network hardware-aware score”) as the “optimization state” of CHU when selecting the final network, and implements the network in the FPGAs of JIANG)
Before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the hardware-aware neural architecture search system of JIANG with the teachings of ANTONY concerning interpolating look-up table entries, and further with the teachings of CHU pertaining to updating a neural architecture search strategy based on results. As disclosed by CHU, one of ordinary skill would have been motivated to do so because CHU teaches that by “combining reinforcement learning mutation and random mutation, the use of reinforcement learning algorithm to adjust random mutation process in evolutionary algorithm is realized, and a balance between exploring randomness and using learned information is achieved.” (para. 0037).
JIANG further discusses that there is a trade-off between efficiency and accuracy (see pp. 5-6, section 4.2), and one of ordinary skill in the art would have been motivated to use the best reward score in situations where accuracy is favored over efficiency.
Regarding Claim 13
JIANG, ANTONY, and CHU teach the HA-NAS system of claim 12. JIANG, ANTONY, and CHU further make obvious:
wherein the processor is further configured to: repeat the steps of selecting a network architecture based upon the updated search strategy, defining a neural network, assessing a task accuracy, and determining a network hardware-aware score until a maximum number of steps have been executed; and (JIANG, p. 5, section 4.1: “For instance, we explain the configuration of MNIST: (1) its child network has 4 layers, (2) the possible filter size (height and width) is 5, 7 or 14, (3) the possible channel number is 9, 18, or 36, and (4) it will find 60 child networks.”; (EN): pursuant to MPEP 2144.04 VI.B, duplication of parts or steps have “no patentable significance unless a new and unexpected result is produced” and the instant limitation is a duplication of the steps of claim 12; JIANG discloses a maximum of 60 trials corresponding to 60 child networks, and now the JIANG-ANTONY-CHU combination repeats the recited steps a maximum of 60 times as disclosed by JIANG)
responsive to determining that the maximum number of steps have been executed, select the neural network associated with the best network hardware-aware score for implementation (CHU, para. 0036: “In step S17, the next generation population of network model structure is used as the current population of generation network model repeated until a multi-objective optimization state is optimal structure, the above process is, and a neural network model structure suitable for different scenarios is selected from a final generation population of network model structure.”; (EN): the JIANG-CHU combination now uses the highest reward value of JIANG (corresponding to recited “best network hardware-aware score”) as the “optimization state” of CHU when selecting the final network, and implements the network in the FPGAs of JIANG, after the 60 trials for the 60 trial networks of JIANG are performed, corresponding to recited “responsive to determining that the maximum number of steps have been executed”)
Before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the hardware-aware neural architecture search system of JIANG with the teachings of ANTONY concerning interpolating look-up table entries, and further with the teachings of CHU pertaining to updating a neural architecture search strategy based on results. As disclosed by CHU, one of ordinary skill would have been motivated to do so because CHU teaches that by “combining reinforcement learning mutation and random mutation, the use of reinforcement learning algorithm to adjust random mutation process in evolutionary algorithm is realized, and a balance between exploring randomness and using learned information is achieved.” (para. 0037).
JIANG further discusses that there is a trade-off between efficiency and accuracy (see pp. 5-6, section 4.2), and one of ordinary skill in the art would have been motivated to use the best reward score in situations where accuracy is favored over efficiency.
Regarding Claim 14
JIANG, ANTONY, and CHU teach the HA-NAS system of claim 8. JIANG further teaches:
wherein assessing the first metric of the first neural network includes: (JIANG, p. 4, section 3.6: “FNAS-Analyzer aims to efficiently and accurately compute the latency L of a neural architecture on target FPGAs with determined schedule.”)
breaking the first neural network down into a plurality of blocks, (JIANG, p. 3, section 3.3: “Due to the limited resource on FPGA, it may be difficult to place a whole convolutional layer on FPGA. In consequence, it is common to apply tiling technique to split convolutional operations into multiple small tasks [8, 12, 13, 15]. FNAS-Design is to determine the tiling parameters for a given NN architecture on target FPGAs. ... After tiling the IFM/OFM/row/col, one convolutional operation is divided to smaller tasks, as shown in Figure 3(c). Each task corresponds to a pair of IFM/OFM tiles. Tasks in one layer will be continuously loaded to a Processing Element (PE) on FPGA for execution (the load sequence is determined by➂FNAS-Sched).”; (EN): each set of IFM/OFM tiles corresponds to a recited “block”, and each processing element performs a convolution operation (corresponding to recited “plurality model corresponding to the plurality of blocks”) where such PE models the specific convolution operation for the tile)
inputting each block of the plurality of blocks to one of the one or more machine learning models; and (JIANG, p. 3, section 3.3: “Due to the limited resource on FPGA, it may be difficult to place a whole convolutional layer on FPGA. In consequence, it is common to apply tiling technique to split convolutional operations into multiple small tasks [8, 12, 13, 15]. FNAS-Design is to determine the tiling parameters for a given NN architecture on target FPGAs. ... After tiling the IFM/OFM/row/col, one convolutional operation is divided to smaller tasks, as shown in Figure 3(c). Each task corresponds to a pair of IFM/OFM tiles. Tasks in one layer will be continuously loaded to a Processing Element (PE) on FPGA for execution (the load sequence is determined by➂FNAS-Sched).”;
JIANG, pp. 4-5, section 3.6: “Processing Time. We first determine the execution time of tasks in the tile-based graph. Since all tasks in layer i utilize the same accelerator for execution, they have the same execution time. ... Start Time. The start time of a layer depends on its previous layer’s start time and data reuse strategy.”;
(EN): the processing time for each tile, based on the starting time, corresponds to the latency for each block)
combining the plurality of first block metrics to produce the first metric. (JIANG, pp. 4-5, section 3.6: Latency. We can then derive a tight lower bound on latency Latsys by summing up processing time and starting time. For a total of N processing elements (PE), assume the first and the last PEs apply OFM reuse, we can calculate Latsys as follows.
PNG
media_image2.png
62
474
media_image2.png
Greyscale
Examiner’s Note: The processing times and start time are combined to determine the overall latency, corresponding to the recited “first metric”)
Regarding Claim 15
JIANG, ANTONY, and CHU teach the HA-NAS system of claim 8. JIANG further teaches:
wherein the first metric is a latency of the first neural network. (JIANG, p. 4, section 3.6: “FNAS-Analyzer aims to efficiently and accurately compute the latency L of a neural architecture on target FPGAs with determined schedule.”)
Claims 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over JIANG in view of ANTONY and CHU and further in view of US 20220138550 A1, hereinafter referenced as ZHANG I.
Regarding Claim 16
JIANG, ANTONY, and CHU teach the HA-NAS system of claim 8. However, JIANG, ANTONY, and CHU fail to explicitly teach:
wherein each machine learning model of the one or more machine learning models is directed to different target hardware
However, in a related field of endeavor (artificial intelligence models, including neural networks, paras. 0002-0003), ZHANG I teaches:
wherein each machine learning model of the one or more machine learning models is directed to different target hardware (ZHANG I, para. 0029: “According to various aspects, an AI model may be broken-up (e.g., split, divided, decomposed, etc.) into a plurality of sub-models by a gateway system (e.g., a blockchain client, etc.) ... The gateway system may break-up the neural network into smaller subsets of neurons (e.g., sub-models) within the neural network, and assign the sub-models to different blockchain peers such that each peer only executes a portion of the AI model, but not the entire AI model. As a non-limiting example, each layer in a deep learning neural network may be a sub-model, and may be assigned to a different blockchain peer (or peers) for training.”; (EN): ZHANG I teaches that a model can be de-composed into sub-models; the JIANG-ANTONY-CHU-ZHANG I combination now modifies the neural architecture search system of JIANG to break up models into sub-models as in ZHANG I, where each sub-model is directed to a different target architecture, e.g., the Xilinx 7A50T or the 7Z020 FPGA hardware as disclosed by section 4 of JIANG)
Before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the hardware-aware neural architecture search system of JIANG with the teachings of ANTONY concerning interpolating look-up table entries, and further with the teachings of CHU pertaining to updating a neural architecture search strategy based on results, and further with the teachings of ZHANG I pertaining to decomposing a neural network into sub-models. As disclosed by ZHANG I, one of ordinary skill would have been motivated to do so, for example, in distributed computing environments such as a blockchain. (para. 0029).
Further, JIANG discloses than an objective to is “exploit parallelism” (p. 3, section 3.5) and one of ordinary skill in the art would understand that parallelism can reduce computing time.
Regarding Claim 17
JIANG, ANTONY, and CHU teach the HA-NAS system of claim 8. JIANG further teaches:
wherein the first neural network includes a plurality of blocks (JIANG, p. 3, section 3.3: “Due to the limited resource on FPGA, it may be difficult to place a whole convolutional layer on FPGA. In consequence, it is common to apply tiling technique to split convolutional operations into multiple small tasks [8, 12, 13, 15]. FNAS-Design is to determine the tiling parameters for a given NN architecture on target FPGAs. ... After tiling the IFM/OFM/row/col, one convolutional operation is divided to smaller tasks, as shown in Figure 3(c). Each task corresponds to a pair of IFM/OFM tiles. Tasks in one layer will be continuously loaded to a Processing Element (PE) on FPGA for execution (the load sequence is determined by➂FNAS-Sched).”; (EN): each set of IFM/OFM tiles corresponds to a recited “block”, and each processing element performs a convolution operation (corresponding to recited “plurality model corresponding to the plurality of blocks”) where such PE models the specific convolution operation for the tile)
However, JIANG, ANTONY, and CHU fail to explicitly teach:
including a plurality of block types; and
wherein each machine learning model of the one or more machine learning models is directed to a different block type
However, in a related field of endeavor (artificial intelligence models, including neural networks, paras. 0002-0003), ZHANG I teaches:
including a plurality of block types; and (ZHANG I, para. 0029: “According to various aspects, an AI model may be broken-up (e.g., split, divided, decomposed, etc.) into a plurality of sub-models by a gateway system (e.g., a blockchain client, etc.) ... The gateway system may break-up the neural network into smaller subsets of neurons (e.g., sub-models) within the neural network, and assign the sub-models to different blockchain peers such that each peer only executes a portion of the AI model, but not the entire AI model. As a non-limiting example, each layer in a deep learning neural network may be a sub-model, and may be assigned to a different blockchain peer (or peers) for training.”; (EN): ZHANG I teaches a plurality of sub-models which can be different layers; the JIANG-ANTONY-CHU-ZHANG I combination now modifies the neural architecture search system of JIANG to break up models by type of layer (e.g., convolutional layer, feed-forward layer, output layer))
wherein each machine learning model of the one or more machine learning models is directed to a different block type. (ZHANG I, para. 0029: “According to various aspects, an AI model may be broken-up (e.g., split, divided, decomposed, etc.) into a plurality of sub-models by a gateway system (e.g., a blockchain client, etc.) ... The gateway system may break-up the neural network into smaller subsets of neurons (e.g., sub-models) within the neural network, and assign the sub-models to different blockchain peers such that each peer only executes a portion of the AI model, but not the entire AI model. As a non-limiting example, each layer in a deep learning neural network may be a sub-model, and may be assigned to a different blockchain peer (or peers) for training.”; (EN): ZHANG I teaches a plurality of sub-models which can be different layers; the JIANG-ANTONY-CHU-ZHANG I combination now modifies the neural architecture search system of JIANG to break up models by type of layer (e.g., convolutional layer, feed-forward layer, output layer) and the sub-models of ZHANG I are directed towards the different types of layers)
Before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the hardware-aware neural architecture search system of JIANG with the teachings of ANTONY concerning interpolating look-up table entries, and further with the teachings of CHU pertaining to updating a neural architecture search strategy based on results, and further with the teachings of ZHANG I pertaining to decomposing a neural network into sub-models, including sub-models according to layer. As disclosed by ZHANG I, one of ordinary skill would have been motivated to do so, for example, in distributed computing environments such as a blockchain. (para. 0029).
Further, JIANG discloses than an objective to is “exploit parallelism” (p. 3, section 3.5) and one of ordinary skill in the art would understand that parallelism can reduce computing time.
Regarding Claim 18
JIANG, ANTONY, and CHU teach the HA-NAS system of claim 8. JIANG further teaches:
wherein the first neural network includes a plurality of blocks (JIANG, p. 3, section 3.3: “Due to the limited resource on FPGA, it may be difficult to place a whole convolutional layer on FPGA. In consequence, it is common to apply tiling technique to split convolutional operations into multiple small tasks [8, 12, 13, 15]. FNAS-Design is to determine the tiling parameters for a given NN architecture on target FPGAs. ... After tiling the IFM/OFM/row/col, one convolutional operation is divided to smaller tasks, as shown in Figure 3(c). Each task corresponds to a pair of IFM/OFM tiles. Tasks in one layer will be continuously loaded to a Processing Element (PE) on FPGA for execution (the load sequence is determined by➂FNAS-Sched).”; (EN): each set of IFM/OFM tiles corresponds to a recited “block”, and each processing element performs a convolution operation (corresponding to recited “plurality model corresponding to the plurality of blocks”) where such PE models the specific convolution operation for the tile)
However, JIANG, ANTONY, and CHU fail to explicitly teach:
including a plurality of block types; and
wherein each machine learning model of the one or more machine learning models is directed to a different block type and a different hardware target
However, in a related field of endeavor (artificial intelligence models, including neural networks, paras. 0002-0003), ZHANG I teaches:
including a plurality of block types; and (ZHANG I, para. 0029: “According to various aspects, an AI model may be broken-up (e.g., split, divided, decomposed, etc.) into a plurality of sub-models by a gateway system (e.g., a blockchain client, etc.) ... The gateway system may break-up the neural network into smaller subsets of neurons (e.g., sub-models) within the neural network, and assign the sub-models to different blockchain peers such that each peer only executes a portion of the AI model, but not the entire AI model. As a non-limiting example, each layer in a deep learning neural network may be a sub-model, and may be assigned to a different blockchain peer (or peers) for training.”; (EN): ZHANG I teaches a plurality of sub-models which can be different layers; the JIANG-ANTONY-CHU-ZHANG I combination now modifies the neural architecture search system of JIANG to break up models by type of layer (e.g., convolutional layer, feed-forward layer, output layer))
wherein each machine learning model of the one or more machine learning models is directed to a different block type and a different hardware target. (ZHANG I, para. 0029: “According to various aspects, an AI model may be broken-up (e.g., split, divided, decomposed, etc.) into a plurality of sub-models by a gateway system (e.g., a blockchain client, etc.) ... The gateway system may break-up the neural network into smaller subsets of neurons (e.g., sub-models) within the neural network, and assign the sub-models to different blockchain peers such that each peer only executes a portion of the AI model, but not the entire AI model. As a non-limiting example, each layer in a deep learning neural network may be a sub-model, and may be assigned to a different blockchain peer (or peers) for training.”; (EN): ZHANG I teaches a plurality of sub-models which can be different layers; the JIANG-ANTONY-CHU-ZHANG I combination now modifies the neural architecture search system of JIANG to break up models by type of layer (e.g., convolutional layer, feed-forward layer, output layer) and the sub-models of ZHANG I are directed towards the different types of layers and/or different processing elements for FPGAs of JIANG (see p. 3, section 3.3 and p. 5, section 4))
Before the effective filing date of the present application, it would have been obvious to one of ordinary skill in the art to combine the hardware-aware neural architecture search system of JIANG with the teachings of ANTONY concerning interpolating look-up table entries, and further with the teachings of CHU pertaining to updating a neural architecture search strategy based on results, and further with the teachings of ZHANG I pertaining to decomposing a neural network into sub-models, including sub-models according to layer. As disclosed by ZHANG I, one of ordinary skill would have been motivated to do so, for example, in distributed computing environments such as a blockchain. (para. 0029).
Further, JIANG discloses than an objective to is “exploit parallelism” (p. 3, section 3.5) and one of ordinary skill in the art would understand that parallelism can reduce computing time.
Claims 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over JIANG in view of ANTONY and CHU and further in view of US 20130110756 A1, hereinafter referenced as ZHANG II.
Regarding Claim 19
JIANG, ANTONY, and CHU teach the HA-NAS system of claim 8. However, JIANG, ANTONY, and CHU fail to explicitly teach:
wherein assessing the first task accuracy uses an accuracy predictor function.
However, in a related field of endeavor (load forecasting using neural networks, see para. 0018), ZHANG II teaches:
wherein assessing the first task accuracy uses an accuracy predictor function. (ZHANG II, para. 0018: “The present disclosure focuses on the support vector regression approach due to its accuracy and efficiency in practical prediction problems.”; (EN): the JIANG-ZHANG II combination now modifies the hardware-aware neural architecture search system of JIANG to use a support vector regression approach as in ZHANG II to predict the accuracy of a child network instead of fully training and measuring the accuracy, e.g., in order to save time and computing resources)
Before the effective filing date of the present application, it would have been obvious to combine the hardware-aware neural architecture search system of JIANG with the teachings of ANTONY concerning interpolating look-up table entries, and further with the teachings of CHU pertaining to updating a neural architecture search strategy based on results, to use a support vector regression approach as in ZHANG II. As disclosed by ZHANG II, one of ordinary skill would have been motivated to do so because ZHANG II teaches techniques for improving such prediction accuracy. (para. 0018). Moreover, one of ordinary skill would understand the benefit of predicting accuracy instead of fully-training and measuring a child model’s accuracy because the prediction can save computation time and resources and reduce latency.
Regarding Claim 20
JIANG, ANTONY, CHU, and ZHANG II teach the HA-NAS system of claim 19. However, JIANG, ANTONY, and CHU fail to explicitly teach:
wherein the accuracy predictor function is based on support vector regression in combination with an early stopping scheme.
However, in a related field of endeavor (load forecasting using neural networks, see para. 0018), ZHANG II teaches:
wherein the accuracy predictor function is based on support vector regression in combination with an early stopping scheme. (ZHANG II, para. 0018: “The present disclosure focuses on the support vector regression approach due to its accuracy and efficiency in practical prediction problems.”; (EN): the JIANG-ANTONY-CHU-ZHANG II combination now modifies the hardware-aware neural architecture search system of JIANG to use a support vector regression approach as in ZHANG II to predict the accuracy of a child network instead of fully training and measuring the accuracy, and will stop if the predicted reward function is lower than L/rL as disclosed by p. 3, section 3.1 of JIANG).
Before the effective filing date of the present application, it would have been obvious to combine the hardware-aware neural architecture search system of JIANG with the teachings of ANTONY concerning interpolating look-up table entries, and further with the teachings of CHU pertaining to updating a neural architecture search strategy based on results, to use a support vector regression approach as in ZHANG II. As disclosed by ZHANG II, one of ordinary skill would have been motivated to do so because ZHANG II teaches techniques for improving such prediction accuracy. (para. 0018). Moreover, one of ordinary skill would understand the benefit of predicting accuracy instead of fully-training and measuring a child model’s accuracy because the prediction can save computation time and resources and reduce latency.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL C LEE whose telephone number is (571)272-4933. The examiner can normally be reached M-F 12:00 pm - 8:00 pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached at 571-272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MICHAEL C. LEE/Examiner, Art Unit 2128