Last updated: May 29, 2026
Application No. 17/459,644
DETERMINING ONE OR MORE NEURAL NETWORKS FOR OBJECT CLASSIFICATION

Final Rejection §101§103
Filed
Aug 27, 2021
Examiner
HOOVER, BRENT JOHNSTON
Art Unit
2127
Tech Center
2100 — Computer Architecture & Software
Assignee
Nvidia Corporation
OA Round
4 (Final)
Interview Optional

— +23.4% interview lift. Examiner has a relatively high allowance rate (83%); +23.4% interview lift. A written response may suffice.
Based on 363 resolved cases, 2023–2026
Examiner Intelligence

HOOVER, BRENT JOHNSTON View full profile →
Grants 83% — above average
Career Allowance Rate
300 granted / 363 resolved
+27.6% vs TC avg
Strong +23% interview lift
Without
With
+23.4%
Interview Lift
resolved cases with interview
Typical timeline
3y 5m
Avg Prosecution
21 currently pending
Career history
387
Total Applications
across all art units
Statute-Specific Performance

§101
22.0%
-18.0% vs TC avg
§103
64.9%
+24.9% vs TC avg
§102
6.6%
-33.4% vs TC avg
§112
3.7%
-36.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 363 resolved cases
Office Action

§101 §103
DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is responsive to the original application filed on 8/27/2021 and the Remarks and Amendments filed on 2/17/2026.

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
  

Claims 1-18 and 25-30 are further rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.  The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).

When considering subject matter eligibility under 35 U.S.C. 101, it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter (Step 1). If the claim does fall within one of the statutory categories, the second step in the analysis is to determine whether the claim is directed to a judicial exception (Step 2A). The Step 2A analysis is broken into two prongs. In the first prong (Step 2A, Prong 1), it is determined whether or not the claims recite a judicial exception (e.g., mathematical concepts, mental processes, certain methods of organizing human activity). If it is determined in Step 2A, Prong 1 that the claims recite a judicial exception, the analysis proceeds to the second prong (Step 2A, Prong 2), where it is determined whether or not the claims integrate the judicial exception into a practical application. If it is determined at step 2A, Prong 2 that the claims do not integrate the judicial exception into a practical application, the analysis proceeds to determining whether the claim is a patent-eligible application of the exception (Step 2B). If an abstract idea is present in the claim, any element or combination of elements in the claim must be sufficient to ensure that the claim integrates the judicial exception into a practical application, or else amounts to significantly more than the abstract idea itself.

Claim 1
Step 1:  The claim recites a processor; thus, it is directed to the statutory category of a machine.
Step 2A Prong 1:  The claim recites, inter alia:
select, based on an inference task to be performed, a network configuration from a configuration search space comprising a plurality of encodings of different combinations of neural architectures and data augmentations, wherein each respective encoding of the plurality of encodings encodes a distinct combination of a respective neural architecture and a respective set of one or more data augmentations together in the respective encoding: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of selecting a neural network configuration comprising encodings of neural architectures and data augmentations, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
one or more data augmentations to be applied to data to train the one or more second neural networks: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of augmenting data to use for training, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Specifically, the additional elements consist of “one or more circuits to use one or more first neural networks to” and “wherein the selected network configuration comprises: a neural architecture for one or more second neural networks”. 
	The additional element “one or more circuits to” amount to invoking computers or other machinery merely as a tool to perform existing processes or judicial exceptions. The additional element “use one or more first neural networks to” amounts to reciting only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the first neural network is used to select a second neural network. Thus, these additional elements amount to no more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP §2106.05(f)).
The additional element of “wherein the selected network configuration comprises: a neural architecture for one or more second neural networks” amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h).
Thus, the additional elements do not provide any meaningful limits on the execution of the abstract idea. Even when viewed in combination, these additional elements do not integrate the abstract idea into a practical application and the claim is thus directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  
	The additional element “one or more circuits to” amount to invoking computers or other machinery merely as a tool to perform existing processes or judicial exceptions. The additional element “use one or more first neural networks to” amounts to reciting only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the first neural network is used to select a second neural network. Thus, these additional elements amount to no more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP §2106.05(f)).
The additional element of “wherein the selected network configuration comprises: a neural architecture for one or more second neural networks” amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h).
Nothing in the claim provides significantly more than that abstract idea.  As such, the claim is ineligible.

Claim 2
Step 1:  A machine, as above.
Step 2A Prong 1: The claim recites the abstract ideas in the preceding claims from which it depends.
Step 2A Prong 2, Step 2B: The additional element of “wherein the neural architecture and the one or more data augmentations are to be selected together based, at least in part, upon a type of information to be inferenced by the one or more second neural networks” amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP §2106.05(h)).  Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 3
Step 1:  A machine, as above.
Step 2A Prong 1: The claim recites, inter alia:
predict which network configuration, for each of a plurality of pairs of candidate second neural networks, will be more accurate for an inference to be generated by the one or more second neural networks: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of predicting which neural network will be accurate for an inference, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B: The additional elements of “wherein the one or more first neural networks include a relational predictor network” amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP §2106.05(h)).  Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 4
Step 1:  A machine, as above.
Step 2A Prong 1: The claim recites, inter alia:
select a set of hyperparameters for the one or more second neural networks: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of selecting a set of hyperparameters, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B: The additional element of “the one or more circuits to use the one or more first neural networks to” amounts to reciting only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the first neural network is used to select hyperparameters. Thus, these additional elements amount to no more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP §2106.05(f)).
 Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 5
Step 1:  A machine, as above.
Step 2A Prong 1: The claim recites the abstract ideas in the preceding claims from which it depends.
Step 2A Prong 2, Step 2B: The additional element of “wherein the neural architecture, the one or more data augmentations, and the set of hyperparameters are part of at least one network configuration, wherein the at least one network configuration is selected from a plurality of candidate configurations sampled from the configuration search space, wherein the candidate configurations in the configuration search space correspond to architectures having at least one of different numbers, types, connections, asymmetries, or spatial resolutions of network layers” amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP §2106.05(h)).  Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 6
Step 1:  A machine, as above.
Step 2A Prong 1: The claim recites, inter alia:
wherein the candidate configurations are sorted based on results of the relational predictor network with respect to other candidate configurations using a binary classification between pairs of candidate configurations without predicting accuracy values for the candidate configurations: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of sorting configurations using a binary classification, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B: The additional elements of “wherein the one or more first neural networks comprise a relational predictor network, wherein the candidate configurations are encoded as vectors to be compared by the relational predictor network” amount to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP §2106.05(h)).  Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 7-12
Claims 7-12 recite a system (step 1: a machine) using a processor to perform the steps of claims 1-6, which by MPEP 2106.05(f) (“apply it”) cannot integrate an abstract idea into a practical application or provide significantly more than the abstract idea by itself, and are thus rejected for the same reasons set forth in the rejection of claims 1-6.

Claims 13-18
Claims 13-18 recite a method (step 1: a process) to perform the steps of claims 1-6, without any additional elements that integrate the abstract ideas into a practical application or provide significantly more than the abstract idea by itself, and are thus rejected for the same reasons set forth in the rejection of claims 1-4, respectively.

Claim 25
Step 1:  The claim recites a system; thus, it is directed to the statutory category of a machine.
Step 2A Prong 1:  The claim recites, inter alia:
select, based on an inference task to be performed, a network configuration from a configuration search space comprising a plurality of encodings of different combinations of neural architectures and data augmentations, wherein each respective encoding of the plurality of encodings encodes a distinct combination of a respective neural architecture and a respective set of one or more data augmentations together in the respective encoding: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of selecting a neural network configuration comprising encodings of neural architectures and data augmentations, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
one or more data augmentations to be applied to data to train the one or more second neural networks: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of augmenting data to use for training, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Specifically, the additional elements consist of “one or more processors to use one or more first neural networks to”, “wherein the selected network configuration comprises: a neural architecture for one or more second neural networks”, and “memory for storing network parameters for the one or more first or second neural networks”. 
	The additional element “one or more processors to” amount to invoking computers or other machinery merely as a tool to perform existing processes or judicial exceptions. The additional element “use one or more first neural networks to” amounts to reciting only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the first neural network is used to select a second neural network. Thus, these additional elements amount to no more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP §2106.05(f)).
The additional element “memory for storing network parameters for the one or more first or second neural networks” is an insignificant extra-solution activity (see MPEP §2106.05(g)).
The additional element of “wherein the selected network configuration comprises: a neural architecture for one or more second neural networks” amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h).
Thus, the additional elements do not provide any meaningful limits on the execution of the abstract idea. Even when viewed in combination, these additional elements do not integrate the abstract idea into a practical application and the claim is thus directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  
	The additional element “one or more processors to” amount to invoking computers or other machinery merely as a tool to perform existing processes or judicial exceptions. The additional element “use one or more first neural networks to” amounts to reciting only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the first neural network is used to select a second neural network. Thus, these additional elements amount to no more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP §2106.05(f)).
The additional element “memory for storing network parameters for the one or more first or second neural networks” is an insignificant extra-solution activity (see MPEP §2106.05(g)), and is a well-understood, routine, conventional activity (see MPEP § 2106.05(d); “Receiving or transmitting data over a network” and “Storing and retrieving information in memory”).
The additional element of “wherein the selected network configuration comprises: a neural architecture for one or more second neural networks” amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h).
Nothing in the claim provides significantly more than that abstract idea.  As such, the claim is ineligible.

Claim 26-30
Claims 26-30 recite a system (step 1: a machine) using a processor to perform the steps of claims 2-6, which by MPEP 2106.05(f) (“apply it”) cannot integrate an abstract idea into a practical application or provide significantly more than the abstract idea by itself, and are thus rejected for the same reasons set forth in the rejection of claims 2-6.

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



Claims 1-18 and 25-30 are rejected under 35 U.S.C. 103 as being obvious over Dai et al. (Dai et al., “FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining”, Mar. 30, 2021, arXiv:2006.02049v3, pp. 1-13, hereinafter “Dai”) in view of Kashima et al. (Kashima et al., “Joint Search of Data Augmentation Policies and Network Architectures”, Jan 12, 2021, arXiv:2012.09407v2, pp. 1-5, hereinafter “Kashima”) and Kaur et al. (US 20210097383 A1, hereinafter “Kaur”).

Regarding claim 1, Dai discloses [a] processor, comprising: one or more circuits to (Page 5, §4; Dai inherently uses a processor with a circuit to implement the Experiments section of the paper)
use one or more first neural networks to select: … a network configuration from a configuration search space (Abstract; “To address this, we 81.3 82.8 81 79 77 present Neural Architecture-Recipe Search (NARS) to search both (a) architectures and (b) their corresponding training recipes, simultaneously. NARS utilizes an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking”, which discloses using a first neural network to select a second neural network using a neural architecture search technique; and Page 3, Algorithm 1; the algorithm discloses using a first neural network “u” to select a second neural network when it is retrained as part of the neural architecture search.  Specifically, this is reflected in the lines “Retrain the accuracy predictor u on Dt” and “Initialize D? with p best-performing samples in DT and q randomly generated samples paired with scores predicted by u” and “Augment D? with C paired with scores predicted by u” as a way to determine or select the second neural network “u”/predictor; and Page 3, §3.1; “Our predictor aims to predict accuracy given representations of an architecture and a training recipe. The architecture and training recipe are encoded using one-hot categorical variables (e.g., for block types) and min-max normalized continuous values (e.g., for channel counts). See the full search space in Table 2. The predictor architecture is a multi-layer perceptron (Fig. 3) consisting of several fully-connected layers and two heads: (1) An auxiliary “proxy” head, used for pretraining the encoder, predicts architecture statistics (e.g., FLOPs and #Parameters) from architecture representations; and (2) the accuracy head, fine-tuned in constrained iterative optimization (Sec 3.3), predicts accuracy from joint representations of the architecture and training recipe”; and see generally  §3.4 and 3.5; and Figure 3).
Dai fails to explicitly disclose but Kashima discloses select … , based on an inference task to be performed, a network configuration from a configuration search space comprising a plurality of encodings of different combinations of neural architectures and data augmentations (Page 2, Column 1; “Specifically, we jointly optimize the differentiable approaches for augmentation policy search … and architecture search”, which discloses selecting a network configuration from a search space with respect to neural architectures and data augmentations; and Page 2, Column 2; “In terms of search spaces, even a single part in the training pipeline has a large one. For example, an efficient NAS method, ENAS (Pham et al. 2018), still has a large search space over 13 1011 possible networks. Auto Augment (Cubuk et al. 2019) is a method to automatically choose the data augmentation policies during training, and the search space has roughly 29 1032 possibilities. It means that searching over all possibilities of the combination of these two parts will have about 38 1043 possibilities”; and Algorithm 1; the algorithm jointly optimizes encodings of different combinations of neural architectures and data augmentations, and the encodings are represented as alpha, z, p and mu values in the algorithm; and Page 3, Column 1; “We sequentially combine Faster-AA and DARTS, then optimize both in the end-to-end manner … Specifically, augmentation policies and network architectures are both optimized by minimizing a loss function on the validation dataset, while the network weights are optimized using the training dataset”, which discloses a joint configuration search space includes architecture parameters (DARTS) and augmentation policies (Faster-AA). The optimization procedure in algorithm 1 selects a configuration from this search space. Further, the selection of the configuration is task specific because it is optimized to minimize a validation loss on a target dataset)
 wherein the selected network configuration comprises: a neural architecture for one or more second neural networks, and (Page 2, Column 2; “Each cell is represented as a directed acyclic graph (DAG) consisting of N nodes which represent intermediate features (e.g., feature maps). A cell takes two input nodes and one output node, and an edge f (i,j) between two nodes i, j represents an operation such as convolution or pooling.”, which discloses wherein the configuration includes a neural architecture that comprises nodes and neural network functions)
one or more data augmentations to be applied to data to train the one or more second neural networks (Page 2, Column 1; “Data augmentation is a series of transformation applied on the input data. Typically, we have to choose which operations should be applied with what magnitudes. Several methods for automatic search of probability distribution on the selection of operators and their magnitudes have been proposed (Cubuk et al. 2019; Lim et al. 2019; Hataya et al. 2020). The Faster-AA considers that a policy consists of L sub-policies each of which has K consecutive operations”, which discloses the data augmentation technique that is applied to the data to train a neural network; and Page 2, Column 1; “A sub-policy is a series of K operations, and during training, the output of kth operation X0 from an input X is calculated as a weighted sum over all possible #O operations as follows”, which discloses that the augmentation is applied to training data; and Algorithm 1).
Dai and Kashima are analogous art because both are concerned with neural architecture search. Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural architecture search to combine the data augmentation and architecture search of Kashima with the neural architecture search process of Dai to yield to the predictable result of select, based on an inference task to be performed, a network configuration from a configuration search space comprising a plurality of encodings of different combinations of neural architectures and data augmentations, wherein the selected network configuration comprises: a neural architecture for one or more second neural networks, and one or more data augmentations to be applied to data to train the one or more second neural networks.  The motivation for doing so would be to provide for a joint optimization method for data augmentation policies and network architectures to bring more automation to the design of training pipeline (Kashima; Abstract).
Dai fails to explicitly disclose but Kaur discloses wherein each respective encoding of the plurality of encodings encodes a distinct combination of a respective neural architecture and a respective set of one or more data augmentations together in the respective encoding ([0036]; “The multi-objective learning process may select the optimal combination by implementing at least one search space. The at least one search may include a combined search space comprising (i) a plurality of deep learning architectures and (ii) a plurality of data pre-processing steps … The multi-objective learning process may include training a recurrent neural network to rank combinations of deep learning architectures and data pre-processing strategies from the at least one search space, based on the data and the deep learning task, and wherein the optimal combination may be selected based on said ranking. The ranking may rank the combinations according to at least one performance metric for performing the deep learning task relative to other combinations. Each of the combinations may include at least one of: a deep learning architecture different from the other combinations, and a data pre-processing strategy different from the other combinations … The one or more data pre-processing steps may include at least one of: …  augmenting at least a portion of said data” (emphasis added), wherein an encoding is interpreted as an optimal combination, and the encoding encodes or represents a distinct combination of a neural or deep learning architecture and a data augmentation or pre-processing step. Note that the originally filed specification at [0058] discloses that a combination of encoding is a reference for a predictor to determine an optimal architecture and training configuration, so it follows that the claimed “encoding” is any reference or combination of a neural architecture and data augmentation information, such as the “optimal combination” that is disclosed in Kaur).
Dai, Kashima, and Kaur are analogous art because all are concerned with neural architecture search. Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural architecture search to combine the combined encoding of data augmentation and architecture search of Kaur with the neural architecture search process of Dai and Kashima to yield to the predictable result of wherein each respective encoding of the plurality of encodings encodes a distinct combination of a respective neural architecture and a respective set of one or more data augmentations together in the respective encoding.  The motivation for doing so would be to provide for a joint optimization method for data augmentation policies and network architectures to allow an optimal architecture to be found for a particular dataset in an improved manner (Kaur; [0018]).

Regarding claim 7, it is a system claim corresponding to the steps of claim 1, and is rejected for the same reasons as claim 1.

Regarding claim 13, it is a method claim corresponding to the steps of claim 1, and is rejected for the same reasons as claim 1.

Regarding claim 25, Dai discloses [a] network selection system, comprising: one or more processors to (Page 5, §4; Dai inherently uses a processor as part of a system to implement the Experiments section of the paper)
use one or more first neural networks to select: … a network configuration from a configuration search space (Abstract; “To address this, we 81.3 82.8 81 79 77 present Neural Architecture-Recipe Search (NARS) to search both (a) architectures and (b) their corresponding training recipes, simultaneously. NARS utilizes an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking”, which discloses using a first neural network to select a second neural network using a neural architecture search technique; and Page 3, Algorithm 1; the algorithm discloses using a first neural network “u” to select a second neural network when it is retrained as part of the neural architecture search.  Specifically, this is reflected in the lines “Retrain the accuracy predictor u on Dt” and “Initialize D? with p best-performing samples in DT and q randomly generated samples paired with scores predicted by u” and “Augment D? with C paired with scores predicted by u” as a way to determine or select the second neural network “u”/predictor; and Page 3, §3.1; “Our predictor aims to predict accuracy given representations of an architecture and a training recipe. The architecture and training recipe are encoded using one-hot categorical variables (e.g., for block types) and min-max normalized continuous values (e.g., for channel counts). See the full search space in Table 2. The predictor architecture is a multi-layer perceptron (Fig. 3) consisting of several fully-connected layers and two heads: (1) An auxiliary “proxy” head, used for pretraining the encoder, predicts architecture statistics (e.g., FLOPs and #Parameters) from architecture representations; and (2) the accuracy head, fine-tuned in constrained iterative optimization (Sec 3.3), predicts accuracy from joint representations of the architecture and training recipe”; and see generally  §3.4 and 3.5; and Figure 3).
memory for storing network parameters for the one or more first or second neural networks (Page 5, §4; Dai inherently uses a memory as part of a system to implement the Experiments section and algorithms of the paper and to store parameters).
Dai fails to explicitly disclose but Kashima discloses select … , based on an inference task to be performed, a network configuration from a configuration search space comprising a plurality of encodings of different combinations of neural architectures and data augmentations (Page 2, Column 1; “Specifically, we jointly optimize the differentiable approaches for augmentation policy search … and architecture search”, which discloses selecting a network configuration from a search space with respect to neural architectures and data augmentations; and Page 2, Column 2; “In terms of search spaces, even a single part in the training pipeline has a large one. For example, an efficient NAS method, ENAS (Pham et al. 2018), still has a large search space over 13 1011 possible networks. Auto Augment (Cubuk et al. 2019) is a method to automatically choose the data augmentation policies during training, and the search space has roughly 29 1032 possibilities. It means that searching over all possibilities of the combination of these two parts will have about 38 1043 possibilities”; and Algorithm 1; the algorithm jointly optimizes encodings of different combinations of neural architectures and data augmentations, and the encodings are represented as alpha, z, p and mu values in the algorithm; and Page 3, Column 1; “We sequentially combine Faster-AA and DARTS, then optimize both in the end-to-end manner … Specifically, augmentation policies and network architectures are both optimized by minimizing a loss function on the validation dataset, while the network weights are optimized using the training dataset”, which discloses a joint configuration search space includes architecture parameters (DARTS) and augmentation policies (Faster-AA). The optimization procedure in algorithm 1 selects a configuration from this search space. Further, the selection of the configuration is task specific because it is optimized to minimize a validation loss on a target dataset)
 wherein the selected network configuration comprises: a neural architecture for one or more second neural networks, and (Page 2, Column 2; “Each cell is represented as a directed acyclic graph (DAG) consisting of N nodes which represent intermediate features (e.g., feature maps). A cell takes two input nodes and one output node, and an edge f (i,j) between two nodes i, j represents an operation such as convolution or pooling.”, which discloses wherein the configuration includes a neural architecture that comprises nodes and neural network functions)
one or more data augmentations to be applied to data to train the one or more second neural networks (Page 2, Column 1; “Data augmentation is a series of transformation applied on the input data. Typically, we have to choose which operations should be applied with what magnitudes. Several methods for automatic search of probability distribution on the selection of operators and their magnitudes have been proposed (Cubuk et al. 2019; Lim et al. 2019; Hataya et al. 2020). The Faster-AA considers that a policy consists of L sub-policies each of which has K consecutive operations”, which discloses the data augmentation technique that is applied to the data to train a neural network; and Page 2, Column 2; “A sub-policy is a series of K operations, and during training, the output of kth operation X0 from an input X is calculated as a weighted sum over all possible #O operations as follows”, which discloses that the augmentation is applied to training data; and Algorithm 1).
The motivation to combine Dai and Kashima is the same as discussed above with respect to claim 1.
Dai fails to explicitly disclose but Kaur discloses wherein each respective encoding of the plurality of encodings encodes a distinct combination of a respective neural architecture and a respective set of one or more data augmentations together in the respective encoding ([0036]; “The multi-objective learning process may select the optimal combination by implementing at least one search space. The at least one search may include a combined search space comprising (i) a plurality of deep learning architectures and (ii) a plurality of data pre-processing steps … The multi-objective learning process may include training a recurrent neural network to rank combinations of deep learning architectures and data pre-processing strategies from the at least one search space, based on the data and the deep learning task, and wherein the optimal combination may be selected based on said ranking. The ranking may rank the combinations according to at least one performance metric for performing the deep learning task relative to other combinations. Each of the combinations may include at least one of: a deep learning architecture different from the other combinations, and a data pre-processing strategy different from the other combinations … The one or more data pre-processing steps may include at least one of: …  augmenting at least a portion of said data”, wherein an encoding is interpreted as an optimal combination, and the encoding encodes or represents a distinct combination of a neural or deep learning architecture and a data augmentation or pre-processing step. Note that the originally filed specification at [0058] discloses that a combination of encoding is a reference for a predictor to determine an optimal architecture and training configuration, so it follows that the claimed “encoding” is any reference or combination of a neural architecture and data augmentation information, such as the “optimal combination” that is disclosed in Kaur).
The motivation to combine Dai, Kashima, and Kaur is the same as discussed above with respect to claim 1.

	Regarding claims 2, 8, 14, and 26, the rejection of claims 1, 7, 13, 19, and 25 are incorporated and Dai further discloses wherein the neural architecture … are to be selected together based, at least in part, upon information to be inferenced by the one or more second neural networks (§3.4; the section discloses selecting a neural network architecture based on information that is inferenced by the second predictor or neural network; and Figure 3).
Dai fails to explicitly disclose but Kashima discloses wherein … the one or more data augmentations are to be selected based, at least in part, upon information to be inferenced by the one or more second neural networks (Page 2, Column 1; “we propose a joint optimization method for both data augmentation policies and network architectures … Data augmentation is a series of transformation applied on the input data. Typically, we have to choose which operations should be applied with what magnitudes. Several methods for automatic search of probability distribution on the selection of operators and their magnitudes have been proposed (Cubuk et al. 2019; Lim et al. 2019; Hataya et al. 2020). The Faster-AA considers that a policy consists of L sub-policies each of which has K consecutive operations”; and Abstract; and Page 2, Column 2; “During inference, the k-th operation is sampled from the categorical distribution Cat(ση(zk)), so that we obtain transformed data X0 by Eq. (1).”).
The motivation to combine Dai and Kashima is the same as discussed above with respect to claim 1.

Regarding claims 3, 9, 15, and 27, the rejection of claims 1, 7, 13, 19, and 25 are incorporated and Dai discloses wherein the one or more first neural networks include a relational predictor network to predict which network configuration, for each of a plurality of pairs of candidate network configurations, will be more accurate for an inference to be generated by the one or more second neural networks (§3.1 and §3.4; and Figure 3).

Regarding claims 4, 10, 16, and 28, the rejection of claims 1, 7, 13, 19, and 25 are incorporated and Dai further discloses select a set of hyperparameters for the one or more second neural networks (Abstract; “To address this, we 81.3 82.8 81 79 77 present Neural Architecture-Recipe Search (NARS) to search both (a) architectures and (b) their corresponding training recipes, simultaneously. NARS utilizes an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking”; and Table 4; and §3.2).

Regarding claims 5, 11, 17, and 29, the rejection of claims 1, 7, 13, 19, 25, 4, 19, 16, 22, and 28 are incorporated and Dai further discloses wherein the neural architecture, [[the one or more data augmentations]], and the set of hyperparameters are part of at least one network configuration, wherein the at least one network configuration is selected from a plurality of candidate configurations sampled from the configuration search space, wherein the candidate configurations in the configuration search space correspond to architectures having at least one of different numbers, types, connections, asymmetries, or spatial resolutions of network layers (Algorithm 1; and Page 4, § 3.3; “Constrained iterative optimization: We first use Quasi Monte-Carlo (QMC) [37] sampling to generate a sample pool of architecture-recipe pairs from the search space”; and Page 6, Table 2).
Dai fails to explicitly disclose but Kashima discloses the one or more data augmentations (Page 2, Column 1; “we propose a joint optimization method for both data augmentation policies and network architectures … Data augmentation is a series of transformation applied on the input data. Typically, we have to choose which operations should be applied with what magnitudes. Several methods for automatic search of probability distribution on the selection of operators and their magnitudes have been proposed (Cubuk et al. 2019; Lim et al. 2019; Hataya et al. 2020). The Faster-AA considers that a policy consists of L sub-policies each of which has K consecutive operations”; and Abstract).
The motivation to combine Dai and Kashima is the same as discussed above with respect to claim 1.

Regarding claims 6, 12, 18, and 30, the rejection of claims 1, 7, 13, 19, 25, 4, 19, 16, 22, 28, 5, 11, 17, 23, and 29 are incorporated and Dai further discloses wherein the one or more first neural networks comprise a relational predictor network, wherein the candidate configurations are encoded as vectors to be compared by the relational predictor network, wherein the candidate configurations are sorted based on results of the relational predictor network with respect to other candidate configurations using a binary classification between pairs of candidate configurations without predicting accuracy values for the candidate configurations (Algorithm 1; the algorithm discloses that the configurations are encoded as vectors for comparison by a relational network and sorting the configurations by rank).

Response to Arguments

Applicant’s arguments and amendments, filed on 2/17/2026, with respect to the 35 USC § 103 rejection of the pending claims have been fully considered but are moot because the arguments do not apply to the references used to reject the independent claims. Dai, Kashima, and Kaur are now being used to render the independent claims obvious under 35 USC § 103.

Applicant’s arguments and amendments, filed on 2/17/2026, with respect to the 35 USC § 101 abstract idea rejection of the pending claims have been fully considered but are not persuasive.

Beginning on page 9 of the Remarks, filed on 2/17/2026, Applicant argues that using a neural network to select a neural architecture and data augmentations is not something that can be performed in the human mind. Examiner respectfully disagrees.

The broad use of the neural network to do the claimed selecting is not part of the abstract ideas of the independent claims.  The use of a neural network to do the claimed selecting of an architecture and data augmentations is a mere instruction to apply per §2106.05(f) of the MPEP because the neural network is used as a tool to perform the abstract ideas of selecting.  This does not provide a technical improvement, integrate the abstract ideas into a practical application, or provide significantly more than the abstract ideas.  Further, with respect to Applicant’s argument that the human mind cannot perform the claimed selecting as recited in the independent claims, Applicant has failed to provide persuasive evidence from the claim language or originally filed specification that the claimed selecting of architectures and data augmentations is impossible to do in the human mind.  The claim language does not make it clear that the act of selecting is “combinatorially large” and nearly impossible to do in the human mind.  

Applicant last argues that “as discussed in paras. [0064] - [0066] of Applicant's specification, encoding different combinations of architecture and data augmentations together in respective vector encoding enables a light-weight comparison between such vectors that allows training and search to be performed much faster than conventional approaches. Thus, Applicant's claimed approach provide an improvement in computer technology supported by explicit results provided in the specific”. Examiner respectfully disagrees.

Applicant has failed to identify any additional elements in the claim language beyond the abstract ideas that integrates the abstract ideas of the claim into a practical application, reflect a technical improvement, or provides significantly more than the abstract ideas.

Accordingly, Applicant’s arguments are not persuasive, and the 35 USC § 101 rejection of the pending claims STANDS.

Conclusion                                                                                                                                                                                         
                                                                                                                                                                                    
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Brent Hoover whose telephone number is (303)297-4403. The examiner can normally be reached Monday - Friday 9-5 MST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on 571-270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/BRENT JOHNSTON HOOVER/Primary Examiner, Art Unit 2127
Read full office action
Prosecution Timeline

Show 1 earlier event
Sep 06, 2024
Non-Final Rejection mailed — §101, §103
Mar 06, 2025
Response Filed
Jun 10, 2025
Final Rejection mailed — §101, §103
Oct 10, 2025
Request for Continued Examination
Oct 16, 2025
Response after Non-Final Action
Nov 18, 2025
Non-Final Rejection mailed — §101, §103
Feb 17, 2026
Response Filed
Mar 31, 2026
Final Rejection mailed — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/962,729
Patent 12639637
SYSTEM AND METHOD OF TRAINING MACHINE-LEARNING-BASED MODEL
3y 7m to grant Granted May 26, 2026
16/949,359
Patent 12632772
METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR IMPROVING INTERPRETABILITY OF SOFTWARE BLACK-BOX MACHINE LEARNING MODEL OUTPUTS
5y 6m to grant Granted May 19, 2026
18/659,042
Patent 12632732
METHOD AND APPARATUS FOR MULTI-LABEL CLASS CLASSIFICATION BASED ON COARSE-TO-FINE CONVOLUTIONAL NEURAL NETWORK
2y 0m to grant Granted May 19, 2026
17/090,071
Patent 12626135
DYNAMICALLY DIVIDING ACTIVATIONS AND KERNELS FOR IMPROVING MEMORY EFFICIENCY
5y 6m to grant Granted May 12, 2026
17/492,460
Patent 12626125
SYNTHESIZING A SINGULAR ENSEMBLE MACHINE LEARNING MODEL FROM AN ENSEMBLE OF MODELS
4y 7m to grant Granted May 12, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

5-6
Expected OA Rounds
83%
Grant Probability
99%
With Interview (+23.4%)
3y 5m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 363 resolved cases by this examiner. Grant probability derived from career allowance rate.