DETAILED ACTION
This action is responsive to the amendment filed on 12/19/2025. Claims 1-21 and 29-32 are pending in the case. Claims 1-6, 8-15, 17-18, and 29-32 are currently amended. Claims 1, 8, 15, and 29 are independent claims.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 11/06/2025 is being considered by the examiner.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-21 and 29-32 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 1, the claim recites “validate ones of the second subgroup of architecture configurations” on line 14 and “the ones of the second subgroup” on lines 15-16. It is unclear what applicant is attempting to refer to with the use of “ones”. For examination purposes, the limitations have been interpreted to mean “validate a subset of architecture configurations of the second subgroup of architecture configurations” and “the subset of architecture configurations of the second subgroup”.
Claims 2-7 are rejected as being dependent upon a rejected base claim without curing any of the deficiencies.
Regarding claim 8, the claim recites “validate ones of the second subgroup of architecture configurations” on line 10 and “the ones of the second subgroup” on lines 11-12. It is unclear what applicant is attempting to refer to with the use of “ones”. For examination purposes, the limitations have been interpreted to mean “validate a subset of architecture configurations of the second subgroup of architecture configurations” and “the subset of architecture configurations of the second subgroup”.
Claims 9-14 are rejected as being dependent upon a rejected base claim without curing any of the deficiencies.
Regarding claim 15, the claim recites “validate ones of the second subgroup of architecture configurations” on lines 28-29 and “the ones of the second subgroup” on line 31. It is unclear what applicant is attempting to refer to with the use of “ones”. For examination purposes, the limitations have been interpreted to mean “validate a subset of architecture configurations of the second subgroup of architecture configurations” and “the subset of architecture configurations of the second subgroup”.
Claims 16-21 are rejected as being dependent upon a rejected base claim without curing any of the deficiencies.
Regarding claim 29, the claim recites “validate ones of the second subgroup of architecture configurations” on lines 12-13 and “the ones of the second subgroup” on line 15. It is unclear what applicant is attempting to refer to with the use of “ones”. For examination purposes, the limitations have been interpreted to mean “validate a subset of architecture configurations of the second subgroup of architecture configurations” and “the subset of architecture configurations of the second subgroup”.
Claims 30-32 are rejected as being dependent upon a rejected base claim without curing any of the deficiencies.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-21 and 29-32 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding claim 1:
Step 1 Statutory Category: Claim 1 is directed to an apparatus, which falls under one of the four statutory categories.
Step 2A Prong 1 Judicial exception: Claim 1 recites, in part, “generate a first plurality of candidate architecture configurations” and “generate a second plurality of candidate architecture configurations”. These limitations, under the broadest reasonable interpretation, cover the recitation of the abstract idea of a mathematical calculation, as directed to “a claim that recites a mathematical calculation, when the claim is given its broadest reasonable interpretation in light of the specification, will be considered as falling within the "mathematical concepts" grouping. A mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number”. See MPEP §2106.04(a)(2)(I)(C). Further, the claim recites: “generate a second subgroup of architecture configurations by selecting a number of the first plurality of candidate architecture configurations”. This limitation, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case judgment. See MPEP § 2106.04(a)(2)(III). Further, the claim recites: “validate ones of the second subgroup of architecture configurations”. This limitation, under the broadest reasonable interpretation, covers the recitation of a mathematical concept, see MPEP §2106.04(a)(2)(I). Further, the claim recites: “select an architecture configuration from the second plurality of candidate architecture configurations for implementation in an artificial intelligence model”. This limitation, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case judgment. See MPEP § 2106.04(a)(2)(III).
Step 2A Prong 2 Integration into a practical application: This judicial exception is not integrated into a practical application. In particular the claim recites: “an apparatus”, “an interface to access a first subgroup of neural network architecture configurations from a plurality of neural network architecture configurations”, “machine-readable instructions”, and “at least one processor circuit to be programmed by the machine-readable instructions”. These limitations are additional elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Further, the claim recites: “a neural architecture search”, “by executing the trained predictor model to search the plurality of neural network architecture configurations”, and “by executing the re-trained predictor model to search the plurality of neural network architecture configurations”. These limitations are additional elements that generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP §2106.05(h). Further, the claim recites: “train a predictor model based on the first subgroup” and “re-train the predictor model based on the first subgroup and the ones of the second subgroup”. These limitations are additional elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f).
Step 2B Significantly more: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements: “an apparatus”, “an interface to access a first subgroup of neural network architecture configurations from a plurality of neural network architecture configurations”, “machine-readable instructions”, “at least one processor circuit to be programmed by the machine-readable instructions”, “train a predictor model based on the first subgroup”, and “re-train the predictor model based on the first subgroup and the ones of the second subgroup” amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. Further, the additional elements, “a neural architecture search”, “by executing the trained predictor model to search the plurality of neural network architecture configurations”, and “by executing the re-trained predictor model to search the plurality of neural network architecture configurations” generally link the use of the judicial exception to a particular technological environment or field of use. Elements that merely generally link the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible.
Regarding claim 2, the rejection of claim 1 is incorporated, and further, the claim recites: “measure a first objective of the first subgroup and a second objective of the first subgroup”. This limitation recites mathematical concepts in addition to those identified in the rejection of the parent claim, thus recites a judicial exception.
Further, the claim recites “the at least one processor circuit” and “train the predictor model based on the first and second objectives”. These are additional elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. The claim is not patent eligible.
Regarding claim 3, the rejection of claim 1 is incorporated, and further, the claim recites: “train the predictor model using first predictors”. This limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use. See MPEP §2106.05(h). Elements that merely generally link the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. Further, the claim recites “the at least one processor circuit”. This is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. The claim is not patent eligible.
Regarding claim 4, the rejection of claim 3 is incorporated, and further, the claim recites: “select the predictor model based on an error of the predictor model”. This limitation recites mental process limitations in addition to those identified in the rejection of the parent claim.
Further, the claim recites “the at least one processor circuit”. This is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. The claim is not patent eligible.
Regarding claim 5, the rejection of claim 1 is incorporated, and further, the claim recites: “generate the first plurality of candidate architecture configurations using an evolutionary protocol”. This limitation is a continuation of the “generate a first plurality of candidate architecture configurations” limitation identified as an abstract idea in the rejection of the parent claim, thus recites a judicial exception.
Further, the claim recites “the at least one processor circuit”. This is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. The claim is not patent eligible.
Regarding claim 6, the rejection of claim 1 is incorporated, and further, the claim recites: “generate the first plurality of candidate architecture configurations during a first iteration”. This limitation is a continuation of the “generate a first plurality of candidate architecture configurations” limitation identified as an abstract idea in the rejection of the parent claim. Further, the claim recites: “generate the second plurality of candidate architecture configurations during a second iteration”. This limitation is a continuation of the “generate a second plurality of candidate architecture configurations” limitation identified as an abstract idea in the rejection of the parent claim. Finally, the claim recites: “stop performing iterations based on a hypervolume metric corresponding to generated architecture configurations corresponding to the second iteration”. This limitation, under the broadest reasonable interpretation, recite mental processes in addition to those identified in the rejection of the parent claim. Thus, the claim recites a judicial exception.
Further, the claim recites “the at least one processor circuit”. This is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. The claim is not patent eligible.
Regarding claim 7, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein the first subgroup of architecture configurations includes less than fifty one architecture configurations”. This limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use, particularly weakly trained predictors. See MPEP §2106.05(h). Elements that merely generally link the use of the judicial exception to a particular technological environment cannot provide an inventive concept. The claim is not patent eligible.
Regarding claim 8:
Step 1 Statutory Category: Claim 8 is directed to a machine, which falls under one of the four statutory categories.
Step 2A Prong 1 Judicial exception: Claim 8 recites, in part, “generate a first plurality of candidate architecture configurations” and “generate a second plurality of candidate architecture configurations”. These limitations, under the broadest reasonable interpretation, cover the recitation of the abstract idea of a mathematical calculation, as directed to “a claim that recites a mathematical calculation, when the claim is given its broadest reasonable interpretation in light of the specification, will be considered as falling within the "mathematical concepts" grouping. A mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number”. See MPEP §2106.04(a)(2)(I)(C). Further, the claim recites: “generate a second subgroup of architecture configurations by selecting a number of the first plurality of candidate architecture configurations”. This limitation, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case judgment. See MPEP § 2106.04(a)(2)(III). Further, the claim recites: “validate ones of the second subgroup of architecture configurations”. This limitation, under the broadest reasonable interpretation, covers the recitation of a mathematical concept, see MPEP §2106.04(a)(2)(I). Further, the claim recites: “select an architecture configuration from the second plurality of candidate architecture configurations for implementation in an artificial intelligence model”. This limitation, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case judgment. See MPEP § 2106.04(a)(2)(III).
Step 2A Prong 2 Integration into a practical application: This judicial exception is not integrated into a practical application. In particular the claim recites: “A non-transitory computer readable medium comprising instructions which, when executed, cause one or more processors to…”. This limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Further, the claim recites: “train a predictor model using a first subgroup of architecture configurations” and “re-train the predictor model based on the first subgroup and the ones of the second subgroup”. These limitations are additional elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Finally, the claim recites: “by executing the trained predictor model to search a plurality of neural network architecture configurations” and “by executing the re-trained predictor model to search the plurality of neural network architecture configurations”. These limitations are additional elements that amount to generally linking the use of the judicial exception to a particular technological environment or field of use. See MPEP §2106.05(h).
Step 2B Significantly more: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements: “A non-transitory computer readable medium comprising instructions which, when executed, cause one or more processors to…”, “train a predictor model using a first subgroup of architecture configurations”, and “re-train the predictor model based on the first subgroup and the ones of the second subgroup” amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. Further, the claim recites the additional elements: “by executing the trained predictor model to search a plurality of neural network architecture configurations” and “by executing the re-trained predictor model to search the plurality of neural network architecture configurations” amount to generally linking the use of the judicial exception to a particular technological environment or field of use. Elements that merely amount to generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible.
Regarding claim 9, the rejection of claim 8 is incorporated, and further, claim 9 is substantially similar to claim 2 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 10, the rejection of claim 8 is incorporated, and further, claim 10 is substantially similar to claim 3 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 11, the rejection of claim 10 is incorporated, and further, claim 11 is substantially similar to claim 4 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 12, the rejection of claim 8 is incorporated, and further, claim 12 is substantially similar to claim 5 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 13, the rejection of claim 8 is incorporated, and further, claim 13 is substantially similar to claim 6 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 14, the rejection of claim 8 is incorporated, and further, claim 14 is substantially similar to claim 7 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 15:
Step 1 Statutory Category: Claim 15 is directed to an apparatus, which falls under one of the four statutory categories.
Step 2A Prong 1 Judicial exception: Claim 15 recites, in part, “generate a first plurality of candidate architecture configurations” and “generate a second plurality of candidate architecture configurations”. These limitations, under the broadest reasonable interpretation, cover the recitation of the abstract idea of a mathematical calculation, as directed to “a claim that recites a mathematical calculation, when the claim is given its broadest reasonable interpretation in light of the specification, will be considered as falling within the "mathematical concepts" grouping. A mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number”. See MPEP §2106.04(a)(2)(I)(C). Further, the claim recites: “generate a second subgroup of architecture configurations by selecting a number of the first plurality of candidate architecture configurations”. This limitation, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case judgment. See MPEP § 2106.04(a)(2)(III). Further, the claim recites: “validate ones of the second subgroup of architecture configurations”. This limitation, under the broadest reasonable interpretation, covers the recitation of a mathematical concept, see MPEP §2106.04(a)(2)(I). Further, the claim recites: “select an architecture configuration from the second plurality of candidate architecture configurations for implementation in an artificial intelligence model”. This limitation, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case judgment. See MPEP § 2106.04(a)(2)(III).
Step 2A Prong 2 Integration into a practical application: This judicial exception is not integrated into a practical application. In particular the claim recites: “an apparatus”, “interface circuitry to access a first subgroup of neural network architecture configurations from a plurality of neural network architecture configurations”, “processor circuitry including one or more of: at least one of a central processing unit, a graphics processing unit or a digital signal processor, the at least one of the central processing unit, the graphics processing unit or the digital signal processor having control circuitry, one or more registers, and arithmetic and logic circuitry to perform one or more first operations corresponding to instructions in the apparatus, and; a Field Programmable Gate Array (FPGA), the FPGA including logic gate circuitry, a plurality of configurable interconnections, and storage circuitry, the logic gate circuitry and interconnections to perform one or more second operations; or Application Specific Integrated Circuitry (ASIC) including logic gate circuitry to perform one or more third operations; the processor circuitry to perform at least one of the first operations, the second operations or the third operations to instantiate”, “predictor training circuitry”, “evolutionary protocol circuitry”, and “architecture selection circuitry”. These limitations are additional elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Further, the claim recites: “train a predictor model using a first subgroup of architecture configurations” and “re-train the predictor model based on the first subgroup and the ones of the second subgroup”. These limitations are additional elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Finally, the claim recites: “a neural architecture search”, “by executing the trained predictor model to search the plurality of neural network architecture configurations”, and “by executing the re-trained predictor model to search the plurality of neural network architecture configurations”. These limitations are additional elements that generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP §2106.05(h).
Step 2B Significantly more: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements: “an apparatus”, “interface circuitry to access a first subgroup of neural network architecture configurations from a plurality of neural network architecture configurations”, “processor circuitry including one or more of: at least one of a central processing unit, a graphics processing unit or a digital signal processor, the at least one of the central processing unit, the graphics processing unit or the digital signal processor having control circuitry, one or more registers, and arithmetic and logic circuitry to perform one or more first operations corresponding to instructions in the apparatus, and; a Field Programmable Gate Array (FPGA), the FPGA including logic gate circuitry, a plurality of configurable interconnections, and storage circuitry, the logic gate circuitry and interconnections to perform one or more second operations; or Application Specific Integrated Circuitry (ASIC) including logic gate circuitry to perform one or more third operations; the processor circuitry to perform at least one of the first operations, the second operations or the third operations to instantiate”, “predictor training circuitry”, “evolutionary protocol circuitry”, “architecture selection circuitry”, “train a predictor model using a first subgroup of architecture configurations”, and “re-train the predictor model based on the first subgroup and the ones of the second subgroup” amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. Further, the claim recites the additional elements: “a neural architecture search”, “by executing the trained predictor model to search the plurality of neural network architecture configurations”, and “by executing the re-trained predictor model to search the plurality of neural network architecture configurations” generally link the use of the judicial exception to a particular technological environment or field of use. Elements that merely generally link the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible.
Regarding claim 16, the rejection of claim 15 is incorporated, and further, the claim recites: “measure a first objective of the first subgroup and a second objective of the first subgroup”. This limitation recites mathematical concepts in addition to those identified in the rejection of the parent claim, thus recites a judicial exception.
Further, the claim recites: “validation circuitry”, “the predictor training circuitry”, and “train the first predictors based on the first and second objectives. These are additional elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. The claim is not patent eligible.
Regarding claim 17, the rejection of claim 15 is incorporated, and further, claim 17 is substantially similar to claim 3 and claim 10 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 18, the rejection of claim 17 is incorporated, and further, claim 18 is substantially similar to claim 4 and claim 11 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 19, the rejection of claim 15 is incorporated, and further, claim 19 is substantially similar to claim 5 and claim 12 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 20, the rejection of claim 15 is incorporated, and further, claim 20 is substantially similar to claim 6 and claim 13 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 21, the rejection of claim 15 is incorporated, and further, claim 21 is substantially similar to claim 7 and claim 14 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 29:
Step 1 Statutory Category: Claim 29 is directed to a method, which falls under one of the four statutory categories.
Step 2A Prong 1 Judicial exception: Claim 29 recites, in part, “generating, …, a first plurality of candidate architecture configurations” and “generating, …, a second plurality of candidate architecture configurations”. These limitations, under the broadest reasonable interpretation, cover the recitation of the abstract idea of a mathematical calculation, as directed to “a claim that recites a mathematical calculation, when the claim is given its broadest reasonable interpretation in light of the specification, will be considered as falling within the "mathematical concepts" grouping. A mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number”. See MPEP §2106.04(a)(2)(I)(C). Further, the claim recites: “generating, …, a second subgroup of architecture configurations by selecting a number of the first plurality of candidate architecture configurations”. This limitation, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case judgment. See MPEP § 2106.04(a)(2)(III). Further, the claim recites: “validating, …, ones of the second subgroup of architecture configurations”. This limitation, under the broadest reasonable interpretation, covers the recitation of a mathematical concept, see MPEP §2106.04(a)(2)(I). Further, the claim recites: “selecting, …, an architecture configuration from the second plurality of candidate architecture configurations for implementation in an artificial intelligence model”. This limitation, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgment, opinion), in this case judgment. See MPEP § 2106.04(a)(2)(III).
Step 2A Prong 2 Integration into a practical application: This judicial exception is not integrated into a practical application. In particular the claim recites: “a neural architecture search”. This limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use. See MPEP §2106.05(h). Further, the claim recites: “training, …, a predictor model based on a first subgroup of architecture configurations” and “re-training, …, the predictor model using the first subgroup and the ones of the second subgroup”. These limitations are additional elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Further, the claim recites: “by executing an instruction with one or more processors”. This limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP §2106.05(f). Finally, the claim recites: “by executing the trained predictor model to search a plurality of neural network architecture configurations” and “by executing the re-trained predictor model to search the plurality of neural network architecture configurations”. These limitations are additional elements that generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP §2106.05(h).
Step 2B Significantly more: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements: “a neural architecture search”, “by executing the trained predictor model to search a plurality of neural network architecture configurations”, and “by executing the re-trained predictor model to search the plurality of neural network architecture configurations” generally link the use of the judicial exception to a particular technological environment or field of use. Elements that merely generally link the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. Further, the claim recites the additional elements: “training, …, a predictor model based on a first subgroup of architecture configurations”, “re-training, …, the predictor model using the first subgroup and the ones of the second subgroup”, and “by executing an instruction with one or more processors” that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. The claim is not patent eligible.
Regarding claim 30, the rejection of claim 29 is incorporated, and further, claim 30 is substantially similar to claim 2 and claim 9 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 31, the rejection of claim 29 is incorporated, and further, claim 31 is substantially similar to claim 3, claim 10, and claim 17 respectively, and is rejected in the same manner and reasoning applying.
Regarding claim 32, the rejection of claim 31 is incorporated, and further, claim 32 is substantially similar to claim 4, claim 11, and claim 18 respectively, and is rejected in the same manner and reasoning applying.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 8-12 and 29-31 are rejected under 35 U.S.C. 103 as being unpatentable over Lu et al., NSGANetV2: Evolutionary Multi-Objective Surrogate-Assisted Neural Architecture Search, 07/20/2020, https://arxiv.org/abs/2007.10396, hereinafter referred to as “Lu” in view of Letunovskiy et al., U.S. Patent Application Publication No. 20240169213, hereinafter referred to as “Letunovskiy”.
Regarding claim 8, Lu teaches A non-transitory computer readable medium comprising instructions which, when executed, cause one or more processors (Lu, Pages 9-10, Section 4.2, Paragraph 2, 1-7, “We then compare the search efficiency of MSuNAS to NSGANet [22] and random search under a bi-objective setup: Top-1 accuracy and #MAdds. To perform the comparison, we run MSuNAS for 30 iterations, leading to 350 architectures evaluated in total. We record the cumulative hypervolume [42] achieved against the number of architectures evaluated. We repeat this process five times on both ImageNet and CIFAR-10 datasets to capture the variance in performance due to randomness in the search initialization” A person of ordinary skill in the art would recognize that this process would require a computer, thus providing evidence for “a non-transitory computer readable medium”, “instructions”, and “one or more processors”) to at least:
train a predictor model using a first subgroup of architecture configurations (Lu, Page 6, Section 3.2, Paragraph 2, Lines 12-14, “we start with an accuracy predictor constructed from only a limited number of architectures sampled randomly from the search space” “construct[ing]” the “accuracy predictor” is considered to be “train[ing] a predictor model”);
generate a first plurality of candidate architecture configurations by executing the trained predictor model to search a plurality of neural network architecture configurations (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)”; The “NSGA-II algorithm” uses the predictor model and thus the “first plurality of candidate architecture configurations” are generated “by executing the trained predictor model to search a plurality of neural network architecture configurations”);
generate a second subgroup of architecture configurations by selecting a number of the first plurality of candidate architecture configurations (Lu, Page 7, Figure 3 Description, Lines 5-6, “A subset of the candidate architectures is chosen to diversify the Pareto front (c) - (d)”; See also: Lu, Algorithm 1, Line 11; Lu, Page 8, Paragraph 2, Lines 7-10, “To select a subset, we first select the architecture with highest predicted accuracy. Then we project all other architecture candidates to the #MAdds axis, and pick the remaining architectures from the sparse regions that help in extending the Pareto frontier to diverse #MAdds regimes, see Fig. 3(c) - (d)”);
…
re-train the predictor model based on the first subgroup and … the second subgroup (Lu, Page 7, Figure 3 Description, Lines 6-7, “The selected candidate architectures are then evaluated and added to the archive (e)”; Lu, Page 7, Figure 3 Description, Lines 1-3, “In each iteration, accuracy prediction surrogate models Sf are constructed from an archive of previously evaluated architectures (a)” constructing the “surrogate models” is considered to be “re-train[ing] the predictor model”; Lu, Page 8, Paragraph 2, Lines 12-13, “We finally add these architectures to the training samples to refine our accuracy predictor models and proceed to next iteration, see Fig. 3(e)” Lu uses “surrogate model” and “accuracy predictor” interchangeably, they are considered to be equivalent); and
generate a second plurality of candidate architecture configurations by executing the re-trained predictor model to search the plurality of neural network architecture configurations (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)” This operation is performed at each iteration, the “second plurality of candidate architecture configurations” are considered to be those generated during the second iteration); and
select an architecture configuration from the second plurality of candidate architecture configurations for implementation in an artificial intelligence model (Lu, Page 12, Figure 6 Description, Lines 4-6, “Our models are obtained by directly searching on the respective datasets. In most problems, MSuNAS finds more accurate solutions with fewer parameters”; see also Lu, Page 12, Figure 6; In order to generate accuracy results, a model must have been selected for implementation, a person of ordinary skill in the art would recognize “our models” are models generated from the selected architectures).
Lu does not explicitly teach validate ones of the second subgroup of architecture configurations nor that the retraining is performed using the ones of the second subgroup.
Letunovskiy teaches validate ones of the second subgroup of architecture configurations and that the retraining is performed using the ones of the second subgroup (Letunovskiy, Paragraph 0177, Lines 8-15, “In step 320, architectures of the first set are filtered to obtain best architectures. The filtering may be performed according to the accuracy and latency predicted by the surrogate model E. In particular, the filtering may consist of discarding from the first set architectures which do not satisfy a validation accuracy threshold a and a latency threshold 1, thereby obtaining the second set of N.sub.1 filtered architectures”; Because the architectures that do not satisfy a validation accuracy are removed, only the remaining architectures are used in the remainder of the method, including retraining).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the neural architecture search method of Lu to include validating selected architectures as taught by Letunovskiy. The motivation to do so would have been to meet hardware requirements of the device the model will be implemented on, as well as to reduce computation time and computational resources by removing poor performing architectures (Letunovskiy, Paragraph 0177).
Regarding claim 9, the rejection of claim 8 is incorporated, and further, the proposed combination teaches measure a first objective of the first subgroup and a second objective of the first subgroup (Lu, Page 8, Paragraph 2, Lines 1-4, “With the accuracy predictor selected by AS, we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1). For the purpose of illustration, we assume that the user is interested in optimizing #MAdds as the second objective”); and
train the first predictors based on the first and second objectives (Lu, Page 8, Paragraph 2, Lines 5-6, “At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)”; Lu, Page 8, Paragraph 2, Lines 11-13, “The architectures from the chosen subset are then sent to the lower level for SGD training. We finally add these architectures to the training samples to refine our accuracy predictor models and proceed to next iteration, see Fig. 3(e)” The predicted models are trained using the selected architectures, and the architectures are selected to optimize the objectives, thus the models are trained “based on” the first and second objectives).
Regarding claim 10, the rejection of claim 8 is incorporated, and further, the proposed combination teaches train the predictor model using first predictors (Lu, Page 8, Paragraph 1, “We first collected four different surrogate models for accuracy prediction from the literature, namely, Multi Layer Perceptron (MLP) [19], Classification And Regression Trees (CART) [34], Radial Basis Function (RBF) [1] and Gaussian Process (GP) [10]. From our ablation study, we observed that no one surrogate model is consistently better than others in terms of the above two criteria on all datasets (see section 4.1). Hence, we propose a selection mechanism, dubbed Adaptive Switching (AS), which constructs all four types of surrogate models at every iteration and adaptively selects the best model via cross-validation”).
Regarding claim 11, the rejection of claim 10 is incorporated, and further, the proposed combination teaches select the predictor model based on an error of the predictor model (Lu, Page 8, Paragraph 1, Lines 6-8, “we propose a selection mechanism, dubbed Adaptive Switching (AS), which constructs all four types of surrogate models at every iteration and adaptively selects the best model via cross-validation” selecting based on “cross-validation” is considered to be selected “based on an error of the predictor model”).
Regarding claim 12, the rejection of claim 8 is incorporated, and further, the proposed combination teaches generate the first plurality of candidate architecture configurations using an evolutionary protocol (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)” A person of ordinary skill in the art would recognize that “NSGA-II” is considered to be “an evolutionary protocol”, this is further supported by the title of the reference).
Regarding claim 29, Lu teaches A method to perform a neural architecture search (Lu, Page 1, Abstract, Line 1, “In this paper, we propose an efficient NAS algorithm”), the method comprising:
training, …, a predictor model based on a first subgroup of architecture configurations (Lu, Page 6, Section 3.2, Paragraph 2, Lines 12-14, “we start with an accuracy predictor constructed from only a limited number of architectures sampled randomly from the search space” “construct[ing]” the “accuracy predictor” is considered to be “train[ing] a predictor model”);
generating, …, a first plurality of candidate architecture configurations by executing the trained predictor model to search a plurality of neural network architecture configurations (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)”);
generating, …, a second subgroup of architecture configurations by selecting a number of the first plurality of candidate architecture configurations (Lu, Page 7, Figure 3 Description, Lines 5-6, “A subset of the candidate architectures is chosen to diversify the Pareto front (c) - (d)”; See also: Lu, Algorithm 1, Line 11; Lu, Page 8, Paragraph 2, Lines 7-10, “To select a subset, we first select the architecture with highest predicted accuracy. Then we project all other architecture candidates to the #MAdds axis, and pick the remaining architectures from the sparse regions that help in extending the Pareto frontier to diverse #MAdds regimes, see Fig. 3(c) - (d)”);
…
re-training, …, the predictor model using the first subgroup and the … second subgroup (Lu, Page 7, Figure 3 Description, Lines 6-7, “The selected candidate architectures are then evaluated and added to the archive (e)”; Lu, Page 7, Figure 3 Description, Lines 1-3, “In each iteration, accuracy prediction surrogate models Sf are constructed from an archive of previously evaluated architectures (a)” constructing the “surrogate models” is considered to be “re-train[ing] the predictor model”; Lu, Page 8, Paragraph 2, Lines 12-13, “We finally add these architectures to the training samples to refine our accuracy predictor models and proceed to next iteration, see Fig. 3(e)” Lu uses “surrogate model” and “accuracy predictor” interchangeably, they are considered to be equivalent);
generating, …, a second plurality of candidate architecture configurations by executing the re-trained predictor model to search the plurality of neural network architecture configurations (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)” This operation is performed at each iteration, the “second plurality of candidate architecture configurations” are considered to be those generated during the second iteration); and
selecting, …, an architecture configuration from the second plurality of candidate architecture configurations for implementation in an artificial intelligence model (Lu, Page 12, Figure 6 Description, Lines 4-6, “Our models are obtained by directly searching on the respective datasets. In most problems, MSuNAS finds more accurate solutions with fewer parameters”; see also Lu, Page 12, Figure 6; In order to generate accuracy results, a model must have been selected for implementation, a person of ordinary skill in the art would recognize “our models” are models generated from the selected architectures).
Further, Lu teaches that the steps of the method are performed by executing an instruction with one or more processors (Lu, Pages 9-10, Section 4.2, Paragraph 2, 1-7, “We then compare the search efficiency of MSuNAS to NSGANet [22] and random search under a bi-objective setup: Top-1 accuracy and #MAdds. To perform the comparison, we run MSuNAS for 30 iterations, leading to 350 architectures evaluated in total. We record the cumulative hypervolume [42] achieved against the number of architectures evaluated. We repeat this process five times on both ImageNet and CIFAR-10 datasets to capture the variance in performance due to randomness in the search initialization” A person of ordinary skill in the art would recognize that this process would require a computer, thus providing evidence for “instructions”, and “one or more processors”).
Lu does not explicitly teach validating, …, ones of the second subgroup of architecture configurations nor that the retraining is performed using the ones of the second subgroup.
Letunovskiy teaches validate ones of the second subgroup of architecture configurations and that the retraining is performed using the ones of the second subgroup (Letunovskiy, Paragraph 0177, Lines 8-15, “In step 320, architectures of the first set are filtered to obtain best architectures. The filtering may be performed according to the accuracy and latency predicted by the surrogate model E. In particular, the filtering may consist of discarding from the first set architectures which do not satisfy a validation accuracy threshold a and a latency threshold 1, thereby obtaining the second set of N.sub.1 filtered architectures”; Because the architectures that do not satisfy a validation accuracy are removed, only the remaining architectures are used in the remainder of the method, including retraining).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the neural architecture search method of Lu to include validating selected architectures as taught by Letunovskiy. The motivation to do so would have been to meet hardware requirements of the device the model will be implemented on, as well as to reduce computation time and computational resources by removing poor performing architectures (Letunovskiy, Paragraph 0177).
Regarding claim 30, the rejection of claim 29 is incorporated, and further, the proposed combination teaches measuring a first objective of the first subgroup and a second objective of the first subgroup (Lu, Page 8, Paragraph 2, Lines 1-4, “With the accuracy predictor selected by AS, we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1). For the purpose of illustration, we assume that the user is interested in optimizing #MAdds as the second objective”); and
training the predictor model based on the first and second objectives (Lu, Page 8, Paragraph 2, Lines 5-6, “At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)”; Lu, Page 8, Paragraph 2, Lines 11-13, “The architectures from the chosen subset are then sent to the lower level for SGD training. We finally add these architectures to the training samples to refine our accuracy predictor models and proceed to next iteration, see Fig. 3(e)” The predicted models are trained using the selected architectures, and the architectures are selected to optimize the objectives, thus the models are trained “based on” the first and second objectives).
Regarding claim 31, the rejection of claim 29 is incorporated, and further, the proposed combination teaches training the predictor model using first predictors (Lu, Page 8, Paragraph 1, “We first collected four different surrogate models for accuracy prediction from the literature, namely, Multi Layer Perceptron (MLP) [19], Classification And Regression Trees (CART) [34], Radial Basis Function (RBF) [1] and Gaussian Process (GP) [10]. From our ablation study, we observed that no one surrogate model is consistently better than others in terms of the above two criteria on all datasets (see section 4.1). Hence, we propose a selection mechanism, dubbed Adaptive Switching (AS), which constructs all four types of surrogate models at every iteration and adaptively selects the best model via cross-validation”).
Regarding claim 32, the rejection of claim 31 is incorporated, and further, the proposed combination teaches selecting the predictor model based on an error of the predictor model (Lu, Page 8, Paragraph 1, Lines 6-8, “we propose a selection mechanism, dubbed Adaptive Switching (AS), which constructs all four types of surrogate models at every iteration and adaptively selects the best model via cross-validation” selecting based on “cross-validation” is considered to be selected “based on an error of the first predictor model”).
Claims 1-5 and 15-19 are rejected under 35 U.S.C. 103 as being unpatentable over Lu in view of Letunovskiy in further view of Loh et al., U.S. Patent Application Publication No. 20210192337, hereinafter referred to as "Loh".
Regarding claim 1, Lu teaches perform a neural architecture search (Lu, Page 1, Abstract, Line 1, “In this paper, we propose an efficient NAS algorithm”) … access a first subgroup of neural network architecture configurations from a plurality of neural network architecture configurations (Lu, Page 6, Section 3.2, Paragraph 2, Lines 12-14, “we start with an accuracy predictor constructed from only a limited number of architectures sampled randomly from the search space” The “limited number of architectures sampled randomly from the search space” is considered to be the “first subgroup”);
train a predictor model based on the first subgroup (Lu, Page 6, Section 3.2, Paragraph 2, Lines 12-14, “we start with an accuracy predictor constructed from only a limited number of architectures sampled randomly from the search space” “construct[ing]” the “accuracy predictor” is considered to be “train[ing] a predictor model”);
generate a first plurality of candidate architecture configurations by executing the trained predictor model to search the plurality of neural network architecture configurations (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)”; The “NSGA-II algorithm” uses the predictor model and thus the “first plurality of candidate architecture configurations” are generated “by executing the trained predictor model to search a plurality of neural network architecture configurations”);
generate a second subgroup of architecture configurations by selecting a number of the first plurality of candidate architecture configurations (Lu, Page 7, Figure 3 Description, Lines 5-6, “A subset of the candidate architectures is chosen to diversify the Pareto front (c) - (d)”; See also: Lu, Algorithm 1, Line 11);
…
re-train the predictor model based on the first subgroup and the … second subgroup (Lu, Page 7, Figure 3 Description, Lines 6-7, “The selected candidate architectures are then evaluated and added to the archive (e)”; Lu, Page 7, Figure 3 Description, Lines 1-3, “In each iteration, accuracy prediction surrogate models Sf are constructed from an archive of previously evaluated architectures (a)” constructing the “surrogate models” is considered to be “train[ing] second predictors”; Lu, Page 8, Paragraph 2, Lines 12-13, “We finally add these architectures to the training samples to refine our accuracy predictor models and proceed to next iteration, see Fig. 3(e)” Lu uses “surrogate model” and “accuracy predictor” interchangeably, they are considered to be equivalent);
generate a second plurality of candidate architecture configurations by executing the re-trained predictor model to search the plurality of neural network architecture configurations (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)” This operation is performed at each iteration, the “second plurality of candidate architecture configurations” are considered to be those generated during the second iteration); and
select an architecture configuration from the second plurality of candidate architecture configurations for implementation in an artificial intelligence model (Lu, Page 12, Figure 6 Description, Lines 4-6, “Our models are obtained by directly searching on the respective datasets. In most problems, MSuNAS finds more accurate solutions with fewer parameters”; see also Lu, Page 12, Figure 6; In order to generate accuracy results, a model must have been selected for implementation, a person of ordinary skill in the art would recognize “our models” are models generated from the selected architectures).
Lu does not explicitly teach validate ones of the second subgroup of architecture configurations nor that the retraining is performed using the ones of the second subgroup nor an apparatus, an interface, machine-readable instructions, nor at least one processor circuit to be programmed by the machine-readable instructions.
Letunovskiy teaches validate ones of the second subgroup of architecture configurations and that the retraining is performed using the ones of the second subgroup (Letunovskiy, Paragraph 0177, Lines 8-15, “In step 320, architectures of the first set are filtered to obtain best architectures. The filtering may be performed according to the accuracy and latency predicted by the surrogate model E. In particular, the filtering may consist of discarding from the first set architectures which do not satisfy a validation accuracy threshold a and a latency threshold 1, thereby obtaining the second set of N.sub.1 filtered architectures”; Because the architectures that do not satisfy a validation accuracy are removed, only the remaining architectures are used in the remainder of the method, including retraining).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the neural architecture search method of Lu to include validating selected architectures as taught by Letunovskiy. The motivation to do so would have been to meet hardware requirements of the device the model will be implemented on, as well as to reduce computation time and computational resources by removing poor performing architectures (Letunovskiy, Paragraph 0177).
The proposed combination does not explicitly teach an apparatus, an interface, machine-readable instructions, nor at least one processor circuit to be programmed by the machine-readable instructions.
Loh teaches an apparatus, an interface, instructions, and processor circuitry to execute the instructions (Loh, Paragraph 0041, Lines 1-4, “System 100 includes communication bus 110 coupled to one or more processors 120, memory 130, I/O interfaces 140, display interface 150, one or more communication interfaces 160 and one or more HAs 170”; Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”; Loh, Paragraph 0049, Lines 1-6, “I/O interfaces 140 are configured to transmit and/or receive data from I/O devices 142. I/O interfaces 140 enable connectivity between processor 120 and I/O devices 142 by encoding data to be sent from processor 120 to I/O devices 142, and decoding data received from I/O devices 142 for processor 120”; Loh, Paragraph 0050, Lines 1-2, “Generally, I/O devices 142 provide input to system 100 and/or output from system 100”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method of proposed combination to include the hardware (apparatus, interface, instructions, and processor circuitry) of Loh. The motivation for doing so would have been that the interface allows for communication with a user (Loh, Paragraphs 0049-0050). Further, any person of ordinary skill in the art would understand that it is necessary to implement a computer-based methodology on a computer infrastructure. Further, one would have been motivated to make such a combination in order to better verify the results of the method taught by the proposed combination.
Regarding claim 2, the rejection of claim 1 is incorporated, and further, the proposed combination teaches wherein the one or more of the at least one processor circuit (Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”) is to: measure a first objective of the first subgroup and a second objective of the first subgroup (Lu, Page 8, Paragraph 2, Lines 1-4, “With the accuracy predictor selected by AS, we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1). For the purpose of illustration, we assume that the user is interested in optimizing #MAdds as the second objective”); and
train the predictor model based on the first and second objectives (Lu, Page 8, Paragraph 2, Lines 5-6, “At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)”; Lu, Page 8, Paragraph 2, Lines 11-13, “The architectures from the chosen subset are then sent to the lower level for SGD training. We finally add these architectures to the training samples to refine our accuracy predictor models and proceed to next iteration, see Fig. 3(e)” The predicted models are trained using the selected architectures, and the architectures are selected to optimize the objectives, thus the models are trained “based on” the first and second objectives).
Regarding claim 3, the rejection of claim 1 is incorporated, and further, the proposed combination teaches wherein the one or more of the at least one processor circuit (Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”) is to train the predictor model using first predictors (Lu, Page 8, Paragraph 1, “We first collected four different surrogate models for accuracy prediction from the literature, namely, Multi Layer Perceptron (MLP) [19], Classification And Regression Trees (CART) [34], Radial Basis Function (RBF) [1] and Gaussian Process (GP) [10]. From our ablation study, we observed that no one surrogate model is consistently better than others in terms of the above two criteria on all datasets (see section 4.1). Hence, we propose a selection mechanism, dubbed Adaptive Switching (AS), which constructs all four types of surrogate models at every iteration and adaptively selects the best model via cross-validation”).
Regarding claim 4, the rejection of claim 3 is incorporated, and further, the proposed combination teaches wherein one or more of the at least one processor circuit (Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”) is to select the predictor model based on an error of the predictor model (Lu, Page 8, Paragraph 1, Lines 6-8, “we propose a selection mechanism, dubbed Adaptive Switching (AS), which constructs all four types of surrogate models at every iteration and adaptively selects the best model via cross-validation” selecting based on “cross-validation” is considered to be selected “based on an error of the predictor model”).
Regarding claim 5, the rejection of claim 1 is incorporated, and further, the proposed combination teaches wherein one or more of the at least one processor circuit (Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”) is to generate the first plurality of candidate architecture configurations using an evolutionary protocol (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)” A person of ordinary skill in the art would recognize that “NSGA-II” is considered to be “an evolutionary protocol”, this is further supported by the title of the reference).
Regarding claim 15, Lu teaches perform a neural architecture search (Lu, Page 1, Abstract, Line 1, “In this paper, we propose an efficient NAS algorithm”)… access a first subgroup of neural network architecture configurations from a plurality of neural network architecture configurations (Lu, Page 6, Section 3.2, Paragraph 2, Lines 12-14, “we start with an accuracy predictor constructed from only a limited number of architectures sampled randomly from the search space” The “limited number of architectures sampled randomly from the search space” is considered to be the “first subgroup”)
… train a predictor model using a first subgroup of architecture configurations (Lu, Page 6, Section 3.2, Paragraph 2, Lines 12-14, “we start with an accuracy predictor constructed from only a limited number of architectures sampled randomly from the search space” “construct[ing]” the “accuracy predictor” is considered to be “train[ing] a predictor model”);
… generate a first plurality of candidate architecture configurations by executing the trained predictor model to search the plurality of neural network architecture configurations (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)” The “NSGA-II algorithm” uses the predictor model and thus the “first plurality of candidate architecture configurations” are generated “by executing the trained predictor model to search a plurality of neural network architecture configurations”);
… generate a second subgroup of architecture configurations by selecting a number of the first plurality of candidate architecture configurations (Lu, Page 7, Figure 3 Description, Lines 5-6, “A subset of the candidate architectures is chosen to diversify the Pareto front (c) - (d)”; See also: Lu, Algorithm 1, Line 11); and
…
… re-train the predictor model based on the first subgroup and the … second subgroup (Lu, Page 7, Figure 3 Description, Lines 6-7, “The selected candidate architectures are then evaluated and added to the archive (e)”; Lu, Page 7, Figure 3 Description, Lines 1-3, “In each iteration, accuracy prediction surrogate models Sf are constructed from an archive of previously evaluated architectures (a)” constructing the “surrogate models” is considered to be “re-train[ing] the predictor model”; Lu, Page 8, Paragraph 2, Lines 12-13, “We finally add these architectures to the training samples to refine our accuracy predictor models and proceed to next iteration, see Fig. 3(e)” Lu uses “surrogate model” and “accuracy predictor” interchangeably, they are considered to be equivalent); and
… generate a second plurality of candidate architecture configurations by executing the re-trained predictor model to search the plurality of candidate architecture configurations for implementation in an artificial intelligence model (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)” This operation is performed at each iteration, the “second plurality of candidate architecture configurations” are considered to be those generated during the second iteration).
Lu does not explicitly teach validate ones of the second subgroup of architecture configurations nor that the retraining is performed using the ones of the second subgroup nor an apparatus, interface circuitry, processor circuitry including one or more of: at least one of a central processing unit, a graphics processing unit or a digital signal processor, the at least one of the central processing unit, the graphics processing unit or the digital signal processor having control circuitry, one or more registers, and arithmetic and logic circuitry to perform one or more first operations corresponding to instructions in the apparatus, and; a Field Programmable Gate Array (FPGA), the FPGA including logic gate circuitry, a plurality of configurable interconnections, and storage circuitry, the logic gate circuitry and interconnections to perform one or more second operations; or Application Specific Integrated Circuitry (ASIC) including logic gate circuitry to perform one or more third operations; the processor circuitry to perform at least one of the first operations, the second operations or the third operations to instantiate: predictor training circuitry, evolutionary protocol circuitry, or architecture selection circuitry.
Letunovskiy teaches validate ones of the second subgroup of architecture configurations and that the retraining is performed using the ones of the second subgroup (Letunovskiy, Paragraph 0177, Lines 8-15, “In step 320, architectures of the first set are filtered to obtain best architectures. The filtering may be performed according to the accuracy and latency predicted by the surrogate model E. In particular, the filtering may consist of discarding from the first set architectures which do not satisfy a validation accuracy threshold a and a latency threshold 1, thereby obtaining the second set of N.sub.1 filtered architectures”; Because the architectures that do not satisfy a validation accuracy are removed, only the remaining architectures are used in the remainder of the method, including retraining).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the neural architecture search method of Lu to include validating selected architectures as taught by Letunovskiy. The motivation to do so would have been to meet hardware requirements of the device the model will be implemented on, as well as to reduce computation time and computational resources by removing poor performing architectures (Letunovskiy, Paragraph 0177).
Lu does not explicitly teach an apparatus, interface circuitry, processor circuitry including one or more of: at least one of a central processing unit, a graphics processing unit or a digital signal processor, the at least one of the central processing unit, the graphics processing unit or the digital signal processor having control circuitry, one or more registers, and arithmetic and logic circuitry to perform one or more first operations corresponding to instructions in the apparatus, and; a Field Programmable Gate Array (FPGA), the FPGA including logic gate circuitry, a plurality of configurable interconnections, and storage circuitry, the logic gate circuitry and interconnections to perform one or more second operations; or Application Specific Integrated Circuitry (ASIC) including logic gate circuitry to perform one or more third operations; the processor circuitry to perform at least one of the first operations, the second operations or the third operations to instantiate: predictor training circuitry, evolutionary protocol circuitry, or architecture selection circuitry.
Loh teaches an apparatus, interface circuitry, and processor circuitry (Loh, Paragraph 0041, Lines 1-4, “System 100 includes communication bus 110 coupled to one or more processors 120, memory 130, I/O interfaces 140, display interface 150, one or more communication interfaces 160 and one or more HAs 170”; Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”; Loh, Paragraph 0049, Lines 1-6, “I/O interfaces 140 are configured to transmit and/or receive data from I/O devices 142. I/O interfaces 140 enable connectivity between processor 120 and I/O devices 142 by encoding data to be sent from processor 120 to I/O devices 142, and decoding data received from I/O devices 142 for processor 120”; Loh, Paragraph 0050, Lines 1-2, “Generally, I/O devices 142 provide input to system 100 and/or output from system 100”) including one or more of: a central processing unit, the central processing unit having control circuitry, one or more registers, and arithmetic and logic circuitry to perform one or more first operations corresponding to instructions in the apparatus (Loh, Paragraph 0043, Lines 8-10, “system 100 may include one or more central processing units (CPUs) 120, each containing one or more processing cores”; Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”; See also, Loh, Figure 3; A person of ordinary skill in the art would recognize that a CPU has control circuitry, one or more registers, and arithmetic and logic circuitry), and; Application Specific Integrated Circuitry (ASIC) including logic gate circuitry to perform one or more third operations (Loh, Paragraph 0053, Lines 8-13, “As noted above, processor 120 may provide the same functionality as HAs 170. Additionally, HAs 170 may include one or more DSPs, FPGAs, ASICs, etc., and may include one or more memory blocks including RAM, ROM, EEPROM, flash memory, etc., integrated circuits, programmable circuits, etc.” A person of ordinary skill in the art would recognize that an ASIC includes logic gate circuitry); the processor circuitry to perform at least one of the first operations, the second operations or the third operations to instantiate: predictor training circuitry … evolutionary protocol circuitry … architecture selection circuitry (Loh, Paragraph 0040, “FIG. 3 depicts a block diagram of a system, in accordance with an embodiment of the present disclosure”; Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100” In light of applicant’s specification, “predictor training circuitry”, “evolutionary protocol circuitry”, and “architecture selection circuitry” are all parts of the “processor circuitry”).
It is noted that the claim recites alternative language, and Loh teaches at least one of the alternatives.
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method of the proposed combination to include the hardware (apparatus, interface, instructions, and processor circuitry) of Loh. The motivation for doing so would have been that the interface allows for communication with a user (Loh, Paragraphs 0049-0050). Further, any person of ordinary skill in the art would understand that it is necessary to implement a computer-based methodology on a computer infrastructure. Further, one would have been motivated to make such a combination in order to better verify the results of the method taught by the proposed combination.
Regarding claim 16, the rejection of claim 15 is incorporated, and further, the proposed combination teaches validation circuitry (Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”; In light of applicant’s specification, “validation circuitry” is part of the “processor circuitry”) to measure a first objective of the first subgroup and a second objective of the first subgroup (Lu, Page 8, Paragraph 2, Lines 1-4, “With the accuracy predictor selected by AS, we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1). For the purpose of illustration, we assume that the user is interested in optimizing #MAdds as the second objective”), the predictor training circuitry (Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”; In light of applicant’s specification, “the predictor training circuitry” is part of the “processor circuitry”) to train the first predictors based on the first and second objectives (Lu, Page 8, Paragraph 2, Lines 5-6, “At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)”; Lu, Page 8, Paragraph 2, Lines 11-13, “The architectures from the chosen subset are then sent to the lower level for SGD training. We finally add these architectures to the training samples to refine our accuracy predictor models and proceed to next iteration, see Fig. 3(e)” The predicted models are trained using the selected architectures, and the architectures are selected to optimize the objectives, thus the models are trained “based on” the first and second objectives).
Regarding claim 17, the rejection of claim 15 is incorporated, and further, the proposed combination teaches wherein the predictor training circuitry (Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”; In light of applicant’s specification, “the predictor training circuitry” is part of the “processor circuitry”) to train the predictor model using first predictors (Lu, Page 8, Paragraph 1, “We first collected four different surrogate models for accuracy prediction from the literature, namely, Multi Layer Perceptron (MLP) [19], Classification And Regression Trees (CART) [34], Radial Basis Function (RBF) [1] and Gaussian Process (GP) [10]. From our ablation study, we observed that no one surrogate model is consistently better than others in terms of the above two criteria on all datasets (see section 4.1). Hence, we propose a selection mechanism, dubbed Adaptive Switching (AS), which constructs all four types of surrogate models at every iteration and adaptively selects the best model via cross-validation”).
Regarding claim 18, the rejection of claim 17 is incorporated, and further, the proposed combination teaches wherein the predictor training circuitry (Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”; In light of applicant’s specification, “the predictor training circuitry” is part of the “processor circuitry”) is to select the predictor model based on an error of the predictor model (Lu, Page 8, Paragraph 1, Lines 6-8, “we propose a selection mechanism, dubbed Adaptive Switching (AS), which constructs all four types of surrogate models at every iteration and adaptively selects the best model via cross-validation” selecting based on “cross-validation” is considered to be selected “based on an error of the first predictor model”).
Regarding claim 19, the rejection of claim 15 is incorporated, and further, the proposed combination teaches wherein the evolutionary protocol circuitry (Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”; In light of applicant’s specification, “the evolutionary protocol circuitry” is part of the “processor circuitry”) is to generate the first plurality of candidate architecture configurations using an evolutionary protocol (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)” A person of ordinary skill in the art would recognize that “NSGA-II” is considered to be “an evolutionary protocol”, this is further supported by the title of the reference).
Claims 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Lu in view of Letunovskiy in further view of Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James T. Kwok, and Tong Zhang. 11/21/2019. Multi-objective neural architecture search via predictive network performance optimization. https://arXiv:1911.09336, hereinafter referred to as “Shi”.
Regarding claim 13, the rejection of claim 8 is incorporated, and further, the proposed combination teaches generate the first plurality of candidate architecture configurations during a first iteration; generate the second plurality of candidate architecture configurations during a second iteration (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)”; Lu, Page 6, Section 3.2, Paragraph 2, Lines 18-20, “We repeat this process for a pre-specified number of iterations and output the non-dominated solutions from the pool of evaluated architectures”).
The proposed combination does not explicitly teach stop performing iterations based on a hypervolume metric corresponding to generated architecture configurations corresponding to the second iteration.
Shi teaches stop performing iterations based on a hypervolume metric corresponding to generated architecture configurations corresponding to the second iteration (Shi, Page 6, Algorithm 1, Line 3, “
P
f
~
∩
P
f
/
P
f
<
t
h
r
e
s
h
o
l
d
”; the condition is considered to be “a hypervolume metric” because it is testing how close the current Pareto front is compared to the optimal Pareto front”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the neural architecture search method of the proposed combination to include a stopping criterion based on a hypervolume metric as taught by Shi. The motivation for doing so would have been to obtain a Pareto front that is closer to the optimal Pareto front, where the optimal Pareto front is the set of all Pareto optimal architectures, thus stopping based on a hypervolume metric has the effect of more optimal results (Shi, Page 9, Section 4.3, Paragraph 2, Lines 4-8, “Figure 5b shows the Pareto fronts estimated by different algorithms, which demonstrates the superiority of BOGCN method. Compared with other baselines, the models sampled by our method are gathered near the optimal Pareto front, and the found Pareto front is also closer to the optimal Pareto front. This validates the efficiency of our algorithm on this multi-objective search task”).
Regarding claim 14, the rejection of claim 8 is incorporated.
The proposed combination does not explicitly teach wherein the first subgroup of architecture configurations includes less than fifty one architecture configurations.
Shi teaches wherein the first subgroup of architecture configurations includes less than fifty one architecture configurations (Shi, Page 7, Section 4.2, Lines 1-3, “For the proposed BOGCN-NAS, we randomly sample 50 architectures to fully train and use them as the initial trained architecture sets, which is counted into the total sample numbers”).
It would be obvious, to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the neural architecture search method of the proposed combination to include the first subgroup having less than fifty one architecture configurations. The motivation for doing so would have been that beginning with less samples results in less samples being used overall, which results in a method that is less computationally expensive (Shi, Page 7, Section 4.2, Paragraph 3, “Figure 3 and Table 2 show the number of samples until finding the global optimal architecture using different methods. The proposed algorithm outperforms the other baselines consistently on the two different datasets. On NASBench, BOGCN-NAS is 128.4×, 59.6×, 50.5×, 7.8× more sample-efficient than Random Search, Regularized Evolution, MCTS and LaNAS respectively. On the smaller NLP dataset, BOGCN-NAS can still search and find the optimal architecture with fewer samples”).
Claims 6-7 and 20-21 are rejected under 35 U.S.C. 103 as being unpatentable over Lu in view of Letunovskiy in further view of Loh, in further view of Shi.
Regarding claim 6, the rejection of claim 1 is incorporated, and further, the proposed combination teaches wherein one or more of the at least one processor circuit (Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”) is to: generate the first plurality of candidate architecture configurations during a first iteration; generate the second plurality of candidate architecture configurations during a second iteration (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)”; Lu, Page 6, Section 3.2, Paragraph 2, Lines 18-20, “We repeat this process for a pre-specified number of iterations and output the non-dominated solutions from the pool of evaluated architectures”).
The proposed combination thus far does not explicitly teach stop performing iterations based on a hypervolume metric corresponding to generated architecture configurations corresponding to the second iteration.
Shi teaches stop performing iterations based on a hypervolume metric corresponding to generated architecture configurations corresponding to the second iteration (Shi, Page 6, Algorithm 1, Line 3, “
P
f
~
∩
P
f
/
P
f
<
t
h
r
e
s
h
o
l
d
”; the condition is considered to be “a hypervolume metric” because it is testing how close the current Pareto front is compared to the optimal Pareto front”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the neural architecture search method of the proposed combination to include a stopping criterion based on a hypervolume metric as taught by Shi. The motivation for doing so would have been to obtain a Pareto front that is closer to the optimal Pareto front, where the optimal Pareto front is the set of all Pareto optimal architectures, thus stopping based on a hypervolume metric has the effect of more optimal results (Shi, Page 9, Section 4.3, Paragraph 2, Lines 4-8, “Figure 5b shows the Pareto fronts estimated by different algorithms, which demonstrates the superiority of BOGCN method. Compared with other baselines, the models sampled by our method are gathered near the optimal Pareto front, and the found Pareto front is also closer to the optimal Pareto front. This validates the efficiency of our algorithm on this multi-objective search task”).
Regarding claim 7, the rejection of claim 1 is incorporated.
The proposed combination does not explicitly teach wherein the first subgroup of architecture configurations includes less than fifty one architecture configurations.
Shi teaches wherein the first subgroup of architecture configurations includes less than fifty one architecture configurations (Shi, Page 7, Section 4.2, Lines 1-3, “For the proposed BOGCN-NAS, we randomly sample 50 architectures to fully train and use them as the initial trained architecture sets, which is counted into the total sample numbers”).
It would be obvious, to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the neural architecture search method of the proposed combination to include the first subgroup having less than fifty one architecture configurations. The motivation for doing so would have been that beginning with less samples results in less samples being used overall, which results in a method that is less computationally expensive (Shi, Page 7, Section 4.2, Paragraph 3, “Figure 3 and Table 2 show the number of samples until finding the global optimal architecture using different methods. The proposed algorithm outperforms the other baselines consistently on the two different datasets. On NASBench, BOGCN-NAS is 128.4×, 59.6×, 50.5×, 7.8× more sample-efficient than Random Search, Regularized Evolution, MCTS and LaNAS respectively. On the smaller NLP dataset, BOGCN-NAS can still search and find the optimal architecture with fewer samples”).
Regarding claim 20, the rejection of claim 15 is incorporated, and further, the proposed combination teaches wherein the predictor training circuitry (Loh, Paragraph 0043, Lines 1-4, “Processor 120 includes one or more general-purpose or application-specific microprocessors that executes instructions to perform control, computation, input/output, etc. functions for system 100”; In light of applicant’s specification, “the predictor training circuitry” is part of the “processor circuitry”) is to: generate the first plurality of candidate architecture configurations during a first iteration; generate the second plurality of candidate architecture configurations during a second iteration (Lu, Page 7, Figure 3 Description, Lines 3-5, “New candidate architectures (brown boxes in (b)) are obtained by solving the auxiliary single-level multi-objective problem
F
~
=
S
f
,
C
(line 10 in Algo 1)”; Lu, Page 8, Paragraph 2, Lines 1-6, “we apply the NSGA-II algorithm to simultaneously optimize for both accuracy (predicted) and other objectives of interest to the user (line 10 in Algorithm 1) ... At the conclusion of the NSGA-II search, a set of non-dominated architectures is output, see Fig. 3(b)”; Lu, Page 6, Section 3.2, Paragraph 2, Lines 18-20, “We repeat this process for a pre-specified number of iterations and output the non-dominated solutions from the pool of evaluated architectures”).
The proposed combination thus far does not explicitly teach stop performing iterations based on a hypervolume metric corresponding to generated architecture configurations corresponding to the second iteration.
Shi teaches stop performing iterations based on a hypervolume metric corresponding to generated architecture configurations corresponding to the second iteration (Shi, Page 6, Algorithm 1, Line 3, “
P
f
~
∩
P
f
/
P
f
<
t
h
r
e
s
h
o
l
d
”; the condition is considered to be “a hypervolume metric” because it is testing how close the current Pareto front is compared to the optimal Pareto front”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the neural architecture search method of the proposed combination to include a stopping criterion based on a hypervolume metric as taught by Shi. The motivation for doing so would have been to obtain a Pareto front that is closer to the optimal Pareto front, where the optimal Pareto front is the set of all Pareto optimal architectures, thus stopping based on a hypervolume metric has the effect of more optimal results (Shi, Page 9, Section 4.3, Paragraph 2, Lines 4-8, “Figure 5b shows the Pareto fronts estimated by different algorithms, which demonstrates the superiority of BOGCN method. Compared with other baselines, the models sampled by our method are gathered near the optimal Pareto front, and the found Pareto front is also closer to the optimal Pareto front. This validates the efficiency of our algorithm on this multi-objective search task”).
Regarding claim 21, the rejection of claim 15 is incorporated.
The proposed combination does not explicitly teach wherein the first subgroup of architecture configurations includes less than fifty one architecture configurations.
Shi teaches wherein the first subgroup of architecture configurations includes less than fifty one architecture configurations (Shi, Page 7, Section 4.2, Lines 1-3, “For the proposed BOGCN-NAS, we randomly sample 50 architectures to fully train and use them as the initial trained architecture sets, which is counted into the total sample numbers”).
It would be obvious, to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the neural architecture search method of the proposed combination to include the first subgroup having less than fifty one architecture configurations. The motivation for doing so would have been that beginning with less samples results in less samples being used overall, which results in a method that is less computationally expensive (Shi, Page 7, Section 4.2, Paragraph 3, “Figure 3 and Table 2 show the number of samples until finding the global optimal architecture using different methods. The proposed algorithm outperforms the other baselines consistently on the two different datasets. On NASBench, BOGCN-NAS is 128.4×, 59.6×, 50.5×, 7.8× more sample-efficient than Random Search, Regularized Evolution, MCTS and LaNAS respectively. On the smaller NLP dataset, BOGCN-NAS can still search and find the optimal architecture with fewer samples”).
Response to Arguments
Applicant’s amendments to claims 9-14 with respect to objections to the claims have been fully considered, and overcome the objections set forth in the nonfinal office action dated 09/19/2025. Consequently, the objections to the claims have been withdrawn.
Applicant’s amendments to claim 8 with respect to 35 U.S.C. 112(b) indefiniteness rejections to claim 12 have been fully considered, and overcome the rejections set forth in the nonfinal office action dated 09/19/2025. Consequently, the 35 U.S.C. 112(b) indefiniteness rejections to claim 12 have been withdrawn.
Applicant’s arguments regarding the 35 U.S.C. 101 rejections of the claims have been fully considered but are unpersuasive.
Applicant first argues, on page 9, final paragraph – page 11, paragraph 2 of the response, that claim 1 is not directed to an abstract idea and points to Desjardins as evidence. Examiner respectfully disagrees, and further, the fact patterns of the instant case are not identical to those present in Desjardins, thus the logic applied to that case cannot be directly applied to the instant case.
Applicant next argues, on page 11, paragraph 3 – page 13, paragraph 1 of the response, that claim 1 is not directed to an abstract idea. Examiner respectfully disagrees. Applicant specifically argues the elements of the claims cannot be practically performed in the mind and points to “performing training of the predictor model”. This argument is not persuasive as the elements “train a predictor model…” and “re-train the predictor model…” were not identified as abstract ideas in the analysis of claim 1, but rather as additional elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See the updated 35 U.S.C. 101 rejections above. Further, applicant argues claim 1 “requires the accessing of computer memory” and therefore does not recite a mental process. Examiner respectfully disagrees. It is important to note that a claim that requires a computer may still recite a mental process, see MPEP 2106.04(a)(2)(III)(C). Finally, applicant argues the claim does not recite an abstract idea because claim 1 is “grounded in the practical application of ‘reduc[ing] the number of computationally heavy validations needed to perform a conventional NAS protocol’”. Examiner respectfully disagrees. It is important to note that claiming the improved speed or efficiency inherent with applying the abstract idea on a computer does not integrate a judicial exception into a practical application or provide an inventive concept, see MPEP 2106.05(f).
Applicant next argues, on page 13, paragraph 2, – page 14, paragraph 2 of the response, that claim 1 amounts to significantly more than an abstract idea. Examiner respectfully disagrees. Applicant specifically argues that claim 1 results in a “technical improvement in the field of artificial intelligence-based machine learning model identification using neural architecture searches” and specifically points to applicant’s method being “more efficient”, “faster”, “more accurate”, and “requires less computation resources”. Examiner respectfully disagrees. It is important to note that claiming the improved speed or efficiency inherent with applying the abstract idea on a computer does not integrate a judicial exception into a practical application or provide an inventive concept, see MPEP 2106.05(f). Further, it is important to keep in mind that an improvement in the abstract idea itself (e.g. a recited fundamental economic concept) is not an improvement in technology, see MPEP 2106.05(a).
Applicant's arguments regarding the remainder of the claims rely upon the arguments asserted with respect to the independent claims, and are thus unpersuasive.
Applicant’s arguments regarding the 35 U.S.C. 102 and 35 U.S.C. 103 rejections of the claims have been fully considered but are unpersuasive.
Applicant argues the combination of Lu and Loh does not teach or suggest “re-train the predictor model based on the first subgroup and the ones of the second subgroup”. Lu does teach “re-train[ing]” the predictor model because each iteration of the algorithm constructs (or re-constructs) the predictor and thus the model is re-trained. With regard to arguments geared toward the validation elements of amended claim 1, the arguments are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record.
Applicant's arguments regarding the remainder of the claims rely upon the arguments asserted with respect to the independent claims, and are thus unpersuasive.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOLLY CLARKE SIPPEL whose telephone number is (571)272-3270. The examiner can normally be reached Monday - Friday, 7:30 a.m. - 4:30 p.m. ET..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571)272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/M.C.S./ Examiner, Art Unit 2122
/KAKALI CHAKI/ Supervisory Patent Examiner, Art Unit 2122