Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Examiner’s Note
The Examiner encourages Applicant to schedule an interview to discuss issues related to, for example, the rejections noted below under 35 U.S.C § 112, 101 and § 103, for moving toward allowance.
Providing supporting paragraph(s) for each limitation of amended/new claim(s) in Remarks is strongly requested for clear and definite claim interpretations by Examiner.
Priority
Acknowledgment is made of applicant's claim for the present application filed on 07/14/2023.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.
Such claim limitation(s) is/are:
Claim 2: “the first node is configured for providing a preset search space and dataset”
Claim 2: “the second node is configured for generating a variational autoencoder and dividing the search space”
Claim 7: “a search sub-node for determining candidate neural architectures for work sub-node”
Claim 7: “a work sub-node for determining the performance of the candidate neural architectures, training a performance predictor, and updating the embedding position”
Claim 7: “a validation sub-node for performing a verification process based on a Merkel tree in a non-trusted environment to verify the data information broadcasted by other third nodes”
Claim 8: “the work sub-node is further configured to test the performance of the candidate neural architectures through the dataset or obtain the performance of the candidate neural architectures by collecting data information from other third nodes”
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 2, 7-8 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim limitations (Claim 2: “the first node is configured for providing a preset search space and dataset”; Claim 2: “the second node is configured for generating a variational autoencoder and dividing the search space”; Claim 7: “a search sub-node for determining candidate neural architectures for work sub-node”; Claim 7: “a work sub-node for determining the performance of the candidate neural architectures, training a performance predictor, and updating the embedding position”; Claim 7: “a validation sub-node for performing a verification process based on a Merkel tree in a non-trusted environment to verify the data information broadcasted by other third nodes”; Claim 8: “the work sub-node is further configured to test the performance of the candidate neural architectures through the dataset or obtain the performance of the candidate neural architectures by collecting data information from other third nodes) invoke 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. The specification is devoid of adequate structure to perform the claimed functions. The specification does not provide sufficient details such that one of ordinary skill in the art would understand which structures perform(s) the claimed functions, as recited in the claim(s) previously stated under 112(f). Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA 35 U.S.C. 112, second paragraph.
Applicant may:
(a) Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph;
(b) Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or
(c) Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either:
(a) Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or
(b) Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claim 2, 7-8 is/are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. As described above, the disclosure does not provide adequate structure to perform the claimed functions. The specification does not demonstrate that the applicant has made an invention that achieves the claimed functions because the invention is not described with sufficient detail such that one of ordinary skill in the art can reasonably conclude that the inventor had possession of the claimed invention. (FP 7.31.01)
Claim Objections
Claim(s) 1-8 is/are objected to because of the following informalities.
Claim(s) 1 is/are objected to because of the following informalities:
it appears that “tunning” (line 6) needs to read “tuning” or something else. Appropriate correction is required.
it appears that “is is” (line 7) needs to read “is” or something else. Appropriate correction is required.
it appears that “broadcast” (line 8) needs to read “broadcasts” or something else. Appropriate correction is required.
it appears that “allow” (line 10) needs to read “allowed” or something else. Appropriate correction is required.
it appears that “broadcast” (line 15) needs to read “broadcasting” or something else. Appropriate correction is required.
it appears that “embedding” (line 18) needs to read “embed” or something else. Appropriate correction is required.
Claim(s) 6 is/are objected to because of the following informalities: it appears that “a performance predictor” (line 1) needs to read “the performance predictor” or something else. Appropriate correction is required. This will help avoid a rejection on “the performance predictor” (line 4) under 35 USC 112(b).
Claim(s) 6 is/are objected to because of the following informalities: “wherein k is a positive integer” may need to be added. Appropriate correction is required.
Claim(s) 1, 6 each recite(s) limitations that raise issues of indefiniteness as set forth above, and their dependent claims are objected to at least based on their direct and/or indirect dependency from the claims listed above. Appropriate explanation and/or amendment is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim(s) 1-10 is/are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim(s) 1 recite(s) the limitation “the system” (line 2). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “the apparatus” to indicate “apparatus” (line 1), or something else. For the purposes of examination, “the apparatus” is used.
Claim(s) 1 recite(s) the limitation “the fitness” (line 3). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a fitness”, or something else. For the purposes of examination, “a fitness” is used.
Claim(s) 1 recite(s) the limitation “the dataset” (line 4). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a dataset”, or something else. For the purposes of examination, “a dataset” is used.
Claim(s) 1 recite(s) the limitation “the processes” (line 5). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “processes”, or something else. For the purposes of examination, “processes” is used.
Claim(s) 1 recite(s) the limitation “the specification” (line 8). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a specification”, or something else. For the purposes of examination, “a specification” is used.
Claim(s) 1 recite(s) the limitation “the problem” (line 8). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a problem”, or something else. For the purposes of examination, “a problem” is used.
Claim(s) 1 recite(s) the limitation “the types” (line 10). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “types”, or something else. For the purposes of examination, “types” is used.
Claim(s) 1 recite(s) the limitation “the evaluation method” (line 11). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “an evaluation method”, or something else. For the purposes of examination, “an evaluation method” is used.
Claim(s) 1 recite(s) the limitation “the termination criteria of the parameter tunning” (line 11). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “termination criteria of a parameter tunning”, or something else. For the purposes of examination, “termination criteria of a parameter tunning” is used.
Claim(s) 1 recite(s) the limitation “the target dataset” (line 12). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a target dataset”, or something else. For the purposes of examination, “a target dataset” is used.
Claim(s) 1 recite(s) the limitation “the training dataset” (line 13). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a training dataset”, or something else. For the purposes of examination, “a training dataset” is used.
Claim(s) 1 recite(s) the limitation “the test dataset” (line 14). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a test dataset”, or something else. For the purposes of examination, “a test dataset” is used.
The term “similar” (claim 1, line 19) is a relative term which renders the claim indefinite. The term “similar” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
The term “adjacent” (claim 1, line 19) is a relative term which renders the claim indefinite. The term “similar” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Claim(s) 1 recite(s) the limitation “the third node” (line 21). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a third node”, or something else. For the purposes of examination, “a third node” is used.
Claim(s) 1 recite(s) the limitation “the embedding position” (line 22). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “an embedding position”, or something else. For the purposes of examination, “an embedding position” is used.
Claim(s) 1 recite(s) the limitation “the gradient direction” (line 22). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a gradient direction”, or something else. For the purposes of examination, “a gradient direction” is used.
Claim(s) 1 recite(s) the limitation “the performance trainer” (line 23). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a performance trainer”, or something else. For the purposes of examination, “a performance trainer” is used.
Claim(s) 1 recite(s) the limitation “the performance” (line 25). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a performance”, or something else. For the purposes of examination, “a performance” is used.
Claim(s) 2 recite(s) the limitation “the search space” (line 4). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to since it is not clear if it indicates “a preset search space” (line 3) or something else. It appears it may need to read “a search space”, or something else. For the purposes of examination, “a search space” is used. In addition, claim(s) 2 (line 8) is/are rejected for the same reason.
Claim(s) 2 recite(s) the limitation “the distribution” (line 9). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a distribution”, or something else. For the purposes of examination, “a distribution” is used.
Claim(s) 2 recite(s) the limitation “the distribution” (line 10). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a distribution”, or something else. For the purposes of examination, “a distribution” is used.
Claim(s) 3 recite(s) the limitation “the initial embedding position” (line 3). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “an initial embedding position”, or something else. For the purposes of examination, “an initial embedding position” is used.
Claim(s) 3 recite(s) the limitation “the updated result” (line 6). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “an updated result”, or something else. For the purposes of examination, “an updated result” is used.
Claim(s) 4 recite(s) the limitation “the embedding position of the initial neural architecture” (line 2). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “an embedding position of an initial neural architecture”, or something else. For the purposes of examination, “an embedding position of an initial neural architecture” is used.
Claim(s) 4 recite(s) the limitation “the neural architecture assigned by the first node” (line 3). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a neural architecture assigned by the first node”, or something else. For the purposes of examination, “a neural architecture assigned by the first node” is used.
Claim(s) 4 recite(s) the limitation “the identifier as the random seed and the setting of the search space” (line 5). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “an identifier as a random seed and a setting of a search space”, or something else. For the purposes of examination, “an identifier as a random seed and a setting of a search space” is used.
Claim(s) 5 recite(s) the limitation “the embedding position according to the top N preferred neural architectures” (line 2). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “an embedding position according to top N preferred neural architectures”, or something else. For the purposes of examination, “an embedding position according to top N preferred neural architectures” is used.
Claim(s) 5 recite(s) the limitation “the preferred neural architectures are sorted according to the performance evaluation result” (line 3). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a preferred neural architectures are sorted according to a performance evaluation result”, or something else. For the purposes of examination, “a preferred neural architectures are sorted according to a performance evaluation result” is used.
Claim(s) 5 recite(s) the limitation “better” (line 4). However, it is not clear it is better than what. It appears it may read “better performance in the sorting” or something else. For the purposes of examination, “better performance in the sorting” is used.
Claim(s) 6 recite(s) the limitation “the performance predictor” (line 4). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to since it may indicate “a performance predictor” (claim 1) or “a performance predictor” (claim 6, line 1), or something else. It appears it may need to read “a performance predictor”, or something else. For the purposes of examination, “a performance predictor” is used.
The term “peripheral” (claim 6, line 6) is a relative term which renders the claim indefinite. The term “peripheral” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Claim(s) 7 recite(s) the limitation “the performance” (line 3). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to since it may indicates “performance” (claim 1, line 21) or “performance” (claim 1, line 25), or something else. It appears it may need to read “a performance”, or something else. For the purposes of examination, “a performance” is used.
Claim(s) 7 recite(s) the limitation “the candidate neural architectures” (line 3). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to since it may indicates “multiple candidate neural architectures” (claim 1, line 21) or “candidate neural architectures” (claim 1, line 25), or something else. It appears it may need to read “candidate neural architectures”, or something else. For the purposes of examination, “candidate neural architectures” is used.
Claim(s) 7 recite(s) the limitation “the data information broadcasted by other third nodes” (line 6). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “data information broadcasted by other third nodes”, or something else. For the purposes of examination, “data information broadcasted by other third nodes” is used.
Claim(s) 9 recite(s) the limitation “the data information of the initial neural architecture” (line 6). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “data information of the initial neural architecture”, or something else. For the purposes of examination, “data information of the initial neural architecture” is used.
Claim(s) 9 recite(s) the limitation “the embedding positions in the direction of the gradient provided by the performance predictor” (line 9). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “embedding positions in a direction of a gradient provided by the performance predictor”, or something else. For the purposes of examination, “embedding positions in a direction of a gradient provided by the performance predictor” is used.
Claim(s) 10 recite(s) the limitation “the embedding position of the top N preferred neural architectures” (line 3). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “an embedding position of top N preferred neural architectures”, or something else. For the purposes of examination, “an embedding position of top N preferred neural architectures” is used.
Claim(s) 10 recite(s) the limitation “the preferred neural architectures” (line 5). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “preferred neural architectures”, or something else. For the purposes of examination, “preferred neural architectures” is used.
Claim(s) 10 recite(s) the limitation “those with better performance” (line 6). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “neural architectures with better performance”, or something else. For the purposes of examination, “neural architectures with better performance” is used.
Claim(s) 10 recite(s) the limitation “better” (line 6). However, it is not clear it is better than what. It appears it may read “better performance in the sorting” or something else. For the purposes of examination, “better performance in the sorting” is used.
Claim(s) 1-7, 9-10 each recite(s) limitations that raise issues of indefiniteness as set forth above, and their dependent claims are rejected at least based on their direct and/or indirect dependency from the claims listed above. Appropriate explanation and/or amendment is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, 3-6 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because the claimed invention describes a computer program without further limiting the computer program with any structural limitations in its body. Even though claim 1 recites “An apparatus for collaborative neural architecture search, wherein the system consists of a collective set of independent computation nodes”, the apparatus may be interpreted as a software. Thus, under the broadest reasonable interpretation, the system is a collection of instructions (software per se) and does not fall within at least one of the four statutory categories. A computer program is merely a set of instructions capable of being executed by a computer, the computer program itself is not a process and without the computer-readable medium, the computer program’s functionality is considered a nonstatutory functional descriptive material. See MPEP § 2106(I) and MPEP § 2111.05. For clarification, the claim may be amended (e.g., “An apparatus, for collaborative neural architecture search, comprising a hardware processor and non-transitory memory”) to make sure that the claim falls within one of the four statutory categories.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding claim 1
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: Due to the statutory issue regarding the “apparatus”, the claim does not pass Step 1 of the test for patent eligibility as not being one of the four categories of patent eligible subject matter. However, if the claim is amended to fall under one of the four categories of statutory subject matter, it would further be rejected based on claim limitations that are directed to a judicially recognized exception of an abstract idea, as follows.
Step 2A Prong 1:
The limitations of
“… for collaborative neural architecture search,
…, …;
…
… and …;
…;
…,
wherein … embedding a neural architecture to a vector space so that neural architectures with similar structures have adjacent embedding positions;
… determine a performance of multiple candidate neural architectures, …, and update the embedding position based on the gradient direction …;
…, and …”, as drafted, are a machine that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application.
In particular, the claim recites an additional element (“wherein the system consists of a collective set of independent computation nodes”, “the processes comprises: an Initialization process, search space division process, parameter tunning process, validation process and search process”, “wherein the initialization process is is a process to maintain the supply of data”, “wherein the specification of the problem comprises: the types of neural architectures allow to be generated in the search process, the evaluation method, the termination criteria of the parameter tunning, the target dataset, the training dataset, and the test dataset”, “wherein the search space division process is a process that is responsible for broadcast a fully trained VAE (variational autoencoder) or a mathematical model that takes in a random vector and outputs a graph”, “the data information comprises the performance of candidate neural architectures”). This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not integrate the abstract idea into a practical application. See MPEP 2106.05(h)
In particular, the claim recites an additional element(s) (“wherein each of the nodes share a ledger that records the fitness of a set of mathematical models with respect to the dataset”, “in the amid of the initialization process, each node broadcast the specification of the problem”, “wherein data information is shared among different third nodes”) – the act of transmitting data. The claim is adding an insignificant extra-solution activity to the judicial exception – see MPEP 2106.05(g). The act of transmitting data is recited at a high-level of generality (i.e., as a generic act of performing a generic act function of transmitting data) such that it amounts no more than a mere act to apply the exception using a generic act of transmitting. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). In particular, the claim recites an additional element(s) (“each of the nodes run the processes”, “the variational autoencoder is used to”, “at least two mutually collaborative third nodes; wherein the third node is used to”, “provided by the performance trainer”) – using a device and/or a model to process data. The device and the model in each step are recited at a high-level of generality (i.e., as a generic computer performing a generic computer function of processing data) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
In particular, the claim recites an additional element(s) (“train a performance predictor”). The additional element is recited at such a high level without any details as to how a model is trained such that it amounts to only the idea of a solution or outcome because it fails to recite details of how a solution to a problem is accomplished, and, therefore, represents no more than mere instructions to apply the judicial exception on a computer (see MPEP 2106.05(f)). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not amount to significantly more than the abstract idea. See MPEP 2106.05(h).
As discussed above, the claim recites the additional element(s) of transmitting data at a high-level of generality and is adding an insignificant extra-solution activity – see MPEP 2106.05(g). However, the addition of insignificant extra-solution activity does not amount to an inventive concept, particularly when the activity is well-understood, routine, and conventional. See MPEP 2106.05(d)(II) – “Receiving or transmitting data over a network” or “Storing and retrieving information in memory”. Accordingly, this additional element does not provide an inventive concept and significantly more than the abstract idea. Thus, the claim is not patent eligible.
As discussed above, with respect to integration of the abstract idea into a practical application, the additional elements of using a generic computer component to perform each step amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible. MPEP 2106.05(f).
The additional elements regarding training are recited at such a high level without any details as to how a model is trained such that it amounts to only the idea of a solution or outcome because it fails to recite details of how a solution to a problem is accomplished, and, therefore, represents no more than mere instructions to apply the judicial exception on a computer (see MPEP 2106.05(f)). Accordingly, this additional element does not amount to significantly more than the abstract idea. The claim is directed to an abstract idea.
Regarding claim 2
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: The claim recites a system; therefore, it falls into the statutory category of a machine.
Step 2A Prong 1:
The limitations of
“…;
wherein …, and … and dividing the search space; …;
…; and
the distribution of the neural architectures in the vector space is selected from a Gaussian distribution with a standard deviation of 1 or the distribution specified by the first node”, as drafted, are a machine that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application.
In particular, the claim recites an additional element (“wherein the initialization process is namely a first node and the search space division process is namely a second node”, “wherein the encoder training dataset consists of several randomly generated neural architectures that meet the search space”). This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not integrate the abstract idea into a practical application. See MPEP 2106.05(h)
The claim recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). In particular, the claim recites an additional element(s) (“the first node is configured for”, “the second node is configured for”) – using a device and/or a model to process data. The device and the model in each step are recited at a high-level of generality (i.e., as a generic computer performing a generic computer function of processing data) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
In particular, the claim recites an additional element(s) (“providing a preset search space and dataset”, “the variational autoencoder is … broadcasted to the third nodes”) – the act of transmitting data. The claim is adding an insignificant extra-solution activity to the judicial exception – see MPEP 2106.05(g). The act of transmitting data is recited at a high-level of generality (i.e., as a generic act of performing a generic act function of transmitting data) such that it amounts no more than a mere act to apply the exception using a generic act of transmitting. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
In particular, the claim recites an additional element(s) (“generating a variational autoencoder …; wherein the variational autoencoder is trained on an encoder training dataset”). The additional element is recited at such a high level without any details as to how a model is trained such that it amounts to only the idea of a solution or outcome because it fails to recite details of how a solution to a problem is accomplished, and, therefore, represents no more than mere instructions to apply the judicial exception on a computer (see MPEP 2106.05(f)). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not amount to significantly more than the abstract idea. See MPEP 2106.05(h).
As discussed above, with respect to integration of the abstract idea into a practical application, the additional elements of using a generic computer component to perform each step amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible. MPEP 2106.05(f).
As discussed above, the claim recites the additional element(s) of transmitting data at a high-level of generality and is adding an insignificant extra-solution activity – see MPEP 2106.05(g). However, the addition of insignificant extra-solution activity does not amount to an inventive concept, particularly when the activity is well-understood, routine, and conventional. See MPEP 2106.05(d)(II) – “Receiving or transmitting data over a network” or “Storing and retrieving information in memory”. Accordingly, this additional element does not provide an inventive concept and significantly more than the abstract idea. Thus, the claim is not patent eligible.
The additional elements regarding training are recited at such a high level without any details as to how a model is trained such that it amounts to only the idea of a solution or outcome because it fails to recite details of how a solution to a problem is accomplished, and, therefore, represents no more than mere instructions to apply the judicial exception on a computer (see MPEP 2106.05(f)). Accordingly, this additional element does not amount to significantly more than the abstract idea. The claim is directed to an abstract idea.
Regarding claim 3
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: Due to the statutory issue regarding the “apparatus”, the claim does not pass Step 1 of the test for patent eligibility as not being one of the four categories of patent eligible subject matter. However, if the claim is amended to fall under one of the four categories of statutory subject matter, it would further be rejected based on claim limitations that are directed to a judicially recognized exception of an abstract idea, as follows.
Step 2A Prong 1:
The limitations of
“determining the initial embedding position;
…;
…”, as drafted, are a machine that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).
The limitations of
“calculating the gradient direction given … through backpropagation algorithm, and
obtaining the updated result of the initial embedding position through gradient ascent”, as drafted, are a machine that, under its broadest reasonable interpretation, covers performance of the limitation based on mathematical relationships and/or mathematical formulas or equations and/or mathematical calculations. That is, nothing in the claim element precludes the step from practically being performed based on mathematical relationships and/or mathematical formulas or equations and/or mathematical calculations.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation based on mathematical relationships and/or mathematical formulas or equations and/or mathematical calculations, but for the recitation of generic computer components, then it falls within the “Mathematical concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application.
In particular, the claim recites an additional element (“wherein the updated result is an optimized embedding position that performs better than the initial embedding position”). This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not integrate the abstract idea into a practical application. See MPEP 2106.05(h)
The claim recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). In particular, the claim recites an additional element(s) (“by the performance predictor”) – using a device and/or a model to process data. The device and the model in each step are recited at a high-level of generality (i.e., as a generic computer performing a generic computer function of processing data) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not amount to significantly more than the abstract idea. See MPEP 2106.05(h).
As discussed above, with respect to integration of the abstract idea into a practical application, the additional elements of using a generic computer component to perform each step amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible. MPEP 2106.05(f).
Regarding claim 4
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: Due to the statutory issue regarding the “apparatus”, the claim does not pass Step 1 of the test for patent eligibility as not being one of the four categories of patent eligible subject matter. However, if the claim is amended to fall under one of the four categories of statutory subject matter, it would further be rejected based on claim limitations that are directed to a judicially recognized exception of an abstract idea, as follows.
Step 2A Prong 1:
The limitations of
“…, which is selected from the neural architecture assigned by the first node or a randomly generated neural architecture;
wherein the randomly generated neural architecture is generated randomly based on the identifier as the random seed and the setting of the search space; and
the identifier is assigned … or generated automatically …”, as drafted, are a machine that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application.
In particular, the claim recites an additional element (“wherein the initial embedding position is the embedding position of the initial neural architecture in the vector space”). This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not integrate the abstract idea into a practical application. See MPEP 2106.05(h)
The claim recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). In particular, the claim recites an additional element(s) (“by the second node”, “by the third node”) – using a device and/or a model to process data. The device and the model in each step are recited at a high-level of generality (i.e., as a generic computer performing a generic computer function of processing data) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not amount to significantly more than the abstract idea. See MPEP 2106.05(h).
As discussed above, with respect to integration of the abstract idea into a practical application, the additional elements of using a generic computer component to perform each step amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible. MPEP 2106.05(f).
Regarding claim 5
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: Due to the statutory issue regarding the “apparatus”, the claim does not pass Step 1 of the test for patent eligibility as not being one of the four categories of patent eligible subject matter. However, if the claim is amended to fall under one of the four categories of statutory subject matter, it would further be rejected based on claim limitations that are directed to a judicially recognized exception of an abstract idea, as follows.
Step 2A Prong 1:
The limitations of
“…;
…, and the preferred neural architectures are sorted according to the performance evaluation results, and …”, as drafted, are a machine that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application.
In particular, the claim recites an additional element (“wherein the initial embedding position is the embedding position according to the top N preferred neural architectures in the vector space; wherein N is a positive integer”, “neural architectures with better performance have lower numbers”). This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not integrate the abstract idea into a practical application. See MPEP 2106.05(h)
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not amount to significantly more than the abstract idea. See MPEP 2106.05(h).
Regarding claim 6
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: Due to the statutory issue regarding the “apparatus”, the claim does not pass Step 1 of the test for patent eligibility as not being one of the four categories of patent eligible subject matter. However, if the claim is amended to fall under one of the four categories of statutory subject matter, it would further be rejected based on claim limitations that are directed to a judicially recognized exception of an abstract idea, as follows.
Step 2A Prong 1:
The limitations of
“;
;
…, and the candidate neural architecture is sampled from a peripheral area of the initial embedding position”, as drafted, are a machine that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application.
In particular, the claim recites an additional element(s) (“obtaining k sets of training data to form a predictor training dataset”) – the act of receiving data. The claim is adding an insignificant extra-solution activity to the judicial exception – see MPEP 2106.05(g). The act of receiving data is recited at a high-level of generality (i.e., as a generic act of receiving performing a generic act function of receiving data) such that it amounts no more than a mere act to apply the exception using a generic act of receiving. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
In particular, the claim recites an additional element(s) (“training the performance predictor based on the predictor training dataset”). The additional element is recited at such a high level without any details as to how a model is trained such that it amounts to only the idea of a solution or outcome because it fails to recite details of how a solution to a problem is accomplished, and, therefore, represents no more than mere instructions to apply the judicial exception on a computer (see MPEP 2106.05(f)). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
In particular, the claim recites an additional element (“wherein each set of training data comprises: candidate neural architecture and its corresponding performance”). This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not integrate the abstract idea into a practical application. See MPEP 2106.05(h)
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
As discussed above, the claim recites the additional element(s) of receiving data at a high-level of generality and is adding an insignificant extra-solution activity – see MPEP 2106.05(g). However, the addition of insignificant extra-solution activity does not amount to an inventive concept, particularly when the activity is well-understood, routine, and conventional. See MPEP 2106.05(d)(II) – “Receiving or transmitting data over a network” or “Storing and retrieving information in memory”. Accordingly, this additional element does not provide an inventive concept and significantly more than the abstract idea. Thus, the claim is not patent eligible.
The additional elements regarding training are recited at such a high level without any details as to how a model is trained such that it amounts to only the idea of a solution or outcome because it fails to recite details of how a solution to a problem is accomplished, and, therefore, represents no more than mere instructions to apply the judicial exception on a computer (see MPEP 2106.05(f)). Accordingly, this additional element does not amount to significantly more than the abstract idea. The claim is directed to an abstract idea.
This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not amount to significantly more than the abstract idea. See MPEP 2106.05(h).
Regarding claim 7
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: The claim recites a system; therefore, it falls into the statutory category of a machine.
Step 2A Prong 1: The claim recites the abstract idea identified above regarding claim 1.
Step 2A Prong 2: This judicial exception is not integrated into a practical application.
In particular, the claim recites an additional element (“wherein the third node comprises: a search sub-node for determining candidate neural architectures for work sub-node; a work sub-node for determining the performance of the candidate neural architectures, training a performance predictor, and updating the embedding position; and a validation sub-node for performing a verification process based on a Merkel tree in a non-trusted environment to verify the data information broadcasted by other third nodes”). This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not integrate the abstract idea into a practical application. See MPEP 2106.05(h)
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not amount to significantly more than the abstract idea. See MPEP 2106.05(h).
Regarding claim 8
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: The claim recites a system; therefore, it falls into the statutory category of a machine.
Step 2A Prong 1:
The limitations of
“wherein … test the performance of the candidate neural architectures through the dataset or …”, as drafted, are a machine that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application.
The claim recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). In particular, the claim recites an additional element(s) (“the work sub-node is further configured to”) – using a device and/or a model to process data. The device and the model in each step are recited at a high-level of generality (i.e., as a generic computer performing a generic computer function of processing data) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
In particular, the claim recites an additional element(s) (“obtain the performance of the candidate neural architectures by collecting data information from other third nodes”) – the act of receiving data. The claim is adding an insignificant extra-solution activity to the judicial exception – see MPEP 2106.05(g). The act of receiving data is recited at a high-level of generality (i.e., as a generic act of receiving performing a generic act function of receiving data) such that it amounts no more than a mere act to apply the exception using a generic act of receiving. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
As discussed above, with respect to integration of the abstract idea into a practical application, the additional elements of using a generic computer component to perform each step amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible. MPEP 2106.05(f).
As discussed above, the claim recites the additional element(s) of receiving data at a high-level of generality and is adding an insignificant extra-solution activity – see MPEP 2106.05(g). However, the addition of insignificant extra-solution activity does not amount to an inventive concept, particularly when the activity is well-understood, routine, and conventional. See MPEP 2106.05(d)(II) – “Receiving or transmitting data over a network” or “Storing and retrieving information in memory”. Accordingly, this additional element does not provide an inventive concept and significantly more than the abstract idea. Thus, the claim is not patent eligible.
Regarding claim 9
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: The claim recites a method; therefore, it falls into the statutory category of a process.
Step 2A Prong 1:
The limitations of
“… for collaborative neural architecture search, comprising:
S1: determining … an initial neural architecture;
S2: embedding the initial neural architecture to an initial embedding position in a vector space …;
S3: …;
S4: …,
S5: …;
…;
S7: …”, as drafted, are a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).
The limitations of
“S6: updating the embedding positions in the direction of the gradient provided … using gradient ascent, such that the updated embedding positions perform better than the initial embedding position;”, as drafted, are a process that, under its broadest reasonable interpretation, covers performance of the limitation based on mathematical relationships and/or mathematical formulas or equations and/or mathematical calculations. That is, nothing in the claim element precludes the step from practically being performed based on mathematical relationships and/or mathematical formulas or equations and/or mathematical calculations.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation based on mathematical relationships and/or mathematical formulas or equations and/or mathematical calculations, but for the recitation of generic computer components, then it falls within the “Mathematical concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application.
In particular, the claim recites an additional element(s) (“and training an initial neural architecture”, “training a performance predictor using the predictor training dataset”). The additional element is recited at such a high level without any details as to how a model is trained such that it amounts to only the idea of a solution or outcome because it fails to recite details of how a solution to a problem is accomplished, and, therefore, represents no more than mere instructions to apply the judicial exception on a computer (see MPEP 2106.05(f)). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). In particular, the claim recites an additional element(s) (“using a variational autoencoder”, “by the performance predictor”) – using a device and/or a model to process data. The device and the model in each step are recited at a high-level of generality (i.e., as a generic computer performing a generic computer function of processing data) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
In particular, the claim recites an additional element(s) (“broadcasting the data information of the initial neural architecture”) – the act of transmitting data. The claim is adding an insignificant extra-solution activity to the judicial exception – see MPEP 2106.05(g). The act of transmitting data is recited at a high-level of generality (i.e., as a generic act of performing a generic act function of transmitting data) such that it amounts no more than a mere act to apply the exception using a generic act of transmitting. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
In particular, the claim recites an additional element(s) (“obtaining k sets of training data to form a predictor training dataset, wherein k is a positive integer”) – the act of receiving data. The claim is adding an insignificant extra-solution activity to the judicial exception – see MPEP 2106.05(g). The act of receiving data is recited at a high-level of generality (i.e., as a generic act of receiving performing a generic act function of receiving data) such that it amounts no more than a mere act to apply the exception using a generic act of receiving. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
In particular, the claim recites an additional element(s) (“repeating steps S1 to S6 L times, wherein L is a positive integer”) – the act of repeating. The claim is adding an insignificant extra-solution activity to the judicial exception – see MPEP 2106.05(g). The act of repeating is recited at a high-level of generality (i.e., as a generic act of performing a generic act function of repeating) such that it amounts no more than a mere act to apply the exception using a generic act of repeating. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The additional elements regarding training are recited at such a high level without any details as to how a model is trained such that it amounts to only the idea of a solution or outcome because it fails to recite details of how a solution to a problem is accomplished, and, therefore, represents no more than mere instructions to apply the judicial exception on a computer (see MPEP 2106.05(f)). Accordingly, this additional element does not amount to significantly more than the abstract idea. The claim is directed to an abstract idea.
As discussed above, with respect to integration of the abstract idea into a practical application, the additional elements of using a generic computer component to perform each step amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible. MPEP 2106.05(f).
As discussed above, the claim recites the additional element(s) of transmitting data at a high-level of generality and is adding an insignificant extra-solution activity – see MPEP 2106.05(g). However, the addition of insignificant extra-solution activity does not amount to an inventive concept, particularly when the activity is well-understood, routine, and conventional. See MPEP 2106.05(d)(II) – “Receiving or transmitting data over a network” or “Storing and retrieving information in memory”. Accordingly, this additional element does not provide an inventive concept and significantly more than the abstract idea. Thus, the claim is not patent eligible.
As discussed above, the claim recites the additional element(s) of receiving data at a high-level of generality and is adding an insignificant extra-solution activity – see MPEP 2106.05(g). However, the addition of insignificant extra-solution activity does not amount to an inventive concept, particularly when the activity is well-understood, routine, and conventional. See MPEP 2106.05(d)(II) – “Receiving or transmitting data over a network” or “Storing and retrieving information in memory”. Accordingly, this additional element does not provide an inventive concept and significantly more than the abstract idea. Thus, the claim is not patent eligible.
As discussed above, the claim recites the additional element(s) of repeating at a high-level of generality and is adding an insignificant extra-solution activity – see MPEP 2106.05(g). However, the addition of insignificant extra-solution activity does not amount to an inventive concept, particularly when the activity is well-understood, routine, and conventional. See MPEP 2106.05(d)(II) – “Performing repetitive calculations”. Accordingly, this additional element does not provide an inventive concept and significantly more than the abstract idea. Thus, the claim is not patent eligible.
Regarding claim 10
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: The claim recites a method; therefore, it falls into the statutory category of a process.
Step 2A Prong 1:
The limitations of
“S8: jumping to the embedding position of the top N preferred neural architectures in the vector space;
wherein N is a positive integer, and the preferred neural architectures are sorted by performance, and …”, as drafted, are a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: This judicial exception is not integrated into a practical application.
In particular, the claim recites an additional element (“those with better performance have a lower index”). This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not integrate the abstract idea into a practical application. See MPEP 2106.05(h)
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not amount to significantly more than the abstract idea. See MPEP 2106.05(h).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1, 9-10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (A mining pool solution for novel proof-of-neural-architecture consensus) in view of Zhang et al. (Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement)
Regarding claim 1
Li teaches
An apparatus for collaborative neural architecture search,
(Li [sec(s) I] “Fig. 1 shows the scenario when two miners search the neural network in two different subspaces. In the mining pool based on the PoDL consensus, each participant will suffer more from network bandwidth as the bottleneck for common distributed deep learning training. In this design, the mining pool will split the NAS searching space into multiple subspaces. Thus, the searching tasks will become relatively more independent.”;)
wherein the system consists of a collective set of independent computation nodes, wherein each of the nodes share a ledger that records the fitness of a set of mathematical models with respect to the dataset;
(Li [sec(s) I] “Fig. 1 shows the scenario when two miners search the neural network in two different subspaces. In the mining pool based on the PoDL consensus, each participant will suffer more from network bandwidth as the bottleneck for common distributed deep learning training. In this design, the mining pool will split the NAS searching space into multiple subspaces. Thus, the searching tasks will become relatively more independent.” [sec(s) III] “In this mining pool, the participants include miners and managers. The pool manager will receive the rewards once any of the participants find the block and the manager will distribute the rewards to each participant [19]. The pool manager normally hosts very powerful servers to maintain a stable connection and job distribution. Miners pay mining fees as a return to the good performing pool manager. The miners in the pool will contribute to the assigned tasks. In the mining pool based on NAS, the amount of the reward distributions will still depend on the ratio of computation power and contribution. But a weak miner may hold back the performance of the whole mining pool. Therefore, the Intuitive solution is that the manager will distribute easier jobs to weak miners and distribute harder jobs to strong miners. Therefore, all miners will be able to deliver useful results. … Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc.”;)
each of the nodes run the processes comprises: an Initialization process, search space division process, parameter tunning process, validation process and search process
(Li [sec(s) II] “PoDL: The consensus based on the deep learning algorithm divides each block time into two or more interval phases [4] [6]. In general, each block includes the initialization phase, the training phase, and the validation phase. In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) I] “Fig. 1 shows the scenario when two miners search the neural network in two different subspaces. In the mining pool based on the PoDL consensus, each participant will suffer more from network bandwidth as the bottleneck for common distributed deep learning training. In this design, the mining pool will split the NAS searching space into multiple subspaces. Thus, the searching tasks will become relatively more independent.” [sec(s) Abs] “This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks.”;)
wherein the initialization process is is a process to maintain the supply of data and in the amid of the initialization process, each node broadcast the specification of the problem;
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5]” [sec(s) III] “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc..” [sec(s) Abs] “This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks.”;)
wherein the specification of the problem comprises:
the types of neural architectures allow to be generated in the search process,
the evaluation method, the termination criteria of the parameter tunning,
the target dataset,
the training dataset, and
the test dataset;
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc..” [sec(s) Abs] “This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks.”; e.g., “searching space” read(s) on “types of neural architectures”.)
(Note: Hereinafter, if a limitation has bold brackets (i.e. [·]) around claim languages, the bracketed claim languages indicate that they have not been taught yet by the current prior art reference but they will be taught by another prior art reference afterwards.)
wherein the search space division process is a process that is responsible for broadcast [a fully trained VAE (variational autoencoder) or a mathematical model that takes in a random vector and outputs a graph].
(Li [sec(s) III] “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc..” [sec(s) I] “Fig. 1 shows the scenario when two miners search the neural network in two different subspaces. In the mining pool based on the PoDL consensus, each participant will suffer more from network bandwidth as the bottleneck for common distributed deep learning training. In this design, the mining pool will split the NAS searching space into multiple subspaces. Thus, the searching tasks will become relatively more independent.” [sec(s) Abs] “This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks.”;)
at least two mutually collaborative third nodes;
(Li [sec(s) III] “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc..” [sec(s) I] “Fig. 1 shows the scenario when two miners search the neural network in two different subspaces. In the mining pool based on the PoDL consensus, each participant will suffer more from network bandwidth as the bottleneck for common distributed deep learning training. In this design, the mining pool will split the NAS searching space into multiple subspaces. Thus, the searching tasks will become relatively more independent.” [sec(s) Abs] “This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks.”;)
wherein the third node is used to determine a performance of multiple candidate neural architectures, train a [performance] predictor, and update [the embedding position based on the gradient direction provided by the performance trainer];
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc. … Exploration and exploitation: The searching space will only be sent to strong miners. Once a solution is confirmed outperforms the current best results, the strong miners will share the corresponding hyperparameters with weak miners. Weak miners will only exploit the confirmed architecture” [sec(s) Abs] “This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks.”;)
wherein data information is shared among different third nodes, and the data information comprises the performance of candidate neural architectures.
(Li [fig(s) 1] “The results of NAS in full searching space, and subspace 1 to 9. The x-axis is the number of episodes and y-axis is the best rewards.” [sec(s) II] “Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) V] “For a public accessible cryptocurrency blockchain system, earning tokens is the incentive for an individual miner to participates in mining.” [sec(s) IV] “Exploration and exploitation: The searching space will only be sent to strong miners. Once a solution is confirmed outperforms the current best results, the strong miners will share the corresponding hyperparameters with weak miners. Weak miners will only exploit the confirmed architecture”;)
However, Li does not appear to explicitly teach:
wherein the search space division process is a process that is responsible for broadcast [a fully trained VAE (variational autoencoder) or a mathematical model that takes in a random vector and outputs a graph].
wherein the variational autoencoder is used to embedding a neural architecture to a vector space so that neural architectures with similar structures have adjacent embedding positions
wherein the third node is used to determine a performance of multiple candidate neural architectures, train a [performance] predictor, and update [the embedding position based on the gradient direction provided by the performance trainer];
(Note: Hereinafter, if a limitation has one or more bold underlines, the one or more underlined claim languages indicate that they are taught by the current prior art reference, while the one or more non-underlined claim languages indicate that they have been taught already by one or more previous art references.)
Zhang teaches
wherein the search space division process is a process that is responsible for broadcast a fully trained VAE (variational autoencoder) or a mathematical model that takes in a random vector and outputs a graph.
(Zhang [sec(s) Abs] “Differently, this paper utilizes a variational graph autoencoder to injectively transform the discrete architecture space into an equivalently continuous latent space, to resolve the incongruence.” [sec(s) 4.2] “In this subsection, we conduct comparison experiments to demonstrate the effectiveness of our injective transformation, and all experiments are conducted without exploration enhancement that γ (in Eq. 3) is set as 0, nor relieving catastrophic forgetting that ε (in Eq. 7) is set as 0. As described in Section 3.1, we first adopt a graph autoencoder to injectively transform the discrete architecture into an equivalently continuous latent space, and then conduct differentiable optimization for architecture search as common differentiable NAS.” [sec(s) 3.1] “we encode the computation C that is the final output of the neural network into a continuous representation z. … Furthermore, we know that the graph encoder maps C to z injectively if the aggregation function G and the updating function U in the graph encoder are injective. … In our graph decoder, an MLP is first applied to the latent vector z to obtain the initial hidden state h0 which is fed to GRUd. Then the decoder constructs a DAG node by node based on the existing graph’s state. The detailed implementation of the variational graph autoencoder based on [38] for our differentiable NAScould be found in the Appendix A.”;)
wherein the variational autoencoder is used to embedding a neural architecture to a vector space so that neural architectures with similar structures have adjacent embedding positions;
(Zhang [sec(s) Abs] “Differently, this paper utilizes a variational graph autoencoder to injectively transform the discrete architecture space into an equivalently continuous latent space, to resolve the incongruence.” [sec(s) 4.2] “In this subsection, we conduct comparison experiments to demonstrate the effectiveness of our injective transformation, and all experiments are conducted without exploration enhancement that γ (in Eq. 3) is set as 0, nor relieving catastrophic forgetting that ε (in Eq. 7) is set as 0. As described in Section 3.1, we first adopt a graph autoencoder to injectively transform the discrete architecture into an equivalently continuous latent space, and then conduct differentiable optimization for architecture search as common differentiable NAS.” [sec(s) 3.1] “we encode the computation C that is the final output of the neural network into a continuous representation z. … Furthermore, we know that the graph encoder maps C to z injectively if the aggregation function G and the updating function U in the graph encoder are injective. The above injectiveness indicates that, there is a one-to-one mapping from latent representation to an architecture with a graph autoencoder, and vice versa, and we could equivalently conduct differentiable optimization on the latent continuous space for architecture search. In our graph decoder, an MLP is first applied to the latent vector z to obtain the initial hidden state h0 which is fed to GRUd. Then the decoder constructs a DAG node by node based on the existing graph’s state. The detailed implementation of the variational graph autoencoder based on [38] for our differentiable NAS could be found in the Appendix A. … We describe the trained graph encoder E as a mapping E : Rm → Rn, m > n, and decoder D as mapping D : Rn → Rm that defines a parameterized manifold of dimension n, M ≡ D(Rn), and every architecture αi could be sampled with noise ξi through αi = D(αiθ) + ξi, where αiθ ∈ Rn. Assuming the decoding function D is smooth enough [28, 41], and using first-order Taylor expansion at a given point αi ∈ Rm, we have
PNG
media_image1.png
79
1521
media_image1.png
Greyscale
, (4)”;)
wherein the third node is used to determine a performance of multiple candidate neural architectures, train a performance predictor, and update the embedding position based on the gradient direction provided by the performance trainer;
(Zhang [algorithm 1] “Retrain α∗ and get the best performance on the test dataset Dtest” and “Sample batch of Dvalid, and update αθ based on Eq. (3);”” [sec(s) 3.1] “we encode the computation C that is the final output of the neural network into a continuous representation z. … Furthermore, we know that the graph encoder maps C to z injectively if the aggregation function G and the updating function U in the graph encoder are injective. … Exploration Enhancement After transforming the discrete architecture into continuous space, existing differentiable NAS methods all conduct continuous optimization to update the continuous architecture representation αθ only along the gradient of validation performance based on Eq.2. Such a method would easily get into the rich-get-richer problem. To overcome this problem, we add the novelty into the gradient to enhance exploration to avoid local optima in architecture search, and update the architecture according” [sec(s) 4.3] “In the following, we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Li with the embedding position of Zhang.
One of ordinary skill in the art would have been motived to combine in order to improve the performance of differentiable One-Shot NAS and significantly enhance the exploration ability.
(Zhang [sec(s) 4] “The results of our E2NAS demonstrate that enhancing exploration helps to improve the performance of differentiable One-Shot NAS, where most γ settings all improve the performance of E2NAS. Our E2NAS is also very robust to γ, especially for dynamic γ, where all Sigγ achieve satisfying results. More importantly, γ significantly enhances the exploration ability of our E2NAS, and the best accuracy during the architecture search of our E2NAS (Sigγ(2)) reaches 94.29±0.07%, which greatly outperform our E2NAS without exploration enhancement (γ = 0). We could find that GDAS also achieves good performance in this dataset and greatly outperforms GDAS-A. One potential reason is that GDAS also introduces the exploration into the architecture search [11], which could improve the performance. Nevertheless, our method E2NAS (with Sigγ(10)) still outperforms GDAS, showing impressive results.”)
Regarding claim 9
A method for collaborative neural architecture search, comprising:
(Li [sec(s) I] “Fig. 1 shows the scenario when two miners search the neural network in two different subspaces. In the mining pool based on the PoDL consensus, each participant will suffer more from network bandwidth as the bottleneck for common distributed deep learning training. In this design, the mining pool will split the NAS searching space into multiple subspaces. Thus, the searching tasks will become relatively more independent.”;)
S1: determining and training an initial neural architecture;
(Li [sec(s) II] “PoDL: The consensus based on the deep learning algorithm divides each block time into two or more interval phases [4] [6]. In general, each block includes the initialization phase, the training phase, and the validation phase. In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].”;)
S3: broadcasting the data information of the initial neural architecture;
(Li [sec(s) II] “Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Exploration and exploitation: The searching space will only be sent to strong miners. Once a solution is confirmed outperforms the current best results, the strong miners will share the corresponding hyperparameters with weak miners. Weak miners will only exploit the confirmed architecture.” and “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc..”;)
S4: obtaining k sets of training data to form a predictor training dataset, wherein k is a positive integer;
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Exploration and exploitation: The searching space will only be sent to strong miners. Once a solution is confirmed outperforms the current best results, the strong miners will share the corresponding hyperparameters with weak miners. Weak miners will only exploit the confirmed architecture.” [sec(s) IV] “The NAS [13] is adopted to find a convolutional neural networks architecture for classification tasks under a certain hardware constrain. We use CIFAR-10 dataset [20]. All experiments were deployed on the workstation with Intel i7-9900K CPU @ 3.60GHz, 32Gb RAM, GTX 1080 Ti.”;)
S5: training a [performance] predictor using the predictor training dataset;
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Exploration and exploitation: The searching space will only be sent to strong miners. Once a solution is confirmed outperforms the current best results, the strong miners will share the corresponding hyperparameters with weak miners. Weak miners will only exploit the confirmed architecture.” [sec(s) IV] “The NAS [13] is adopted to find a convolutional neural networks architecture for classification tasks under a certain hardware constrain. We use CIFAR-10 dataset [20]. All experiments were deployed on the workstation with Intel i7-9900K CPU @ 3.60GHz, 32Gb RAM, GTX 1080 Ti.”;)
S7: repeating steps S1 [to] S6 L times, wherein L is a positive integer.
(Li [fig(s) 1] “The results of NAS in full searching space, and subspace 1 to 9. The x-axis is the number of episodes and y-axis is the best rewards.”;)
However, Xiao does not appear to explicitly teach:
S2: embedding the initial neural architecture to an initial embedding position in a vector space using a variational autoencoder;
S5: training a [performance] predictor using the predictor training dataset;
S6: updating the embedding positions in the direction of the gradient provided by the performance predictor using gradient ascent, such that the updated embedding positions perform better than the initial embedding position;
S7: repeating steps S1 [to] S6 L times, wherein L is a positive integer.
Zhang teaches
S2: embedding the initial neural architecture to an initial embedding position in a vector space using a variational autoencoder;
(Zhang [sec(s) Abs] “this paper utilizes a variational graph autoencoder to injectively transform the discrete architecture space into an equivalently continuous latent space, to resolve the incongruence.” [sec(s) 1] “For the incongruence in the relaxation transformation of differentiable NAS, we utilize a variational graph autoencoder with an asynchronous message passing scheme to transform the discrete architectures into an equivalent continuous space injectively. Because of the injectiveness, we could equivalently perform optimization in the continuous latent space with a solid theoretical foundation [33, 38].” [sec(s) 3.1] “As mentioned before, existing differentiable NAS methods usually adopt a simple continuous relaxation [22] to transform the discrete neural architectures (usually represented as DAGs) into a continuous space. As they could hardly guarantee that this transformation is injective, they suffer the problem of incongruence. For this problem, we adopt an asynchronous message passing scheme based graph neural network (GNN) to encode the neural architecture into an injective space. Different from encoding the graph in many GNNs, we encode the computation C that is the final output of the neural network into a continuous representation z.”;)
S5: training a performance predictor using the predictor training dataset;
(Zhang [algorithm 1] “Retrain α∗ and get the best performance on the test dataset Dtest” [sec(s) 3.1] “we encode the computation C that is the final output of the neural network into a continuous representation z. … Furthermore, we know that the graph encoder maps C to z injectively if the aggregation function G and the updating function U in the graph encoder are injective. … Exploration Enhancement After transforming the discrete architecture into continuous space, existing differentiable NAS methods all conduct continuous optimization to update the continuous architecture representation αθ only along the gradient of validation performance based on Eq.2. Such a method would easily get into the rich-get-richer problem. To overcome this problem, we add the novelty into the gradient to enhance exploration to avoid local optima in architecture search, and update the architecture according” [sec(s) 3.2] “As described in Section 2, the differentiable NAS is built upon One-Shot NAS, which trains numerous architectures with partially shared weights on a single dataset. Without losing generality, this paper also considers the typical scenario that only one architecture (a single path) in the supernet is trained in each step of architecture search. Now we simply define each step of supernet training, argminLtrain(αiθ,WA) = argminLtrain(WA(αi)), as a task, and the supernet is trained on multiple sequential tasks through a lifelong learning setting [8, 30] or a online multi-task learning setting [10].” [sec(s) 4.3] “In the following, we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
S6: updating the embedding positions in the direction of the gradient provided by the performance predictor using gradient ascent, such that the updated embedding positions perform better than the initial embedding position;
(Zhang [algorithm 1] “Retrain α∗ and get the best performance on the test dataset Dtest” [sec(s) 3.1] “we encode the computation C that is the final output of the neural network into a continuous representation z. … Furthermore, we know that the graph encoder maps C to z injectively if the aggregation function G and the updating function U in the graph encoder are injective. … Exploration Enhancement After transforming the discrete architecture into continuous space, existing differentiable NAS methods all conduct continuous optimization to update the continuous architecture representation αθ only along the gradient of validation performance based on Eq.2. Such a method would easily get into the rich-get-richer problem. To overcome this problem, we add the novelty into the gradient to enhance exploration to avoid local optima in architecture search, and update the architecture according” [sec(s) 4.3] “In the following, we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
S7: repeating steps S1 to S6 L times, wherein L is a positive integer.
(Zhang [algorithm 1] “while not done do” [sec(s) 4.2] “Apart from the test accuracy of the searched architecture in the last iteration, we further demonstrate the test accuracy of the best searched architecture in all iterations to present the exploration ability in differentiable One-Shot NAS (Best (%) in Table 2). We first investigate the effectiveness of our proposed injective transformation method.”;)
Li is combinable with Zhang for the same rationale as set forth above with respect to claim 1.
Regarding claim 10
The combination of Li, Zhang teaches claim 9.
further teaches
S8: jumping to the embedding position of the top N preferred neural architectures in the vector space;
(Zhang [algorithm 1] “Randomly initialize architecture parameter αθ and supernet weights WA(α)” and “Retrain α∗ and get the best performance” [sec(s) Abs] “this paper utilizes a variational graph autoencoder to injectively transform the discrete architecture space into an equivalently continuous latent space, to resolve the incongruence.” [sec(s) 4.1] “Our E2NAS with random seed 0 obtains a 94.22%, 73.13%, and 46.48% on CIFAR-10, CIFAR-100, and ImageNet, respectively, which are almost equal to the optimal point in NAS-Bench-201 dataset.” [sec(s) 1] “architectures with better performance in the early stage would be trained more frequently, and the updated weights further make these architectures having a higher probability of being sampled, which easily leads to a local optimal.” [sec(s) 3.1] “we adopt an asynchronous message passing scheme based graph neural network (GNN) to encode the neural architecture into an injective space. Different from encoding the graph in many GNNs, we encode the computation C that is the final output of the neural network into a continuous representation z. … While it is intractable to measure the novelty of architectures in the discrete space, we could calculate the probability density function of αiθ drawn from the distribution formulated by continuous architectures αθ in the archive A, which is also called as probabilistic novelty detection in latent space.” [sec(s) 4.3] “we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
wherein N is a positive integer, and the preferred neural architectures are sorted by performance, and those with better performance have a lower index.
(Zhang [algorithm 1] “Randomly initialize architecture parameter αθ and supernet weights WA(α)” and “Retrain α∗ and get the best performance” [sec(s) Abs] “this paper utilizes a variational graph autoencoder to injectively transform the discrete architecture space into an equivalently continuous latent space, to resolve the incongruence.” [sec(s) 4.1] “Our E2NAS with random seed 0 obtains a 94.22%, 73.13%, and 46.48% on CIFAR-10, CIFAR-100, and ImageNet, respectively, which are almost equal to the optimal point in NAS-Bench-201 dataset.” [sec(s) 1] “architectures with better performance in the early stage would be trained more frequently, and the updated weights further make these architectures having a higher probability of being sampled, which easily leads to a local optimal.” [sec(s) 3.1] “we adopt an asynchronous message passing scheme based graph neural network (GNN) to encode the neural architecture into an injective space. Different from encoding the graph in many GNNs, we encode the computation C that is the final output of the neural network into a continuous representation z. … While it is intractable to measure the novelty of architectures in the discrete space, we could calculate the probability density function of αiθ drawn from the distribution formulated by continuous architectures αθ in the archive A, which is also called as probabilistic novelty detection in latent space.” [sec(s) 4.3] “we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
The combination of Li, Zhang is combinable with Zhang for the same rationale as set forth above with respect to claim 1.
Claim(s) 2 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (A mining pool solution for novel proof-of-neural-architecture consensus) in view of Zhang et al. (Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement) in view of Wen et al. (Neural Predictor for Neural Architecture Search)
Regarding claim 2
The combination of Li, Zhang teaches claim 1.
Li further teaches
wherein the initialization process is namely a first node and the search space division process is namely a second node;
(Li [sec(s) II] “PoDL: The consensus based on the deep learning algorithm divides each block time into two or more interval phases [4] [6]. In general, each block includes the initialization phase, the training phase, and the validation phase. In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) I] “Fig. 1 shows the scenario when two miners search the neural network in two different subspaces. In the mining pool based on the PoDL consensus, each participant will suffer more from network bandwidth as the bottleneck for common distributed deep learning training. In this design, the mining pool will split the NAS searching space into multiple subspaces. Thus, the searching tasks will become relatively more independent.” [sec(s) Abs] “This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks.”;)
wherein the first node is configured for providing a preset search space and dataset, and the second node is configured for generating [a variational autoencoder] and dividing the search space; wherein [the variational autoencoder is trained on an encoder training dataset] and broadcasted to the third nodes;
(LI [sec(s) II] “In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc..”;)
the distribution of the neural architectures in the vector space is selected from a Gaussian distribution with a standard deviation of 1 or the distribution specified by the first node.
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc. … Exploration and exploitation: The searching space will only be sent to strong miners. Once a solution is confirmed outperforms the current best results, the strong miners will share the corresponding hyperparameters with weak miners. Weak miners will only exploit the confirmed architecture.” [sec(s) Abs] “The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks. The strong miners are assigned for exploration and the weak miners are assigned for exploitation.”;)
Zhang further teaches
wherein the first node is configured for providing a preset search space and dataset, and the second node is configured for generating a variational autoencoder and dividing the search space; wherein the variational autoencoder is trained on an encoder training dataset and broadcasted to the third nodes;
(Zhang [algorithm 1] “Trained encoder E and decoder D, training dataset Dtrain and validation dataset Dvalid” [sec(s) Abs] “Differently, this paper utilizes a variational graph autoencoder to injectively transform the discrete architecture space into an equivalently continuous latent space, to resolve the incongruence.” [sec(s) 4.2] “In this subsection, we conduct comparison experiments to demonstrate the effectiveness of our injective transformation, and all experiments are conducted without exploration enhancement that γ (in Eq. 3) is set as 0, nor relieving catastrophic forgetting that ε (in Eq. 7) is set as 0. As described in Section 3.1, we first adopt a graph autoencoder to injectively transform the discrete architecture into an equivalently continuous latent space, and then conduct differentiable optimization for architecture search as common differentiable NAS.” [sec(s) 3.2] “As described in Section 2, the differentiable NAS is built upon One-Shot NAS, which trains numerous architectures with partially shared weights on a single dataset. Without losing generality, this paper also considers the typical scenario that only one architecture (a single path) in the supernet is trained in each step of architecture search. Now we simply define each step of supernet training, argminLtrain(αiθ,WA) = argminLtrain(WA(αi)), as a task, and the supernet is trained on multiple sequential tasks through a lifelong learning setting [8, 30] or a online multi-task learning setting [10]. … In our E2NAS, we train the encoder E and decoder D”;)
wherein the encoder training dataset consists of several [randomly] generated neural architectures that meet the search space; and
(Zhang [algorithm 1] “Randomly initialize architecture parameter αθ and supernet weights WA(α)” and “Sample batch of Dtrain, decode αθ to get α based on D, get the complementary architecture αc, and update the supernet weights WA(α) based on Eq. (7), and add architecture α into A;” [table(s) 1] “The best single run of our E2NAS (with random seed 0)” [fig(s) 1] “Trajectory of validation accuracy of sampled architecture during supernet training on DARTS search space” [sec(s) 4.1] “The comparison results on NAS-Bench-201 with NAS baselines are demonstrated in Table 1, where we report the statistical results from independent search experiments with different random seeds (The random seeds for all experiments on NAS-Bench-201 are set as {0,1}.).” [sec(s) 4.2] “we consider a variant of GDAS, GDAS-A, which directly samples architectures through argmax during the supernet training.” [sec(s) 4.4] “Figure 1 (b) tracks the validation accuracy of the sampled architectures during the supernet training for differentiable One-Shot NAS methods on a common convolutional search space [11, 22].” [sec(s) 2] “One-Shot NAS uses a controller to sample discrete architectures from the search space for supernet training, and the most promising architecture α∗ is obtained through heuristic search methods based on the trained supernet.”;)
The combination of Li, Zhang is combinable with Zhang for the same rationale as set forth above with respect to claim 1.
However, the combination of Li, Zhang does not appear to explicitly teach:
wherein the encoder training dataset consists of several [randomly] generated neural architectures that meet the search space; and
Wen teaches
wherein the encoder training dataset consists of several randomly generated neural architectures that meet the search space; and
(Wen [sec(s) Abs] “we train N random architectures to generate N (architecture, validation accuracy) pairs and use them to train a regression model that predicts accuracies for architectures.” [sec(s) 1] “To achieve this, the proposed Neural Predictor uses the following steps to perform an architecture search: (1) Build a predictor by training N random architectures to obtain N (architecture, validation accuracy) pairs. Use this data to train a regressor. (2) Quality prediction using the regression model over a large set of random architectures. Select the K most promising architectures for final validation. (3) Final validation of the top K architectures by training them. Then we select the architecture with the highest validation accuracy to deploy.”;)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Li, Zhang with the randomly generated neural architectures of Wen.
One of ordinary skill in the art would have been motived to combine in order to obtain the architecture with the best and highest validation accuracies for deployment.
(Wen [sec(s) 1] “With an infinite compute budget, a very simple but na¨ıve approach to architecture search would be to sample tons of random architectures, train and evaluate each one, and then select the architectures with the best validation set accuracies for deployment; … To achieve this, the proposed Neural Predictor uses the following steps to perform an architecture search: (1) Build a predictor by training N random architectures to obtain N (architecture, validation accuracy) pairs. Use this data to train a regressor. (2) Quality prediction using the regression model over a large set of random architectures. Select the K most promising architectures for final validation. (3) Final validation of the top K architectures by training them. Then we select the architecture with the highest validation accuracy to deploy.”)
Claim(s) 3-6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (A mining pool solution for novel proof-of-neural-architecture consensus) in view of Zhang et al. (Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement) in view of Liu et al. (DARTS: DIFFERENTIABLE ARCHITECTURE SEARCH)
Regarding claim 3
The combination of Li, Zhang teaches claim 1.
wherein the step of updating the embedding position based on the gradient direction provided by the performance trainer comprises: (See claim 1)
Zhang further teaches
determining the initial embedding position;
(Zhang [algorithm 1] “Randomly initialize architecture parameter αθ and supernet weights WA(α)” and “Sample batch of Dtrain, decode αθ to get α based on D, get the complementary architecture αc, and update the supernet weights WA(α) based on Eq. (7), and add architecture α into A;” [table(s) 1] “The best single run of our E2NAS (with random seed 0)” [fig(s) 1] [sec(s) 4.1] “The comparison results on NAS-Bench-201 with NAS baselines are demonstrated in Table 1, where we report the statistical results from independent search experiments with different random seeds (The random seeds for all experiments on NAS-Bench-201 are set as {0,1}.).” [sec(s) 4.2] “we consider a variant of GDAS, GDAS-A, which directly samples architectures through argmax during the supernet training.”;)
calculating the gradient direction given by the performance predictor through [backpropagation] algorithm, and
(Zhang [algorithm 1] “Retrain α∗ and get the best performance on the test dataset Dtest” [sec(s) 3] “Our novel approach consists of two key components. First, we develop an exploration enhancement module to overcome the rich-get-richer problem in a differentiable space. Second, we develop a architecture complementation loss function for relieving catastrophic forgetting. More details follow” [sec(s) 3.1] “we encode the computation C that is the final output of the neural network into a continuous representation z. … Furthermore, we know that the graph encoder maps C to z injectively if the aggregation function G and the updating function U in the graph encoder are injective. … Exploration Enhancement After transforming the discrete architecture into continuous space, existing differentiable NAS methods all conduct continuous optimization to update the continuous architecture representation αθ only along the gradient of validation performance based on Eq.2. Such a method would easily get into the rich-get-richer problem. To overcome this problem, we add the novelty into the gradient to enhance exploration to avoid local optima in architecture search, and update the architecture according” [sec(s) 4.3] “In the following, we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
obtaining the updated result of the initial embedding position through gradient ascent;
(Zhang [algorithm 1]; [sec(s) 3]; [sec(s) 3.1] “we encode the computation C that is the final output of the neural network into a continuous representation z. … Furthermore, we know that the graph encoder maps C to z injectively if the aggregation function G and the updating function U in the graph encoder are injective. … Exploration Enhancement After transforming the discrete architecture into continuous space, existing differentiable NAS methods all conduct continuous optimization to update the continuous architecture representation αθ only along the gradient of validation performance based on Eq.2. Such a method would easily get into the rich-get-richer problem. To overcome this problem, we add the novelty into the gradient to enhance exploration to avoid local optima in architecture search, and update the architecture according” [sec(s) 4.3] “In the following, we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
wherein the updated result is an optimized embedding position that performs better than the initial embedding position.
(Zhang [algorithm 1]; [sec(s) 3]; [sec(s) 3.1] “we encode the computation C that is the final output of the neural network into a continuous representation z. … Furthermore, we know that the graph encoder maps C to z injectively if the aggregation function G and the updating function U in the graph encoder are injective. … Exploration Enhancement After transforming the discrete architecture into continuous space, existing differentiable NAS methods all conduct continuous optimization to update the continuous architecture representation αθ only along the gradient of validation performance based on Eq.2. Such a method would easily get into the rich-get-richer problem. To overcome this problem, we add the novelty into the gradient to enhance exploration to avoid local optima in architecture search, and update the architecture according” [sec(s) 4.3] “In the following, we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
The combination of Li, Zhang is combinable with Zhang for the same rationale as set forth above with respect to claim 1.
However, the combination of Li, Zhang does not appear to explicitly teach:
calculating the gradient direction given by the performance predictor through [backpropagation] algorithm, and
Liu teaches
calculating the gradient direction given by the performance predictor through backpropagation algorithm, and
(Liu [algorithm 1] [sec(s) 2.3] “Applying chain rule to the approximate architecture gradient (equation 6) yields
PNG
media_image2.png
80
1260
media_image2.png
Greyscale
(7) where
PNG
media_image3.png
64
698
media_image3.png
Greyscale
denotes the weights for a one-step forward model. The expression above contains an expensive matrix-vector product in its second term. Fortunately, the complexity can be substantially reduced using the finite difference approximation. Let be a small scalar2 and
PNG
media_image4.png
73
717
media_image4.png
Greyscale
. Then:
PNG
media_image5.png
129
1814
media_image5.png
Greyscale
(8) Evaluating the finite difference requires only two forward passes for the weights and two backward passes for α, and the complexity is reduced from O(|α||w|) to O(|α| + |w|)” [sec(s) B SEARCH WITH INCREASED DEPTH] “Moreover, searching with a deeper model might require different hyperparameters due to the increased number of layers to back-prop through” [sec(s) Abs] “This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent.”;)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Li, Zhang with the backpropagation of Liu.
One of ordinary skill in the art would have been motived to combine in order to achieve remarkable efficiency improvement (reducing the cost of architecture discovery to a few GPU days), which is attributed to the use of gradient-based optimization as opposed to non-differentiable search techniques.
(Liu [sec(s) 1] “• Through extensive experiments on image classification and language modeling tasks we show that gradient-based architecture search achieves highly competitive results on CIFAR-10 and outperforms the state of the art on PTB. This is a very interesting result, considering that so far the best architecture search methods used non-differentiable search techniques, e.g. based on RL (Zoph et al., 2018) or evolution (Real et al., 2018; Liu et al., 2018b). • We achieve remarkable efficiency improvement (reducing the cost of architecture discovery to a few GPU days), which we attribute to the use of gradient-based optimization as opposed to non-differentiable search techniques.”)
Regarding claim 4
The combination of Li, Zhang, Liu teaches claim 3.
Li further teaches
the identifier is assigned by the second node or generated automatically by the third node.
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc..” [sec(s) Abs] “We are the first to demonstrate a mining pool solution for novel consensuses based on deep learning. This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks. The strong miners are assigned for exploration and the weak miners are assigned for exploitation.”;)
Zhang further teaches
wherein the initial embedding position is the embedding position of the initial neural architecture in the vector space, which is selected from the neural architecture assigned by the first node or a randomly generated neural architecture;
(Zhang [algorithm 1] “Randomly initialize architecture parameter αθ and supernet weights WA(α)” [sec(s) Abs] “this paper utilizes a variational graph autoencoder to injectively transform the discrete architecture space into an equivalently continuous latent space, to resolve the incongruence.” [sec(s) 4.1] “Our E2NAS with random seed 0 obtains a 94.22%, 73.13%, and 46.48% on CIFAR-10, CIFAR-100, and ImageNet, respectively, which are almost equal to the optimal point in NAS-Bench-201 dataset.” [sec(s) 1] “For the incongruence in the relaxation transformation of differentiable NAS, we utilize a variational graph autoencoder with an asynchronous message passing scheme to transform the discrete architectures into an equivalent continuous space injectively. Because of the injectiveness, we could equivalently perform optimization in the continuous latent space with a solid theoretical foundation [33, 38].” [sec(s) 3.1] “we adopt an asynchronous message passing scheme based graph neural network (GNN) to encode the neural architecture into an injective space. Different from encoding the graph in many GNNs, we encode the computation C that is the final output of the neural network into a continuous representation z. … While it is intractable to measure the novelty of architectures in the discrete space, we could calculate the probability density function of αiθ drawn from the distribution formulated by continuous architectures αθ in the archive A, which is also called as probabilistic novelty detection in latent space.”; Note that Li teaches “first node.”)
wherein the randomly generated neural architecture is generated randomly based on the identifier as the random seed and the setting of the search space; and
(Zhang [algorithm 1] “Randomly initialize architecture parameter αθ and supernet weights WA(α)” [sec(s) Abs] “this paper utilizes a variational graph autoencoder to injectively transform the discrete architecture space into an equivalently continuous latent space, to resolve the incongruence.” [sec(s) 4.1] “Our E2NAS with random seed 0 obtains a 94.22%, 73.13%, and 46.48% on CIFAR-10, CIFAR-100, and ImageNet, respectively, which are almost equal to the optimal point in NAS-Bench-201 dataset.” [sec(s) 1] “For the incongruence in the relaxation transformation of differentiable NAS, we utilize a variational graph autoencoder with an asynchronous message passing scheme to transform the discrete architectures into an equivalent continuous space injectively. Because of the injectiveness, we could equivalently perform optimization in the continuous latent space with a solid theoretical foundation [33, 38].” [sec(s) 3.1] “As mentioned before, existing differentiable NAS methods usually adopt a simple continuous relaxation [22] to transform the discrete neural architectures (usually represented as DAGs) into a continuous space. As they could hardly guarantee that this transformation is injective, they suffer the problem of incongruence. For this problem, we adopt an asynchronous message passing scheme based graph neural network (GNN) to encode the neural architecture into an injective space. Different from encoding the graph in many GNNs, we encode the computation C that is the final output of the neural network into a continuous representation z. … While it is intractable to measure the novelty of architectures in the discrete space, we could calculate the probability density function of αiθ drawn from the distribution formulated by continuous architectures αθ in the archive A, which is also called as probabilistic novelty detection in latent space.”;)
The combination of Li, Zhang, Liu is combinable with Zhang for the same rationale as set forth above with respect to claim 1.
Regarding claim 5
The combination of Li, Zhang, Liu teaches claim 3.
Zhang further teaches
wherein the initial embedding position is the embedding position according to the top N preferred neural architectures in the vector space;
(Zhang [algorithm 1] “Randomly initialize architecture parameter αθ and supernet weights WA(α)” and “Retrain α∗ and get the best performance” [sec(s) Abs] “this paper utilizes a variational graph autoencoder to injectively transform the discrete architecture space into an equivalently continuous latent space, to resolve the incongruence.” [sec(s) 4.1] “Our E2NAS with random seed 0 obtains a 94.22%, 73.13%, and 46.48% on CIFAR-10, CIFAR-100, and ImageNet, respectively, which are almost equal to the optimal point in NAS-Bench-201 dataset.” [sec(s) 1] “architectures with better performance in the early stage would be trained more frequently, and the updated weights further make these architectures having a higher probability of being sampled, which easily leads to a local optimal.” [sec(s) 3.1] “we adopt an asynchronous message passing scheme based graph neural network (GNN) to encode the neural architecture into an injective space. Different from encoding the graph in many GNNs, we encode the computation C that is the final output of the neural network into a continuous representation z. … While it is intractable to measure the novelty of architectures in the discrete space, we could calculate the probability density function of αiθ drawn from the distribution formulated by continuous architectures αθ in the archive A, which is also called as probabilistic novelty detection in latent space.” [sec(s) 4.3] “we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
wherein N is a positive integer, and the preferred neural architectures are sorted according to the performance evaluation results, and neural architectures with better performance have lower numbers.
(Zhang [algorithm 1] “Randomly initialize architecture parameter αθ and supernet weights WA(α)” and “Retrain α∗ and get the best performance” [sec(s) Abs] “this paper utilizes a variational graph autoencoder to injectively transform the discrete architecture space into an equivalently continuous latent space, to resolve the incongruence.” [sec(s) 4.1] “Our E2NAS with random seed 0 obtains a 94.22%, 73.13%, and 46.48% on CIFAR-10, CIFAR-100, and ImageNet, respectively, which are almost equal to the optimal point in NAS-Bench-201 dataset.” [sec(s) 1] “architectures with better performance in the early stage would be trained more frequently, and the updated weights further make these architectures having a higher probability of being sampled, which easily leads to a local optimal.” [sec(s) 3.1] “we adopt an asynchronous message passing scheme based graph neural network (GNN) to encode the neural architecture into an injective space. Different from encoding the graph in many GNNs, we encode the computation C that is the final output of the neural network into a continuous representation z. … While it is intractable to measure the novelty of architectures in the discrete space, we could calculate the probability density function of αiθ drawn from the distribution formulated by continuous architectures αθ in the archive A, which is also called as probabilistic novelty detection in latent space.” [sec(s) 4.3] “we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
The combination of Li, Zhang, Liu is combinable with Zhang for the same rationale as set forth above with respect to claim 1.
Regarding claim 6
The combination of Li, Zhang, Liu teaches claim 3.
wherein the step of training to obtain a performance predictor comprises: (See claim 1)
Li further teaches
obtaining k sets of training data to form a predictor training dataset;
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Exploration and exploitation: The searching space will only be sent to strong miners. Once a solution is confirmed outperforms the current best results, the strong miners will share the corresponding hyperparameters with weak miners. Weak miners will only exploit the confirmed architecture.” [sec(s) IV] “The NAS [13] is adopted to find a convolutional neural networks architecture for classification tasks under a certain hardware constrain. We use CIFAR-10 dataset [20]. All experiments were deployed on the workstation with Intel i7-9900K CPU @ 3.60GHz, 32Gb RAM, GTX 1080 Ti.”;)
training the [performance] predictor based on the predictor training dataset;
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Exploration and exploitation: The searching space will only be sent to strong miners. Once a solution is confirmed outperforms the current best results, the strong miners will share the corresponding hyperparameters with weak miners. Weak miners will only exploit the confirmed architecture.” [sec(s) IV] “The NAS [13] is adopted to find a convolutional neural networks architecture for classification tasks under a certain hardware constrain. We use CIFAR-10 dataset [20]. All experiments were deployed on the workstation with Intel i7-9900K CPU @ 3.60GHz, 32Gb RAM, GTX 1080 Ti.”;)
Zhang further teaches
training the performance predictor based on the predictor training dataset;
(Zhang [algorithm 1] “Retrain α∗ and get the best performance on the test dataset Dtest” [sec(s) 3.1] “we encode the computation C that is the final output of the neural network into a continuous representation z. … Furthermore, we know that the graph encoder maps C to z injectively if the aggregation function G and the updating function U in the graph encoder are injective. … Exploration Enhancement After transforming the discrete architecture into continuous space, existing differentiable NAS methods all conduct continuous optimization to update the continuous architecture representation αθ only along the gradient of validation performance based on Eq.2. Such a method would easily get into the rich-get-richer problem. To overcome this problem, we add the novelty into the gradient to enhance exploration to avoid local optima in architecture search, and update the architecture according” [sec(s) 3.2] “As described in Section 2, the differentiable NAS is built upon One-Shot NAS, which trains numerous architectures with partially shared weights on a single dataset. Without losing generality, this paper also considers the typical scenario that only one architecture (a single path) in the supernet is trained in each step of architecture search. Now we simply define each step of supernet training, argminLtrain(αiθ,WA) = argminLtrain(WA(αi)), as a task, and the supernet is trained on multiple sequential tasks through a lifelong learning setting [8, 30] or a online multi-task learning setting [10].” [sec(s) 4.3] “In the following, we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
wherein each set of training data comprises: candidate neural architecture and its corresponding performance, and the candidate neural architecture is sampled from a peripheral area of the initial embedding position.
(Zhang [fig(s) 1] “Trajectory of validation accuracy of sampled architecture during supernet training on DARTS search space” [algorithm 1] “Retrain α∗ and get the best performance on the test dataset Dtest” [sec(s) 3.1] “we encode the computation C that is the final output of the neural network into a continuous representation z. … Furthermore, we know that the graph encoder maps C to z injectively if the aggregation function G and the updating function U in the graph encoder are injective. … Exploration Enhancement After transforming the discrete architecture into continuous space, existing differentiable NAS methods all conduct continuous optimization to update the continuous architecture representation αθ only along the gradient of validation performance based on Eq.2. Such a method would easily get into the rich-get-richer problem. To overcome this problem, we add the novelty into the gradient to enhance exploration to avoid local optima in architecture search, and update the architecture according … every architecture αi could be sampled with noise ξi through αi = D(αiθ) + ξi, where αiθ ∈ Rn.” [sec(s) 4.3] “In the following, we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “Figure 1 (b) tracks the validation accuracy of the sampled architectures during the supernet training for differentiable One-Shot NAS methods on a common convolutional search space [11, 22]. … The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
The combination of Li, Zhang, Liu is combinable with Zhang for the same rationale as set forth above with respect to claim 1.
Claim(s) 7-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (A mining pool solution for novel proof-of-neural-architecture consensus) in view of Zhang et al. (Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement) in view of Liu et al. (Merkle Tree: A Fundamental Component of Blockchains, hereinafter Liu2021)
Regarding claim 7
The combination of Li, Zhang teaches claim 1.
Li teaches
wherein the third node comprises:
a search sub-node for determining candidate neural architectures for work sub-node;
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc..” [sec(s) Abs] “We are the first to demonstrate a mining pool solution for novel consensuses based on deep learning. This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks. The strong miners are assigned for exploration and the weak miners are assigned for exploitation.”;)
a work sub-node for determining the [performance] of the candidate neural architectures, training a [performance] predictor, and [updating the embedding position]; and
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc..” [sec(s) Abs] “We are the first to demonstrate a mining pool solution for novel consensuses based on deep learning. This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks. The strong miners are assigned for exploration and the weak miners are assigned for exploitation.”;)
a validation sub-node for performing a verification process based on [a Merkel tree] in a non-trusted environment to verify the data information broadcasted by other third nodes.
(Li [sec(s) II] “In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) III] “Mining pool participants: 1) the pool manager: the pool manager will split the searching space of the selected NAS task into multiple subspaces and assign the searching spaces to miners. 2) the miners: In the mining pool design, we separate miners into strong miners and weak miners. For strong miners, they can finish the search task in a given subspace. For weak miners, the search task cannot be finished due to the limitation of network bandwidths, hardware, etc..” [sec(s) Abs] “We are the first to demonstrate a mining pool solution for novel consensuses based on deep learning. This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks. The strong miners are assigned for exploration and the weak miners are assigned for exploitation.”;)
Zhang further teaches
a work sub-node for determining the performance of the candidate neural architectures, training a performance predictor, and updating the embedding position; and
(Zhang [algorithm 1] “Retrain α∗ and get the best performance on the test dataset Dtest” and “Sample batch of Dvalid, and update αθ based on Eq. (3);” [sec(s) 3.1] “we encode the computation C that is the final output of the neural network into a continuous representation z. … Furthermore, we know that the graph encoder maps C to z injectively if the aggregation function G and the updating function U in the graph encoder are injective. … Exploration Enhancement After transforming the discrete architecture into continuous space, existing differentiable NAS methods all conduct continuous optimization to update the continuous architecture representation αθ only along the gradient of validation performance based on Eq.2. Such a method would easily get into the rich-get-richer problem. To overcome this problem, we add the novelty into the gradient to enhance exploration to avoid local optima in architecture search, and update the architecture according” [sec(s) 4.3] “In the following, we investigate how exploration enhancement affects the performance of our E2NAS and the necessity of exploration in differentiable architecture search. In our E2NAS, a bigger γ enhances the exploration to avoid local optimal, and a smaller γ guarantees better solutions with higher validation performance.” [sec(s) 4.4] “The performance of architectures by inheriting weights in this curve is getting better with the supernet training, making the assumption in bilevel optimization based differentiable NAS hold true.”;)
The combination of Li, Zhang is combinable with Zhang for the same rationale as set forth above with respect to claim 1.
However, the combination of Li, Zhang does not appear to explicitly teach:
a validation sub-node for performing a verification process based on [a Merkel tree] in a non-trusted environment to verify the data information broadcasted by other third nodes.
Liu2021 teaches
a validation sub-node for performing a verification process based on a Merkel tree in a non-trusted environment to verify the data information broadcasted by other third nodes.
(Liu2021 [sec(s) Abs] “With the increasing popularity of blockchain technology, Merkle trees (or hash trees) are playing significant roles in verifying and retrieving data for the implementation of blockchain for allowing efficient and secure verification of the contents of large data structures.” [sec(s) IV] “As is discussed above, it consumes plenty of time to verify a file at a time, especially when the file is divided into several parts, stored in distinct places. Instead of transmitting the entire Merkle tree, the server merely sends the block headers and adds some relevant nodes in Merkle trees in the process of verification or authentication. As a result of that, the efficiency and superiority of Merkle trees ensure its application in verification, like Simple Verification Payment in Bitcoin. Since Merkle trees are typically implemented as binary trees, only the computational complexity of binary form Merkle trees will be included in this section.” [sec(s) VI] “the overall investigative direction of Merkle trees is not just blockchain. It is also widely applied in the P2P network, credential authentication, and verification.”;)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Li, Zhang with the Merkel tree of Liu2021.
One of ordinary skill in the art would have been motived to combine in order to lead to a more flexible manipulation for developers to create the blockchain based on the unique characteristic of a hash tree measured with the SHA-256 algorithm.
(Liu2021 [sec(s) VI] “The increasingly wide application of Merkle trees contributes to a promising prospect for blockchain. In summary, a Merkle tree is a hash tree measured with the SHA-256 algorithm. This unique characteristic leads to a more flexible manipulation for developers to create the blockchain. In the blockchain, characterized by decentralization, transactions appear to be transparent from the users.”)
Regarding claim 8
The combination of Li, Zhang, Liu2021 teaches claim 7.
Li further teaches
wherein the work sub-node is further configured to test the performance of the candidate neural architectures through the dataset or obtain the performance of the candidate neural architectures by collecting data information from other third nodes.
(Li [sec(s) II] “PoDL: The consensus based on the deep learning algorithm divides each block time into two or more interval phases [4] [6]. In general, each block includes the initialization phase, the training phase, and the validation phase. In the initialization phase, all miners confirm the target task and evaluate the training setup, such as the target training epochs and the size of the dataset. In the training phase, miners train the confirmed target task and commit their model before the training phase ends. Here the miners submit the hash of their deep learning model, training results, and miners ID. The task publishers release training dataset and deep learning training source code. [4], [5] In the validation phase, the task publisher releases the test dataset to miners and full nodes, and each miner submits (1) the block header and the block that contains information describing the trained model on top of existing attributes, (2) the trained model, and (3) the accuracy of the trained model, to full nodes. The full nodes validate the submitted models [4], [5].” [sec(s) I] “Fig. 1 shows the scenario when two miners search the neural network in two different subspaces. In the mining pool based on the PoDL consensus, each participant will suffer more from network bandwidth as the bottleneck for common distributed deep learning training. In this design, the mining pool will split the NAS searching space into multiple subspaces. Thus, the searching tasks will become relatively more independent.” [sec(s) Abs] “This work adopts from exist Proof-of-Deep-Learning (PoDL) as the consensus and Neural Architecture Search (NAS) as the workload. The mining pool manager partitions the full searching space into subspaces and all miners contributes to the NAS task in the assigned tasks.”;)
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEHWAN KIM whose telephone number is (571)270-7409. The examiner can normally be reached Mon - Fri 9:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SEHWAN KIM/Examiner, Art Unit 2129 3/8/2026