Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
Applicant is reminded of the proper language and format for an abstract of the disclosure. IN particular, it should avoid using phrases which can be implied, such as, “The disclosure concerns,” “The disclosure defined by this invention,” “The disclosure describes,” etc.
Examiner respectfully suggests amending the Abstract to read similar to “A method and an apparatus for generating a neural network by: Automatic searching for a network structure satisfying a preset constraint is enabled in a search space of network structures.”
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-11 and 14-20 rejected under 35 U.S.C. 102(a) (1) as being anticipated by Zhou et al., US Pre-Grant Publication No. 2019/0354837 (hereafter Zhou).
Regarding claim 1 and analogous claims 14 and 17:
Zhou teaches:
“A method for generating a neural network, comprising”: Zhou, paragraph 0048, “FIG. 2 depicts a general methodology that may be employed by a RENA framework embodiment [“A method for generating a neural network], according to embodiments of the present disclosure”; Zhou, paragraph 0022, “In the following description, for purposes of explanation, specific details are set forth in order to provide an understanding of the disclosure. It will be apparent, however, to one skilled in the art that the disclosure can be practiced without these details. Furthermore, one skilled in the art will recognize that embodiments of the present disclosure, described below, may be implemented in a variety of ways, such as a process, an apparatus, a system, a device, or a method on a tangible computer-readable medium.”
“training a plurality of neural networks for a plurality of performance parameters to obtain a plurality of parameter values for each of the plurality of performance parameters”: Zhou, paragraph 0048, “In one or more embodiments, an initial neural network architecture configuration is mapped (205) into a representation, such as using a lookup table. In one or more embodiments, a policy network converts (210) that initial neural network architecture configuration representation into a network embedding. Then, in embodiments, the policy network uses (215) that network embedding to automatically generate adaptations to the neural network architecture configuration [a plurality of neural networks]. In one or more embodiments, the adapted neural network is trained [training] (220) to convergence, and the trained adapted neural network architecture may be evaluated (225) based upon one or more metrics (e.g., accuracy, memory footprint, power consumption, inference latency, etc.) [to obtain a plurality of parameter values for each of the plurality of performance parameters].”
“training a plurality of neural network predictors based on the parameter values and the neural networks”: Zhou, paragraph 0047, “In one or more embodiments, a value network 140 takes in network embedding of the generated target network 145 and data distributions to approximate the reward by ascertain metrics, such as network accuracy 150 and training time 155--although other metrics may also be determined. In one or more embodiments, the value network may predict target network accuracy and training time without actually running the target network till convergence. In one or more embodiments, both the accuracy network 150 and the training time network 155 are trainable neural networks that may be pre-trained or trained jointly with the policy network [training a plurality of neural network predictors based on the parameter values and the neural networks].”
“determining a target neural network using the trained neural network predictors”: Zhou, Fig. 1,
PNG
media_image1.png
877
1155
media_image1.png
Greyscale
[showing Reward Engine 160 using trained neural network predictors Accuracy Network 150 and Training Time Network 155]; Zhou, paragraph 0047, “In one or more embodiments, the final reward engine 160 sets weights to various metrics, such as network accuracy, model size, and training time, which may be set according to user specification. The configurable reward engine 160 enables finding neural architectures with various resource constraints [determining a target neural network using the trained neural network predictors], such as memory size and GPU time. In one or more embodiments, a policy gradient 165 is applied to train the policy network.”
Regarding claim 2 and analogous claims 15 and 18:
Zhou teaches “the method according to claim 1.”
Zhou further teaches “wherein the neural network predictors comprise a main predictor and an auxiliary predictor for predicting different ones of the plurality of performance parameters for the neural networks, respectively”: Zhou, paragraph 0046, “In one or more embodiments, a value network 140 takes in network embedding of the generated target network and data distributions to approximate the reward by ascertain metrics, such as network accuracy 150 and training time 155--although other metrics may also be determined. In one or more embodiments, the value network may predict target network accuracy and training time without actually running the target network till convergence. In one or more embodiments, both the accuracy network 150 and the training time network 155 [comprise a main predictor and an auxiliary predictor for predicting different ones of the plurality of performance parameters for the neural networks, respectively] are trainable neural networks that may be pre-trained or trained jointly with the policy network.”
Regarding claim 3 and analogous claims 16 and 19:
Zhou teaches “the method according to claim 1.”
Zhou further teaches “determining a set of network structures, each network structure in the set of network structures characterizing a neural network, wherein training the plurality of neural networks comprises: selecting a plurality of network structures characterizing the plurality of neural networks from the set of network structures”: Zhou, paragraph 0047, “In one or more embodiments, the final reward engine 160 sets weights to various metrics, such as network accuracy, model size, and training time, which may be set according to user specification. The configurable reward engine 160 enables finding neural architectures [selecting a plurality of network structures characterizing the plurality of neural networks from the set of network structures] with various resource constraints, such as memory size and GPU time. In one or more embodiments, a policy gradient 165 is applied to train the policy network”; Zhou, paragraph 0048, “In one or more embodiments, an initial neural network architecture configuration is mapped (205) into a representation, such as using a lookup table. In one or more embodiments, a policy network converts (210) that initial neural network architecture configuration representation into a network embedding. Then, in embodiments, the policy network uses (215) that network embedding to automatically generate adaptations to the neural network architecture configuration [determining a set of network structures, each network structure in the set of network structures characterizing a neural network]. In one or more embodiments, the adapted neural network is trained (220) to convergence [training the plurality of neural networks], and the trained adapted neural network architecture may be evaluated (225) based upon one or more metrics (e.g., accuracy, memory footprint, power consumption, inference latency, etc.).”
Regarding claim 4 and analogous claim 20:
Zhou teaches “the method according to claim 3.”
Zhou further teaches “wherein the set of network structures comprises a network structure represented by a directed acyclic graph, wherein each node of the directed acyclic graph represents an operation, wherein each edge of the directed acyclic graph represents a connection relationship between two corresponding nodes of the directed acyclic graph”: Zhou, Fig. 8,
PNG
media_image2.png
744
678
media_image2.png
Greyscale
[showing a directed acyclic graph representing a network structure, where each node, e.g., “Convolution + RELU,” represents an operation and each of the edges represents a connection relationship between two corresponding nodes].
Regarding claim 5:
Zhou teaches “the method according to claim 4.”
Zhou further teaches “wherein the set of network structures further comprises a network structure represented by a one-dimensional vector”: Zhou, paragraph 0060, “FIG. 5 graphically shows an alternative depiction of an embedding network 500, where a layer embedding network 505 takes a layer description and maps layer features into multiple lookup tables, according to embodiments of the present disclosure. In one or more embodiments, lookup tables (e.g., lookup tables 510-x) transform the discrete feature space into trainable feature vectors. An LSTM network takes layer feature vectors (e.g., 515-x) and generates a layer embedding 520 [a network structure represented by a one-dimensional vector]. After multiple layer embedding have been produced, a network embedding LSTM network 525 processes the sequential information in these layer embeddings and generates a network embedding 535. In one or more embodiments, this network embedding 535 is used as by the policy network and by a value network.”
Regarding claim 6:
Zhou teaches “the method according to claim 3.”
Zhou further teaches “wherein determining the target neural network using
the trained neural network predictors comprises: performing multiple iterations using the trained neural network predictors to obtain a group of network structures, comprising, for each iteration:
determining a plurality of gradient structures based on the network structure obtained in a previous iteration using the trained neural network predictors, and obtaining a network structure for the iteration based on a network structure obtained in the previous iteration and the gradient structures”: Zhou, paragraph 0076, “In one or more embodiments, the policy network generates (1205) a batch of actions at which produce a series of child networks, which may be considered in evolutionary branches [performing multiple iterations using the trained neural network predictors] (e.g., branch 1125 in FIG. 11). In one or more embodiments, the initial network architecture (e.g., Arch. NN[ilo,x 1115) for the start of a branch may be the same for two or more branches ( e.g., it may be replicated for each branch from a single input---e.g., an initial network architecture configuration, or the best network architecture configuration from one or more prior episodes). Or, in one or more embodiments, it may be different for two or more branches. For example, in one or more embodiments, the starting network architecture for a branch may be: varied (e.g., randomly varied) from an initial architecture input 1110 (particularly, if this is the first episode); the N best network architectures [obtain a group of network structures] from one or more prior episodes; a set of N architecture randomly selected from the best Y network architectures from one or more prior episodes may be used [obtaining a network structure for the iteration based on a network structure obtained in the previous iteration and the gradient structures], etc. In one or more embodiments, at each step, the child networks are trained (1210) until convergence and a combination of performance and resource use are used (1215) as an immediate reward, as given in Eq. 3 (see also 1115 in FIG. 11). Rewards of a full episode (e.g., episode 1105 in FIG. 11) may be accumulated to train the policy network using the policy gradient to get an updated policy network [determining a plurality of gradient structures based on the network structure obtained in a previous iteration using the trained neural network predictors] (e.g., updated policy network 1120).”
Regarding claim 7:
Zhou teaches “the method according to claim 6.”
Zhou further teaches “wherein determining the target neural network using the trained neural network predictors further comprises: selecting a network structure characterizing the target neural network from the group of network structures according to a predetermined rule”: Zhou, paragraph 0079, “To find neural architectures that meet multiple resource constraints, a reward based on the model performance may be penalized according to the extent of violating the constraints. Although a fixed hard penalty may be effective for some constraints, it may be challenging for the controller to learn from highly sparse rewards under tight resource constraints. Therefore, in one or more embodiments, a soft continuous penalization method may be used to enable finding architectures with high performance while still meeting all resource constraints [selecting a network structure characterizing the target neural network from the group of network structures according to a predetermined rule].”
Regarding claim 8:
Zhou teaches “the method according to claim 3.”
Zhou further teaches:
“wherein determining the target neural network using the trained neural network predictors comprises: performing multiple iterations to iteratively train the neural network predictors, comprising, for each iteration”: Zhou, paragraph 0077, “In one or more embodiments, the updated policy network is used for the next episode. The number of episodes may be user-selected or may be based upon one or more stop conditions ( e.g., runtime of RENA embodiment, number of iterations, convergence ( or difference between iteration is not changing more than a threshold, divergence, and/or performance of the neural network meets criteria) [performing multiple iterations to iteratively train the neural network predictors].”
“obtaining a group of network structures using the trained neural network predictors”: Zhou, paragraph 0048, “In one or more embodiments, an initial neural network architecture configuration is mapped (205) into a representation, such as using a lookup table. In one or more embodiments, a policy network converts (210) that initial neural network architecture configuration representation into a network embedding. Then, in embodiments, the policy network uses (215) that network embedding to automatically generate adaptations to the neural network architecture configuration [obtaining a group of network structures]. In one or more embodiments, the adapted neural network is trained (220) to convergence, and the trained adapted neural network architecture may be evaluated (225) based upon one or more metrics (e.g., accuracy, memory footprint, power consumption, inference latency, etc.). In one or more embodiments, a policy gradient method may be used (230) to compute a multi-objective reward that is feed back to the policy network to improve the policy network's ability to automatically generate a set of one or more best architectures. In one or more embodiments, a number of adapted neural network architectures may be processed in parallel per episode as part of the reinforcement step”; Zhou, Fig. 1,
PNG
media_image3.png
709
900
media_image3.png
Greyscale
[showing how trained neural network predictors, such as Accuracy Network 150 and Training Time Network 155, are used in the Value Network 140 to provide policy gradient data to influence policy in the Policy Network 110 to produce further network structures].
“training a neural network characterized by at least one network structure of the group of network structures for the plurality of performance parameters to obtain a group of parameter values for each of the plurality of performance parameters”: Zhou, paragraph 0048, “In one or more embodiments, an initial neural network architecture configuration is mapped (205) into a representation, such as using a lookup table. In one or more embodiments, a policy network converts (210) that initial neural network architecture configuration representation into a network embedding. Then, in embodiments, the policy network uses (215) that network embedding to automatically generate adaptations to the neural network architecture configuration. In one or more embodiments, the adapted neural network is trained (220) to convergence, and the trained adapted neural network architecture may be evaluated (225) based upon one or more metrics (e.g., accuracy, memory footprint, power consumption, inference latency, etc.) [training a neural network characterized by at least one network structure of the group of network structures for the plurality of performance parameters to obtain a group of parameter values for each of the plurality of performance parameters].”
“and training the neural network predictors based on at least the group of parameter values and the group of network structures”: Zhou, paragraph 0047, “In one or more embodiments, a value network 140 takes in network embedding of the generated target network 145 and data distributions to approximate the reward by ascertain metrics, such as network accuracy 150 and training time 155--although other metrics may also be determined. In one or more embodiments, the value network may predict target network accuracy and training time without actually running the target network till convergence. In one or more embodiments, both the accuracy network 150 and the training time network 155 are trainable neural networks that may be pre-trained or trained jointly with the policy network [training a plurality of neural network predictors based on the parameter values and the neural networks].”
Regarding claim 9:
Zhou teaches “the method according to claim 8.”
Zhou further teaches:
“wherein obtaining the group of network structures using the trained neural network predictors comprises: performing multiple iterations using the trained neural network predictors to obtain the group of network structures, comprising, for each iteration”: Zhou, paragraph 0077, “In one or more embodiments, the updated policy network is used for the next episode. The number of episodes may be user-selected or may be based upon one or more stop conditions ( e.g., runtime of RENA embodiment, number of iterations, convergence ( or difference between iteration is not changing more than a threshold, divergence, and/or performance of the neural network meets criteria) [performing multiple iterations]”; Zhou, Fig. 1,
PNG
media_image3.png
709
900
media_image3.png
Greyscale
[showing how trained neural network predictors, such as Accuracy Network 150 and Training Time Network 155, are used in the Value Network 140 to provide policy gradient data to influence policy in the Policy Network 110 to produce further network structures].
“determining a plurality of gradient structures based on the network structure obtained in a previous iteration using the trained neural network predictors, and”: Zhou, Fig. 1,
PNG
media_image3.png
709
900
media_image3.png
Greyscale
[showing trained neural network predictors, such as Accuracy Network 150 and Training Time Network 155, used in the Value Network 140 to provide policy gradient data to influence policy in the Policy Network 110 to produce further network structures, used from one iteration to the next]; Zhou, paragraph 0046, “In one or more embodiments, a value network 140 takes in network embedding of the generated target network 145 and data distributions to approximate the reward by ascertain metrics, such as network accuracy 150 and training time 155-although other metrics may also be determined. In one or more embodiments, the value network may predict target network accuracy and training time without actually running the target network till convergence. In one or more embodiments, both the accuracy network 150 and the training time network 155 are trainable neural networks that may be pre-trained or trained jointly with the policy network”; Zhou, paragraph 0089, “The parameters Θv if [sic] the value network is updated via gradient descent [determining a plurality of gradient structures] using
PNG
media_image4.png
39
283
media_image4.png
Greyscale
.”
“obtaining a network structure for the iteration based on a network structure obtained in the previous iteration and the gradient structures”: Zhou, paragraph 0089, “The parameters Θv if [sic] the value network is updated via gradient descent [gradient structures] using
PNG
media_image4.png
39
283
media_image4.png
Greyscale
”; Zhou, Fig. 1,
PNG
media_image3.png
709
900
media_image3.png
Greyscale
[showing trained neural network predictors, such as Accuracy Network 150 and Training Time Network 155, used in the Value Network 140, to provide policy gradient data to influence policy in the Policy Network 110 to produce further network structures, used from one iteration to the next].
Regarding claim 10:
Zhou teaches “the method according to claim 9.”
Zhou further teaches “wherein determining the target neural network using
the trained neural network predictors further comprises: selecting a network structure characterizing the target neural network from the group of network structures obtained in a last iteration for iteratively training the neural network predictors according to a predetermined rule”: Zhou, paragraph 0079, “To find neural architectures that meet multiple resource constraints, a reward based on the model performance may be penalized according to the extent of violating the constraints. Although a fixed hard penalty may be effective for some constraints, it may be challenging for the controller to learn from highly sparse rewards under tight resource constraints. Therefore, in one or more embodiments, a soft continuous penalization method may be used to enable finding architectures with high performance while still meeting all resource constraints [selecting a network structure characterizing the target neural network from the group of network structures … according to a predetermined rule].”
Regarding claim 11:
Zhou teaches “the method according to claim 6.”
Zhou further teaches “wherein obtaining the network structure for the iteration based on the network structure obtained in the previous iteration and the gradient structures comprises: assigning different weights to the gradient structures corresponding to different neural network predictors”: Zhou, paragraph 0047, “In one or more embodiments, the final reward engine 160 sets weights to various metrics, such as network accuracy, model size, and training time, which may be set according to user specification [assigning different weights to the gradient structures corresponding to different neural network predictors]. The configurable reward engine enables finding neural architectures with various resource constraints, such as memory size and GPU time. In one or more embodiments, a policy gradient is applied to train the policy network.”
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 12-13 rejected under 35 U.S.C. 103 over Zhou in view of Rao et al., US Pre-Grant Publication No. 2023/0091667 (hereafter Rao).
Regarding claim 12:
Zhou teaches “the method according to claim 6.”
Zhou further teaches:
“wherein obtaining the network structure for the iteration based on the network structure obtained in the previous iteration and the gradient structures comprises: modifying the network structure obtained in the previous iteration using the gradient structures”: Zhou, paragraph 0077, “In one or more embodiments, the updated policy network is used for the next episode. The number of episodes may be user-selected or may be based upon one or more stop conditions ( e.g., runtime of RENA embodiment, number of iterations, convergence ( or difference between iteration is not changing more than a threshold, divergence, and/or performance of the neural network meets criteria)”; Zhou, Fig. 1,
PNG
media_image3.png
709
900
media_image3.png
Greyscale
[showing how the gradient structures inside Accuracy Network 150 and Training Time Network 155, are used in the Value Network 140 to provide policy gradient data to influence policy in the Policy Network 110 to produce a modified network structure].
(bold only) “projecting the modified network structure to the set of network structures to obtain the network structure for the iteration in response to the modified network structure not belonging to the set of network structures”: Zhou, paragraph 0079, “To find neural architectures that meet multiple resource constraints, a reward based on the model performance may be penalized according to the extent of violating the constraints [projecting the modified network structure to the set of network structures to obtain the network structure for the iteration, interpreted as including mechanisms that steer the modifications towards meeting constraints]. Although a fixed hard penalty may be effective for some constraints, it may be challenging for the controller to learn from highly sparse rewards under tight resource constraints. Therefore, in one or more embodiments, a soft continuous penalization method may be used to enable finding architectures with high performance while still meeting all resource constraints.”
Zhou does not explicitly teach:
“determining whether the modified network structure belongs to the set of network structures”
(bold only) “projecting the modified network structure to the set of network structures to obtain the network structure for the iteration in response to the modified network structure not belonging to the set of network structures.”
Rao teaches:
“determining whether the modified network structure belongs to the set of network structures”: Rao, paragraph 0070, “At 322, it may be determined whether constraints or performance indicators are satisfied [determining whether the modified network structure belongs to the set of network structures]. In an embodiment, the circuitry 202 may be configured to determine whether the set of constraints or one or more performance indicators are satisfied for the candidate neural network.”
(bold only) “projecting the modified network structure to the set of network structures to obtain the network structure for the iteration in response to the modified network structure not belonging to the set of network structures”: Rao, paragraph 0072, “In an embodiment, the second neural network may be determined to be the candidate neural network, based on a determination that the evaluated one or more performance indicators are above the threshold values. In such a case, control may pass to 324. In another embodiment, the evaluated one or more performance indicators may be determined to be below the threshold values. In such a case, the control may pass to 326 [in response to the modified network structure not belonging to the set of network structures].”
Rao and Zhou are analogous arts as they are both related to neural architecture search. It would have been obvious to a person having ordinary skill in the art prior to the effective filing date of the claimed invention to have combined the constraint testing of Rao with the teachings of Zhou to arrive at the present invention, in order to enforce necessary constraints on generated models, as stated in Rao, paragraph 0003, “In some scenarios, developers may develop features or updates for devices based on requirements, such as a specific hardware specification, a device cost, or a device launch date. Such features or updates may not be developed and made available for other devices that don't meet the requirements.”
Regarding claim 13:
Zhou as modified by Rao teaches “the method according to claim 12.”
Rao further teaches “wherein projecting the modified network structure to the set of network structures to obtain the network structure for the iteration in response to the modified network structure not belonging to the set of network structures comprises: determining a network structure closest to the modified network structure from the set of network structures as the network structure for the iteration”: Rao, paragraph 0082, “At 330, a pruning operation may be executed. In an embodiment, the circuitry 202 may be configured to execute the pruning operation on weight parameters of the trained candidate neural network. The pruning operation may include an operation to modify (such as to compress) a neural network (such as the candidate neural network) by removing unwanted nodes of the candidate neural network arranged in a plurality of layers. Such unwanted nodes may include nodes that have minimal or no significance in training of the candidate neural network [determining a network structure closest to the modified network structure from the set of network structures as the network structure for the iteration]. For example, the nodes having have minimal or no significance weight ( e.g., weight, w ≈ 0) may be denoted as unwanted nodes as they may have no significance in training of the candidate neural network.”
Rao and Zhou are combinable for the rationale given under claim 12.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Zoph et al., US Pre-Grant Publication No. 2020/0265315, discloses a neural architecture search method in which neural networks are stored as directed acyclic graphs, each node representing an operation.
Arikawa et al., US Pre-Grant Publication No. 2023/0385603, discloses a neural architecture search method with a deployment constraint management unit, which steers search towards user-specified constraints.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VINCENT SPRAUL whose telephone number is (703) 756-1511. The examiner can normally be reached M-F 9:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MICHAEL HUNTLEY can be reached at (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/VAS/Examiner, Art Unit 2129
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129