DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claims 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 4, 5, 10 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Ghosh (US 2021/0103820), in view of Turbin et al (US 2011/0276526) and further in view of Pudipeddi et al (US 2021/0019151).
For claim 1, Ghosh teaches computer program product for using weights and biases for a neural network in an array of processing elements in a core of an accelerator (e.g. paragraph 16: the accelerator is configured to perform training of a neural network), the computer program product comprising a computer readable storage medium having computer readable program code embodied therein that is executable to perform operations (e.g. paragraph 38: deep learning or other software processing), the operations comprising:
selecting a minibatch size of inference jobs batched to process in the accelerator (figure 3, paragraph 58, virtual minibatch (VMB), );
processing a representation of a neural network to determine a set of weights and biases for the selected minibatch size to load into the core (e.g. paragraphs 16, 58: after a training sample has cycled through the neural network 12 (one iteration), are immediately used to update the neural network 12 and its weights W and biases…training samples equal to the minibatch size are evaluated through the neural network per iteration or per layer if pipelining is used);
loading the set of weights and biases into the core for use by the array of processing elements in the core of the accelerator (e.g. paragraph 16: The accelerator is configured to generate gradient updates of weights and biases of the neural network); and
using the weights and the biases in the processing elements for the neural network, loaded for the selected minibatch size, to apply to minibatches of inferences having minibatch sizes less than the selected minibatch size (e.g. Figure 3 shows the size of VSMB is less than the size of VMB, paragraph 59: After VSMB training samples TSi are cycled through the neural network 12, the local gradient buffer ∇W is updated) .
Ghosh does not further disclose: reusing the weights and the biases in the processing elements for the neural network; wherein a minibatch size references a number of inferences batched for inference processing.
Turbin et al teach: reusing the weights and the biases in the processing elements for the neural network (e.g. paragraph 192: the number of the iterations is reduced dramatically if an initial set of weights and biases are reused). It would have been obvious to one ordinary skill in the art before the effective filing date of the claimed inventio to incorporate the teaching of Turbin et al into the teaching of Ghosh to reuse the weight and biases in the processing elements for neural network to minimize the training time (e.g. paragraph 192, Turbin et al ).
Ghosh and Turbin et al do not further disclose wherein a minibatch size references a number of inferences batched for inference processing. Pudipeddi et al teach wherein a minibatch size references a number of inferences batched for inference processing (e.g. paragraph 40: A group of microbatches forms a minibatch, which is the term for the number of samples per update (for training) or the number served in every inference cycle (for inference).) It would have been obvious to one ordinary skill in the art before the effective filing date of the claimed inventio to incorporate the teaching of Pudipeddi et al into the teaching of Ghosh and Turbin et al to reuse the weight and biases in the processing elements for neural network to minimize the training time
Claims 10 and 16 are rejected for the same reasons as discussed in claim 1 above.
For claims 4, Ghosh teach the operations of selecting a minibatch size and processing the representation of the neural network to determine the set of weights and biases are performed for a plurality of neural network models (e.g. paragraphs 16, 58: after a training sample has cycled through the neural network 12 (one iteration), are immediately used to update the neural network 12 and its weights W and biases…training samples equal to the minibatch size are evaluated through the neural network per iteration or per layer if pipelining is used).
Claim 5 is rejected for the same reasons as discussed in claim 1 above.
Claims 2-3, 11-12 and 17 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Ghosh, Turbin et al and Pudipeddi et al, as applied to claims 1, 10 and 16 above, and further in view of Langford et al (US 2017/0308789).
For claims 2, 11 and 17, Ghosh, Turbin et al and Pudipeddi et al do not further disclose determining an optimal minibatch size of inferences to input into the array of processing elements to maximize throughput within a latency constraint, wherein the selected minibatch size comprises the optimal minibatch size. Langford et al teach determining an optimal minibatch size of inferences to input into the array of processing elements to maximize throughput within a latency constraint, wherein the selected minibatch size comprises the optimal minibatch size (e.g. paragraph 38: the size may be selected to maximize both computation accuracy and execution efficiency of the algorithm. ). It would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Langford et al into the teaching of Ghosh and Turbin et al to maximize both computation accuracy and execution efficiency of the algorithm.
For claims 3, 12 and 18, Ghosh, Turbin et al and Pudipeddi et al do not further disclose receiving input data for inferences in a large minibatch having a size greater than the optimal minibatch size; and forming a plurality of minibatches having a size less than or equal to the optimal minibatch size including the inferences in the large minibatch, wherein the formed plurality of minibatches include at least one minibatch having the optimal minibatch size. LangFord et al teach receiving input data for inferences in a large minibatch having a size greater than the optimal minibatch size; and forming a plurality of minibatches having a size less than or equal to the optimal minibatch size including the inferences in the large minibatch, wherein the formed plurality of minibatches include at least one minibatch having the optimal minibatch size (e.g. paragraphs 38-39: the DNN may have varying sizes due to differences in the number of units in various layers of the DNN. For example, a largest layer in the DNN may have a size that is ten times larger than that of the one or more smallest layers). It would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Langford et al into the teaching of Ghosh and Turbin et al to maximize both computation accuracy and execution efficiency of the algorithm.
Allowable Subject Matter
Claims 6-9, 13-15 and 19-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAQUAN ZHAO whose telephone number is (571)270-1119. The examiner can normally be reached M-Thur: 7:00 am-5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Thai Tran can be reached on 571-272-7382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Email: daquan.zhao1@uspto.gov.
Phone: (571)270-1119
/DAQUAN ZHAO/Primary Examiner, Art Unit 2484