DETAILED ACTION
Claims 1-20 are presented for examination.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1-2 and 8-9 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Verrilli et al. (US PG Pub No. 2020/0250545 A1).
Verrilli was disclosed in IDS dated 06/13/2023.
Regarding claim 1, Verrilli teaches a processor implemented method for accelerating machine learning on a computing device (Fig 3; [0041], "neural network acceleration architecture 300"; Fig 7; [0060], "method for accelerating machine learning on a computing device”), comprising: accessing a neural network ([0041], "operate a neural network");
splitting the neural network into N sub-neural networks ([0042], wherein "the neural network of the host application 320 is split across the first AIIA 330 and the second AIIA 340 … the intermediate inference request results 308 are routed between the first AIIA 330 and the second AIIA 340");
hosting the N sub-neural networks in M inference accelerators ([0042], wherein "the neural network of the host application 320 is hosted in the first AIIA 330 and the second AIIA 340. ");
scheduling the N sub-neural networks in the M inference accelerators ([0043], wherein "the inference request result 306 is provided to the host processor 310 through a designated inference accelerator of the first AIIA 330 and the second AIIA 340." This designation amounts to a scheduling of the two sub-neural networks in the two accelerators.); and
executing the N sub-neural networks in the M inference accelerators ([0043], wherein "the inference request result 306 may be generated by the first AIIA 330, in which the inference request result 306 is based on the intermediate inference request results 308 received from the second AIIA 340.").
Regarding claim 2, Verrilli teaches setting a data path and a control path of a first sub-neural network of the N sub-neural networks in a first inference accelerator of the M inference accelerators; and setting a data path and a control path of a second sub-neural network of the N sub-neural networks in a second inference accelerator of the M inference accelerators (Fig 4; [0047]).
Regarding claims 8-9, they are the medium claims of claims 1-2 above. Therefore, they are rejected for the same reasons as claims 1-2 above.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 5 and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verrilli et al. (US PG Pub No. 2020/0250545 A1).
Regarding claim 5, Verrilli does not teach assigning a unique port identification to each input port and output port of the first inference accelerator and the second inference accelerator.
However, it is old and well known to assign identifiers or labels to various components when designing/implementing accelerators as a network of accelerators. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to assign a unique port identification to each input port and output port of the first inference accelerator and the second inference accelerator. One would be motivated by the desire to ensure that data is routed correctly.
Regarding claims 12, it is the medium claim of claim 5 above. Therefore, it is rejected for the same reasons as claim 5 above.
Claim(s) 6-7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verrilli et al. (US PG Pub No. 2020/0250545 A1) further in view of Zejda et al. (US Pat No. 12,093,806).
Regarding claims 6-7, Verrilli does not teach in which N=M or N<M and N and M are integers greater than one.
Zejda teaches splitting a neural network into different subgraphs and assigning the subgraphs to processing units (Fig 4; col 6 lines 56-62). Zejda further teaches that there may be a surplus available processing units (col 8 lines 27-39) or more subgraphs than processing units can be created (col 13 lines 9-29).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention that N=M or N<M where N and M are integers greater than one. Where the general conditions of a claim are disclosed in the prior art, it is not inventive to discover the optimum or workable ranges by routine experimentation, In re Aller 220 F.2d 454, 456, 105 USPQ 233, 235 (CCPA 1955), see MPEP 2144.05, II.
Claim(s) 3-4, 10-11, and 13-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verrilli et al. (US PG Pub No. 2020/0250545 A1) in view of Yang (US PG Pub No. 2020/0026997 A1).
Yang was disclosed in IDS dated 06/13/2023.
Regarding claim 3, Verrilli does not teach monitoring the first inference accelerator and the second inference accelerator; and dynamically adjusting the data path and/or the control path of the first sub-neural network and/or the second sub-neural network according to a predetermined performance criteria.
Yang teaches a method for accelerated machine learning wherein when a first accelerator becomes available and has a higher priority than a second accelerator for performing a task currently being performed on the second AI accelerator (Fig. 10-12; [0101]-[0115]), for example a higher priority for low- power consumption, the task is assigned to the first accelerator and the information regarding the resources and communication schemes ([0030]-[0033]) required for performing the task are updated.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement the dynamic scheduling described in Yang, by using the switch device of Verrilli to update the communication scheme related to the task. For example, when AIIA 340 is computing intermediate results to be passed to AIIA 330 for computing final results based on the intermediate results and a third AIIA allowing lower power consumption than the AIIA 330 becomes available, the skilled artisan would configure the switch device to appropriately update the output port of AIIA 340 so that the intermediate results are routed to the third AIIA. One would be motivated by the desire to provide better control over power consumption and temperature of the inference accelerators.
Regarding claim 4, Verrilli and Yang do not teach identifying a peer output port in response to dynamically adjusting the data path and/or the control path; and waking the peer output port to dynamically adjust the data path and/or the control path.
However, it is old and well known to switch off unused peripherals in a computing device for reducing power consumption. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to switch on the output port to be used when updating a communication scheme.
Regarding claims 10-11, they are the medium claims of claims 3-4 above. Therefore, they are rejected for the same reasons as claims 3-4 above.
Regarding claim 13, Verrilli teaches a method for
splitting a neural network into a first sub-neural network and a second sub-neural network (Fig. 4; [0060], wherein "a neural network is split between the first AIIA 430 and the second AIIA 440 … intermediate inference request results are routed between the first inference accelerator and the second inference accelerator.");
hosting the first sub-neural network in a first inference accelerator and the second sub-neural network in a second inference accelerator ([0060], wherein "a neural network is hosted in a first inference accelerator and a second inference accelerator");
monitoring the first inference accelerator and the second inference accelerator (Fig. 3; [0045], wherein "the switch device 302 supports peer-to-peer communication between the first AIIA 330 and the second AIIA 340 to enable direct memory access (DMA) transfer without intervention from the host processor 310." - this peer-to-peer communication through the switch device implies a monitoring by the switch device of the two accelerators.); and
Verrilli does not teach that the inference routing is dynamic and the method includes a step of performing an output redirection decision of the first sub-neural network and/or the second sub-neural network according to a predetermined performance criteria.
Yang, at Figs 10-12; [0101]-[0115], teaches a method for accelerated machine learning wherein when a first accelerator becomes available and has a higher priority than a second accelerator for performing a task currently being performed on the second AI accelerator, for example a higher priority for low- power consumption, the task is assigned to the first accelerator and the information regarding the resources and communication schemes ([0030]-[0033]) required for performing the task are updated.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement the dynamic scheduling described in Yang, by using the switch device of Verrilli to update the communication scheme related to the task. For example, when AIIA 340 is computing intermediate results to be passed to AIIA 330 for computing final results based on the intermediate results and a third AIIA allowing lower power consumption than the AIIA 330 becomes available, the skilled artisan would configure the switch device to appropriately update the output port of AIIA 340 so that the intermediate results are routed to the third AIIA. One would be motivated by the desire to provide better control over power consumption and temperature of the inference accelerators.
Regarding claim 14, Verrilli and Yang do not teach determining whether a platform software redirection is detected prior to choosing an output port; and waking a peer output port in response to detecting the platform software redirection.
However, it is old and well known to switch off unused peripherals in a computing device for reducing power consumption. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to switch on the output port to be used when updating a communication scheme.
Regarding claim 15, Yang teaches dynamically scheduling the first sub-neural network in the first inference accelerator and the second sub-neural network in the second inference accelerator (Figs 10-12; [0101]-[0115]).
Regarding claim 16, Yang teaches adjusting a data path and a control path of the first sub-neural network in the first inference accelerator at runtime; and adjusting a data path and a control path of the second sub-neural network in the second inference accelerator at runtime (Figs 10-12; [0101]-[0115])
Regarding claim 17, Yang teaches in which the predetermined performance criteria comprises a power consumption ([0105]; [0112]), a temperature, and/or a load of the first inference accelerator and/or the second inference accelerator.
Regarding claims 18-20, they are the medium claims of claims 13-16 above. Therefore, they are rejected for the same reasons as claims 13-16 above.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC C WAI whose telephone number is (571)270-1012. The examiner can normally be reached Monday - Friday 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached at (571) 272-4169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Eric C Wai/Primary Examiner, Art Unit 2195