Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103(a) are summarized as follows:
Determining the scope and contents of the prior art.
Ascertaining the differences between the prior art and the claims at issue.
Resolving the level of ordinary skill in the pertinent art.
Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1 – 7, 19 - 25 and 28 - 34 are rejected under 35 U.S.C. 102(a)(2) being unpatentable over Munshi et al. (hereinafter Munshi, US 2011/0285729) in view of Lee et al. (hereinafter Lee, US 2023/0086989).
Regarding claim 1, Munshi discloses:
a processor comprising:
circuitry to, in response to an application programming interface (API) call identifying a kernel comprising two or more groups of blocks of threads to be performed (see at least ph. [0062] discloses the processing logic updates the execution queue in response to API requests/calls where these updates include using one or more computer kernel execution instances to perform the necessary processing required by the system and that the kernel may include a number of threads as a thread group (block) and this is for any number of groups/blocks of threads, including two or more).
cause a scheduling policy, indicated by calls of the API, to be used to schedule performance of the two or more groups of blocks of threads of the kernel (see at least ph. [0062] discloses the processing logic updates the execution queue in response to API requests/calls where this includes in instances one or more compute kernel execution instances being scheduled for execution in such an execution queue and these kernels are disclosed as including a number of threads that may be organized as a thread group / block, and this may be done for any number of times for numbers of blocks resulting in any number of groups of blocks of threads for any kernel, including two or more).
Munshi does not expressly disclose, however, Lee discloses:
one or more parameters identifying a kernel (see at least ph. [0054] for kernels represented by tuples with parameters identifying name of kernels).
It would have been obvious for a person of ordinary skill in the art at the time of filing to modify the teachings of Munshi, by the teachings of Lee in order to implement a means to select a kernel for execution so that kernel itself may properly perform tasks suited to that particular kernel.
Claim 19, is a computer system version of claim 1 and is similarly rejected as claim 1, where it is noted that Munshi discloses the use of processors, memory storing instructions to perform the features claimed in at least Fig. 1 and this figure’s related descriptions in ph. [0030] – [0034].
Claim 28, is a machine-readable medium version of claim 1 and is similarly rejected as claim 1, where it is noted that Munshi discloses the use of a medium storing instructions to be performed by one or more processors as per the features claimed in claim 28 in at least Fig. 1 and this figure’s related descriptions in ph. [0030] – [0034].
Regarding claims 2, 20 and 29 the rejections of claims 1, 10, 19 and 28 are incorporated and Munshi discloses:
the scheduling policy is to apply to a group of two or more groups of blocks threads the two or more groups of blocks of threads being of a software kernel to be performed (at least ph. [0062] discloses that the API itself may specify for the number of thread groups/blocks that are executed, where any number of blocks (groups of threads) may be so selected, including two or more and such a kernel may be added to the execution queue is disclosed).
Regarding claims 3, 21 and 30 the rejections of claims 1, 10, 19 and 28 are incorporated and Munshi discloses:
the scheduling policy is to apply to multiple partitions of multiple groups of one or more blocks of one or more threads (at least ph. [0062] discloses that different numbers of groups/blocks may be specified by the API, which indicates that the groups are distinct from one another (i.e. each is partitioned/separate from the other groups) and that the groups are to be schedule for execution according to the scheduling related to the scheduling of the execution queue).
Regarding claims 4, 22 and 31 the rejections of claims 1, 10, 19 and 28 are incorporated and Munshi discloses:
the scheduling policy is to prioritize scheduling of one or more first groups of blocks of one or more threads over one or second groups of blocks of one or more threads (at least ph. [0062] discloses priority for kernel execution (which then applies to the API calls and their associated thread groups/blocks) and that the compute kernel execution instance involved in such execution, through an event object that indicated an execution order relationship between the execution instance and other execution instances (which then is a prioritizing schedule)).
Regarding claims 5, 23 and 32 the rejections of claims 1, 10, 19 and 28 are incorporated and Munshi discloses:
the scheduling policy is a spread policy (the scheduling described in at least ph. [0062] is disclosed in at least Fig. 3 and ph. [0039] as implementable to be executed on different physical computing devices, this results in the different physical devices performing the task together, therefore the subsequent shared processing is distributed among these physical computing devices and any scheduling so implemented to do so results in a type of scheduling spread policy, which is the scheduling of tasks for an execution of a computing task to be ‘spread’ out among different processing nodes which will thereby result in total faster processing).
Regarding claims 6, 24 and 33 the rejections of claims 1, 10, 19 and 28 are incorporated and Munshi discloses:
the scheduling policy is a load balancing policy (the scheduling described in at least ph. [0062] is disclosed in at least Fig. 3 and ph. [0039] as implementable to be executed on different physical computing devices, this results in the different physical devices performing the task together, therefore the subsequent shared processing is distributed among these physical computing devices and any scheduling so implemented to do so results in a type of scheduling spread policy, which is the scheduling of tasks for an execution of a computing task to be ‘spread’ out among different processing nodes which will thereby result in total faster processing, where further embodiments of the reference disclose that this scheduling includes load balancing among different physical computing devices).
Regarding claims 7, 25 and 34 the rejections of claims 1, 10, 19 and 28 are incorporated and Munshi discloses:
in response to the API call cause the scheduling policy to be applied when the one or more blocks of the one or more threads are to be performed (at least ph. [0062] discloses that API calls result in thread group(s) to be executed by a kernel which results in the execution thereof to be schedule, which then indicates that the execution of the API results in applying an appropriate scheduling policy to be applied).
Regarding claims 8, 17, 26 and 35 the rejections of claims 1, 10, 19 and 28 are incorporated and Munshi discloses:
the scheduling policy is to affect scheduling of groups of blocks of one or more threads on multiprocessors of a graphics processing unit (GPU) (see at least ph. [0060] for targets of computer program executables including physical CPUs and GPUs (for processing) where in at least ph. [0062] these executables correspond to kernels with their own groups/blocks of threads and [0071] indicating within the computing system (and therefore its requisite circuity) using an API that includes the number of threads and even a total number of threads and groups of threads executing in parallel resulting in an indicated maximum number of blocks of threads to be scheduled (as the total specified establishes a maximum as the total number specified may not then be exceeded), see as well as Table 2 that discloses the naming of a maximum number of work groups size / maximum number of groups of threads).
Regarding claim 10, Munshi discloses:
a computer-implemented method comprising:
receiving an application programming interface (API) call comprising one or more parameters indicative of a software kernel comprising two or more groups of blocks of threads to be performed by a graphics processing unit (GPU) (see at least ph. [0060] for targets of computer program executables including physical CPUs and GPUs (for processing) where in at least ph. [0062] these executables correspond to kernels with their own groups/blocks of threads and [0071] indicating within the computing system (and therefore its requisite circuity) using an API that includes the number of threads and even a total number of threads and groups of threads executing in parallel resulting in an indicated maximum number of blocks of threads to be scheduled (as the total specified establishes a maximum as the total number specified may not then be exceeded), see as well as Table 2 that discloses the naming of a maximum number of work groups size / maximum number of groups of threads); and
in response to receiving the API call, causing a scheduling policy, indicated by the call of the API, to be used to schedule performance of the two or more groups of blocks of threads of a kernel (see at least ph. [0062] discloses the processing logic updates the execution queue in response to API requests/calls where this includes in instances one or more compute kernel execution instances being scheduled for execution in such an execution queue and these kernels are disclosed as including a number of threads that may be organized as a thread group / block, and this may be done for any number of times for numbers of blocks resulting in any number of groups of blocks of threads for any kernel, including two or more).
Munshi does not expressly disclose, however, Lee discloses:
one or more parameters identifying a kernel (see at least ph. [0054] for kernels represented by tuples with parameters identifying name of kernels).
It would have been obvious for a person of ordinary skill in the art at the time of filing to modify the teachings of Munshi, by the teachings of Lee in order to implement a means to select a kernel for execution so that kernel itself may properly perform tasks suited to that particular kernel.
Regarding claim 11 the rejection of claim 10 is incorporated and Munshi discloses:
the scheduling policy is to apply to a group of multiple groups of one or more blocks of one or more threads, the multiple groups being of a software kernel to be performed (at least ph. [0062] discloses that the API itself may specify for the number of thread groups/blocks that are executed as / in the kernel).
Regarding claim 12 the rejection of claim 10 is incorporated and Munshi discloses:
the scheduling policy is to apply to multiple partitions of multiple groups of one or more blocks of one or more threads (at least ph. [0062] discloses that different numbers of groups/blocks may be specified by the API, which indicates that the groups are distinct from one another (i.e. each is partitioned/separate from the other groups) and that the groups are to be schedule for execution according to the scheduling related to the scheduling of the execution queue).
Regarding claim 13 the rejection of claim 10 is incorporated and Munshi discloses:
the scheduling policy is to prioritize scheduling of one or more first groups of blocks of one or more threads over one or second groups of blocks of one or more threads (at least ph. [0062] discloses priority for kernel execution (which then applies to the API calls and their associated thread groups/blocks) and that the compute kernel execution instance involved in such execution, through an event object that indicated an execution order relationship between the execution instance and other execution instances (which then is a prioritizing schedule)).
Regarding claim 14 the rejection of claim 10 is incorporated and Munshi discloses:
the scheduling policy is a spread policy (the scheduling described in at least ph. [0062] is disclosed in at least Fig. 3 and ph. [0039] as implementable to be executed on different physical computing devices, this results in the different physical devices performing the task together, therefore the subsequent shared processing is distributed among these physical computing devices and any scheduling so implemented to do so results in a type of scheduling spread policy, which is the scheduling of tasks for an execution of a computing task to be ‘spread’ out among different processing nodes which will thereby result in total faster processing).
Regarding claim 15 the rejection of claim 10 is incorporated and Munshi discloses:
the scheduling policy is a load balancing policy (the scheduling described in at least ph. [0062] is disclosed in at least Fig. 3 and ph. [0039] as implementable to be executed on different physical computing devices, this results in the different physical devices performing the task together, therefore the subsequent shared processing is distributed among these physical computing devices and any scheduling so implemented to do so results in a type of scheduling spread policy, which is the scheduling of tasks for an execution of a computing task to be ‘spread’ out among different processing nodes which will thereby result in total faster processing, where further embodiments of the reference disclose that this scheduling includes load balancing among different physical computing devices).
Regarding claim 16 the rejection of claim 10 is incorporated and Munshi discloses:
the scheduling policy determines whether two or more blocks are distributed to be performed on a single compute unit (at least ph. [0062] discloses an execution priority (therefore a scheduling policy) for kernels, with their respective groups/blocks of threads can be added as a computer kernel execution instance for a corresponding computer program executable, therefore the priority/scheduling policy determines when or if the kernel and its groups/blocks are processed/distributed to the system).
Claims 9, 27 and 36 are rejected under 35 U.S.C. 103(a) as being unpatentable over Munshi in view of Lee and further in view of Li et al. (hereinafter Li, US 2014/0208331).
Regarding claims 9, 27 and 36 the rejections of claims 1, 10, 19 and 28 are incorporated and Munshi discloses:
in response to the API call, indicate the scheduling policy associated with the two or more groups of blocks of threads (see at least ph. [0062] discloses the processing logic updates the execution queue in response to API requests/calls where this includes in instances one or more compute kernel execution instances being scheduled for execution in such an execution queue and these kernels are disclosed as including a number of threads that may be organized as a thread group / block, and this may be done for any number of times for numbers of blocks resulting in any number of groups of blocks of threads for any kernel, including two or more).
Munshi and Lee do not expressly disclose, however, Li discloses:
in response to the function call, indicate the scheduling policy by returning a scheduling policy associated with the two or more groups of blocks of threads (see at least ph. [0069] for the calling of select_cores to select logical cores for thread groups (may be done for any number of these, including two or more) and returning a number of cores to be used (so returns information about the number of cores) to the scheduler that called select_cores provided, which then indicates, at least the policy of the scheduler to utilize that many cores as that then, based on system information, is apparently optimal).
It would have been obvious for a person of ordinary skill in the art at the time of filing to modify the teachings of Munshi, as modified by Lee, by the teachings of Li in order to implement a way to understand how many cores will be effective to process a selected number of threads (organized in groups and or blocks or other arrangements) to effectively process those threads.
Claim 18 is rejected under 35 U.S.C. 103(a) as being unpatentable over Munshi in view of Lee and further in view of Li.
Regarding claim 18 the rejection of claim 10 is incorporated and Munshi discloses:
in response to the API call, indicate the scheduling policy associated with the two or more groups of blocks of threads (see at least ph. [0062] discloses the processing logic updates the execution queue in response to API requests/calls where this includes in instances one or more compute kernel execution instances being scheduled for execution in such an execution queue and these kernels are disclosed as including a number of threads that may be organized as a thread group / block, and this may be done for any number of times for numbers of blocks resulting in any number of groups of blocks of threads for any kernel, including two or more).
Munshi and Lee do not expressly disclose, however, Li discloses:
in response to the function call, indicate the scheduling policy by returning a scheduling policy associated with the two or more groups of blocks of threads (see at least ph. [0069] for the calling of select_cores to select logical cores for thread groups (may be done for any number of these, including two or more) and returning a number of cores to be used (so returns information about the number of cores) to the scheduler that called select_cores provided, which then indicates, at least the policy of the scheduler to utilize that many cores as that then, based on system information, is apparently optimal).
It would have been obvious for a person of ordinary skill in the art at the time of filing to modify the teachings of Munshi, as modified by Lee, by the teachings of Li in order to implement a way to understand how many cores will be effective to process a selected number of threads (organized in groups and or blocks or other arrangements) to effectively process those threads.
Other References Cited Not Relied Upon
Marathe et al. (US 2012/0254875) discloses calling a specialized API to manage parallel thread execution.
Munshi et al. (US 2008/0276262) discloses API calls at runtime to include a number of threads that execute simultaneously in parallel.
Boone et al. (US 2022/0060862) discloses batches of API calls creating multiple threads for parallel execution of batches of API calls.
Response to Arguments
Applicant’s arguments have been fully considered but are moot in light of new grounds of rejection.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CRAIG C DORAIS whose telephone number is (571)270-3371. The examiner can normally be reached M-F 9:00 am - 6:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Vital can be reached at 5712724215. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CRAIG C DORAIS/Primary Examiner, Art Unit 2198