Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 16 January 2026 has been entered.
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-31 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-31 is/are rejected under 35 U.S.C. 103 as being unpatentable over Johnson (US 2020/0081748) and further in view of Barrow-Williams (US 10,037,228).
Regarding claim 1, Johnson teaches: A central processing unit (CPU) comprising: circuitry to:
cause a graphics processing unit (GPU) to reserve a set of streaming multiprocessors (SMs) within the GPU (¶ 188, “Volta MPS provides control for MPS clients to specify what fraction of the GPU (GPU fraction 2108, GPU fraction 2110, and GPU fraction 2112) is necessary for execution”) to perform one or more software threads based (¶ 191, “the parallel processing unit 2200 is a multi-threaded processor that is implemented on one or more integrated circuit devices”), at least in part, on one or more indications of a number of SMs to be included in the set of SMs (Figure 21, ¶ 188, “GPU fraction 2108, GPU fraction 2110, and GPU fraction 2112”), wherein the one or more indications are provided by the CPU to the GPU via an application program interface (API) (¶ 202, “a host processor executes a driver kernel that implements an application programming interface (API) that enables one or more applications executing on the host processor to schedule operations for execution on the parallel processing unit 2200”) for one or more contexts for performing the one or more software threads (¶ 220, “define groups of threads explicitly at sub-block (e.g., as small as a single thread) . . . so that libraries and utility functions can synchronize safely within their local context without having to make assumptions about convergence”).
Johnson does not teach; however, Barrow-Williams discloses: a number of SMs to be included in the set of SMs (col. 16:50-54, “‘deep allocation’ preferentially assigns CTAs associated with the same grid to a minimum number of different SMs 310 to generally maximize cache affinity for both TLB caching as well as data caching”).
It would have been obvious to a person having ordinary skill in the art, at the effective filing date of the invention, to have applied the known technique of a number of SMs to be included in the set of SMs, as taught by Barrow-Williams, in the same way to the one or more indications, as taught by Johnson. Both inventions are in the field of GPU scheduling, and combining them would have predictably resulted in “improving GPU utilization and performance in certain applications,” as indicated by Barrow-Williams (col. 19:53-54).
Regarding claim 2, Johnson teaches: The CPU of claim 1, wherein the set of SMs is to be reserved, at least in part, by a multi-process service (MPS) serving one or more clients for performing the one or more software threads and a server to reserve the set of SMs (¶ 187, “a multi-process service environment 2100 using Volta Multi-Process Service (MPS 2118) is a feature of the Volta GV100 architecture enabling improved performance and isolation for multiple compute applications sharing the GPU”), the one or more clients providing the one or more indications (¶ 188, “olta MPS provides control for MPS clients to specify what fraction of the GPU (GPU fraction 2108, GPU fraction 2110, and GPU fraction 2112) is necessary for execution”), wherein the circuitry is further to cause the GPU to reserve based, at least in part, on the one or more indications, a subset of the set of SMs within the GPU for the one or more contexts to perform the one or more software threads prior to scheduling performance of the one or more software threads by the GPU (¶ 188, “This control to restrict each client to only a fraction of the GPU execution resources reduces or eliminates head-of-line blocking where work from one MPS client may overwhelm GPU execution resources”).
Regarding claim 3, Johnson teaches: The CPU of claim 1, wherein the one or more indications comprise environment variable data set by one or more software programs for performing the one or more software threads (¶ 188, “Volta MPS provides control for MPS clients to specify what fraction of the GPU (GPU fraction 2108, GPU fraction 2110, and GPU fraction 2112) is necessary for execution”) and provided via the API for the one or more contexts (¶ 202, “An application may generate instructions (e.g., API calls) that cause the driver kernel to generate one or more tasks for execution by the parallel processing unit 2200”).
Regarding claim 4, Barrow-Williams teaches: The CPU of claim 1, wherein circuitry is further to combine the one or more contexts with one or more other contexts corresponding to one or more other software threads (col. 13:45-48, “A resource manager (RM) 454 within driver 103 is configured to pack the one or more thread programs, each assigned to a TMD 452, into one GPU context 450 for simultaneous execution within a single GPU context”).
Regarding claim 5, Barrow-Williams teaches: The CPU of claim 4, wherein the one or more contexts are combined based, at least in part, on many-to-one context mapping (col. 16:25-29, “a particular PA page is mapped into two or more VA spaces. In such a usage model, the PA page comprises a shared memory page having two or more different virtual address representations in corresponding execution contexts”).
Regarding claim 6, Johnson teaches: The CPU of claim 1, wherein one or more multi-process service (MPS) clients provide the one or more indications via the API for the one or more contexts for performing the one or more software threads to a MPS server (¶ 187, “Starting with Kepler GK110 GPUs, NVIDIA introduced a software-based multi-process service (MPS) and MPS server that allowed multiple different CPU processes (application contexts) to be combined into a single application context and run on the GPU, attaining higher GPU resource utilization”), the MPS server to manage access to the set of SMs by the one or more software threads (¶ 187, “The MPS 2118 may be used to implement improved thread convergence in accordance with the methods disclosed herein”).
Regarding claim 7, Johnson teaches: The CPU of claim 1, wherein the circuitry is further to execute a daemon process to communicate with one or more clients associated with the one or more software threads (¶ 187, “This process acts as the intermediary to submit work 2120 to the work queues 2122 inside the GPU 2124 for concurrent kernel execution”).
Claims 14 recites commensurate subject matter as claim 1. Therefore, it is rejected for the same reasons.
Regarding claim 19, Johnson teaches: the group of SMs comprises two or more cores and memory (¶ 215, “each of the SM 2500 modules may implement an L1 cache. The L1 cache is private memory that is dedicated to a particular SM 2500”).
Claims 8-13, 15-18, and 20-31 recite commensurate subject matter as claims 1-7, 14, and 19. Therefore, they are rejected for the same reasons.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACOB D DASCOMB whose telephone number is (571)272-9993. The examiner can normally be reached M-F 9:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Vital can be reached at (571) 272-4215. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JACOB D DASCOMB/ Primary Examiner, Art Unit 2198