Prosecution Insights
Last updated: May 29, 2026
Application No. 18/250,708

Persistent Multi-Instance GPU Partitions

Non-Final OA §103
Filed
Apr 26, 2023
Priority
Dec 14, 2022 — nonprovisional of PCT/US2022/052887 +1 more
Examiner
AYERS, MICHAEL W
Art Unit
2195
Tech Center
2100 — Computer Architecture & Software
Assignee
Rakuten Symphony Inc.
OA Round
3 (Non-Final)
70%
Grant Probability
Favorable
3-4
OA Rounds
1m
Est. Remaining
99%
With Interview

Examiner Intelligence

Grants 70% — above average
70%
Career Allowance Rate
205 granted / 292 resolved
+15.2% vs TC avg
Strong +54% interview lift
Without
With
+53.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 2m
Avg Prosecution
21 currently pending
Career history
325
Total Applications
across all art units

Statute-Specific Performance

§101
3.2%
-36.8% vs TC avg
§103
91.4%
+51.4% vs TC avg
§102
0.8%
-39.2% vs TC avg
§112
3.0%
-37.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 292 resolved cases

Office Action

§103
DETAILED ACTION This office action is in response to claims and remarks filed 11 May 2026. Claims 1, 3-11, and 13-20 are pending. Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 11 May 2026 has been entered. Response to Arguments Applicant's arguments filed 11 May 2026 have been fully considered but they are not persuasive. On pages 8-12, applicant argues the following in the remarks: “The Cully and Ahn references are silent on any teaching of the following claimed concepts…”rebooting the node, wherein rebooting the node deletes the one or more GPU instances”… “Therefore Cully does not disclose, teach, or otherwise suggest…“rebooting the node, wherein rebooting the node deletes the one or more GPU instances”… “Applicant submits that the accelerator memory dump…as disclosed by Ahn, does not disclose, teach or otherwise suggest “the file” as claimed. Specifically, Applicant’s claimed feature recites ‘the accessing, by the node, a server with the file when the node completes the reboot”… Accordingly, independent claims 1, 11, and 20, at least as amended, are novel and non-obvious over Cully in view of Ahn.” The examiner respectfully disagrees. MPEP 2145(IV) states “One cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.” Applicant’s argument attacks Cully for not teaching “rebooting the node…”; however, this aspect is taught by Ahn. Applicant’s argument then attacks Ahn for not teaching “the file”; however, this aspect is taught by Cully. As such, the combination of Cully and Ahn, when considered together and not individually, teaches the claim limitation at issue. Applicant makes additional arguments directed to newly amended claim limitations. These arguments are moot because they do not specifically address the new reference (KURKURE, cited below) being used to reject those newly amended claim limitations in the current rejection. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1, 3-4, 6-7, 9, 11-14, 16-17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over CULLY et al. Pub. No.: US 2023/0393898 A1 (hereafter CULLY), in further view of KURKURE et al. Pub. No.: US 2021/0373972 A1 (hereafter KURKURE), in further view of AHN et al. Pub. No.: US 2023/0281081 A1 (hereafter AHN). CULLY and AHN were cited previously Regarding claim 1, CULLY teaches the invention substantially as claimed, including: A method comprising: providing a node comprising one or more applications ([0022] Application 214 is co-executed by…process container 208 running in the acceptor node); providing a GPU ([0022] Process container 208 is spun up in pod VM 130 of host 120-1 because host 120-1 is equipped with…special hardware 166 (e.g., GPU)); dividing the GPU into one or more GPU instances, wherein each GPU instance is associated with at least one of the one or more applications ([0038] A hypervisor may provision a virtual hardware platform including a virtual GPU (i.e., virtual GPUs represent a “division” of the physical GPU) for a pod VM); saving partition data pertaining to the one or more GPU instances to a file…and saving the file to a database ([0038] When this pod VM is suspended, the entire GPU state (i.e., vGPU “partition data”) is saved in a backing file in storage (i.e., “database”)). …[delete] the one or more GPU instances ([0033] In step 602, in response to an instruction to suspend a workload, the pod VM controller suspends the workload by invoking a scheduler of its node to suspend execution of the workload and then evicting a portion or all of the executing image of the workload that is in memory to storage (e.g., shared storage 170), depending on how much memory resources need to be freed up. The amount of memory resources that need to be freed up may be indicated as a parameter of the instruction for suspending the workload. [0034] Orchestration service 105 adds the entry for the suspended workload to idle set 304 and removes the entry for the suspended workload from one of the active sets corresponding to the node where the workload is now idle. (i.e., evicting vGPU state from memory to storage when an associated workload is suspended removes or “deletes” the vGPU state)); accessing, by the node, a server with the file…retrieving the file comprising the partition data; creating new one or more GPU instances according to the partition data ([0038] The migration destination node restores the GPU state by reading the backing file from the location in storage indicated in the fetch description (i.e., destination node accesses shared storage server 170 to retrieve the GPU state for a new (“created”) vGPU in the destination node)); and associating the one or more applications with the new one or more GPU instances ([0039] The state of this workload is restored in the migration destination node by migrating the part of the workload state that is in the memory of the migration source node to the migration destination node and restoring the rest of the workload state from storage (i.e., workload of an application is associated with the restored vGPU state on the destination node)). While CULLY teaches migration of workloads between nodes, CULLY does not explicitly teach: saving partition data pertaining to the one or more GPU instances in a file including a mapping of the GPU instances to the one or more applications; associating the one or more applications with the new one or more GPU instances including remapping the one or more applications based on the mapping saved in the file on a one-to-one mapping basis. However, in analogous art that similarly discusses migrating workloads between nodes, KURKURE teaches: saving partition data pertaining to the one or more GPU instances in a file including a mapping of the GPU instances to the one or more applications ([0014] The GPUs 115 can be vGPU-enabled, or support vGPUs 151. For example, NVIDIA® vGPU solutions can allow multiple virtual machines 118, or workloads, to share a GPU 115 with a balance among performance, security and isolation. Each virtual machine 118 can be assigned to a vGPU 151 of the GPU 115. [0001] A virtual machine can include an operating system (OS) running one or more applications. [0020] Virtual machine data 128…can be stored in the data store 117. [0036] Virtual machine data 128 can represent information related to virtual machines 118. Virtual machine data 128 can include a record of all vGPU requests for the virtual machines 118. A vGPU request can include a graphics processing workload or graphics processing requirement of a virtual machine 118. Virtual machine data 128 can include an identifier or name of each virtual machine 118, and an identifier or location of a GPU 115 where a vGPU request or virtual machine 118 is being processed or executed (i.e., virtual machine data stored as a “file” in a data store reflects at least a mapping between an application executing on a VM and a vGPU of a particular GPU that processes the vGPU requests of the application)); associating the one or more applications with the new one or more GPU instances including remapping the one or more applications based on the mapping saved in the file on a one-to-one mapping basis ([0076] In step 530, the virtual machine scheduler 120 can migrate the virtual machine 118 to the destination GPU 115. This can include creating a vGPU 151 for the virtual machine 118 in conjunction with a hypervisor 135, and assigning the virtual machine 118 to the vGPU 151 (i.e., the migrated VM application is associated with a different vGPU instance, which changes the assignment, or “mapping” of the VM application to the new vGPU in the Virtual machine data 128 file)). It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to have combined KURKURE’s teaching of mapping and remapping applications with GPU instances during workload migration, with CULLY’s teaching of workload migration between GPU instances, to realize, with a reasonable expectation of success, a system that migrates workloads between GPU instances, as in CULLY, by mapping and remapping applications and GPU instances in a file, as in KURKURE. A person having ordinary skill would have been motivated to make this combination to enable more efficient utilization of resources (KURKURE [0002]). While CULLY teaches deleting GPU instances upon suspension and retrieving GPU instance data, CULLY and KUKURE does not explicitly teach: rebooting the node, wherein rebooting the node deletes the one or more GPU instances accessing, by the node, a server… when the node completes the reboot; However, in analogous art that similarly deletes GPU instances, AHN teaches: rebooting the node, wherein rebooting the node deletes the one or more GPU instances ([0068] In operation 5, when a checkpoint operation for a memory of an accelerator is completed and an accelerator (i.e., “GPU instance”) memory dump 433 has been generated (e.g., on-demand as discussed with reference to FIG. 3 ), an operation node 410 may restart or reset (i.e., “reboot”). In this case, a master node 420 may be involved in the restarting of the operation node 410, but implementations are not limited to the foregoing example (i.e., dumping accelerator memory upon node restart “deletes” the instance of the accelerator from the operation node). [0050] The accelerator 220 may include, for example, a graphics processing unit (GPU)) accessing, by the node, a server…with the file when the node completes the reboot ([0070] In operation 7, the accelerator memory dump 433 stored in the storage node 430 may be transmitted (i.e., “accessed”) to the host processor of the operation node 410. The accelerator memory dump 433 may include data checkpointed on demand in response to a failure occurring in the host processor (e.g., per operation 4 of FIGS. 3 and/or 5 ); It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to have combined AHN’s teaching of rebooting a node causing a GPU instance to be deleted, with CULLY and KURKURE’s teaching of causing a GPU instance to be deleted upon suspension, to realize, with a reasonable expectation of success, a system that deletes GPU instances from a node, as in CULLY and KURKURE, when the node is rebooted, as in AHN. A person having ordinary skill would have been motivated to make this combination to enable a system to recover from faults and failures that require operation node rebooting while maintaining GPU memory state. Regarding claim 3, CULLY further teaches: dividing the GPU into one or more GPU instances, saving partition data pertaining to the one or more GPU instances to a file, and saving the file to a database are performed by an automated agent ([0044] One or more embodiments of the present invention may be implemented as one or more computer programs (i.e., “automated agent”) or as one or more computer program modules embodied in computer-readable media). Regarding claim 4, CULLY further teaches: retrieving the file comprising the partition data, creating new one or more partitions according to the partition data, and associating the one or more applications with the new one or more GPU instances are performed by the automated agent ([0044] One or more embodiments of the present invention may be implemented as one or more computer programs (i.e., “automated agent”) or as one or more computer program modules embodied in computer-readable media). Regarding claim 6, AHN further teaches: the node provides a handshake to the automated agent upon completing the reboot such that the automated agent receives an indication to retrieve the file from the database ([0078] A host memory dump previously stored (e.g., checkpointed per a checkpoint interval of the host processor 510) in the storage 540 may be transmitted to the restarted host processor 510 and loaded into the memory thereof, and an accelerator memory dump may be transmitted to the accelerator 530 and loaded into the memory thereof)… CULLY further teaches: retrieve the file from the database and partition the GPU ([0038] The migration destination node restores the GPU state by reading the backing file from the location in storage indicated in the fetch description (i.e., destination node accesses shared storage server 170 to retrieve the GPU state for a new vGPU in the destination node)) Regarding claim 7, CULLY further teaches: the automated agent saves the partition data to the file each time there is a change to a number of the GPU instances, a configuration of the GPU instances, or a mapping of the GPU instances to the one or more applications ([0038] When this pod VM is suspended (i.e., suspending the pod VM suspends the vGPU assigned to that pod VM, thereby lowering the “number” of active GPU instances), the entire GPU state (i.e., vGPU “partition data”) is saved in a backing file in storage (i.e., “database”)). Regarding claim 9, CULLY further teaches: the partition data comprises one or more of a state of the GPU instances, a configuration of the GPU instances, metadata describing the GPU, or a ratio of the compute power in each GPU instance ([0038] When this pod VM is suspended, the entire GPU state (i.e., vGPU “partition data”) is saved in a backing file in storage (i.e., “database”)). Regarding claims 11, 13-14, 16-17 , they comprise limitations similar to those of claims 1, 3-4, 6, and 7, and are therefore rejected for at least similar rationale. Regarding claim 20, it comprises limitations similar to those of claim 1, and is therefore rejected for similar rationale. Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over CULLY, in view of KURKURE, in view of AHN, as applied to claims 1, and 11 above, and in further view of DULUK JR. et al. Pub. No.: US 2023/0289212 A1 (hereafter DULUK). Regarding claim 5, while CULLY discusses GPUs partitioned into virtual GPUs, CULLY, KURKURE, and AHN does not explicitly teach: the GPU is a plurality of GPUs and the plurality of GPUs each support one or more GPU instances, and wherein partition data of each GPU instance of each GPU is saved to the file. However, in analogous art that similarly teaches physical GPU support of virtual GPUs, DULUK teaches: the GPU is a plurality of GPUs and the plurality of GPUs each support one or more GPU instances ([0036] In a virtualized environment that's powered by NVIDIA virtual GPUs, the NVIDIA virtual GPU (vGPU) software is installed at a virtualization layer along with a hypervisor. This software creates virtual GPUs that let every virtual machine (VM) share the physical GPU installed on the server. For more demanding workflows, a single VM can harness the power of multiple physical GPUs. For example, an installation can include many nodes, where each node may include several CPUs and several GPUs (i.e., multiple nodes each contain multiple GPUs which support multiple created virtual GPU “instances”)), and wherein partition data of each GPU instance of each GPU is saved to the file ([0037] HPC installations should be able to migrate a VM from one part of the installation to another. For example, when a node is taken down for maintenance, all the VMs on that node are migrated to different nodes…At the time of migration, the programs running on migrating VMs are preempted off the CPU(s) and GPU(s), memory images and context save buffers are moved to different places in the HPC installation (i.e., context save buffers represent “partition data” of the multiple virtual GPU instances of the GPUs are saved to buffer files)). It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to have combined DULUK’s teaching of a system where multiple GPUs create multiple virtual GPU instances that save context data in context save buffers, with CULLY, KURKURE, and AHN’s teaching of saving context data of a virtual GPU instance of a physical GPU, to realize, with a reasonable expectation of success a system that saves virtual GPU instance context data, as in CULLY, KURKURE, and AHN, for a plurality of virtual GPU instances of a plurality of physical GPUs, as in DULUK. A person having ordinary skill would have been motivated to make this combination to enable a virtual machine to process more demanding workflows (DULUK [0036]). Regarding claim 15, it comprises limitations similar to claim 5, and is therefore rejected for similar rationale. Claims 8, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over CULLY, in view of KURKURE, in view of AHN, as applied to claims 1, and 11 above, and in further view of LIANG et al. Pub. No.: US 2023/0074456 A1 (hereafter LIANG) Regarding claim 8, AHN further teaches: the partition data is periodically saved to the file according to a time period and saved to the database ([0055] Setting a checkpoint interval (checkpoint frequency) (i.e., “time period”) for periodic routine checkpointing of an accelerator (e.g., the accelerator 220) (i.e., periodic checkpointing saves accelerator data) having a relatively increased capacity, the FIT rate of only the accelerator (e.g., the accelerator 220) may be applied for determining its checkpoint interval) While CULLY, KURKURE, and AHN discuss collecting partition data of vGPU partitions in a periodic manner, CULLY and AHN does not explicitly teach: the time period is specified by a user However, in analogous art that similarly periodically collects state data of a processor, LIANG teaches: the time period is specified by a user ([0008] an emulation system captures DUT data by receiving a determined number of clock cycle intervals to sample internal state signals (e.g., a predetermined number as specified by a user). The emulation system may then receive the DUT data, which includes internal state signals and primary input signals. The emulation system can sample the primary input signals on each clock cycle and sample the internal state signals on every determined number of clock cycles. The emulation system may then create, on each clock cycle, a header for a current sample of the DUT data. The header may include a time stamp of the current sample, a sample count to the current sample, a last sample pointer, a last sector pointer, and a last frame pointer. The emulation system may store, with each clock cycle, the current header of the current sample of the DUT data with the time stamp. The emulation system can store the internal state signal at each interval corresponding to the determined number of clock cycle intervals (i.e., user specifies time period (in number of clock cycle intervals) for collection of internal state data)). It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to have combined LIANG’s teaching of a user setting intervals for periodically collecting internal processor state data, with the combination of CULLY, KURKURE, and AHN’s teaching of periodically collecting internal processor state data in the form of vGPU partition data, to realize, with a reasonable expectation of success, a system that collects vGPU partition data periodically, as in CULLY, KURKURE and AHN, according to intervals set by a user, as in LIANG. A person having ordinary skill would have been motivated to make this combination to give users enhanced control over collection of internal state data while reducing processing and time cost (LIANG [0003]). Regarding claim 18, it comprises limitations similar to those of claim 8, and is therefore rejected for similar rationale. Claims 10, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over CULLY, in view of KURKURE, in view of AHN, as applied to claims 1, and 11 above, and in further view of CHEN Pub. No.: US 2007/0157012 A1 (hereafter CHEN). Regarding claim 10, CULLY further teaches: accessing the server is performed by an automated agent ([0044] One or more embodiments of the present invention may be implemented as one or more computer programs (i.e., “automated agent”) or as one or more computer program modules embodied in computer-readable media), While CULLY, KURKURE, and AHN teach rebooting a node, CULLY, KURKURE and AHN do not explicitly teach: wherein the automated agent provides a handshake to determine when the node finishes the reboot. However, in analogous art that similarly teaches rebooting/booting of a node, CHEN teaches: wherein the automated agent provides a handshake to determine when the node finishes the reboot ([0024] A handshake protocol may be established to facilitate communication between the CPUs 102, . . . , 106, the clients 108, . . . , 112, the memory 116 and the host CPU 114 during an initial boot of the multiple CPU system 100. For example, an initial boot of the multiple CPU system 100 may include the execution of one or more general processing instructions generated by the host processor 114. The general processing instructions may be executed in a determined sequence to complete the initial boot. For example, during an initial boot, the host processor 114 may generate GPIs 120 and 122 to clients 108 and 110. The GPIs 120 and 122 may comprise instructions for running one or more test patterns to the memory 116 so that the host CPU 114 may adjust an optimal clock rate prior to completion of the initial boot sequence. After the test patterns were generated and an optimal clock rate of the host CPU 114 configured, the initial boot sequence may be considered complete and normal data processing operation may be initiated within the multiple CPU system 100 (i.e., the handshake protocol enables a determination that the boot sequence is complete)). It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to have combined CHEN’s teaching of a handshake protocol enabling a determination that a boot sequence is complete, with the combination of CULLY, KURKURE, and AHN’s teaching of rebooting a node, to realize, with a reasonable expectation of success, a system that reboots a node, as in CULLY, KURKURE, and AHN, and determines that the reboot is complete based on a handshake protocol, as in CHEN. A person having ordinary skill would have been motivated to make this combination to ensure that a sequence of actions in the handshake are executed to ensure that a node is correctly and completely booted. Regarding claim 19, it comprises limitations similar to claim 10, and is therefore rejected for similar rationale. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL W AYERS whose telephone number is (571)272-6420. The examiner can normally be reached M-F 8:30-5 PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached at (571) 272-4169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /MICHAEL W AYERS/Primary Examiner, Art Unit 2195
Read full office action

Prosecution Timeline

Apr 26, 2023
Application Filed
Oct 10, 2025
Non-Final Rejection mailed — §103
Jan 08, 2026
Response Filed
Mar 11, 2026
Final Rejection mailed — §103
May 11, 2026
Request for Continued Examination
May 12, 2026
Response after Non-Final Action
May 18, 2026
Non-Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12613696
SYSTEM, APPARATUS AND METHOD FOR THROTTLING FUSION OF MICRO-OPERATIONS IN A PROCESSOR
4y 4m to grant Granted Apr 28, 2026
Patent 12547446
Computing Device Control of a Job Execution Environment Based on Performance Regret of Thread Lifecycle Policies
4y 7m to grant Granted Feb 10, 2026
Patent 12498950
SIGNAL PROCESSING DEVICE AND DISPLAY APPARATUS FOR VEHICLE USING SHARED MEMORY TO TRANSMIT ETHERNET AND CONTROLLER AREA NETWORK DATA BETWEEN VIRTUAL MACHINES
3y 8m to grant Granted Dec 16, 2025
Patent 12493497
DETECTION AND HANDLING OF EXCESSIVE RESOURCE USAGE IN A DISTRIBUTED COMPUTING ENVIRONMENT
5y 2m to grant Granted Dec 09, 2025
Patent 12461768
CONFIGURING METRIC COLLECTION BASED ON APPLICATION INFORMATION
3y 8m to grant Granted Nov 04, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4
Expected OA Rounds
70%
Grant Probability
99%
With Interview (+53.7%)
3y 2m (~1m remaining)
Median Time to Grant
High
PTA Risk
Based on 292 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month