Last updated: April 19, 2026
Application No. 18/250,708
Persistent Multi-Instance GPU Partitions

Final Rejection §103
Filed
Apr 26, 2023
Examiner
AYERS, MICHAEL W
Art Unit
2195
Tech Center
2100 — Computer Architecture & Software
Assignee
Rakuten Symphony Inc.
OA Round
2 (Final)
Interview Optional

— +56.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 287 resolved cases, 2023–2026
Examiner Intelligence

AYERS, MICHAEL W View full profile →
Grants 70% — above average
Career Allow Rate
200 granted / 287 resolved
+14.7% vs TC avg
Strong +56% interview lift
Without
With
+56.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
37 currently pending
Career history
324
Total Applications
across all art units
Statute-Specific Performance

§101
14.8%
-25.2% vs TC avg
§103
47.3%
+7.3% vs TC avg
§102
2.9%
-37.1% vs TC avg
§112
25.6%
-14.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 287 resolved cases
Office Action

§103
DETAILED ACTION
This office action is in response to claims and remarks filed 8 January 2026.
Claims 1, 3-11, and 13-20 are pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments, see pages 7-9 of the remarks filed 8 January 2026, with respect to the claim objections and rejections made under 35 U.S.C. 112, 35 U.S.C. 101, and 35 U.S.C. 102 have been fully considered and are persuasive.  The objections and rejections have been withdrawn. 

Applicant’s arguments, see pages 9-12 of the remarks filed 8 January 2026, with respect to the rejections made under 35 U.S.C.103 have been fully considered but are not persuasive.

On pages 9-12 of the remarks, applicant argues the following:
“Applicant respectfully submits that the cited Cully reference does not anticipate claim 1, at least as amended. Specifically, the Cully does not teach the following limitations and Ahn is relied upon to teach these limitations: rebooting the node, wherein rebooting the node deletes the one or more GPU instances; accessing, by the node, a server with the file when the node completes the reboot.
“Applicant respectfully submit that Ahn does not teach the above limitations…
“Applicant submits that the accelerator memory dump that ‘may include data checkpointed on demand in response to a failure occurring in the host processor,’ as disclosed by Ahn, does not teach or suggest ‘the file’ as claimed. Specifically, Applicant’s claimed feature recites ‘the accessing, by the node, a server with the file when the node completes the reboot.’ The ‘file’ as claimed included saved ‘partition data pertaining to the one or more GPU instances to a file.’ Applicant submits that the data checkpointed on demand in response to a failure occurring in the host processor, as disclosed by Ahn, does not teach or suggest partition data pertaining to the one or more GPU instances saved to a file. Therefore, Ahn does not teach or suggest ‘rebooting the node, wherein rebooting the node deletes the one or more GPU instances’ and ‘accessing, by the node, a server with the file when the node completes the reboot,’ as claimed.”

	The examiner respectfully disagrees. MPEP 2145 IV states:
“One cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.”
	In our case, the rejection combines CULLY’s teaching of saving of a “file” of partition data of GPU instances (“[0038] When this pod VM is suspended, the entire GPU state (i.e., vGPU “partition data”) is saved in a backing file in storage (i.e., “database”)”). CULLY further retrieves the “file” when a GPU state is restored (“[0038] The migration destination node restores the GPU state by reading the backing file from the location in storage indicated in the fetch description (i.e., destination node accesses shared storage server 170 to retrieve the GPU state for a new (“created”) vGPU in the destination node)”). AHN was used to teach that saving and restoring of a file may occur when nodes reboot (“([0068] In operation 5, when a checkpoint operation for a memory of an accelerator is completed and an accelerator (i.e., “GPU instance”) memory dump 433 has been generated (e.g., on-demand as discussed with reference to FIG. 3 ), an operation node 410 may restart or reset (i.e., “reboot”). In this case, a master node 420 may be involved in the restarting of the operation node 410, but implementations are not limited to the foregoing example (i.e., dumping accelerator memory upon node restart “deletes” the instance of the accelerator from the operation node). [0050] The accelerator 220 may include, for example, a graphics processing unit (GPU))”, “[0070] In operation 7, the accelerator memory dump 433 stored in the storage node 430 may be transmitted (i.e., “accessed”) to the host processor of the operation node 410. The accelerator memory dump 433 may include data checkpointed on demand in response to a failure occurring in the host processor (e.g., per operation 4 of FIGS. 3 and/or 5 )”). Thus, by combining CULLY’s teaching of storing and retrieving a “file” of partition data of GPU instances, with AHN’s teaching of storing and retrieving files during reboot of a node, the combination of references teaches storing and retrieving of a file” of partition data of GPU instances, as in CULLY, during reboot of a node, as in AHN. 
Since the applicant argues that AHN fails to teach the claimed “file”, but does not consider that this limitation is taught by CULLY, the applicant’s argument fails to consider the combination of references, and attempts to attack the references individually. The applicant’s argument is therefore not persuasive.
	All other arguments in the remarks rely on this unpersuasive argument, and are therefore themselves unpersuasive.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-4, 6-7, 9, 11-14, 16-17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over CULLY et al. Pub. No.: US 2023/0393898 A1 (hereafter CULLY), in further view of AHN et al. Pub. No.: US 2023/0281081 A1 (hereafter AHN).

Regarding claim 1, CULLY teaches the invention substantially as claimed, including:
A method comprising:
providing a node comprising one or more applications ([0022] Application 214 is co-executed by…process container 208 running in the acceptor node);
providing a GPU ([0022] Process container 208 is spun up in pod VM 130 of host 120-1 because host 120-1 is equipped with…special hardware 166 (e.g., GPU));
dividing the GPU into one or more GPU instances, wherein each GPU instance is associated with at least one of the one or more applications ([0038] A hypervisor may provision a virtual hardware platform including a virtual GPU (i.e., virtual GPUs represent a “division” of the physical GPU) for a pod VM);
saving partition data pertaining to the one or more GPU instances to a file; and saving the file to a database ([0038] When this pod VM is suspended, the entire GPU state (i.e., vGPU “partition data”) is saved in a backing file in storage (i.e., “database”)). 
…[delete] the one or more GPU instances ([0033] In step 602, in response to an instruction to suspend a workload, the pod VM controller suspends the workload by invoking a scheduler of its node to suspend execution of the workload and then evicting a portion or all of the executing image of the workload that is in memory to storage (e.g., shared storage 170), depending on how much memory resources need to be freed up. The amount of memory resources that need to be freed up may be indicated as a parameter of the instruction for suspending the workload. [0034] Orchestration service 105 adds the entry for the suspended workload to idle set 304 and removes the entry for the suspended workload from one of the active sets corresponding to the node where the workload is now idle. (i.e., evicting vGPU state from memory to storage when an associated workload is suspended removes or “deletes” the vGPU state));
accessing, by the node, a server…retrieving the file comprising the partition data; creating new one or more GPU instances according to the partition data ([0038] The migration destination node restores the GPU state by reading the backing file from the location in storage indicated in the fetch description (i.e., destination node accesses shared storage server 170 to retrieve the GPU state for a new (“created”) vGPU in the destination node)); and
associating the one or more applications with the new one or more GPU instances ([0039] The state of this workload is restored in the migration destination node by migrating the part of the workload state that is in the memory of the migration source node to the migration destination node and restoring the rest of the workload state from storage (i.e., workload of an application is associated with the restored vGPU state on the destination node)). 

While CULLY teaches deleting GPU instances upon suspension and retrieving GPU instance data, CULLY does not explicitly teach:
rebooting the node, wherein rebooting the node deletes the one or more GPU instances
accessing, by the node, a server when the node completes the reboot;

However, in analogous art that similarly deletes GPU instances, AHN teaches:
rebooting the node, wherein rebooting the node deletes the one or more GPU instances
([0068] In operation 5, when a checkpoint operation for a memory of an accelerator is completed and an accelerator (i.e., “GPU instance”) memory dump 433 has been generated (e.g., on-demand as discussed with reference to FIG. 3 ), an operation node 410 may restart or reset (i.e., “reboot”). In this case, a master node 420 may be involved in the restarting of the operation node 410, but implementations are not limited to the foregoing example (i.e., dumping accelerator memory upon node restart “deletes” the instance of the accelerator from the operation node). [0050] The accelerator 220 may include, for example, a graphics processing unit (GPU))
accessing, by the node, a server with the file when the node completes the reboot ([0070] In operation 7, the accelerator memory dump 433 stored in the storage node 430 may be transmitted (i.e., “accessed”) to the host processor of the operation node 410. The accelerator memory dump 433 may include data checkpointed on demand in response to a failure occurring in the host processor (e.g., per operation 4 of FIGS. 3 and/or 5 );

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to have combined AHN’s teaching of rebooting a node causing a GPU instance to be deleted, with CULLY’s teaching of causing a GPU instance to be deleted upon suspension, to realize, with a reasonable expectation of success, a system that deletes GPU instances from a node, as in CULLY, when the node is rebooted, as in AHN. A person having ordinary skill would have been motivated to make this combination to enable a system to recover from faults and failures that require operation node rebooting while maintaining GPU memory state.

Regarding claim 3, CULLY further teaches:
dividing the GPU into one or more GPU instances, saving partition data pertaining to the one or more GPU instances to a file, and saving the file to a database are performed by an automated agent ([0044] One or more embodiments of the present invention may be implemented as one or more computer programs (i.e., “automated agent”) or as one or more computer program modules embodied in computer-readable media). 

Regarding claim 4, CULLY further teaches:
retrieving the file comprising the partition data, creating new one or more partitions according to the partition data, and associating the one or more applications with the new one or more GPU instances are performed by the automated agent ([0044] One or more embodiments of the present invention may be implemented as one or more computer programs (i.e., “automated agent”) or as one or more computer program modules embodied in computer-readable media).  

Regarding claim 6, AHN further teaches:
the node provides a handshake to the automated agent upon completing the reboot such that the automated agent receives an indication to retrieve the file from the database ([0078]  A host memory dump previously stored (e.g., checkpointed per a checkpoint interval of the host processor 510) in the storage 540 may be transmitted to the restarted host processor 510 and loaded into the memory thereof, and an accelerator memory dump may be transmitted to the accelerator 530 and loaded into the memory thereof)…

CULLY further teaches:
retrieve the file from the database and partition the GPU ([0038] The migration destination node restores the GPU state by reading the backing file from the location in storage indicated in the fetch description (i.e., destination node accesses shared storage server 170 to retrieve the GPU state for a new vGPU in the destination node))

Regarding claim 7, CULLY further teaches:
the automated agent saves the partition data to the file each time there is a change to a number of the GPU instances, a configuration of the GPU instances, or a mapping of the GPU instances to the one or more applications ([0038] When this pod VM is suspended (i.e., suspending the pod VM suspends the vGPU assigned to that pod VM, thereby lowering the “number” of active GPU instances), the entire GPU state (i.e., vGPU “partition data”) is saved in a backing file in storage (i.e., “database”)).  

Regarding claim 9, CULLY further teaches:
the partition data comprises one or more of a state of the GPU instances, a configuration of the GPU instances, metadata describing the GPU, or a ratio of the compute power in each GPU instance ([0038] When this pod VM is suspended, the entire GPU state (i.e., vGPU “partition data”) is saved in a backing file in storage (i.e., “database”)). 

Regarding claims 11, 13-14, 16-17 , they comprise limitations similar to those of claims 1, 3-4, 6, and 7, and are therefore rejected for at least similar rationale.

Regarding claim 20, it comprises limitations similar to those of claim 1, and is therefore rejected for similar rationale.

Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over CULLY, in view of AHN, as applied to claims 1, and 11 above, and in further view of DULUK JR. et al. Pub. No.: US 2023/0289212 A1 (hereafter DULUK).

Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over CULLY, as applied to claims 1, and 11 above, and in further view of DULUK JR. et al. Pub. No.: US 2023/0289212 A1 (hereafter DULUK).

Regarding claim 5, while CULLY discusses GPUs partitioned into virtual GPUs, CULLY and AHN does not explicitly teach:
the GPU is a plurality of GPUs and the plurality of GPUs each support one or more GPU instances, and wherein partition data of each GPU instance of each GPU is saved to the file. 

However, in analogous art that similarly teaches physical GPU support of virtual GPUs, DULUK teaches:
the GPU is a plurality of GPUs and the plurality of GPUs each support one or more GPU instances ([0036] In a virtualized environment that's powered by NVIDIA virtual GPUs, the NVIDIA virtual GPU (vGPU) software is installed at a virtualization layer along with a hypervisor. This software creates virtual GPUs that let every virtual machine (VM) share the physical GPU installed on the server. For more demanding workflows, a single VM can harness the power of multiple physical GPUs. For example, an installation can include many nodes, where each node may include several CPUs and several GPUs (i.e., multiple nodes each contain multiple GPUs which support multiple created virtual GPU “instances”)), and wherein partition data of each GPU instance of each GPU is saved to the file ([0037] HPC installations should be able to migrate a VM from one part of the installation to another. For example, when a node is taken down for maintenance, all the VMs on that node are migrated to different nodes…At the time of migration, the programs running on migrating VMs are preempted off the CPU(s) and GPU(s), memory images and context save buffers are moved to different places in the HPC installation (i.e., context save buffers represent “partition data” of the multiple virtual GPU instances of the GPUs are saved to buffer files)). 

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to have combined DULUK’s teaching of a system where multiple GPUs create multiple virtual GPU instances that save context data in context save buffers, with CULLY and AHN’s teaching of saving context data of a virtual GPU instance of a physical GPU, to realize, with a reasonable expectation of success a system that saves virtual GPU instance context data, as in CULLY and AHN, for a plurality of virtual GPU instances of a plurality of physical GPUs, as in DULUK. A person having ordinary skill would have been motivated to make this combination to enable a virtual machine to process more demanding workflows (DULUK [0036]).

Regarding claim 15, it comprises limitations similar to claim 5, and is therefore rejected for similar rationale.

Claims 8, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over CULLY, in view of AHN, as applied to claims 1, and 11 above, and in further view of LIANG et al. Pub. No.: US 2023/0074456 A1 (hereafter LIANG)

Regarding claim 8, AHN further teaches:
the partition data is periodically saved to the file according to a time period and saved to the database ([0055] Setting a checkpoint interval (checkpoint frequency) (i.e., “time period”) for periodic routine checkpointing of an accelerator (e.g., the accelerator 220) (i.e., periodic checkpointing saves accelerator data) having a relatively increased capacity, the FIT rate of only the accelerator (e.g., the accelerator 220) may be applied for determining its checkpoint interval)

While CULLY and AHN discuss collecting partition data of vGPU partitions in a periodic manner, CULLY and AHN does not explicitly teach:
the time period is specified by a user

	However, in analogous art that similarly periodically collects state data of a processor, LIANG teaches:
the time period is specified by a user ([0008] an emulation system captures DUT data by receiving a determined number of clock cycle intervals to sample internal state signals (e.g., a predetermined number as specified by a user). The emulation system may then receive the DUT data, which includes internal state signals and primary input signals. The emulation system can sample the primary input signals on each clock cycle and sample the internal state signals on every determined number of clock cycles. The emulation system may then create, on each clock cycle, a header for a current sample of the DUT data. The header may include a time stamp of the current sample, a sample count to the current sample, a last sample pointer, a last sector pointer, and a last frame pointer. The emulation system may store, with each clock cycle, the current header of the current sample of the DUT data with the time stamp. The emulation system can store the internal state signal at each interval corresponding to the determined number of clock cycle intervals (i.e., user specifies time period (in number of clock cycle intervals) for collection of internal state data)).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to have combined LIANG’s teaching of a user setting intervals for periodically collecting internal processor state data, with the combination of CULLY and AHN’s teaching of periodically collecting internal processor state data in the form of vGPU partition data, to realize, with a reasonable expectation of success, a system that collects vGPU partition data periodically, as in CULLY and AHN, according to intervals set by a user, as in LIANG. A person having ordinary skill would have been motivated to make this combination to give users enhanced control over collection of internal state data while reducing processing and time cost (LIANG [0003]).

Regarding claim 18, it comprises limitations similar to those of claim 8, and is therefore rejected for similar rationale.

Claims 10, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over CULLY, in view of AHN, as applied to claims 1, and 11 above, and in further view of CHEN Pub. No.: US 2007/0157012 A1 (hereafter CHEN).

Regarding claim 10, CULLY further teaches:
accessing the server is performed by an automated agent ([0044] One or more embodiments of the present invention may be implemented as one or more computer programs (i.e., “automated agent”) or as one or more computer program modules embodied in computer-readable media), 

While CULLY and AHN teach rebooting a node, CULLY and AHN do not explicitly teach:
wherein the automated agent provides a handshake to determine when the node finishes the reboot. 

However, in analogous art that similarly teaches rebooting/booting of a node, CHEN teaches:
wherein the automated agent provides a handshake to determine when the node finishes the reboot ([0024] A handshake protocol may be established to facilitate communication between the CPUs 102, . . . , 106, the clients 108, . . . , 112, the memory 116 and the host CPU 114 during an initial boot of the multiple CPU system 100. For example, an initial boot of the multiple CPU system 100 may include the execution of one or more general processing instructions generated by the host processor 114. The general processing instructions may be executed in a determined sequence to complete the initial boot. For example, during an initial boot, the host processor 114 may generate GPIs 120 and 122 to clients 108 and 110. The GPIs 120 and 122 may comprise instructions for running one or more test patterns to the memory 116 so that the host CPU 114 may adjust an optimal clock rate prior to completion of the initial boot sequence. After the test patterns were generated and an optimal clock rate of the host CPU 114 configured, the initial boot sequence may be considered complete and normal data processing operation may be initiated within the multiple CPU system 100 (i.e., the handshake protocol enables a determination that the boot sequence is complete)). 

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to have combined CHEN’s teaching of a handshake protocol enabling a determination that a boot sequence is complete, with the combination of CULLY and AHN’s teaching of rebooting a node, to realize, with a reasonable expectation of success, a system that reboots a node, as in CULLY and AHN, and determines that the reboot is complete based on a handshake protocol, as in CHEN. A person having ordinary skill would have been motivated to make this combination to ensure that a sequence of actions in the handshake are executed to ensure that a node is correctly and completely booted.

Regarding claim 19, it comprises limitations similar to claim 10, and is therefore rejected for similar rationale.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
URJAN ANANDAKUMAR et al. Patent No.: US 12,066,964 B1 discloses modular controllers and hardware accelerators that are held in standby to facilitate failure recovery and improve server availability including interruption and restarting of executing applications.

THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL W AYERS whose telephone number is (571)272-6420. The examiner can normally be reached M-F 8:30-5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached at (571) 272-4169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/MICHAEL W AYERS/Primary Examiner, Art Unit 2195
Read full office action
Prosecution Timeline

Apr 26, 2023
Application Filed
Oct 09, 2025
Non-Final Rejection — §103
Jan 08, 2026
Response Filed
Mar 09, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/350,448
Patent 12547446
Computing Device Control of a Job Execution Environment Based on Performance Regret of Thread Lifecycle Policies
2y 5m to grant Granted Feb 10, 2026
17/755,040
Patent 12498950
SIGNAL PROCESSING DEVICE AND DISPLAY APPARATUS FOR VEHICLE USING SHARED MEMORY TO TRANSMIT ETHERNET AND CONTROLLER AREA NETWORK DATA BETWEEN VIRTUAL MACHINES
2y 5m to grant Granted Dec 16, 2025
17/023,444
Patent 12493497
DETECTION AND HANDLING OF EXCESSIVE RESOURCE USAGE IN A DISTRIBUTED COMPUTING ENVIRONMENT
2y 5m to grant Granted Dec 09, 2025
17/673,119
Patent 12461768
CONFIGURING METRIC COLLECTION BASED ON APPLICATION INFORMATION
2y 5m to grant Granted Nov 04, 2025
18/326,870
Patent 12423149
LOCK-FREE WORK-STEALING THREAD SCHEDULER
2y 5m to grant Granted Sep 23, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
70%
Grant Probability
99%
With Interview (+56.2%)
3y 4m
Median Time to Grant
Moderate
PTA Risk
Based on 287 resolved cases by this examiner. Grant probability derived from career allow rate.