Last updated: April 19, 2026
Application No. 18/642,821
GPU MEMORY POOL MANAGER FOR VIRTUAL SHARED GPU MEMORY POOLING

Non-Final OA §103
Filed
Apr 23, 2024
Examiner
LIU, ZHENGXI
Art Unit
2611
Tech Center
2600 — Communications
Assignee
DELL PRODUCTS, L.P.
OA Round
1 (Non-Final)
Interview Optional

— +40.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 354 resolved cases, 2023–2026
Examiner Intelligence

LIU, ZHENGXI View full profile →
Grants 64% of resolved cases
Career Allow Rate
225 granted / 354 resolved
+1.6% vs TC avg
Strong +40% interview lift
Without
With
+40.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
31 currently pending
Career history
385
Total Applications
across all art units
Statute-Specific Performance

§101
13.2%
-26.8% vs TC avg
§103
61.3%
+21.3% vs TC avg
§102
5.1%
-34.9% vs TC avg
§112
15.7%
-24.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 354 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Compact Prosecution 
With respect to Claim Interpretation, the Examiner has provided some notes regarding “[BRI on the record]” throughout the Office Action, so that the record is clear about the scope of the claimed invention, and the record is also clear about the basis for the Examiner’s analyses.  A clear record of the claim interpretation could expedite the examination by creating the condition to allow the examination to focus on Applicant’s inventive concept and its comparison with related prior art. 
If there are disagreements, Applicant may present an alternative interpretation based on MPEP 2111.  The Examiner will adopt Applicant’s interpretation on the record, if Applicant’s interpretation is reasonable and/or arguments are persuasive. 
Applicant may amend claims relying on the Examiner’s claim interpretation provided on the record. 
Claim Objections
Claims 1 and 11 are objected to because of the following minor informalities, Claim 1 recites “the physical memory of each of the two or more GPUs” and the antecedent basis is unclear.  Claim 1 has introduced “allocating a physical memory of the GPU.”  However, the introduction does not sufficiently serve as the antecedent basis for “the physical memory of each of the two or more GPUs.”  The Examiner suggests Applicant could consider the following amendment to address the informalities. 
“the two or more GPUs’ physical memories”
Claim 11 recites a similar limitation and has the same informality, and Claim 11 is objected to. 
Claims 9 and 19 are objected to because of the following minor informalities, Claim 9 recites, “wherein the GMP manager is configured to allocate portions of the VSGMP preferentially wherein unallocated portions of the VSGMP comprising physical memory contributed by a VM are allocated to the VM before allocating any unallocated portions contributed by another VM.”  Appropriate correction is needed.  The Examiner suggests Applicant could consider the following amendment to address the informalities. 
wherein the GMP manager is configured to allocate portions of the VSGMP preferentially, and  wherein unallocated portions of the VSGMP comprising physical memory contributed by a VM are allocated to the VM before allocating any unallocated portions contributed by another VM.
Claim 19 recites a similar limitation and has the same informality, and Claim 19 is objected to. 
Claims 7-9 and 17-19 are objected to because of the following minor informalities: these Claims recite “a VM” and “the VM.”  However, these claims’ parents, Claims 1 and 1, have already introduced “a VM.”  Therefore, the antecedent basis for “the VM” in these dependent claims could be clarified.  Further, Applicant should clarify whether Claims 7-9 and 17-19’s introduced “a VM” is the same or different from “a VM” introduced in the independent Claims. Appropriate corrections are required. 
Claims 5 and 15 are objected to because of the following minor informalities: these Claims recite “from a VM based on.”  However, these claims’ parents, Claims 1 and 11, have already introduced “a VM.”  It is unclear whether Claims 5 and 15’s “VM” is referring to Claims 1 and 11’s “VM” or one of the claims’ VMs.  It is requested that Applicant put on the record if Claims 5 and 15’s “VM” is referring to Claims 1 and 11’s “VM.”  If not, Claims 5 and 15’s “VM” could be any VM anywhere to make GPU memory requests. However, Claim 1 and 11 recite, “a GPU memory resource accessible to VMs running in the information handling system.”  
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Johnson (US 20180089881 A1) in view of Koker et al. (US 20220137967 A1).
Regarding Claim 1, Johnson teaches A method for managing graphics processing units (GPUs) (“a plurality of graphics processing units (GPUs) to be shared by a plurality of virtual machines (VMs) within a virtualized execution environment; a shared memory to be shared between the plurality of VMs and GPUs executed within the virtualized graphics execution environment; . . ..”  Johnson Abstract.), the method comprising: 
responsive to detecting a GPU assignment, comprising an assignment of a GPU, selected from a group of two or more GPUS (Johnson fig. 15 1531 1532) included in an information handling system (the overall system illustrated by Johnson fig. 15), to a virtual machine (VM) (Johnson fig. 15 1501 1502) associated with the information handling system (

    PNG
    media_image1.png
    572
    448
    media_image1.png
    Greyscale

	With respect to fig. 15, Johnson explains, “Virtual machines 1501-1502 are abstracted by the virtualization software 1510 running on a physical machine which may include a host system memory 1550, multiple CPUs (not shown) and multiple GPUs 1531-1531.”  Johnson ¶ 134.  “Virtual machines (VMs) running on a physical host may use one or more graphics processing units (GPUs) to perform graphics operations. Hypervisor software manages how the GPU can be used by the VMs.”  Johnson ¶ 123.  
Johnson teaches GPU assignment to a virtual machine, stating “The GPU on the host PCI bus can be directly assigned to one VM and used only by that VM.” Johnson ¶ 124.  The GPU is selected from a group of “multiple GPUs 1531-1531” (Johnson ¶ 134).
Johnson teaches allocating GPU resources responsive to detecting the GPU assignment, stating “ In one embodiment, the entire GPU resources for a particular GPU can be assigned to a specific VM (e.g., GPU 1531 may be fully assigned to VM 1501). There may be no fixed allocations of GPU resources because the GPU scheduler or KMD host software determines the GPU resources dynamically as needed. In one embodiment, GPU memory 1550 is allocated and managed completely by the host software and mapped into the guest environment as needed using host-based memory management software.”  Johnson ¶ 157.  If a GPU is not assigned to VM, none of the resources associated with the GPU would be allocated the assigned VM.), 
performing GPU allocation operations including: 
allocating one or more non-memory resources of the GPU exclusively to the VM; and allocating a physical memory of the GPU (Johnson fig. 15 memory 1550) to a GPU memory pool (GMP) manager (Johnson fig. 15 memory interface unit 1540) communicatively coupled to each of the two or more GPUs (Johnson fig. 15 1501 1502) (
Johnson teaches allocating assigned GPU’s resources to the VM, stating “ In one embodiment, the entire GPU resources for a particular GPU can be assigned to a specific VM (e.g., GPU 1531 may be fully assigned to VM 1501). There may be no fixed allocations of GPU resources because the GPU scheduler or KMD host software determines the GPU resources dynamically as needed. In one embodiment, GPU memory 1550 is allocated and managed completely by the host software and mapped into the guest environment as needed using host-based memory management software.”  Johnson ¶ 157.  Johnson further states, “The GPU on the host PCI bus can be directly assigned to one VM and used only by that VM.” Johnson ¶ 124.  
All or some of non-memory resources of the assigned GPU are allocated exclusively to the receiving VM, when the GPU is “directly assigned to one VM and used only by that VM.” Johnson ¶ 124.  The could include GPU computation resources, which is differentiated from storage resources, e.g., memory. 
The physical memory is mapped to “GPU memory 1550.”  
The GPU memory pool (GMP) manager is mapped to “A memory interface unit 1540 [that] provides the GPUs with access to the shared system memory 1550.”  Johnson ¶ 136.), 
wherein the GMP manager is configured to: 
allow a GPU memory resource (Johnson fig. 15 GPU memory 1550) accessible to VMs running in the information handling system (“a shared memory to be shared between the plurality of VMs and GPUs executed within the virtualized graphics execution environment. . ..”  Johnson Abstract.); and 
execute GPU memory transactions from any of the VMs 
[BRI on the record] With respect to “memory transactions,” the Examiner is reading the limitation: memory accessing, memory allocation, and memory configuration.  This interpretation is in light of the specification:
[0034] In at least some embodiments, GMP manager 111 is configured to detect GPU memory transactions from VMs 120 and to execute, complete, or otherwise perform GPU memory transactions via VSGMP 115. GPU memory transactions may include GPU read/write transactions and GPU allocation and/or configuration transactions.
Spec. ¶ 34.
Johnson teaches accessing GPU memory from VM, stating “a shared memory to be shared between the plurality of VMs and GPUs executed within the virtualized graphics execution environment. . ..”  Johnson Abstract.).
Johnson does not explicitly disclose abstract a virtualized shared GPU memory pool (VSGMP), encompassing the physical memory of each of the two or more GPUs; or executing GPU memory transactions via the VSGMP.
Koker teaches 
abstract a virtualized shared GPU memory pool (VSGMP), encompassing the physical memory of each of the two or more GPUs; and executing GPU memory transactions via the VSGMP (
Koker teaches abstracting a virtual memory address space, encompassing the physical memory of GPUs, stating “As illustrated in FIG. 4F, in one optional implementation a unified memory addressable via a common virtual memory address space used to access the physical processor memories 401-402 and GPU memories 420-423 is employed. . . . The entire virtual/effective memory space (sometimes referred to as the effective address space) may thereby be distributed across each of the processor memories 401-402 and GPU memories 420-423, allowing any processor or GPU to access any physical memory with a virtual address mapped to that memory.” Koker ¶ 154.

    PNG
    media_image2.png
    450
    690
    media_image2.png
    Greyscale

	There are two alternative ways to combine Johnson and Koker: (a) a region of Koker’s virtualized memory address space as shown in fig. 4 could be allocated to create Johnson’s shared GPU memory fig. 15 1550, and (b) the unified memory only comprises memories of the participating GPUs.  Therefore, after the combination, we have a virtualized shared GPU memory pool through the combination as explained. 
Further, after the combination, the MIU fig. 15 1540, mapped to the GPU memory pool manager will be adapted to manage Johnson in view of Koker’s memory system.
Koker also teaches memory transaction via the VSGMP, stating “A first portion of the virtual/effective address space may be allocated to the processor memory 401, a second portion to the second processor memory 402, a third portion to the GPU memory 420, and so on. The entire virtual/effective memory space (sometimes referred to as the effective address space) may thereby be distributed across each of the processor memories 401-402 and GPU memories 420-423, allowing any processor or GPU to access any physical memory with a virtual address mapped to that memory.”  Koker ¶ 154.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Koker’s memory pool with Johnson. One of ordinary skill in the art would be motivated to share resources from multiple computation components and make access to the shared resources easier and consistent.  “The entire virtual/effective memory space (sometimes referred to as the effective address space) may thereby be distributed across each of the processor memories 401-402 and GPU memories 420-423, allowing any processor or GPU to access any physical memory with a virtual address mapped to that memory.”  Koker ¶ 154.
Claim 11 and Claim 1 are substantially similar.  The rejection analysis based on Johnson and Koker for Claim 1 is also applied to Claim 11.  In addition, Claim 11 recites An information handling system, comprising: a central processing unit (CPU) (Johnson fig. 1 107); two or more graphics processing units (GPUs) (Johnson fig. 1 108, 112); and system memory (Johnson fig. 1 120), accessible to the CPU (data and control flows indicated in Johnson fig. 1), and including processor-executable instructions (Johnson fig. 1 109) that, when executed by the CPU (Johnson fig. 1 107), cause the system to perform GPU management operations (“a plurality of graphics processing units (GPUs) to be shared by a plurality of virtual machines (VMs) within a virtualized execution environment; a shared memory to be shared between the plurality of VMs and GPUs executed within the virtualized graphics execution environment; . . ..”  Johnson Abstract) including, . . ..

Claims 2-4 and 12-14 are rejected under 35 U.S.C. 103 as being unpatentable over Johnson in view of Koker as applied to Claims 1 and 11, in further view of Park et al. (“Ballooning Graphics Memory Space in Full GPU Virtualization Environments”) (thereafter as Park-1). 
Regarding Claim 2, Johnson in view of Koker The method of claim 1. 
Johnson in view of Koker does not explicitly disclose wherein the GMP manager is configured to: perform GPU memory allocations, allocating portions of the VSGMP to a particular VM, in response to at least some GPU memory transactions.
Park-1 teaches wherein the GMP manager is configured to: 
perform GPU memory allocations, allocating portions of the VSGMP to a particular VM, in response to at least some GPU memory transactions (
Park-1 discloses the problem associated with static allocation GPU memory to a VM, stating “Although elasticity is one of the major benefits in this environment, the allocation of GPU memory is still static in the sense that after the GPU memory is allocated to a VM, it is not possible to change the memory size at runtime.”  Park Abstract. 
Park-1 provides a solution of dynamic allocation GPU memory, mapped to GPU memory allocations, stating “We implemented the gBalloon by modifying the gVirt, a full GPU virtualization solution for Intel’s integrated GPUs.  Benchmarking results show that the gBalloon dynamically adjusts the GPU memory size at runtime, which improves the performance by up to 8% against the gVirt with 384MB of high global graphics memory and 32% against the gVirt with 1024MB of high global graphics memory.”

    PNG
    media_image3.png
    358
    560
    media_image3.png
    Greyscale

	After the combination of Johnson in view of Koker and Park, the dynamic allocation (Park) is applied to Johnson in view of Koker’s VSGMP.  Because “The GPU on the host PCI bus can be directly assigned to one VM and used only by that VM” (Johnson ¶ 124), the memory allocated to the GPU is allocated to a particular VM.  
Park-1 teaches dynamic allocation in response to tracked failed memory object transactions, mapped to GPU memory transactions, stating “To reduce this overhead, the gBalloon detects the VMs’ lack of GPU memory by tracing the number of memory object returns at runtime and reduces the ballooned area of other VMs for the required amount of memory space so that the VM with the lack of memory can use additional GPU memory.”  Park 3.1 GPU Memory Expansion Strategy.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Park-1’s with Johnson in view of Koker. One of ordinary skill in the art would be motivated to better utilize computing resources that include memories.  Park-1 tries to solve the following problem, stating “Although elasticity is one of the major benefits in this environment, the allocation of GPU memory is still static in the sense that after the GPU memory is allocated to a VM, it is not possible to change the memory size at runtime.”  Park Abstract. 

Regarding Claim 3, Johnson in view of Koker and Park-1 teaches The method of claim 2, wherein the GMP manager is configured to: 
maintain GMP information indicative of an amount of physical memory contributed to the VSGMP by each of the VMs (
Koker teaches physical memory contributed by each GPU, stating “As illustrated in FIG. 4F, in one optional implementation a unified memory addressable via a common virtual memory address space used to access the physical processor memories 401-402 and GPU memories 420-423 is employed. . . . The entire virtual/effective memory space (sometimes referred to as the effective address space) may thereby be distributed across each of the processor memories 401-402 and GPU memories 420-423, allowing any processor or GPU to access any physical memory with a virtual address mapped to that memory.” Koker ¶ 154.

    PNG
    media_image2.png
    450
    690
    media_image2.png
    Greyscale

Each of the GPUS could correspond to a VM, because “The GPU on the host PCI bus can be directly assigned to one VM and used only by that VM.” Johnson ¶ 124.
Koker teaches mapping physical memory and virtual memory through use the addresses of contributed memory, stating “An instruction can access any of a local, shared, or global address space by specifying an address within a unified address space. The address mapping unit 256 can be used to translate addresses in the unified address space into a distinct memory address that can be accessed by the load/store units 266.”  Koker ¶ 83.
The physical addresses of contributed GPU memory track the amount of physical memory.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Koker’s memory pool with Johnson. One of ordinary skill in the art would be motivated to share resources from multiple computation components and make access to the shared resources easier and consistent.  “The entire virtual/effective memory space (sometimes referred to as the effective address space) may thereby be distributed across each of the processor memories 401-402 and GPU memories 420-423, allowing any processor or GPU to access any physical memory with a virtual address mapped to that memory.”  Koker ¶ 154.

Regarding Claim 4, Johnson in view of Koker and Park teaches The method of claim 2, wherein the GMP manager is configured to: 
maintain mapping information indicative of portions of the VSGMP allocated to each of two or more VMs (
Park teaches maintain mapping information, stating “Figure 1 shows the memory mapping and management structure between global graphics memory and system memory. In the gVirt (modi.ed gVirt in which the gScale’s features are added), part of the low global graphics memory is shared by all vGPUs, and the high global graphics memory is divided into 64MB slots that can also be shared among the vGPUs. The virtual address of the global graphics memory is converted into a physical address through the physical GTT.”  Park 2.1. Overview of gVirt.

    PNG
    media_image4.png
    234
    360
    media_image4.png
    Greyscale
, showing memory slots allocated to GPUS, which correspond to VMs, because “The GPU on the host PCI bus can be directly assigned to one VM and used only by that VM.” Johnson ¶ 124.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Park-1’s with Johnson in view of Koker. One of ordinary skill in the art would be motivated to better utilize computing resources that include memories.  Park-1 tries to solve the following problem, stating “Although elasticity is one of the major benefits in this environment, the allocation of GPU memory is still static in the sense that after the GPU memory is allocated to a VM, it is not possible to change the memory size at runtime.”  Park Abstract. 
Claims 12-14 and Claim 2-4 are substantially similar.  The rejection analyses based on Johnson and Koker and Park-1 for Claim 2-4 are also applied to Claim 12-14.  

Claims 5-6 and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Johnson in view of Koker as applied to Claims 1 and 11, in further view of Wu et al. (US 20250285206 A1) and Lee et al. (US 20180246911 A1). 
Regarding Claim 5, Johnson in view of Koker The method of claim 1. 
Johnson in view of Koker does not explicitly disclose  wherein the GMP manager is configured to grant or deny GPU memory requests from a VM based, at least in part, on: an amount of unallocated memory within the VSGMP; and an amount of memory indicated in the GPU memory request.  
Wu teaches wherein the GMP manager is configured to grant or deny GPU memory requests from a VM based, at least in part, on:
Wu teaches grant or deny GPU memory requests, stating “As compared to the art known by the inventor(s), the disclosed GPU-resource management method can determine whether to assign corresponding physical memory according to the type of the memory request, and enables adaptive adjustment according to exact attribute information of the physical memory.”  Wu ¶ 40.
Johnson teaches the GPU memory request is from a virtual machine (VM), stating “The GPU on the host PCI bus can be directly assigned to one VM and used only by that VM.” Johnson ¶ 124.  
Here, the disclosed “memory request” is a GPU memory request, because “Preferably, the API proxy manages the processes of the GPUs and allocates the resources of the GPUs through: creating a memory pool; in response to an incoming memory request, determining whether to assign a physical memory for the memory request according to type of the memory request;”  Wu ¶¶ 35-37.
Wu teaches making  memory allocation decision based on unallocated memory, stating “FIG. 2 is a detailed flowchart of memory allocation in a GPU-sharing method for serverless inference loads according to a preferred mode of the present disclosure. Specifically, in the step of memory allocation, after the API proxy process starts, a memory pool is created. . . . For an accessing task, it is first to determine whether physical memory has been allocated, and if not, physical memory is provisionally applied for and the mapping table is updated. A physical address acquired from the mapping table is used. Afterward, it is further to determine whether the access count of this part of memory has reached the upper limit, and if yes, this part of memory is released and the mapping table is updated.”  Wu ¶ 117.
After Johnson in view of Koker is combined with Wu, the unallocated memory is within  Johnson in view of Koker’s VSGMP.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Wu’s dynamic allocation of GPU memory with Johnson in view of Koker. One of ordinary skill in the art would be motivated to avoid GPU memory conflict or reduce GPU memory sharing so as to enhance memory performance.   
However, Johnson in view of Koker and Wu does not explicitly disclose granting or denying memory requests based on an amount of memory indicated in a memory request and an amount ofunallocated memory. 
Lee teaches disclose granting or denying memory requests based on an amount of memory indicated in a memory request and an amount of unallocated memory (
“For example, a policy may cause a memory allocator to evaluate how much total unallocated memory remains available, and grant or deny the requested allocation based on the priority of the requesting process. When the total available unallocated memory falls below a threshold, the memory allocator may deny a requested allocation of memory to statement execution related processes, . . ..”  Lee ¶ 89.

    PNG
    media_image5.png
    300
    428
    media_image5.png
    Greyscale
  Lee explains, “The secondary database system, at 730, generates an aggregated statement memory consumption value indicating the total amount of memory used by, or allocated to, processes handling execution of statements in the secondary database system.”  Lee ¶ 101.  The estimated consumption value corresponds to an amount of memory indicated in a memory request.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Lee’s with Johnson in view of Koker and Wu. One of ordinary skill in the art would be motivated to prevent allocation memory that the system cannot provide, thereby reducing the possibility of system errors.    

Regarding Claim 6, Johnson in view of Koker and Wu and Lee teaches The method of claim 5, wherein the GMP manager is denied GPU memory requests that, if granted, would reduce the amount of unallocated memory below a threshold minimum unallocated memory (
“For example, a policy may cause a memory allocator to evaluate how much total unallocated memory remains available, and grant or deny the requested allocation based on the priority of the requesting process. When the total available unallocated memory falls below a threshold, the memory allocator may deny a requested allocation of memory to statement execution related processes, . . ..”  Lee ¶ 89.
The disclosed “threshold” is mapped to the threshold minimum unallocated memory.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Lee’s with Johnson in view of Koker and Wu. One of ordinary skill in the art would be motivated to prevent allocation memory that the system cannot provide, thereby reducing the possibility of system errors, and/or reserving memory space for higher priority activities. “For example, a policy may cause a memory allocator to evaluate how much total unallocated memory remains available, and grant or deny the requested allocation based on the priority of the requesting process. When the total available unallocated memory falls below a threshold, the memory allocator may deny a requested allocation of memory to statement execution related processes, . . ..”  Lee ¶ 89.

Claims 15-16 and Claim 5-6 are substantially similar.  The rejection analyses based on Johnson and Koker, Wu, and Lee for Claim 5-6 is also applied to Claim 15-16.  

Claims 7-9 and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Johnson in view of Koker as applied to Claim 1, in further view of Wu et al. (US 20250285206 A1). 
Regarding Claim 7, Johnson in view of Koker teaches The method of claim 1. 
Johnson in view of Koker does not explicitly disclose wherein the GMP manager is configured to grant or deny GPU memory requests from a VM without regard to an identity of the VM. 
Wu teaches wherein the GMP manager is configured to grant or deny GPU memory requests from a VM without regard to an identity of the VM (
Wu teaches grant or deny GPU memory requests based on type of a memory request type, stating “As compared to the art known by the inventor(s), the disclosed GPU-resource management method can determine whether to assign corresponding physical memory according to the type of the memory request, and enables adaptive adjustment according to exact attribute information of the physical memory.”  Wu ¶ 40.
Johnson teaches the GPU memory request is from a virtual machine (VM), stating “The GPU on the host PCI bus can be directly assigned to one VM and used only by that VM.” Johnson ¶ 124.  
Here, the disclosed “memory request” is a GPU memory request, because “Preferably, the API proxy manages the processes of the GPUs and allocates the resources of the GPUs through: creating a memory pool; in response to an incoming memory request, determining whether to assign a physical memory for the memory request according to type of the memory request;”  Wu ¶¶ 35-37.
Wu teaches that the memory request type could be without regard to an identity of the memory requester, stating “Preferably, the step of ‘determining whether to assign a physical memory for the memory request according to type of the memory request’ includes: where the memory request is of the type of memory accessing, accessing a mapping to acquire the physical memory; and otherwise assigning a virtual address for the memory request, and inserting the assigned virtual address into the mapping table.”  Wu ¶¶ 45-47.  
Further, Koker also teaches a virtual address system without regard to an identity of the memory requester, stating “As illustrated in FIG. 4F, in one optional implementation a unified memory addressable via a common virtual memory address space used to access the physical processor memories 401-402 and GPU memories 420-423 is employed. . . . The entire virtual/effective memory space (sometimes referred to as the effective address space) may thereby be distributed across each of the processor memories 401-402 and GPU memories 420-423, allowing any processor or GPU to access any physical memory with a virtual address mapped to that memory.” Koker ¶ 154.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Koker’s memory pool with Johnson. One of ordinary skill in the art would be motivated to share resources from multiple computation components and make access to the shared resources easier and consistent.  “The entire virtual/effective memory space (sometimes referred to as the effective address space) may thereby be distributed across each of the processor memories 401-402 and GPU memories 420-423, allowing any processor or GPU to access any physical memory with a virtual address mapped to that memory.”  Koker ¶ 154.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Wu’s dynamic allocation of GPU memory with Johnson in view of Koker. One of ordinary skill in the art would be motivated to avoid GPU memory conflict or reduce GPU memory sharing so as to enhance memory performance.   

Regarding Claim 8, Johnson in view of Koker and Park teaches The method of claim 1, 
wherein the GMP manager is configured to grant or deny GPU memory requests from a VM without regard to the amount of physical memory contributed to the VSGMP by the VM ( 
Wu teaches grant or deny GPU memory requests based on type of a memory request type, stating “As compared to the art known by the inventor(s), the disclosed GPU-resource management method can determine whether to assign corresponding physical memory according to the type of the memory request, and enables adaptive adjustment according to exact attribute information of the physical memory.”  Wu ¶ 40.
Johnson teaches the GPU memory request is from a virtual machine (VM), stating “The GPU on the host PCI bus can be directly assigned to one VM and used only by that VM.” Johnson ¶ 124.  
Here, the disclosed “memory request” is a GPU memory request, because “Preferably, the API proxy manages the processes of the GPUs and allocates the resources of the GPUs through: creating a memory pool; in response to an incoming memory request, determining whether to assign a physical memory for the memory request according to type of the memory request;”  Wu ¶¶ 35-37.
Wu teaches that the memory request type could be without regard to the amount of physical memory contributed by the memory requester, stating “Preferably, the step of ‘determining whether to assign a physical memory for the memory request according to type of the memory request’ includes: where the memory request is of the type of memory accessing, accessing a mapping to acquire the physical memory; and otherwise assigning a virtual address for the memory request, and inserting the assigned virtual address into the mapping table.”  Wu ¶¶ 45-47.  
Further, Koker also teaches a virtual address system without regard to the amount of physical memory contributed by the memory requester, stating “As illustrated in FIG. 4F, in one optional implementation a unified memory addressable via a common virtual memory address space used to access the physical processor memories 401-402 and GPU memories 420-423 is employed. . . . The entire virtual/effective memory space (sometimes referred to as the effective address space) may thereby be distributed across each of the processor memories 401-402 and GPU memories 420-423, allowing any processor or GPU to access any physical memory with a virtual address mapped to that memory.” Koker ¶ 154.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Koker’s memory pool with Johnson. One of ordinary skill in the art would be motivated to share resources from multiple computation components and make access to the shared resources easier and consistent.  “The entire virtual/effective memory space (sometimes referred to as the effective address space) may thereby be distributed across each of the processor memories 401-402 and GPU memories 420-423, allowing any processor or GPU to access any physical memory with a virtual address mapped to that memory.”  Koker ¶ 154.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Wu’s dynamic allocation of GPU memory with Johnson in view of Koker. One of ordinary skill in the art would be motivated to avoid GPU memory conflict or reduce GPU memory sharing so as to enhance memory performance.   

Regarding Claim 9, Johnson in view of Koker teaches The method of claim 1. 
Johnson in view of Koker does not explicitly disclose 
wherein the GMP manager is configured to allocate portions of the VSGMP preferentially wherein unallocated portions of the VSGMP comprising physical memory contributed by a VM are allocated to the VM before allocating any unallocated portions contributed by another VM. 
Wu teaches wherein the GMP manager is configured to allocate portions of the VSGMP preferentially wherein unallocated portions of the VSGMP comprising physical memory contributed by a VM are allocated to the VM before allocating any unallocated portions contributed by another VM (
Wu teaches allocate portions of the shared memory, stating “As compared to the art known by the inventor(s), the disclosed GPU-resource management method can determine whether to assign corresponding physical memory according to the type of the memory request, and enables adaptive adjustment according to exact attribute information of the physical memory.”  Wu ¶ 40.
Wu teaches such allocation based on unallocated portions of the shared memory, stating “FIG. 2 is a detailed flowchart of memory allocation in a GPU-sharing method for serverless inference loads according to a preferred mode of the present disclosure. Specifically, in the step of memory allocation, after the API proxy process starts, a memory pool is created. . . . For an accessing task, it is first to determine whether physical memory has been allocated, and if not, physical memory is provisionally applied for and the mapping table is updated.”  Wu ¶ 117.
After Johnson in view of Koker is combined with Wu, such allocation is done preferentially. “The memory page is then retrieved according to the data access structure implemented by the masking structure 2825. In a particular implementation, the data access structure directs the memory access from a closest memory to a GPU to improve memory access efficiency.”  Koker ¶ 379.  “In particular, the masking structure may cause pages to be retrieved from a nearest GPU memory element, rather than from other elements of the shared memory.”  Koker ¶ 381.  The preferential allocation is based on the location of unallocated portion of shared memory.
A GPU’s memory is closest to the GPU, and therefore, unallocated portions of the shared memory contributed by a VM are allocated to the VM.
“The GPU on the host PCI bus can be directly assigned to one VM and used only by that VM.” Johnson ¶ 124.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Koker’s preferential treatment of memory based closeness. One of ordinary skill in the art would be motivated to share resources from multiple computation components and make access to the shared resources easier and consistent.  The preferential treatment would enhance the speed of GPU memory access.  “The memory page is then retrieved according to the data access structure implemented by the masking structure 2825. In a particular implementation, the data access structure directs the memory access from a closest memory to a GPU to improve memory access efficiency.”  Koker ¶ 379.  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Wu’s dynamic allocation of GPU memory with Johnson in view of Koker. One of ordinary skill in the art would be motivated to avoid GPU memory conflict or reduce GPU memory sharing so as to enhance memory performance.   

Claims 17-19 and Claim 7-9 are substantially similar.  The rejection analyses based on Johnson and Koker and Wu for Claim 7-9 is also applied to Claim 17-19.  

Claims 10 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Johnson in view of Koker as applied to Claims 1 and 11, in further view of PARK (US 20180198618 A1) (thereafter as Park-2). 
Regarding Claim 10, Johnson in view of Koker teaches The method of claim 1. 
Johnson in view of Koker does not explicitly disclose wherein the GMP manager runs within a hypervisor enabled by a lightweight secure operating system (LSOS).
Park-2 teaches wherein the GMP manager runs within a hypervisor enabled by a lightweight secure operating system (LSOS) (
“The secure execution environment provision apparatus 200 for a mobile cloud may be composed of a general execution unit, by which a general mobile operating system is running based on the hypervisor, and a secure execution unit, by which a lightweight embedded operating system is running in consideration of the characteristics of apparatuses.”  Park-2 ¶ 56.

    PNG
    media_image6.png
    530
    452
    media_image6.png
    Greyscale

	“lightweight embedded operating system” run within a “secure execution unit” (820) is mapped to lightweight secure operating system.  The “secure execution unit” (820) is run on a hypervisor (830). 
	After Johnson in view of Koker is combined with Park-2, Johnson in view of Koker’s GMP manager run within Park-2 hypervisor.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Park-2’s secure execution unit with Johnson in view of Koker. One of ordinary skill in the art would be motivated to enhance security and/or efficiency of the system.  “The secure execution environment provision apparatus 200 for a mobile cloud may be composed of a general execution unit, by which a general mobile operating system is running based on the hypervisor, and a secure execution unit, by which a lightweight embedded operating system is running in consideration of the characteristics of apparatuses.”  Park-2 ¶ 56.
Claim 20 and Claim 10 are substantially similar.  The rejection analyses based on Johnson, Koker, and Park-2 for Claim 10 is also applied to Claim 20.  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Yeh et al. (“KubeShare: A Framework to Manage GPUs as First-Class and Shared Resources in Container Cloud”) 

    PNG
    media_image7.png
    616
    556
    media_image7.png
    Greyscale

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZHENGXI LIU whose telephone number is (571)270-7509. The examiner can normally be reached M-F 9 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached at (571)272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/ZHENGXI LIU/            Primary Examiner
Read full office action
Prosecution Timeline

Apr 23, 2024
Application Filed
Nov 28, 2025
Non-Final Rejection — §103
Feb 17, 2026
Interview Requested
Mar 18, 2026
Examiner Interview Summary
Mar 18, 2026
Applicant Interview (Telephonic)
Precedent Cases

Applications granted by this same examiner with similar technology

18/473,184
Patent 12602865
METHODS FOR DEPTH CONFLICT MITIGATION IN A THREE-DIMENSIONAL ENVIRONMENT
2y 5m to grant Granted Apr 14, 2026
18/554,610
Patent 12599463
COLOR MANAGEMENT PROCESS FOR CUSTOMIZED DENTAL RESTORATIONS
2y 5m to grant Granted Apr 14, 2026
18/551,877
Patent 12597402
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM FOR APPLICATION WINDOW HAVING FIRST DISPLAY MODE AND SECOND DISPLAY MODE
2y 5m to grant Granted Apr 07, 2026
18/551,738
Patent 12567193
PARTICLE RENDERING METHOD AND APPARATUS
2y 5m to grant Granted Mar 03, 2026
18/371,897
Patent 12561929
METHOD AND ELECTRONIC DEVICE FOR PROVIDING INFORMATION RELATED TO PLACING OBJECT IN SPACE
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
64%
Grant Probability
99%
With Interview (+40.1%)
3y 4m
Median Time to Grant
Low
PTA Risk
Based on 354 resolved cases by this examiner. Grant probability derived from career allow rate.