Last updated: April 19, 2026

Application No. 18/388,799

FLEXIBLE GPU RESOURCE SCHEDULING METHOD IN LARGE-SCALE CONTAINER OPERATION ENVIRONMENT

Non-Final OA §103

Filed

Nov 10, 2023

Examiner

CHEN, SHIN HON

Art Unit

2431

Tech Center

2400 — Computer Networks

Assignee

Korea Electronics Technology Institute

OA Round

1 (Non-Final)

Interview Optional

— +13.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 797 resolved cases, 2023–2026

Examiner Intelligence

CHEN, SHIN HON View full profile →

Grants 87% — above average

Career Allow Rate

690 granted / 797 resolved

+28.6% vs TC avg

Moderate +13% lift

Without

With

+13.4%

Interview Lift

resolved cases with interview

Typical timeline

2y 10m

Avg Prosecution

32 currently pending

Career history

829

Total Applications

across all art units

Statute-Specific Performance

§101

12.4%

-27.6% vs TC avg

§103

43.3%

+3.3% vs TC avg

§102

25.2%

-14.8% vs TC avg

§112

3.7%

-36.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 797 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-11 have been examined.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/16/24 is being considered by the examiner.

Examiner’s Comment
Claim 10 recites “computer-readable recording medium,” which implies non-transitory nature of the structure to store data. However, Applicant is encouraged to positively recite “non-transitory computer-readable recording medium” to affirm 35 U.S.C. 101 requirement with respect to non-transitory media.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation is: “communication unit” and “processor”  in claim 11. Support for the communication unit can be found in paragraphs [0035]-[0036] and [0050]-[0055] of the Specification.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 9, 10 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Sivanthanu U.S. 2022/0318052 in view of Ni U.S. 2023/0109368 (hereinafter Ni).

As per claim 1, 10 and 11, Sivanthanu discloses a cloud management method/computer readable recording medium/system comprising: 
a communication unit configured to collect data for allocating GPU resources in a large-scale container operating environment (Sivanthanu: [0028]: monitor workloads that are currently running and hardware capacity that is currently available anywhere around the world in the cloud for scheduling GPU services); and 
a processor configured to: generate a multi-metric based on the collected data, to set a scheduling priority for the generated pod (Sivathanu: [0028]: scale up a job, i.e. create new pod by monitoring GPU utilization/collected data); and 
to perform a scheduling operation for allocating GPU resources according to the set scheduling priority (Sivanthanu: [0028]-[0029]: scheduler allocates resources based on utilization…scheduling priority; [0066]: new workload may be associated with a higher priority or tier than the current workload).  
Sivanthanu discloses scale up or down a job based by monitoring/tracking workload, including pods (Sivanthanu: [0028]: monitor workloads to scale up or down a job; [0058]: preparing schedules corresponding to workloads include jobs, model, and/or pods). Sivanthanu does not explicitly disclose generating new pod based on the multi-metric. However, Ni discloses autoscale number of pods based on GPU metric (Ni: [0034]-[0035]: autoscale number of pods based on GPU metrics). It would have been obvious to one having ordinary skill in the art to generate new pods based on GPU metrics in the process of scaling up AI workload because generating new pods is well known in the art for Kubernetes systems.

As per claim 2, Sivanthanu as modified discloses the cloud management method of claim 1. Sivanthanu further discloses wherein the step of setting the scheduling priority comprises, when a new pod is generated, setting a scheduling priority for the generated pod by reflecting a priority set by a user and a number of times of trying rescheduling (Sivathanu: [0028]: scale up a job, i.e. generate new pod; [0033]: establish priority; [0110]-[0112]: priority-based scheduling depends on pass value, i.e. how many rounds or number of times of rescheduling).  

As per claim 9, Sivanthanu as modified discloses the cloud management method of claim 1. Sivanthanu as modified further discloses wherein the step of collecting data comprises collecting GPU resources comprising GPU utilization, GPU memory, GPU clock, GPU architecture, GPU core, GPU power, GPUtemperature, GPU process resource, GPU NVlink pair, GPU return, and GPU assignment (Sivathanu: [0028]: scheduler monitors utilization of GPU utilization to collect data; [0066]: scale up or down based on resource utilization; Ni: [0034]-[0035]).

Claims 3-5 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Sivanthanu in view of Ni and further in view of Zhang et al. U.S. 2022/0291956 (hereinafter Zhang).

As per claim 3, Sivanthanu as modified discloses the cloud management method of claim 1. Sivanthanu does not explicitly disclose wherein the step of performing the scheduling operation comprises, when performing the scheduling operation, performing a node filtering operation, a GPU filtering operation, a node scoring operation, and a GPU scoring operation. However, Zhang discloses filtering and scoring nodes and GPUs according to requests (Zhang: [0022]-[0029]; [0048]: multiple tasks may share the GPUs based on utilization of GPU resources in the distributed container cluster).  It would have been obvious one having ordinary skill in the art to filter node and score GPUs to ensure the load balance of nodes in the cluster, enhance the utilization of GPU resources in the distributed container cluster, better meet the scheduling requirements, and allow containers to complete tasks faster.

As per claim 4, Sivanthanu as modified discloses the cloud management method of claim 3. Sivanthanu as modified further discloses wherein the step of performing the scheduling operation comprises, when performing the GPU filtering operation and the GPU scoring operation, reflecting a number of GPU requests set by a user and a requested GPU memory capacity (Zhang: [0022]-[0029]; [0048]).  Same rationale applies here as above in rejecting claim 3.

As per claim 5, Sivanthanu as modified discloses the cloud management method of claim 4. Sivanthanu as modified further discloses wherein the step of performing the scheduling operation comprises: determining whether the number of GPU requests set by the user is physically satisfiable; when it is determined that the number of GPU requests is physically satisfiable, performing a GPU filtering operation and a GPU scoring operation with respect to an available GPU; and allocating GPU resources based on a result of the GPU filtering operation and the GPU scoring operation (Zhang: [0022]-[0029]: select optimal nodes based on GPU scores; [0048]: scheduling containers to the most adaptive node based on the metric state, free memory and allocation of graphics cards at the node).  Same rationale applies here as above in rejecting claim 3.

As per claim 8, Sivanthanu as modified discloses the cloud management method of claim 5. Sivanthanu further discloses wherein the step of performing the scheduling operation comprises, when it is determined that the number of GPU requests is physically unsatisfiable, identifying a pre-set user policy, and, when multi-node allocation is allowed, allocating a GPU over multiple nodes to satisfy the number of GPU requests (Sivathanu: [0099]-[0101]; [0114]: GPUs are allocated across multiple servers for a large job).  


Claims 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Sivanthanu in view of Ni and further in view of Zhang and further in view of Garg et al. U.S. 2021/0011773 (hereinafter Garg).

As per claim 6, Sivanthanu as modified discloses the cloud management method of claim 5. Sivanthanu does not explicitly disclose wherein the step of performing the scheduling operation comprises, when it is determined that a total number of GPU requests set for a plurality of pods, respectively, is physically unsatisfiable, identifying a partitionable GPU memory, partitioning one GPU memory into a plurality of GPU memories, and allocating the plurality of partitioned GPU memories to a plurality of pods to allow the plurality of pods to share one physical GPU device. However, Garg teaches or at least suggests the limitations (Garg: [0015]: partitioning GPU memory to support vGPU profiles associated with different workloads/vms). It would have been obvious to one having ordinary skill in the art partition GPU memories for multiple instances to fully utilize capacity of GPUs as well known in the scheduling process.

As per claim 7, Sivanthanu as modified discloses the cloud management method of claim 5. Sivanthanu as modified further discloses wherein the step of performing the scheduling operation comprises, when it is determined that the number of GPU requests is physically unsatisfiable, identifying a partitionable GPU memory, partitioning one GPU memory into a plurality of GPU memories, and allocating a part or all of the plurality of partitioned GPU memories to the pod (Garg: [0015]: partition GPU memory to support vGPU profiles associated with different workloads/vms).  Same rationale applies here as above in rejecting claim 6.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Zhang et al. U.S. 2023/0266999 discloses a Kubernetes scheduling system that schedules pods  by filtering, sorting and scoring nodes based on optimization algorithms.
Morano U.S. 2025/0267192 discloses mounting persistent data volumes in multiple bundle applications.
Srikanta et al. U.S. 11,989,586 discloses scaling up computing resource allocations for execution of containerized applications.
Zhang U.S. 2024/0095082 discloses method for multiple services to share same GPU.
Duluk et al. U.S. 2023/0288471 discloses virtualizing hardware processing resources in a processor.
Cho et. al. U.S. 2023/0089925 discloses assigning jobs to heterogeneous GPUs.
Baillargeon U.S. 2023/0072358 discloses tenant resource optimization in clouds.
Hu et al. U.S. 2023/0037293 discloses method of hybrid centralized distributive scheduling on shared physical hosts.
Xu et al. U.S. 2022/0276899 discloses resource scheduling based on GPU topology relationship of a cluster.
Frey et al. U.S. 11,310,342 discloses method for optimizing a software allocation to shared resources based on a dynamic mapping of resource relationships.
Sivaraman et al. U.S. 2021/0216375 discloses workload placement for virtual GPU enabled systems.
Bahramshahry et al. U.S. 2020/0026579 discloses method for implementing a scheduler and workload manager that identifies and consumes global virtual resources.
O’Neal et al. U.S. 2019/0317821 discloses demand-based utilization of cloud computing resources.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIN HON (ERIC) CHEN whose telephone number is (571)272-3789. The examiner can normally be reached Monday to Thursday 9am- 7pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lynn Feild can be reached at 571-272-2092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SHIN-HON (ERIC) CHEN/Primary Examiner, Art Unit 2431

Read full office action

Prosecution Timeline

Nov 10, 2023

Application Filed

Feb 05, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/210,870

Patent 12598227

SYSTEMS AND METHODS FOR CONTROLLING SIGN-ON TO WEB APPLICATIONS

2y 5m to grant Granted Apr 07, 2026

18/883,990

Patent 12592109

BUILDING EQUIPMENT ACCESS MANAGEMENT SYSTEM WITH DYNAMIC ACCESS CODE GENERATION TO UNLOCK EQUIPMENT CONTROL PANELS

2y 5m to grant Granted Mar 31, 2026

18/665,625

Patent 12587528

DATA MASKING

2y 5m to grant Granted Mar 24, 2026

18/762,600

Patent 12585804

APPROACHES OF ENFORCING DATA SECURITY, COMPLIANCE, AND GOVERNANCE IN SHARED INFRASTRUCTURES

2y 5m to grant Granted Mar 24, 2026

18/433,087

Patent 12574382

PROVIDING SECURITY WITH DYNAMIC PRIVILEGE LEVEL ASSIGNMENT IN A HYBRID-CLOUD STACK

2y 5m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

87%

Grant Probability

99%

With Interview (+13.4%)

2y 10m

Median Time to Grant

Low

PTA Risk

Based on 797 resolved cases by this examiner. Grant probability derived from career allow rate.