Last updated: May 29, 2026
Application No. 18/152,528
TECHNIQUES FOR BALANCING DYNAMIC INFERENCING BY MACHINE LEARNING MODELS

Non-Final OA §103
Filed
Jan 10, 2023
Examiner
NGUYEN, AN-AN NGOC
Art Unit
2195
Tech Center
2100 — Computer Architecture & Software
Assignee
Nvidia Corporation
OA Round
3 (Non-Final)
Interview Optional

— +66.7% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 71% grant rate with +66.7% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 7 resolved cases, 2023–2026
Examiner Intelligence

NGUYEN, AN-AN NGOC View full profile →
Grants 71% — above average
Career Allowance Rate
5 granted / 7 resolved
+16.4% vs TC avg
Strong +67% interview lift
Without
With
+66.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
16 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
2.3%
-37.7% vs TC avg
§103
95.4%
+55.4% vs TC avg
§102
2.3%
-37.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 7 resolved cases
Office Action

§103
DETAILED ACTION
1.	Claims 1, 11, and 20 are currently amended.
2.	Claims 1-20 are pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
3.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on February 17, 2026 has been entered.

Response to Arguments
4.	Regarding 35 U.S.C. 101 Rejections:
	Applicant’s amendments and arguments with respect to the objections to the 35 U.S.C. 103 rejections of the invention have been fully considered and are persuasive. The 35 U.S.C. 103 rejections have been withdrawn.

5.	Regarding Prior Art Rejections:
Applicant’s amendments and arguments to claims 1, 11, and 20 have been considered and are not persuasive. The rejections under 35 U.S.C. 103 are maintained. Additionally, applicant’s arguments are rejected under a new ground of rejection necessitated by the amendment.

6.	Applicant argues in remarks:
	Amended claim 1 recites the limitations of allocating one or more computational resources to the plurality of tasks based on the one or more available computational resources and one or more performance requirements associated with the plurality of tasks. Amended claim 1 further recites that the one or more performance requirements provide that a first task is subject to an average performance requirement and that the one or more performance requirements allow performance of the first task to fall below a minimum performance requirement when insufficient computational resources are available to perform the plurality of tasks. Lastly, amended claim 1 recites that allocating the one or more computational resources comprises decreasing an allocation of computational resources to a first task to provide performance below the minimum performance level during a first time interval and increasing the allocation of computational resources to the first task to provide performance above the average performance requirement during a second time interval. None of the cited references disclose these limitations. Therefore, no combination of the cited references can teach each and every limitation of amended claim 1. 
In the rejections, the Examiner acknowledges that the Ross reference fails to disclose the prior version of the above claim limitations, and, instead, relies on the Sivathanu reference in the rejections. See Final Office Action at pp. 4, 18-19. Applicant respectfully traverses with respect to the amended claim language. 
Sivathanu discloses the general idea of executing tasks according to different preemption priority tiers, where tasks having the highest preemption priority tier are the least likely to be preempted, and tasks having the lowest preemption priority tier are the most likely to be preempted. See Sivathanu at [0029], [0030]. 
In the rejections, the Examiner maps the first task, recited in prior claim 1, to a task having a lowest preemption priority tier, disclosed in Sivathanu; and the allocation of one or more computational resources, recited in prior claim 1, to the preemption of a task, disclosed in Sivathanu. See Final Office Action at pp. 4, 18-19. Based on these claim mappings, to teach or suggest the above limitations of amended claim 1, Sivathanu would have to disclose that a set of performance requirements provide that the task having a lowest preemption priority tier is subject to an average performance requirement, that the set of performance requirements allows the task to be preempted during a first time interval, and that increasing the allocation to the task to provide performance above the average performance requirement during a second time interval. Importantly, Sivathanu contains no such teachings. Rather, as discussed above, Sivathanu contains only general teachings about the preemption of different tasks during execution. Notably, Sivathanu contains no specific teachings and does not suggest anything about tasks being subject to an average performance requirement or about preempting a task during a first time interval and allocating more computational resources to provide performance above the average performance requirement during a second time interval, as the amended claim language now expressly recites. Sivathanu is silent in this regard. In view of at least these distinctions, Applicant submits that Sivathanu cannot be properly interpreted as teaching the above limitations of amended claim 1.

7.	With the newly amended claims, the overall scope of the claim does not read the same way it did before. Therefore, new art and combination thereof was introduced to better suit the new scope of the claims.
8.	Additionally, Examiner respectfully disagrees with Applicant that Sivathanu does not teach of “tasks being subject to an average performance requirement or about preempting a task during a first time interval and allocating more computational resources to provide performance above the average performance requirement during a second time interval.” Sivathanu teaches:
 [0039] Depending on load, a job may get more than the minimum resources required by tier standards for some job hours. For example, it may get N GPUs for a given job hour (instead of the minimum resources of N*f). In such cases, the job may accumulate debt, which can be redeemed by the scheduler in subsequent job hours. For example, if the job got N GPU hours in the first job hour (instead of N*f), during the second job hour, the job may have a tier standard resource requirement of (N*f−slack), where slack is N*(1−f) (i.e., the cumulative excess capacity it got so far). In the 80% example, the job only needs N*0.6 GPU hours to meet tier standards. The dynamic priority within a job hour is thus computed based on the slack available for the job to meet its tier standard requirements.
9.	Here, the average performance requirement and a first and second time interval is shown. Each tier has a standard, which is analogous to an average performance requirement. Moreover, resources change during a time period in order to meet tier standards. For example, if a job gets extra resources during one hour, it will get less the next hour in order to meet the tier standards, otherwise known as the average performance requirement. Therefore, Sivathanu remains for portions of the rejection, and additionally, Leach was added to further align with the newly added scope of the amended claims.  Leach teaches:
[0068] In accordance with another example, average QoS (QoS.sub.mean) can be tracked during runtime. If the average QoS is less that the paid-for QoS (QoS_paid) for a particular job, the actual QoS for the job can be increased until the next “calculation cycle” during which a new/adjusted EQ rating is determined. If the average QoS exceeds the paid-for QoS for a particular job during runtime, QoS for that job is decreased, again until the next calculation cycle. In this way, the average QoS remains in-line with the paid-for QoS by the time the application/process is done executing, thereby enabling the paid-for QoS to be guaranteed. It should be understood that IaaS resource manager 112 (described above), may control reconfiguration of system resources, and can act based on specified policies provided by a system administrator, such as paid-for-QoS. As also described above, IaaS resource manager 112 may measure CPU, memory, storage, and network usage and traffic data. IaaS resource manager 112 may decide when to switch resource configurations (e.g., memory, processor, etc.) for particular software applications (e.g., to improve image processing, to improve user experience, etc.). By virtue of reconfiguring system resources, desired QoS can be achieved, or can be accounted for (in the event payments/credits are to be made).
10.	This also encompasses the idea of throttling resources by increasing them during one time period and decreasing them in another time period in order to meet an average performance requirement. The QoS level is analogous with the average performance requirement. 
11.	Additionally, claims 2-10 and 12-19 depend from and further limit amended claims 1, 11, and 20 and are therefore also rejected under 35 U.S.C 103. The full rejection can be found in the 35 U.S.C. 103 rejection section below.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

12.	Claims 1-2, 6-7, 	10-12, and 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Ross et al. US 11138522 B1 in view of Leach et al. US 20240045726 A1.

13.	With regard to claim 1, Ross teaches:

	A computer-implemented method for allocating computational resources when executing trained machine learning models (Col. 1, lines 17-18, This specification relates to allocating resources for machine learning model tasks; Col. 3, lines 55-59, The processing system (100) is an example of a system implemented as computer programs on one or more computers in one or more locations, in which the systems, components, and techniques described below are implemented; Examiner’s Note: The present invention is a way to allocate resources for machine learning models. The process is done by a system implemented as computer programs. Together, this shows that this is the same as a computer implemented method for allocating computational resources when executing trained machine learning models.), the method comprising:

determining one or more available computational resources that are usable by one or more trained machine learning models to perform a plurality of tasks (Col. 1, lines 30-40, This specification describes technologies for allocating resources for machine learning model tasks. These technologies generally involve determining, at compile-time of the machine learning model executable binaries, the amount of resources that a machine learning model will use during execution on a special purpose machine learning model processor. By knowing the amount of resources a machine learning model will use, a processing system can efficiently schedule machine learning model executable binaries to execute on special purpose machine learning model processors; Col. 5, lines 33-36, An allocation engine (109) identifies the special purpose machine learning model processors (117) and available resources in a datacenter that the processing system (100) can allocate to the machine learning model; Examiner’s Note: The allocation engine determines available resources that can be allocated to the machine learning model. This is analogous with determining one or more available computational resources that are usable by one or more trained machine learning models to perform tasks.);

allocating one or more computational resources to the plurality of tasks based on the one or more available computational resources and one or more performance requirements associated with the plurality of tasks, the one or more performance requirements provide that a first task is subject to an average performance requirement, and the one or more performance requirements allow performance of the first task to fall below a minimum performance requirement when insufficient computational resources are available to perform the plurality of tasks, and allocating the one or more computational resources comprises decreasing an allocation of computational resources to the first task to provide performance below the minimum performance level during a first time interval and increasing the allocation of computational resources to the first task to provide performance above the average performance requirement during a second time interval (Col. 5, lines 37-40, The processing system (100) then allocates the special purpose processors (117) and other resources to the computational graph representation of the machine learning model for execution; Examiner’s Note: The processing system allocates resources to the machine learning model for execution. This is analogous with allocating computational resources based on the one or more available computational resources.); and

causing execution of one or more processors to execute the one or more trained machine learning models to perform the first task according to the average performance requirement using at least a portion of the one or more computational resources allocated to the plurality of tasks (Col. 5, lines 33-44, An allocation engine (109) identifies the special purpose machine learning model processors (117) and available resources in a datacenter that the processing system (100) can allocate to the machine learning model. The processing system (100) then allocates the special purpose processors (117) and other resources to the computational graph representation of the machine learning model for execution. The allocated special purpose machine learning model processors (117) then execute the operations of the computational dataflow graph representation to complete the machine learning task of the machine learning model; Examiner’s Note: The ML machine learning model uses the allocated resources to complete the machine learning task.).

Although Ross teaches of teaches the allocation of one or more resources to a plurality of tasks based on resource availability and performance requirements, Ross fails to explicitly teach the one or more performance requirements provide that a first task is subject to an average performance requirement, and the one or more performance requirements allow performance of the first task to fall below a minimum performance requirement when insufficient computational resources are available to perform the plurality of tasks, and allocating the one or more computational resources comprises decreasing an allocation of computational resources to the first task to provide performance below the minimum performance level during a first time interval and increasing the allocation of computational resources to the first task to provide performance above the average performance requirement during a second time interval.

However, in analogous art, Leach teaches:

the one or more performance requirements provide that a first task is subject to an average performance requirement, and the one or more performance requirements allow performance of the first task to fall below a minimum performance requirement when insufficient computational resources are available to perform the plurality of tasks, and allocating the one or more computational resources comprises decreasing an allocation of computational resources to the first task to provide performance below the minimum performance level during a first time interval and increasing the allocation of computational resources to the first task to provide performance above the average performance requirement during a second time interval ([0068] In accordance with another example, average QoS (QoS.sub.mean) can be tracked during runtime. If the average QoS is less that the paid-for QoS (QoS_paid) for a particular job, the actual QoS for the job can be increased until the next “calculation cycle” during which a new/adjusted EQ rating is determined. If the average QoS exceeds the paid-for QoS for a particular job during runtime, QoS for that job is decreased, again until the next calculation cycle. In this way, the average QoS remains in-line with the paid-for QoS by the time the application/process is done executing, thereby enabling the paid-for QoS to be guaranteed. It should be understood that IaaS resource manager 112 (described above), may control reconfiguration of system resources, and can act based on specified policies provided by a system administrator, such as paid-for-QoS. As also described above, IaaS resource manager 112 may measure CPU, memory, storage, and network usage and traffic data. IaaS resource manager 112 may decide when to switch resource configurations (e.g., memory, processor, etc.) for particular software applications (e.g., to improve image processing, to improve user experience, etc.). By virtue of reconfiguring system resources, desired QoS can be achieved, or can be accounted for (in the event payments/credits are to be made).), and 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Ross with the teachings of Leach where the one or more performance requirements provide that a first task is subject to an average performance requirement, and the one or more performance requirements allow performance of the first task to fall below a minimum performance requirement when insufficient computational resources are available to perform the plurality of tasks, and allocating the one or more computational resources comprises decreasing an allocation of computational resources to the first task to provide performance below the minimum performance level during a first time interval and increasing the allocation of computational resources to the first task to provide performance above the average performance requirement during a second time interval. Ross teaches of provisioning resources to machine learning model tasks based on resource availability and requirements. Similarly, Leach teaches of provisioning resources based on a desired QoS (quality of service) associated with performance of an application executed by the resources (Abstract). Leach specifically teaches of an expected QoS based on a customer’s pricing level. This means that the customer’s system has to reflect the QoS that was paid for. This QoS acts as an average performance requirement because the performance of an application has to meet the QoS that a customer has paid for. In order to balance out resources and ensure the QoS is met,  average QoS (QoS.sub.mean) can be tracked during runtime. If the average QoS is less that the paid-for QoS (QoS_paid) for a particular job, the actual QoS for the job can be increased until the next “calculation cycle” during which a new/adjusted EQ rating is determined. If the average QoS exceeds the paid-for QoS for a particular job during runtime, QoS for that job is decreased, again until the next calculation cycle. In this way, the average QoS remains in-line with the paid-for QoS by the time the application/process is done executing, thereby enabling the paid-for QoS to be guaranteed. It should be understood that IaaS resource manager 112 (described above), may control reconfiguration of system resources, and can act based on specified policies provided by a system administrator, such as paid-for-QoS ([0068]). By virtue of reconfiguring system resources, desired QoS can be achieved, or can be accounted for (in the event payments/credits are to be made), as discussed in Leach ([0068]).

14.	With regard to claim 2, Ross further teaches:

wherein the one or more computational resources are allocated to the plurality of tasks based on one or more target performance requirements associated with the plurality of tasks (Col. 1, lines 30-40, This specification describes technologies for allocating resources for machine learning model tasks. These technologies generally involve determining, at compile-time of the machine learning model executable binaries, the amount of resources that a machine learning model will use during execution on a special purpose machine learning model processor. By knowing the amount of resources a machine learning model will use, a processing system can efficiently schedule machine learning model executable binaries to execute on special purpose machine learning model processors; Col. 5, lines 13-25, Each computational graph representation of a machine learning model executing on the special purpose machine learning model processor has certain resource requirements that may be accounted for. Examples of such resource requirements include a number of operations to be performed, the amount of storage the machine learning model requires to execute, and the amount of input/output (IO) the model requires to communicate information the model will use during execution. The processing system (100) allocates resources of the special purpose machine learning model processor based on the determined amount of resources required by the executable binary (212); Examiner’s Note: Each ML model has certain resource requirements needed that relate to the number of operations needed to be performed (target performance requirements). The processing system allocates resources based on the number of resources needed by the ML model to execute its tasks. This is analogous with allocating resources based on one or more target performance requirements associated with the plurality of tasks.).

15.	With regard to claim 6, Leach further teaches:

wherein allocating the one or more computational resources to the plurality of tasks comprises:
computing one or more performance averages associated with the plurality of tasks ([0019] A QoS rating can be periodically recalculated during runtime of an application/service to initially deploy workloads, and add/remove resources as needed to guarantee paid-for QoS while maximizing resource efficiency.; [0068] In accordance with another example, average QoS (QoS.sub.mean) can be tracked during runtime.); and

allocating the one or more computational resources to the plurality of tasks based on the one or more performance averages and one or more minimum performance requirements associated with the plurality of tasks ([0068] In accordance with another example, average QoS (QoS.sub.mean) can be tracked during runtime. If the average QoS is less that the paid-for QoS (QoS_paid) for a particular job, the actual QoS for the job can be increased until the next “calculation cycle” during which a new/adjusted EQ rating is determined. If the average QoS exceeds the paid-for QoS for a particular job during runtime, QoS for that job is decreased, again until the next calculation cycle. In this way, the average QoS remains in-line with the paid-for QoS by the time the application/process is done executing, thereby enabling the paid-for QoS to be guaranteed. It should be understood that IaaS resource manager 112 (described above), may control reconfiguration of system resources, and can act based on specified policies provided by a system administrator, such as paid-for-QoS. As also described above, IaaS resource manager 112 may measure CPU, memory, storage, and network usage and traffic data. IaaS resource manager 112 may decide when to switch resource configurations (e.g., memory, processor, etc.) for particular software applications (e.g., to improve image processing, to improve user experience, etc.). By virtue of reconfiguring system resources, desired QoS can be achieved, or can be accounted for (in the event payments/credits are to be made).).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Ross with the teachings of Leach wherein allocating the one or more computational resources to the plurality of tasks comprises: computing one or more performance averages associated with the plurality of tasks; and allocating the one or more computational resources to the plurality of tasks based on the one or more performance averages and one or more minimum performance requirements associated with the plurality of tasks. Ross teaches of provisioning resources to machine learning model tasks based on resource availability and requirements. Similarly, Leach teaches of provisioning resources based on a desired QoS (quality of service) associated with performance of an application executed by the resources (Abstract). Leach specifically teaches of an expected QoS based on a customer’s pricing level. This means that the customer’s system has to reflect the QoS that was paid for. This QoS acts as an average performance requirement because the performance of an application has to meet the QoS that a customer has paid for. In order to balance out resources and ensure the QoS is met,  average QoS (QoS.sub.mean) can be tracked during runtime. If the average QoS is less that the paid-for QoS (QoS_paid) for a particular job, the actual QoS for the job can be increased until the next “calculation cycle” during which a new/adjusted EQ rating is determined. If the average QoS exceeds the paid-for QoS for a particular job during runtime, QoS for that job is decreased, again until the next calculation cycle. In this way, the average QoS remains in-line with the paid-for QoS by the time the application/process is done executing, thereby enabling the paid-for QoS to be guaranteed. It should be understood that IaaS resource manager 112 (described above), may control reconfiguration of system resources, and can act based on specified policies provided by a system administrator, such as paid-for-QoS ([0068]). By virtue of reconfiguring system resources, desired QoS can be achieved, or can be accounted for (in the event payments/credits are to be made), as discussed in Leach ([0068]).

16.	With regard to claim 7, Leach further teaches:

wherein allocating the one or more computational resources to the plurality of tasks further comprises decreasing one or more computational resources allocated to at least one task ([0068] If the average QoS exceeds the paid-for QoS for a particular job during runtime, QoS for that job is decreased, again until the next calculation cycle. In this way, the average QoS remains in-line with the paid-for QoS by the time the application/process is done executing, thereby enabling the paid-for QoS to be guaranteed.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Ross with the teachings of Leach wherein allocating the one or more computational resources to the plurality of tasks further comprises decreasing one or more computational resources allocated to at least one task. Ross teaches of provisioning resources to machine learning model tasks based on resource availability and requirements. Similarly, Leach teaches of provisioning resources based on a desired QoS (quality of service) associated with performance of an application executed by the resources (Abstract). Leach specifically teaches of an expected QoS based on a customer’s pricing level. This means that the customer’s system has to reflect the QoS that was paid for. This QoS acts as an average performance requirement because the performance of an application has to meet the QoS that a customer has paid for. In order to balance out resources and ensure the QoS is met,  average QoS (QoS.sub.mean) can be tracked during runtime. [...] If the average QoS exceeds the paid-for QoS for a particular job during runtime, QoS for that job is decreased, again until the next calculation cycle. In this way, the average QoS remains in-line with the paid-for QoS by the time the application/process is done executing, thereby enabling the paid-for QoS to be guaranteed. It should be understood that IaaS resource manager 112 (described above), may control reconfiguration of system resources, and can act based on specified policies provided by a system administrator, such as paid-for-QoS ([0068]). By virtue of reconfiguring system resources, desired QoS can be achieved, or can be accounted for (in the event payments/credits are to be made), as discussed in Leach ([0068]).

17.	With regard to claim 10, Ross further teaches:

wherein causing the one or more trained machine learning models to perform the plurality of tasks using the one or more computational resources comprises either transmitting an indication of the one or more computational resources to the one or more trained machine learning models or configuring the one or more trained machine learning models based on the one or more computational resources (Col. 3, lines 37-53, An example processing system executes machine learning models on special purpose machine learning model processors. Each special purpose machine learning model processor is a custom programmable artificial intelligence (AI) accelerator built for machine learning applications. The processor has a deterministic instruction set architecture (ISA) and can be tailored for multi-dimensional array flow operations. In some implementations, the multi-dimensional array flow operations can be implemented using a software library for numerical computation using data flow graph. As a result of compiling with a deterministic ISA, the processing system determines, at compile-time, the amount of resources that a machine learning model will use during execution on special purpose machine learning model processors. Using this knowledge, the processing system can allocate resources to a particular machine learning model for execution; Examiner’s Note: ML models are compiled based on the amount of resources needed.).

18.	Regarding claim 11, it is rejected under the same reasoning as claim 1 above. Therefore, it is rejected under the same rationale. 

19.	Regarding claim 12, it is rejected under the same reasoning as claim 2 above. Therefore, it is rejected under the same rationale. 

20.	Regarding claim 15, it is rejected under the same reasoning as claim 6 above. Therefore, it is rejected under the same rationale. 

21.	Regarding claim 16, it is rejected under the same reasoning as claim 7 above. Therefore, it is rejected under the same rationale. 

22.	With regard to claim 17, Ross teaches the one or more non-transitory computer-readable media of claim 11 and Ross further teaches:

wherein the one or more computational resources includes at least one of an execution time, a system memory, or an energy (Col. 3, lines 29-32, Machine learning tasks are computationally intensive and usually require numerous resources for execution. Such resources include input/output (IO), memory, and operations; Examiner’s Note: The resources include memory.).

23.	With regard to claim 18, Ross teaches the one or more non-transitory computer-readable media of claim 11 and Ross further teaches:

wherein the one or more performance requirements include one or more accuracy requirements associated with the one or more tasks (Col. 7 lines 61-67 – Col. 8, lines 1-2; Examiner’s Note: The customer can specify the level of precision for a calculation resulting from a machine learned model operation.).

24.	With regard to claim 19, Ross teaches the one or more non-transitory computer-readable media of claim 11 and Ross further teaches:

wherein the one or more trained machine learning models include one or more trained dynamic deep neural networks (Col. 4, lines 39-41, The operations represented in the computational graph are neural network operations or operations for a different kind of machine learning model; Examiner’s Note: The operations are neural network operations of a machine learning model.).

25.	Regarding claim 20, it is rejected under the same reasoning as claim 1 above. Therefore, it is rejected under the same rationale. 

26.	Claims 3-5 and 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Ross et al. US 11138522 B1 and Leach et al. US 20240045726 A1, as applied in claim 1, in further view of Sivathanu et al. US 20220318052 A1. 

27.	With regard to claim 3, Ross and Leach teach the computer-implemented method of claim 1 but fail to explicitly teach wherein allocating the one or more computational resources to the plurality of tasks comprises, if one or more additional computational resources are available after allocating the one or more computational resources based on one or more target performance requirements, allocating the one or more additional computational resources to the plurality of tasks based on one or more priorities associated with the plurality of tasks.

However, in analogous art, Sivathanu teaches:

wherein allocating the one or more computational resources to the plurality of tasks comprises, if one or more additional computational resources are available after allocating the one or more computational resources based on one or more target performance requirements, allocating the one or more additional computational resources to the plurality of tasks based on one or more priorities associated with the plurality of tasks ([0174]; [0175]; [0176]; Examiner’s Note: The processor first identifies a subset of workloads that have the highest priority and allocate resources to them. After that, the processor identifies a second workload with a second priority and allocate any spare resources to the second subset of workloads. Then it does the same for a third subset. The extra resources are being given out based on priority. The multiple subsets of workloads indicates a plurality of tasks.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Ross and Leach with the teachings of Sivathanu wherein allocating the one or more computational resources to the one or more tasks comprises, if one or more additional computational resources are available after allocating the one or more computational resources based on one or more target performance requirements, allocating the one or more additional computational resources to the one or more tasks based on one or more priorities associated with the one or more tasks. This allows resources to be allocated based on a scale-up priority requirement of the high priority tier, as discussed in Sivathanu ([0174]).

28.	With regard to claim 4, Sivathanu further teaches:

wherein allocating the one or more computational resources to the plurality of tasks comprises, if insufficient computational resources are available to allocate the one or more computational resources based on one or more target performance requirements, decreasing the one or more computational resources allocated to the plurality of tasks based on one or more priorities associated with the plurality of tasks ([0029] For example, if a job is submitted with the highest tier level, indicating a best-capacity tier, the job is run with the least preemption, the equivalent of running on dedicated cloud resources. If a job is submitted at a middle tier, there is some preemption or migration experienced that may “slow” the job somewhat but drive efficiencies and improving the overall utilization of the fixed pool of resources; Examiner’s Note: Medium tier tasks are allocated less resources than high tier tasks.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Ross and Leach with the teachings of Sivathanu wherein allocating the one or more computational resources to the plurality of tasks comprises, if insufficient computational resources are available to allocate the one or more computational resources based on one or more target performance requirements, decreasing the one or more computational resources allocated to the plurality of tasks based on one or more priorities associated with the plurality of tasks. By decreasing resources allocated to tasks based on their priorities, it drives efficiencies and improving the overall utilization of the fixed pool of resource, as discussed in Sivathanu ([0029]).

29.	With regard to claim 5, Sivathanu further teaches:

wherein allocating the one or more computational resources to the plurality of tasks further comprises, if insufficient computational resources are available after decreasing the one or more computational resources allocated to the plurality of tasks, further decreasing the one or more computational resources allocated to at least one task for which averaged performance over a plurality of time periods is permitted ([0029] If the job is submitted at the lowest tier, the job is preempted frequently, providing the experience similar to spot virtual machines (VMs), but with the guarantee that the job will be completed, albeit not necessarily at the fastest pace; Examiner’s Note: Lowest tier tasks are allocated even less resources than middle tier tasks.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Ross and Leach with the teachings of Sivathanu wherein allocating the one or more computational resources to the plurality of tasks further comprises, if insufficient computational resources are available after decreasing the one or more computational resources allocated to the plurality of tasks, further decreasing the one or more computational resources allocated to at least one task for which averaged performance over a plurality of time periods is permitted. By further decreasing resources allocated to tasks based on their priorities, it drives efficiencies and improving the overall utilization of the fixed pool of resource but also ensures that tasks are guaranteed to be completed, even if at a slower pace, as discussed in Sivathanu ([0029]).

30.	Regarding claim 13, it is rejected under the same reasoning as claim 3 above. Therefore, it is rejected under the same rationale. 

31.	Regarding claim 14, it is rejected under the same reasoning as claim 4 above. Therefore, it is rejected under the same rationale. 

32.	Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Ross et al. US 11138522 B1 and Leach et al. US 20240045726 A1, as applied in claim 1, in further view of Allen et al. US 11134013 B1.

33.	With regard to claim 8, Ross and Leach teach the computer-implemented method of claim 1 but fail to explicitly teach wherein allocating the one or more computational resources to the plurality of tasks comprises querying a look-up table that associates the one or more performance requirements with amounts of the one or more computational resources required by the one or more trained machine learning models to achieve the one or more performance requirements.

wherein allocating the one or more computational resources to the plurality of tasks comprises querying a look-up table that associates the one or more performance requirements with amounts of the one or more computational resources required by the one or more trained machine learning models to achieve the one or more performance requirements (Col. 24, lines 1-8; Fig. 8A; Fig. 8B; Col. 26, lines 12-35; Examiner’s Note: There are tables that have jobs and the number of cores and nodes associated with them, their queue status, and their job ID. This is showing their performance requirements. The resources are allocated in response to when a policy violation (performance requirement) is detected, or when there is a maximum average performance degradation (performance averages).).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Ross and Leach with the teachings of Allen wherein allocating the one or more computational resources to the one or more tasks comprises querying a look-up table that associates the one or more performance requirements with amounts of the one or more computational resources required by the one or more trained machine learning models to achieve the one or more performance requirements. This allows for access of a job’s performance requirements, and other relevant information, as discussed in Allen (Col. 24, lines 1-8; Fig. 8A; Fig. 8B).

34.	Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Ross et al. US 11138522 B1 and Leach et al. US 20240045726 A1, as applied in claim 1, in further view of Cadambi et al. US 20120124591 A1.

35.	With regard to claim 9, Ross, Leach, and Allen teach the computer-implemented method of claim 8 but fail to explicitly teach further comprising updating the look-up table based on amounts of the one or more computational resources used by the one or more trained machine learning models to perform the plurality of tasks. 

further comprising updating the look-up table based on amounts of the one or more computational resources used by the one or more trained machine learning models to perform the plurality of tasks ([0030] History table 215 stores the details of recently completed tasks of each application 110. Each entry of history table 215 may include executed user requests, resources allocated, and the actual time taken by the allocated resources to execute the user requests. History table 215 is updated each time an application task is completed; [0031]; Examiner’s Note: After the resources are allocated, the history table (look-up table) is updated. The information in the history table includes resource allocation request sizes.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Ross, Leach, and Allen with the teachings of Cadambi to update the look-up table based on amounts of the one or more computational resources used by the one or more trained machine learning models to perform the one or more tasks. This allows for the most up-to-date reflection of available resources, and the analyzation of resource requests and the ability to make estimations based on request sizes, as discussed in Cadambi ([0031]).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AN-AN N NGUYEN whose telephone number is (571)272-6147. The examiner can normally be reached Monday-Friday 8:00-5:00 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, AIMEE LI can be reached at (571) 272-4169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/AN-AN NGOC NGUYEN/Examiner, Art Unit 2195                                                                                                                                                                                                        
/Aimee Li/Supervisory Patent Examiner, Art Unit 2195
Read full office action
Prosecution Timeline

Show 1 earlier event
Jul 02, 2025
Non-Final Rejection mailed — §103
Sep 22, 2025
Response Filed
Dec 03, 2025
Final Rejection mailed — §103
Jan 26, 2026
Response after Non-Final Action
Feb 09, 2026
Applicant Interview (Telephonic)
Feb 17, 2026
Request for Continued Examination
Feb 25, 2026
Response after Non-Final Action
Apr 03, 2026
Non-Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/971,391
Patent 12561130
MAINTENANCE MODE IN HCI ENVIRONMENT
3y 4m to grant Granted Feb 24, 2026
17/839,943
Patent 12511156
CREDIT-BASED SCHEDULING USING LOAD PREDICTION
3y 6m to grant Granted Dec 30, 2025
Study what changed to get past this examiner. Based on 2 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
71%
Grant Probability
99%
With Interview (+66.7%)
3y 4m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 7 resolved cases by this examiner. Grant probability derived from career allowance rate.