Last updated: April 19, 2026
Application No. 17/811,895
RESOURCE AND WORKLOAD SCHEDULING

Final Rejection §103§112
Filed
Jul 12, 2022
Examiner
HU, SELINA ELISA
Art Unit
2193
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
4 (Final)
Interview Optional

— +100.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 3 resolved cases, 2023–2026
Examiner Intelligence

HU, SELINA ELISA View full profile →
Grants 67% — above average
Career Allow Rate
2 granted / 3 resolved
+11.7% vs TC avg
Strong +100% interview lift
Without
With
+100.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
32 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
24.4%
-15.6% vs TC avg
§103
53.5%
+13.5% vs TC avg
§102
12.0%
-28.0% vs TC avg
§112
10.1%
-29.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 3 resolved cases
Office Action

§103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is in response to applicant’s amendment filed on 01/22/2026.
Claims 1, 5-9, 13-17, 19-22 and 24-26 are pending and examined.
Claims 2-4, 10-12, 18, and 23 are cancelled.

Response to Arguments
Applicant's arguments filed 01/22/2026 with respect to 35 U.S.C. 103 have been fully considered but they are not persuasive. Applicant argued that the cited prior art does not teach the amended claims of “sending, by the respective one of the plurality of hosts, a message to a workload scheduler, wherein the message includes a request for additional tasks based on an effectiveness of the boost action” and “sending, by the respective one of the plurality of hosts, a message to the workload scheduler, wherein the message includes a request to stop sending tasks to the respective one of the plurality of hosts based on an ineffectiveness of the next boost action.” Additionally, applicant argues that the prior art references of record Mitra and Miao would not have been obvious to combine before the effective filing date of the claimed invention by one of ordinary skill in the art and that “the "resource analyzer" described in Saillet is distinguishable from the daemon claimed by Applicant.”
The examiner respectfully disagrees, see the 103 rejections below for a detailed analysis pertaining to the amended claim. As explained previously, Saillet alone does not disclose the daemon claimed by the applicant, and therefore Saillet’s “resource analyzer” is not equated alone to the claimed daemon. The previously cited prior art of Moussaoui is interpreted to disclose implementing a daemon, wherein the daemon iteratively determines and stores a spare resource and a current resource consumption of each of a plurality of hosts for a plurality of workloads and wherein the daemon determines the spare resource and the current resource consumption for each of the plurality of hosts based on metadata collected for each host, wherein the metadata includes the current resource consumption by each task of the plurality of tasks or each application for each of the plurality of hosts. The combination of Sailet in view of Moussaoui is interpreted to disclose the claimed daemon. Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with Moussaoui because the daemon can provide current information on whether or not the request for resource allocation can be granted based on comparing the current resource status to the resource allocation rules. This can allow an application running in real time which requires low latency and high throughput to be advantageously scheduled.
Additionally, with regards to the combination of Saillet with Mitra and Miao, as previously stated, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with Mitra and Miao because the iterative reinforcement learning process used for learning an optimum scheduling policy improves scheduling distribution of resource requests initiated by applications on a shared compute infrastructure and reinforcement learning training can be computationally expensive to train, and these generative and discriminative models allow a lower overhead via imitation learning.
Lastly, with regards to newly amended limitations, Saillet in view of Mitra, Miao, Moussaoui and Kochunni is interpreted to disclose the limitations. For example, Saillet’s resources still being available for a certain period of time after executing a task in a certain resource correlate to a boost action being effective. The smaller tasks at the end of the queue filling available time in the available resources which can be triggered through sending a control signal correlates to sending a request for additional tasks based on an effectiveness of the boost action. While Saillet does not explicitly teach that the respective one of the plurality of hosts [sends] a message to the workload scheduler, the hosts [sending] a message to a workload scheduler is a popular method of workload and resource scheduling as evidenced by Mitra. Mitra’s self-learning application scheduler receiving incoming resource requests initiated by applications on a shared compute infrastructure correlates to hosts sending a message to a workload scheduler. Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with Mitra because the iterative reinforcement learning process used for learning an optimum scheduling policy improves scheduling distribution of resource requests initiated by applications on a shared compute infrastructure. Additionally, a customizable reinforced learning-based reward or penalty helps to teach a reinforced learning agent desirable properties of the system and avoid resource interference, thereby improving the learning process.
Additionally, the additional prior art of Kochunni is interpreted to further disclose the newly amended limitations. Kochunni’s requested resource not being available in response to the API call from the job manager involves sending an indication it is not available is interpreted as sending, by the respective one of the plurality of hosts, a message to the workload scheduler. The API call being a blocking call and the requested resource being unavailable correlates to an ineffectiveness of the next boost action. The job manager waiting until the requested resource is available again before requesting the resource to process the next job correlates to the message includes a request to stop sending tasks to the respective one of the plurality of hosts based on an ineffectiveness of the next boost action. Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with Kochunni because sequential approaches for resource allocation can be implemented using blocking calls that allow a requested resource to indicate it cannot be immediately provided and therefore cause the job manager to wait until it is available. Job mangers can then determine what kinds of resources are available in the system, determine what kind of resources are requested, and allocate a portion of the available resources based on these determinations. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 26 recites the limitation "the next boost" in “wherein a boost indicator for the one or more new hosts indicates that the next boost is valid.” There is insufficient antecedent basis for this limitation in the claim as the term “next boost” is not referenced earlier in the claim or in claim 1.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 5-9, 13-17, 19-22 and 24-26 are rejected under 35 U.S.C. 103 as being unpatentable over Saillet et al. (U.S. Patent No. US 20200379803 A1), hereinafter “Saillet,” and further in view of Mitra et al. (U.S. Patent No. US 20200257968 A1), hereinafter “Mitra,” Miao et al. (Multi-Agent Reinforcement Learning for Edge Resource Management with Reconstructed Environment), hereinafter “Miao,” Moussaoui et al. (U.S. Patent No. US 20220357995 A1), hereinafter “Moussaoui” and Kochunni et al. (U.S. Patent No. US 20160283274 A1), hereinafter “Kochunni.”

With regards to claim 9, Saillet teaches:
A computer system for workload scheduling (Fig. 8 and paragraphs 122-123, system 816, “a system (816) for determining workflow execution timing based on resource availability, according to another example of principles described herein.” The system for determining workflow execution timing based on resource availability corresponds to a computer system for workload scheduling), comprising:
one or more processors (Fig. 8 and paragraphs 122-123, processing unit 818, “For example, the computing system (816) may be a desktop computer, a laptop computer, a server, or any other such device that includes processors and hardware components.” The computing system including processors correlates to one or more processors), one or more computer-readable memories (Fig. 2 and paragraph 72, “To achieve its desired functionality, the system (202) includes various components. Each component may include a combination of hardware and program instructions to perform a designated function... For example, each of the components may include a processor and memory.” The system including various components such as hardware and program instructions which may include a processor and memory correlates to one or more computer-readable memories), one or more computer-readable tangible storage medium (Fig. 8, paragraph 123, computer-readable storage medium 820 and paragraph 28-29, “The system (816) includes a processing unit (818) and a computer-readable storage medium (820). The computer-readable storage medium (820) may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.” The system including a computer-readable storage medium corresponds to one or more computer-readable tangible storage mediums), and program instructions stored on at least one of the one or more tangible storage medium for execution by at least one of the one or more processors via at least one of the one or more memories (Fig. 8, paragraph 123, “The system (816) includes a processing unit (818) and a computer-readable storage medium (820). The computer-readable storage medium (820) may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.” The system including a computer-readable storage medium with a processing unit corresponds to program instructions stored on at least one tangible storage medium for execution by one or more processors), wherein the computer system is capable of performing a method comprising:
wherein each of the plurality of workloads is comprised of a plurality of tasks (Paragraph 42, “the resource usage of each workflow is computed by doing a static analysis of the stages involved in the workflow. That is, each workflow is made up of a sequence of stages, e.g., a filter, a sorting stage, a transform operator. Each of these stages has a known behavior regarding its resource usage, depending on the operations it implements, the input throughput, the throughput at which it can output its data to the next stage, etc.” The workflow being made up of a sequence of stages correlates to a plurality of workloads comprising a plurality of tasks);
determining an average resource consumption of the plurality of hosts based on a historical resource consumption associated with the plurality of workloads (Fig. 1, paragraph 54, 66, block 102, “For the previous workflows, the workflow analyzer may have stored an amount of resources used over time for those workflows. Accordingly, using known usage data of previous workflows having similar characteristics, the workflow analyzer can determine (block 102) the expected usage of each computing resource… workflows are first ordered based on their expected average resource usage with the most expensive workflow evaluated first.” The workflow analyzer stores a number of resources used over time for multiple workflows with similar characteristics and uses the expected average to order workflows, which corresponds to determining an average resource consumption of the plurality of hosts based on historical resource consumption associated with the plurality of workloads), 
determining a boost action based on the spare resource and the average resource consumption (Fig. 1, paragraphs 65-66, block 103, “determining (block 103) a time of execution of each workflow may include simulating each possible execution order of the multiple workflows to determine an execution order that maximizes system resource usage… workflows are first ordered based on their expected average resource usage with the most expensive workflow evaluated first.” The time of execution of each workflow being based on the average resource usage and system resource usage correlates to determining a boost action based on the spare resource and average resource consumption), the boost action comprising a number of tasks among the plurality of tasks assigned to a respective one of the plurality of hosts (Fig. 1, paragraphs 65-66, block 103, “determining (block 103) a time of execution of each workflow may include simulating each possible execution order of the multiple workflows to determine an execution order that maximizes system resource usage… workflows are first ordered based on their expected average resource usage with the most expensive workflow evaluated first.” The time of execution of each workflow which includes multiple stages correlates to the boost action comprising a number of tasks to be assigned to the plurality of hosts); and
dispatching the number of tasks to the respective one of the plurality of hosts based on the boost action (Fig. 1, paragraph 71, block 103, “With the time of execution of each workflow determined (block 103), the system may then effectuate the execution order. That is, the system depicted in FIG. 2 for example, not only determines the order of execution, or a timing of execution of each workflow, but it also carries out the execution.” The time of execution for each workflow determined and executed correlates to dispatching the number of tasks to the plurality of hosts based on the boost action);
sending, by the system, a message to a system resource, wherein the message includes a request for additional tasks based on an effectiveness of the boost action (Paragraphs 66 and 71, “This approach has the advantage that by keeping workflows that consume less resources at the end of the queue, they may better fill small available time in the available resources, rather than if the small jobs are executed at the beginning and a percentage of the resources are available, but cannot be utilized because remaining jobs consume more resources… With the time of execution of each workflow determined (block 103), the system may then effectuate the execution order. That is, the system depicted in FIG. 2 for example, not only determines the order of execution, or a timing of execution of each workflow, but it also carries out the execution. In some examples, this includes triggering the software that performs the data processing operation, such as sending a control signal to the system resource being consumed.” Resources still being available for a certain period of time after executing a task in a certain resource correlate to a boost action being effective. The smaller tasks at the end of the queue filling available time in the available resources which can be triggered through sending a control signal correlates to sending a request for additional tasks based on an effectiveness of the boost action):
wherein the task execution efficiency is greater than a first threshold (Paragraph 74, “In one example, associated with each resource is a monitor which detects activity at the computing resource. For example, a CPU monitor may count the CPU operations to execute. That is, each CPU core can execute a certain number of operations per unit of time… When the CPU has capacity for executing operations but has nothing to do (because for instance the managed processes are waiting), it is idle. The percentage of time the CPU is idle may indicate the free capacity of the CPU.” The CPU having capacity for executing operations but having nothing to do and executing zero operations in an idle state correlate to the first threshold. The CPU monitor counting the number of CPU operations being executed in a non-idle state correlates to the task execution efficiency being greater than a first threshold) and the host resource utilization remains in a predefined range of resource utilization defined by a customer as part of a resource plan (Paragraph 70, “In some examples, determining (block 103) a time of execution of each workflow may be further based on a maximum usage for each computing resource. For example, the system or a user, may set a maximum threshold above which computing resource usage is not to exceed. Such a maximum threshold may define overcommitment of a computing resource.” The maximum usage for each computing resource having a maximum threshold indicates that all values below the threshold are acceptable and therefore correlates to the host resource utilization remaining in a predefined range of resource utilization. The system or user setting the maximum threshold correlates to the predefined range of resource utilization being defined by a customer as part of a resource plan), wherein the task execution efficiency is defined as a number of tasks per unit of time (Paragraph 74, “In one example, associated with each resource is a monitor which detects activity at the computing resource. For example, a CPU monitor may count the CPU operations to execute. That is, each CPU core can execute a certain number of operations per unit of time… When the CPU has capacity for executing operations but has nothing to do (because for instance the managed processes are waiting), it is idle. The percentage of time the CPU is idle may indicate the free capacity of the CPU.” The CPU monitor counting the CPU operations executed over a unit of time correlates to the task execution efficiency defined as the number of tasks per unit time)
the next boost action comprising a second number of tasks among the plurality of tasks assigned to the respective one of the plurality of hosts (Fig. 2, paragraph 116, scheduler 208, “It is then determined if the system, in its current state, can accommodate the workflow. If yes, the scheduler (FIG. 2, 208) schedules it and goes back to pick and analyze another one.” The system accommodating the workflow and the scheduler scheduling the workflow and picking another workflow corresponds to the next boost action comprising a second number of tasks).
dispatching, the second number of tasks to the respective one of the plurality of hosts based on the next boost action (Fig. 2, paragraph 116, method 500, “So, the system (FIG. 2, 202) identifies which stage can be run next, giving preference to the one on which other stages are dependent. With the stages identified, the system (FIG. 2, 200) determines exactly what are the different stages in the workflow and the system (FIG. 2, 202) also sees what is the input to the workflow… It is then determined if the system, in its current state, can accommodate the workflow. If yes, the scheduler (FIG. 2, 208) schedules it and goes back to pick and analyze another one.” The system identifying which stages can be run next, determining if it can accommodate the workflow and scheduling the workflow correlates to dispatching a second number of tasks to the respective one of the plurality of hosts based on the next boost action).
reclaiming, in response to determining from the boost indicator that the next boost action is not effective, the number of tasks dispatched to the plurality of hosts (Fig. 3, paragraphs 89-90, block 303, “it is determined (block 303) whether computing resource usage exceeds an available amount. This may be performed for each computing resource. For example, it may be determined whether at a particular point in time, t0, and for any point throughout an execution period of the workflow, usage of any of the computing resources being analyzed exceeds a threshold amount for that particular resource… By comparison, when the expected usage of any computing resource over a period of time is projected to exceed the available amount of a computing resource (block 303, determination YES), the scheduler (FIG. 2, 208) introduces (block 305) a delay into the execution of that workflow. The delay may be indeterminate. That is, the scheduler (FIG. 2, 208) in one example may bump the workflow to a lower point in the queue to be re-evaluated at a later point in time. In another example, the delay may be determinate. That is, the scheduler (FIG. 2, 208), rather than executing the workflow at a time, t0, may determine a future point in time, i.e., t1-tn, at which to execute the workflow.” Determining whether computing resource usage exceeds an available amount for each computing resource correlates to determining from the boost indicator that the next boost action is not effective. The scheduler bumping the workflow to a lower point in the queue to be re-evaluated at a later point in time or determining a future point in time to execute the workflow in response to the expected usage of a computing resource exceeding the available amount of computing resource correlates to reclaiming the number of tasks dispatched to a plurality of hosts in response to determining that the next boost action is not effective).

Saillet does not explicitly teach that a reinforced learning model determines a boost action, that the reinforced model is comprised of a generative and discriminative model, that the historical resource consumption is determined following a sampling interval of the daemon iteratively determining and storing the spare resource and the current resource consumption, and that the respective one of the plurality of hosts [sends] a message to the workload scheduler. However, reinforced learning models are a popular method of determining boost actions based on spare and average resource consumption as evidenced by Miao below (Fig. 2, section IV, paragraphs 3-4, algorithm 1). Additionally, generative and discriminative models are a popular type of model included in reinforced learning models as evidenced by Miao below (Fig. 2, section IV, paragraphs 3-4, algorithm 1). Sampling intervals of a daemon which iteratively determine and store spare and current resource consumption are a popular method of obtaining spare and current resource consumption data used for determining historical resource consumption as evidenced by Moussaoui below (Moussaoui: Paragraphs 120, 123 and 136). Lastly, hosts [sending] a message to a workload scheduler is a popular method of workload and resource scheduling as evidenced by Mitra (Paragraph 41, “FIG. 1A depicts a block diagram illustrating an example computing platform 100a including a self-learning application scheduler 122 operable to utilize a reinforcement learning agent (RL-Agent) 123 to efficiently schedule incoming resource requests 105, e.g., jobs or services, initiated by applications on a shared compute infrastructure 130, according to some implementations” The self-learning application scheduler receiving incoming resource requests initiated by applications on a shared compute infrastructure correlates to hosts sending a message to a workload scheduler)

Saillet does not explicitly teach:
implementing a daemon, wherein the daemon iteratively determines and stores a spare resource and a current resource consumption of each of a plurality of hosts for a plurality of workloads,
and wherein the daemon determines the spare resource and the current resource consumption for each of the plurality of hosts based on metadata collected for each host, wherein the metadata includes the current resource consumption by each task of the plurality of tasks or each application for each of the plurality of hosts;
determining a reward based on a change in a task execution efficiency after the boost action and a host resource utilization via the discriminative model; determining, a next boost action via the generative model based on positive feedback as the reward with the boost action as a state of the generative model,
determining a boost indicator that indicates whether the next boost action is effective for the plurality of workloads,
sending, by the respective one of the plurality of hosts, a message to the workload scheduler, wherein the message includes a request to stop sending tasks to the respective one of the plurality of hosts based on an ineffectiveness of the next boost action.

However, Moussaoui teaches:
implementing a daemon, wherein the daemon iteratively determines and stores a spare resource and a current resource consumption of each of a plurality of hosts for a plurality of workloads (Paragraphs 120, 123 and 136, “In one or more embodiments, the resource management unit 302 may further be configured to configure 314 the resource allocation process 303b2 with updated resource allocation. This advantageously allows storing locally to the computing environment 303 information regarding allocation and status of resources allocated to an application program of interest… Therefore, in some embodiments, the resource allocation process 303b2 may advantageously be used to store, locally to the cluster in which an instance of the application program of interest is running, detailed information regarding resources (e.g. CPU resources and/or memory resources) allocated to run the instance of the application program, and possibly information regarding status of resources (e.g. CPU resources and/or memory resources) of the computing machine on which the instance of the application program is running. Such status information may for example comprise information on which CPU resource has already been allocated, and/or which CPU resource is free of allocation… the resource management control unit 402 may be configured for obtaining, for example from a resource allocation daemon process running on the computing machine on which the Cluster comprising the Pod in which the instance of the application program is running, current resource allocation (e.g. CPU resource currently allocated to the Pod) and current resource status (e.g. status of CPU resources among allocated resources and available resources.” The resource management control unit obtaining current resource allocation and current resource status, which includes available resources, from the resource allocation daemon correlates to a daemon iteratively determining the spare and current resource consumption on each of the plurality of hosts. The resource management control unit storing information regarding allocation and status of resources allocated to an application program of interest locally correlates to storing a spare resource and current resource consumption of each of a plurality of hosts for a plurality of workloads).
and wherein the daemon determines the spare resource and the current resource consumption for each of the plurality of hosts based on metadata collected for each host, wherein the metadata includes the current resource consumption by each task of the plurality of tasks or each application for each of the plurality of hosts (Paragraphs 123 and 136, “Therefore, in some embodiments, the resource allocation process 303b2 may advantageously be used to store, locally to the cluster in which an instance of the application program of interest is running, detailed information regarding resources (e.g. CPU resources and/or memory resources) allocated to run the instance of the application program, and possibly information regarding status of resources (e.g. CPU resources and/or memory resources) of the computing machine on which the instance of the application program is running. Such status information may for example comprise information on which CPU resource has already been allocated, and/or which CPU resource is free of allocation… the resource management control unit 402 may be configured for obtaining, for example from a resource allocation daemon process running on the computing machine on which the Cluster comprising the Pod in which the instance of the application program is running, current resource allocation (e.g. CPU resource currently allocated to the Pod) and current resource status (e.g. status of CPU resources among allocated resources and available resources.” The resource management control unit obtaining current resource allocation and current resource status, which includes available resources and the status of resources, for computing machines which the instance of the application program is running from the resource allocation daemon correlates to a daemon determining the spare and current resource consumption on each of the plurality of hosts based on metadata collected for each host. The detailed information including current CPU and/or memory resource allocation of the computing machine on which the instance of the application program is running correlates to the metadata including the current resource consumption for each application for each of the plurality of hosts);

Additionally, Mitra teaches:
determining a reward based on a change in a task execution efficiency after the boost action (Fig. 7, paragraph 61, “The reward/penalty generation module 126 is configured to calculate a reward or penalty based on the observed state of the shared compute infrastructure. For example, the reward/penalty generation module 126 can determine a change in the state of the shared compute infrastructure 130 occurring as a result of performing the scheduling action and responsively calculate a reward or penalty based on the change in state. As discussed herein, the reward or penalty can be a summation of multiple components including at least a resource contention component, a resource over utilization component, and a scheduling delay component.” The reward generation module calculating a reward based on a change in state of the shared compute infrastructure as a result of performing the scheduling action correlates to determining a reward based on a change in task execution efficiency after the boost action) and a host resource utilization (Paragraph 44, “More specifically, the RL-Agent 123 interacts with compute infrastructure 130 to learn an optimized policy that reduces application slowdown by taking scheduling actions A.sub.t and observing how those scheduling actions A.sub.t affect the state S.sub.t of the system. The observed state S.sub.t of the system comes with an associated reward (or penalty) when the system achieves (or does not achieve) the desirable properties, e.g., resource contention among applications, scheduling delay, etc.” The RL-Agent interacting with compute infrastructure to learn an optimized policy and having an observed state associated with a reward when the system achieves desirable properties such as resource contention among applications correlates to determining a reward based on a host resource utilization);
determining a next boost action based on positive feedback as the reward with the boost action (Fig. 7, paragraphs 61-62, “the reward/penalty generation module 126 can determine a change in the state of the shared compute infrastructure 130 occurring as a result of performing the scheduling action and responsively calculate a reward or penalty based on the change in state… The scheduling action determination module (policy network) 127 is configured to select one or more machines of multiple machines of the shared compute infrastructure 130 on which to schedule the incoming resource requests initiated by the applications based on a scheduling policy. The scheduling action determination module (policy network) 127 is further configured to iteratively learn or refine the scheduling policy based on the calculated reward or penalty to maximize an expected future reward or minimize an expected future penalty.” The scheduling action determination module using the calculated reward, which can be positive as the future reward is maximized, to iteratively learn and scheduling incoming resource requests correlates to determining a next boost action based on positive feedback as the reward with the boost action), 
determining a boost indicator that indicates whether the next boost action is effective for the plurality of workloads (Fig. 7, paragraph 95, resource contention component 710, resource over-utilization component 720, scheduling delay component 730, “More specifically, the example of FIG. 7 illustrates calculating a total reinforcement learning based penalty (negative reward) 740 defined by the combination of a resource contention component 710, a resource over-utilization component 720, and a scheduling delay (or wait) component 730.” The elements 710, 720, and 730 used to calculate the reward correlate to the boost indicator that indicates whether the next boost action is effective for a plurality of workloads).

Saillet in view of Mitra fails to disclose wherein the reward is determined via a discriminative model and determining the next boost action via a generative model. 
However, Miao teaches:
determining a reward via the discriminative model (Miao: Fig. 2, section IV, paragraphs 3-4, algorithm 1, “The discriminator is updated to distinguish the trajectories. Then, we update the generator so that the generated trajectory is indistinguishable. The method is to use the logarithm of negative discriminator output as the reward function, and update the network with PPO algorithm.” The discriminator using the logarithm of negative discriminator output as the reward function correlates to determining the reward via a discriminative model); and
determining a next action via the generative model based on the reward with the action as a state of the generative model (Miao: Fig. 2, section IV, paragraphs 3-4, algorithm 1, “Thus, we can train the policies jointly with a generative adversarial framework to optimize the reward function of the users. In this framework, we introduce a generator and a discriminator. The generator runs PPO to generate trajectories, while the discriminator tries to discriminate the expert trajectory from the generated trajectory... Then, we update the generator so that the generated trajectory is indistinguishable.” The generator being used to optimize the reward function and generate trajectories correlates to the generative model determining a next action based on the reward. The generator being updated so the generated trajectory is indistinguishable correlates to the action being a state of the generative model).

Additionally, Kochunni teaches:
sending, by the respective one of the plurality of hosts, a message to the workload scheduler, wherein the message includes a request to stop sending tasks to the respective one of the plurality of hosts based on an ineffectiveness of the next boost action (Paragraph 264, “In some embodiments, a job manager (or any other component configured to schedule and execute jobs and/or allocate resources for the jobs) may determine what kind of resources are available in the system, determine what kind of resources are requested by the job, and allocate a portion of the available resources for the job based on the determinations… For example, the job manager may make an application program interface (API) call to request a specific resource for a particular job. The call may return with the requested resource (e.g., a pointer to the resource) or some other expected result. However, if the requested resource cannot immediately be provided, the call, if it is a blocking call, would cause the job manager to wait until the requested resource can be provided. Once the requested resource becomes available, the call returns the requested resource and the job manager proceeds to process the next job.” The requested resource not being available in response to the API call from the job manager involves sending an indication it is not available and correlates to sending, by the respective one of the plurality of hosts, a message to the workload scheduler. The API call being a blocking call and the requested resource being unavailable correlates to an ineffectiveness of the next boost action. The job manager waiting until the requested resource is available again before requesting the resource to process the next job correlates to the message includes a request to stop sending tasks to the respective one of the plurality of hosts based on an ineffectiveness of the next boost action).

Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with implementing a daemon, wherein the daemon iteratively determines and stores a spare resource and a current resource consumption of each of a plurality of hosts for a plurality of workloads and wherein the daemon determines the spare resource and the current resource consumption for each of the plurality of hosts based on metadata collected for each host, wherein the metadata includes the current resource consumption by each task of the plurality of tasks or each application for each of the plurality of hosts as taught by Moussaoui because the daemon can provide current information on whether or not the request for resource allocation can be granted based on comparing the current resource status to the resource allocation rules. This can allow an application running in real time which requires low latency and high throughput to be advantageously scheduled (Moussaoui: paragraphs 138 and 151).

Additionally, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with sending, by the respective one of the plurality of hosts, a message to a workload scheduler, determining a reward based on a change in a task execution efficiency after the boost action; and determining a next boost action based on the reward with the boost action, determining a boost indicator that indicates whether the next boost action is effective for the plurality of workloads as taught by Mitra because the iterative reinforcement learning process used for learning an optimum scheduling policy improves scheduling distribution of resource requests initiated by applications on a shared compute infrastructure. Additionally, a customizable reinforced learning-based reward or penalty helps to teach a reinforced learning agent desirable properties of the system and avoid resource interference (Mitra: Fig. 7, paragraphs 63 and 95-96), thereby improving the learning process.

It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with determining a reward via a discriminative model; and determining a next action via a generative model based on the reward with the action as a state of the generative model as taught by Miao. Combining the generation of the boost action of Saillet using a generative model and generating the reward taught by Mitra using a discriminative model would have been obvious because reinforcement learning training can be computationally expensive to train, and these models allow a lower overhead via imitation learning (Miao: Section I, paragraph 3), thereby improving the learning process.

It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with sending, by the respective one of the plurality of hosts, a message to the workload scheduler, wherein the message includes a request to stop sending tasks to the respective one of the plurality of hosts based on an ineffectiveness of the next boost action as taught by Kochunni because sequential approaches for resource allocation can be implemented using blocking calls that allow a requested resource to indicate it cannot be immediately provided and therefore cause the job manager to wait until it is available. Job mangers can then determine what kinds of resources are available in the system, determine what kind of resources are requested, and allocate a portion of the available resources based on these determinations (Kochunni: paragraph 264).

With regards to Claim 1, the system of Claim 9 performs the same steps as the method of Claim 1, and Claim 1 is therefore rejected using the same art and rationale set forth above in the rejection of claim 9.

With regards to Claim 17, the system of Claim 9 performs the same steps as the product of Claim 17, and Claim 17 is therefore rejected using the same art and rationale set forth above in the rejection of claim 9. 

With regards to claim 13, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the computer system of Claim 9 as referenced above. Saillet further teaches:
determining whether a host resource utilization for the next boost action is greater than a threshold (Fig. 1, paragraph 70, block 103, “In some examples, determining (block 103) a time of execution of each workflow may be further based on a maximum usage for each computing resource. For example, the system or a user, may set a maximum threshold above which computing resource usage is not to exceed. Such a maximum threshold may define overcommitment of a computing resource.” Determining the time of execution based on if the computing resource usage exceeds a threshold correlates to determining whether a host resource utilization for the next boost action is greater than a threshold); and 
in response to determining that the host resource utilization for the next boost action is greater than the threshold, determining the boost action to be an invalid boost action (Fig. 1, paragraph 66, block 103, “If executing the first workflow in the queue would lead to an overcommitment, it is bumped and the next workflow in the queue is analyzed.” The first workflow leading to an overcommitment due to resource usage and bumped from the queue correlates to determining the boost action to be an invalid boost action).

With regards to Claim 5, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the method of Claim 1 as referenced above. The system of Claim 13 performs the same steps as the method of Claim 5, and Claim 5 is therefore rejected using the same art and rationale set forth above in the rejection of claim 13.

With regards to Claim 14, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the system of Claim 9 as referenced above. Mitra further teaches:
determining a boost indicator that indicates whether the boost action is effective for the plurality of workloads (Fig. 7, paragraph 95, resource contention component 710, resource over-utilization component 720, scheduling delay component 730, “More specifically, the example of FIG. 7 illustrates calculating a total reinforcement learning based penalty (negative reward) 740 defined by the combination of a resource contention component 710, a resource over-utilization component 720, and a scheduling delay (or wait) component 730.” The elements 710, 720, and 730 used to calculate the reward correlate to the boost indicator that indicates whether the boost action is effective for a plurality of workloads).

Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet determining a boost indicator that indicates whether the boost action is effective for the plurality of workloads as taught by Mitra because a customizable reinforced learning-based reward or penalty helps to teach a reinforced learning agent desirable properties of the system and avoid resource interference (Mitra: Fig. 7, paragraphs 95-96), thereby improving the learning process.

With regards to Claim 6, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the method of Claim 1 as referenced above. The system of Claim 14 performs the same steps as the method of Claim 6, and Claim 6 is therefore rejected using the same art and rationale set forth above in the rejection of claim 14.

With regards to Claim 19, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the product of Claim 17 as referenced above. The system of Claim 14 performs the same steps as the product of Claim 19, and Claim 19 is therefore rejected using the same art and rationale set forth above in the rejection of claim 14. 

With regards to Claim 15, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the system of Claim 14 as referenced above. Mitra further teaches:
wherein the boost indicator (Mitra: Fig. 7, paragraph 95, resource contention component 710, resource over-utilization component 720, scheduling delay component 730, “More specifically, the example of FIG. 7 illustrates calculating a total reinforcement learning based penalty (negative reward) 740 defined by the combination of a resource contention component 710, a resource over-utilization component 720, and a scheduling delay (or wait) component 730.” The elements 710, 720, and 730 used to calculate the reward correlate to the boost indicator) comprises at least one of:
a number of workloads completed per unit time; and
a difference between a resource scheduled by the boost action and an actual resource utilization (Mitra: Fig. 7, paragraph 98, resource over-utilization component 720, “The resource over-utilization component 720 is a penalty designed to prevent scheduling of more resource requests than can be handled by a machine. More specifically, the resource over-utilization component 720 introduces a penalty when a machine is not able to meet the resource requirements of resource requests scheduled on that machine.” The total reinforcement learning penalty consisting of a resource over-utilization component which prevents the scheduling of more resource requests than the machine can handle correlates to the boost indicator comprising a difference between a resource scheduled by the boost action and the actual resource utilization).

Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with wherein the boost indicator comprises at least one of: a number of workloads completed per unit time; and a difference between a resource scheduled by the boost action and an actual resource utilization as taught by Mitra because a customizable reinforced learning-based reward or penalty helps to teach a reinforced learning agent desirable properties of the system and avoid resource interference (Mitra: Fig. 7, paragraphs 95-96), thereby improving the learning process.

With regards to Claim 7, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the method of Claim 6 as referenced above. The system of Claim 15 performs the same steps as the method of Claim 7, and Claim 7 is therefore rejected using the same art and rationale set forth above in the rejection of claim 15.

With regards to claim 16, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the system of Claim 14 as referenced above. Saillet further teaches:
in response to determining from the boost indicator that the boost action is not effective, at least one of (Fig. 3, paragraph 89, block 303, “it is determined (block 303) whether computing resource usage exceeds an available amount. This may be performed for each computing resource. For example, it may be determined whether at a particular point in time, t0, and for any point throughout an execution period of the workflow, usage of any of the computing resources being analyzed exceeds a threshold amount for that particular resource.” Determining whether computing resource usage exceeds an available amount for each computing resource correlates to determining from the boost indicator that the boost action is not effective): 
stopping to dispatch one or more of the plurality of tasks to the plurality of hosts (Fig. 3, paragraph 90, block 305, “when the expected usage of any computing resource over a period of time is projected to exceed the available amount of a computing resource (block 303, determination YES), the scheduler (FIG. 2, 208) introduces (block 305) a delay into the execution of that workflow. The delay may be indeterminate.” The scheduler introducing an indefinite delay into the execution of the workflow corresponds to stopping to dispatch one or more of the plurality of tasks to the plurality of hosts); and
reclaiming the number of tasks dispatched to the plurality of hosts.

With regards to Claim 8, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the method of Claim 6 as referenced above. The system of Claim 16 performs the same steps as the method of Claim 8, and Claim 8 is therefore rejected using the same art and rationale set forth above in the rejection of claim 16.

With regards to Claim 20, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the product of Claim 19 as referenced above. The system of Claim 16 performs the same steps as the product of Claim 20, and Claim 20 is therefore rejected using the same art and rationale set forth above in the rejection of claim 16.

With regards to Claim 21, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the method of Claim 1 as referenced above. Saillet further teaches:
	determining, by the one or more processors, the host resource utilization is less than a second threshold (Paragraph 8, “scheduling a workflow consuming a large amount of the first computing resource to start at the particular point in time so long as a resulting computing resource usage over time does not peak to greater than a threshold amount.” Scheduling a workflow so long as the resulting computing resource usage over time does not exceed a threshold amount corelates to determining the host resource utilization is less than a threshold); determining, by the one or more processors, the boost action to be an invalid boost action (Fig. 1, paragraph 66, block 103, “If executing the first workflow in the queue would lead to an overcommitment, it is bumped and the next workflow in the queue is analyzed.” The first workflow leading to an overcommitment due to resource usage and bumped from the queue correlates to determining the boost action to be an invalid boost action); and updating, by the one or more processors, the next boost action based on the host resource utilization (Fig. 2, paragraph 116, scheduler 208, “It is then determined if the system, in its current state, can accommodate the workflow. If yes, the scheduler (FIG. 2, 208) schedules it and goes back to pick and analyze another one.” The system accommodating the workflow and the scheduler scheduling the workflow and picking another workflow using the same analysis which includes computing resource usage corresponds to updating the next boost action based on the host resource utilization).
	Saillet does not explicitly teach:
determining, by the one or more processors, the change in the task execution efficiency is greater than a first threshold;
However, Mitra teaches:
determining, by the one or more processors, the change in the task execution efficiency is greater than a first threshold (Paragraph 52, “When the combined resource demands from a machine by all the co-scheduled resource requests exceed a threshold, e.g., machine's physical capacity or CPU utilization capacity, the execution of the resource requests, e.g., jobs or services, can crash the machine or severely slowdown the machine (e.g., due to memory thrashing or CPU starvation).” Comparing the combined resource demands from the co-scheduled resource requests to a threshold which includes the execution of the resource requests correlates to determining the change in task execution efficiency is greater than a first threshold);
Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with determining, by the one or more processors, the change in the task execution efficiency is greater than a first threshold as taught by Mitra because any crash or slowdown degrades the user experience. Ensuring that the resource requirements are met through comparison of the execution of resource requests over time helps to avoid these scenarios (Mitra: paragraph 52).

With regards to Claim 22, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the method of Claim 1 as referenced above. Saillet further teaches:
wherein dispatching the number of tasks to the respective one of the plurality of hosts based on the boost action (Fig. 1, paragraph 71, block 103, “With the time of execution of each workflow determined (block 103), the system may then effectuate the execution order. That is, the system depicted in FIG. 2 for example, not only determines the order of execution, or a timing of execution of each workflow, but it also carries out the execution.” The time of execution for each workflow determined and executed correlates to dispatching the number of tasks to the plurality of hosts based on the boost action), further comprises:
Saillet does not explicitly teach:
implementing, by the one or more processors, the workload scheduler and a resource scheduler at the plurality of hosts, wherein the workload scheduler and the resource scheduler are implemented at a separated node or at a virtualized layer over nodes of the plurality of hosts.
However, Mitra teaches:
implementing, by the one or more processors, the workload scheduler (Paragraph 39, “As used herein, the term “reinforcement learning agent” refers to a reinforcement learning-based agent that iteratively learns an optimum scheduling policy for efficiently predicting on which of one or more shared resources to schedule incoming resource requests for minimizing resource contention.” The reinforcement learning agent which predicts which of the one or more shared resources to schedule incoming resource requests correlates to the workload scheduler) and a resource scheduler (Paragraph 38, “As used herein, the term “self-learning application scheduler” refers to a scheduler that uses machine learning algorithms and artificial intelligence to determine how to allocate shared computing resources among applications.” The self-learning application scheduler that determines how to allocate shared computing resources correlates to a resource scheduler) at the plurality of hosts, wherein the workload scheduler and the resource scheduler are implemented at a separated node or at a virtualized layer over nodes of the plurality of hosts (Fig. 1A, 10 and 12, paragraphs 41 and 55-56, “As discussed herein, the applications can be containerized, e.g., encapsulated by one or more containers, or run directly on the shared compute infrastructure or hardware… As shown in the example of FIG. 1A, the RL-Agent 123 acts on compute infrastructure 130 (environment or system) which can include multiple compute nodes or processing units… The self-learning application scheduler 122 can include or be executed on any system or collection of systems configured to perform the scheduling actions discussed herein… Such systems may employ one or more virtual machines, containers, or any other type of virtual computing resource in the context of improving application performance orchestration on a platform of which computing system 1201 of FIG. 12 is representative.” The RL-Agent acting on multiple compute nodes and the self-learning application scheduler being executed on any system or collection of systems employing virtual computing resources correlates to the workload and resource scheduler implemented at a separated node or virtualized layer over nodes of the plurality of hosts).
Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with implementing, by the one or more processors, a workload scheduler and a resource scheduler at the plurality of hosts, wherein the workload scheduler and the resource scheduler are implemented at a separated node or at a virtualized layer over nodes of the plurality of hosts as taught by Mitra because workload and resource schedulers can utilize reinforcement learning to efficiently predict which of the shared resources to schedule incoming requests for, which minimizes resource contention and improves application performance for a better overall user experience. The implementation through nodes or a virtualized layer over nodes allows clustering software to be installed on each of the servers to perform administrative tasks such as load balancing, determining node failures, and assigning failover duty. (Mitra: paragraphs 38 and 40).

With regards to Claim 24, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the method of Claim 1 as referenced above. Saillet further teaches:
wherein the system is configured to generate boost actions (Fig. 1, paragraphs 65-66, block 103, “determining (block 103) a time of execution of each workflow may include simulating each possible execution order of the multiple workflows to determine an execution order that maximizes system resource usage… workflows are first ordered based on their expected average resource usage with the most expensive workflow evaluated first.” The time of execution of each workflow being based on the average resource usage and system resource usage correlates to generating a boost action)
and wherein the system further receives as inputs, a name of each application (Paragraph 41, “Specifically, in order to optimize the execution time of each workflow, the system includes a component to predict the expected resource usage during the different points of the execution time of the running workflows, as well as queued workflows. This component is added to the workload manager. An optimization component is also added that will search which of the queued workflows can be started at which point of time to ensure an optimal utilization of the resources, that is to ensure maximum utilization without overcommitment. In one example, this includes a system that receives, in a queue, a list of workflows to execute and determines the optimal time to execute each workflow to maximize the utilization of the system without overcommitting it.” The system receiving a list of workflows to be executed includes an identifier to differentiate or separate each workflow in the list and therefore correlates to a system receiving a name of each application as input), pending tasks (Paragraphs 41-42, “Specifically, in order to optimize the execution time of each workflow, the system includes a component to predict the expected resource usage during the different points of the execution time of the running workflows, as well as queued workflows. This component is added to the workload manager. An optimization component is also added that will search which of the queued workflows can be started at which point of time to ensure an optimal utilization of the resources, that is to ensure maximum utilization without overcommitment. In one example, this includes a system that receives, in a queue, a list of workflows to execute and determines the optimal time to execute each workflow to maximize the utilization of the system without overcommitting it… the resource usage of each workflow is computed by doing a static analysis of the stages involved in the workflow. That is, each workflow is made up of a sequence of stages, e.g., a filter, a sorting stage, a transform operator. Each of these stages has a known behavior regarding its resource usage, depending on the operations it implements, the input throughput, the throughput at which it can output its data to the next stage, etc.” The system receiving a list of workflows to be executed, where each of the workflows are made up of a sequence of stages correlates to a system receiving pending tasks as input), the spare resource for each of the plurality of hosts (Paragraph 61, “The method (100) also includes determining (block 103) a time of execution of each workflow in the queue based on the available amount of each of the multiple computing resources over time and the expected usage of each computing resource to execute each workflow in the queue. That is, once the available amount of computing resources over a period of time is determined (block 101) and expected usage over time of each computing resource to execute a particular job is determined (block 102), the system can determine (block 103) a time of execution by comparing these two pieces of data.” The available amount of each of the multiple computing resources given to the system correlates to the system receiving a spare resource for each of the plurality of hosts as input), the historical resource consumption for each of the plurality of hosts (Paragraph 61, “The method (100) also includes determining (block 103) a time of execution of each workflow in the queue based on the available amount of each of the multiple computing resources over time and the expected usage of each computing resource to execute each workflow in the queue. That is, once the available amount of computing resources over a period of time is determined (block 101) and expected usage over time of each computing resource to execute a particular job is determined (block 102), the system can determine (block 103) a time of execution by comparing these two pieces of data.” The expected usage of each of the multiple computing resources given to the system correlates to the system receiving the historical resource consumption for each of the plurality of hosts as input),

Saillet does not explicitly teach that the boost actions are generated by a generative model and that the inputs are received by a generative model. However, reinforced learning models including generative models are a popular method of generating actions as evidenced by Miao above (Fig. 2, section IV, paragraphs 3-4, algorithm 1). Additionally, receiving inputs to generate actions based on the input is a popular method of generating actions from a generative model’s input state as evidenced by Miao above (Fig. 2, section IV, paragraphs 2-4, algorithm 1).

Moussaoui further teaches:
and wherein the system further receives as inputs, a host name for each of the plurality of hosts (Paragraph 110, “In such cases, the resource management unit 302 may be configured to obtain 312 from the cluster management node 303a identifiers of Pods and Containers in which the instance of the application program has been scheduled to run.” The resource management unit obtaining identifiers of the pods and containers scheduled to run the application program correlates to the system receiving a host name for each of the plurality of hosts as input).

Miao further teaches:
 and the discriminative model is configured to evaluate the actions (Fig. 2, section IV, paragraphs 3-4, algorithm 1, “Thus, we can train the policies jointly with a generative adversarial framework to optimize the reward function of the users. In this framework, we introduce a generator and a discriminator. The generator runs PPO to generate trajectories, while the discriminator tries to discriminate the expert trajectory from the generated trajectory. The discriminator is updated to distinguish the trajectories. Then, we update the generator so that the generated trajectory is indistinguishable. The method is to use the logarithm of negative discriminator output as the reward function, and update the network with PPO algorithm.” The discriminator calculating a reward function using the logarithm of negative discriminator output, which is calculated based on the generated action from the generative model, correlates to the discriminative model evaluating the action),

Miao does not explicitly teach that the action is a boost action. However, boost actions are a popular type of action generated and evaluated by models as evidenced by Mitra above (Fig. 7, paragraphs 61-62).

Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with and wherein the system further receives as inputs, a host name for each of the plurality of hosts as taught by Moussaoui because each application may be allocated multiple resources for execution such as pods. Obtaining all the associated information for respective system physical resources where an application is scheduled to run can be used to request information on currently allocated resources to the corresponding pods and containers from resource allocation processes (Moussaoui: paragraph 110).

It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with and the discriminative model is configured to evaluate the actions as taught by Miao because reinforcement learning training can be computationally expensive to train, and these models allow a lower overhead via imitation learning (Miao: Section I, paragraph 3), thereby improving the learning process.

With regards to Claim 25, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the method of Claim 1 as referenced above. Moussaoui further teaches:
wherein the daemon continues to sample the metadata collected for each host until the daemon determines that the sampling interval has been reached (Paragraphs 123 and 136-137, “Therefore, in some embodiments, the resource allocation process 303b2 may advantageously be used to store, locally to the cluster in which an instance of the application program of interest is running, detailed information regarding resources (e.g. CPU resources and/or memory resources) allocated to run the instance of the application program, and possibly information regarding status of resources (e.g. CPU resources and/or memory resources) of the computing machine on which the instance of the application program is running. Such status information may for example comprise information on which CPU resource has already been allocated, and/or which CPU resource is free of allocation… the resource management control unit 402 may be configured for obtaining, for example from a resource allocation daemon process running on the computing machine on which the Cluster comprising the Pod in which the instance of the application program is running, current resource allocation (e.g. CPU resource currently allocated to the Pod) and current resource status (e.g. status of CPU resources among allocated resources and available resources… For example, the Linux command «Iscpu», which retrieves CPU architecture information from sysfs and /proc/cpuinfo, may in some embodiments be used to retrieve via the resource allocation daemon a list of all the CPU nodes of the server on which the Pod is running.” The detailed information including current CPU and/or memory resource allocation of the computing machine on which the instance of the application program is running correlates to the daemon continuing to sampling the metadata for each host. The resource management control unit obtaining current resource allocation and current resource status for computing machines which the instance of the application program is running, which can be stopped once the application or pod is no longer running, from the resource allocation daemon correlates to a daemon sampling the metadata until the daemon determines the sampling interval has been reached).

Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with wherein the daemon continues to sample the metadata collected for each host until the daemon determines that the sampling interval has been reached as taught by Moussaoui because the daemon can provide current information on whether or not the request for resource allocation can be granted based on comparing the current resource status to the resource allocation rules. This can allow an application running in real time which requires low latency and high throughput to be advantageously scheduled. Daemons can also be used to provide relevant status and resource allocation information for currently allocated resources in response to client requests for resource allocation (Moussaoui: paragraphs 133, 135-136, 138 and 151).

With regards to Claim 26, Saillet in view of Mitra, Miao, Moussaoui and Kochunni teach the method of Claim 1 as referenced above. Saillet further teaches:
dispatching the second number of tasks to one or more new hosts (Fig. 2, paragraph 116, method 500, “So, the system (FIG. 2, 202) identifies which stage can be run next, giving preference to the one on which other stages are dependent. With the stages identified, the system (FIG. 2, 200) determines exactly what are the different stages in the workflow and the system (FIG. 2, 202) also sees what is the input to the workflow… It is then determined if the system, in its current state, can accommodate the workflow. If yes, the scheduler (FIG. 2, 208) schedules it and goes back to pick and analyze another one.” The system identifying which stages can be run next, determining if it can accommodate the workflow and scheduling the workflow correlates to dispatching a second number of tasks to the one or more new hosts), based on the task execution efficiency being greater than the first threshold (Paragraphs 69-70, 74, “That is, the system may determine which workflow may be executed at a particular point in time, t0, and if not executable at that point in time without overcommitting a resource, determine when each of the workflows can be executed without overcommitting resources… For example, CPU usage, memory usage, and I/O usage may be defined as “overcommitted” when more than 80% of that resource is used… In one example, associated with each resource is a monitor which detects activity at the computing resource. For example, a CPU monitor may count the CPU operations to execute. That is, each CPU core can execute a certain number of operations per unit of time… When the CPU has capacity for executing operations but has nothing to do (because for instance the managed processes are waiting), it is idle. The percentage of time the CPU is idle may indicate the free capacity of the CPU.” The CPU having capacity for executing operations but having nothing to do and executing zero operations in an idle state correlate to the first threshold. The CPU monitor counting the number of CPU operations being executed in a non-idle state where the CPU may still have some free capacity but is below the CPU usage overcommitment threshold correlates to the task execution efficiency being greater than a first threshold) and the host resource utilization remaining constant (Paragraphs 69-70 and 77, “That is, the system may determine which workflow may be executed at a particular point in time, t0, and if not executable at that point in time without overcommitting a resource, determine when each of the workflows can be executed without overcommitting resources. In some examples, determining (block 103) a time of execution of each workflow may be further based on a maximum usage for each computing resource. For example, the system or a user, may set a maximum threshold above which computing resource usage is not to exceed. Such a maximum threshold may define overcommitment of a computing resource... An example where the resource analyzer (204) determines availability per resource and over time allows for greater resource utilization. That is, as described above, rather than making a determination of an availability of all computing resources based on the availability of one particular computing resource, such as a CPU, the present system (202) makes a determination of availability on a per-resource level and over time so that each computing resource is individually analyzed for usage and thus usage of that particular computing resource can be maximized.” The system analyzing each computing resource to maximize the usage of a particular computing resource, which would bring the usage to its associated maximum threshold, correlates to the host resource utilization remaining constant).
Mitra further teaches:
wherein a boost indicator for the one or more new hosts indicates that the next boost is valid (Fig. 7, paragraph 95, resource contention component 710, resource over-utilization component 720, scheduling delay component 730, “More specifically, the example of FIG. 7 illustrates calculating a total reinforcement learning based penalty (negative reward) 740 defined by the combination of a resource contention component 710, a resource over-utilization component 720, and a scheduling delay (or wait) component 730.” The elements 710, 720, and 730 used to calculate the reward correlate to the boost indicator that indicates whether the next boost is valid).

Therefore, it would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which said subject matter pertains to combine Saillet with wherein a boost indicator for the one or more new hosts indicates that the next boost is valid as taught by Mitra because the iterative reinforcement learning process used for learning an optimum scheduling policy improves scheduling distribution of resource requests initiated by applications on a shared compute infrastructure. Additionally, a customizable reinforced learning-based reward or penalty helps to teach a reinforced learning agent desirable properties of the system and avoid resource interference (Mitra: Fig. 7, paragraphs 63 and 95-96), thereby improving the learning process.

Prior Art Made of Record
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Anghel et al. (U.S. Patent No. US 20160299785 A1); teaching a method of selecting one or more computing entities to process computational tasks by a scheduling entity based on various factors, which can include the number of instructions, average CPU usage, and average memory usage. The method further modifies and combines multiple computing entity parameters to improve a machine learning process used by the scheduling process entity.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SELINA HU whose telephone number is (571)272-5428. The examiner can normally be reached Monday-Friday 8:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chat Do can be reached at (571) 272-3721. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SELINA ELISA HU/               Examiner, Art Unit 2193                                                                                                                                                                                         


/Chat C Do/Supervisory Patent Examiner, Art Unit 2193
Read full office action
Prosecution Timeline

Jul 12, 2022
Application Filed
Mar 19, 2025
Non-Final Rejection — §103, §112
May 09, 2025
Interview Requested
Jun 10, 2025
Examiner Interview Summary
Jun 10, 2025
Applicant Interview (Telephonic)
Jun 13, 2025
Response Filed
Jul 08, 2025
Final Rejection — §103, §112
Aug 04, 2025
Interview Requested
Aug 28, 2025
Applicant Interview (Telephonic)
Sep 05, 2025
Response after Non-Final Action
Oct 07, 2025
Examiner Interview Summary
Oct 13, 2025
Request for Continued Examination
Oct 16, 2025
Response after Non-Final Action
Oct 28, 2025
Non-Final Rejection — §103, §112
Dec 12, 2025
Interview Requested
Jan 16, 2026
Applicant Interview (Telephonic)
Jan 16, 2026
Examiner Interview Summary
Jan 22, 2026
Response Filed
Feb 13, 2026
Final Rejection — §103, §112
Mar 06, 2026
Interview Requested
Precedent Cases

Applications granted by this same examiner with similar technology

17/895,687
Patent 12585485
Warm migrations for virtual machines in a cloud computing environment
2y 5m to grant Granted Mar 24, 2026
18/020,618
Patent 12563114
CONTENT INITIALIZATION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 2 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
67%
Grant Probability
99%
With Interview (+100.0%)
3y 3m
Median Time to Grant
High
PTA Risk
Based on 3 resolved cases by this examiner. Grant probability derived from career allow rate.
RESOURCE AND WORKLOAD SCHEDULING

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email