DETAILED ACTION
This Action is responsive to Claims filed 07/29/2025.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of the Claims
Claims 1, 8, 11, and 18 have been amended. Claims 1-20 are pending.
Response to Amendment
The amendments to claims 8 and 18 have overcome the objections to minor informalities.
Response to Arguments
Applicant's arguments, see Pages 10-11, filed 07/29/2025, with respect to the 35 U.S.C. 101 Rejection of claims 1-20 have been fully considered but they are not persuasive.
The Applicant argues the newly amended limitations point toward a specific improvement to the functioning of a computer or other technological field, in light of the Specification. The Examiner respectfully disagrees with the Applicant. The newly amended limitations serve to further refine the data manipulated by the interpretable abstract idea mental process steps. The improvements cited by the Applicant are a direct result of the determination of an optimum infrastructure allocation and related steps. Per MPEP 2106.05(a)(II), the specific improvement must come from a specific structure or additional element, and not from the interpretable abstract idea mental process steps. See the updated 35 U.S.C. 101 Rejection below.
Applicant’s arguments, see Pages 11-14, filed 07/29/2025, with respect to the 35 U.S.C. 103 Rejection of claims 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 101
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-20 rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more; and because the claims as a whole, considering all claim elements both individually and in combination, do not amount to significantly more than the abstract idea, see
Alice Corporation Pty. Ltd. v. CLS Bank International, et al, 573 U.S. (2014). In determining whether the claims are subject matter eligible, the Examiner applies the 2019 USPTO Patent
Eligibility Guidelines. (2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50,
Jan. 7, 2019.)
Step 1:
Claims 1-10 recite a method for running multiple iterations of a computing workload, which falls under the statutory category of a process. Claims 11-20 recite a computer-readable storage medium for running multiple iterations of a computing workload, which falls under the statutory category of a manufacture.
Step 2A - Prong 1:
Claim 1 recites an abstract idea, law of nature, or natural phenomenon. The limitations of “…generate a respective infrastructure allocation for the computing workload, wherein a reward function of the reinforcement learning process generates a respective reward for each initial infrastructure allocation; “, “…generate a total reward for each infrastructure allocation;”, and “identifying an optimum infrastructure allocation for the computing workload based on total rewards from multiple iterations;” Under the broadest reasonable interpretation, this step covers a mental process including an observation, evaluation, judgment or opinion that could be performed in the human mind or with the aid of pencil and paper. This limitation therefore falls within the mental process group.
Generating an infrastructure allocation and a reward function output are practically performed within the human mind or with the aid of pen and paper. Generating or summing a total reward is practically performed within the human mind or with the aid of pen and paper. Identifying an optimum infrastructure allocation based on the generate total reward is practically performed within the human mind or with the aid of pen and paper.
Step 2A - Prong 2:
The additional elements of claim 1 do not integrate the abstract idea into a judicial
exception. The claim recites the additional element “a computing workload”, “computing resources”, “a service level agreement average” which are recognized as generic computer components recited at a high level of generality. Although they have and execute instructions to perform the abstract idea itself, this also does not serve to integrate the abstract idea into a practical application as it merely amounts to instructions to "apply it." (See MPEP 2106.04(d)(2) indicating mere instructions to apply an abstract idea does not amount to integrating the abstract idea into a practical application).
The additional elements of “a reinforcement learning process”, “infrastructure allocation”, “a reward function”, “an accumulator map voting process”, and “band of region” are recognized as not being generic computer components, however they are recited at a high level of generality and found to generally linking the abstract idea to a particular technological environment or field of use.
Furthermore, the limitations “running multiple iterations of a computing workload;”, “running an accumulator map voting process to generate a total reward for each initial infrastructure allocation;”, “assigning computing resources to the computing workload based on the optimum infrastructure allocation, wherein the optimum infrastructure allocation is identified when the service level agreement average is positioned within the band of region encompassing the service level agreement average;”, and “executing the computing workload using the assigned computing resources and, based on a set of all computing resources available at a time of the executing, wherein the assigned computing resources are optimal for execution of the computing workload.” merely amounts to instructions to "apply it." (See MPEP 2106.05(f) indicating mere instructions to apply an abstract idea does not amount to integrating the abstract idea into a practical application).
Step 2B:
The only limitation on the performance of the described method is limitations reciting “a computing workload”, “computing resources”, “a service level agreement average”. This element is insufficient to transform a judicial exception to a patentable invention because the recited element is considered insignificant extra-solution activity (generic computer system, processing resources, links the judicial exception to a particular, respective, technological environment). The claim thus recites computing components only at a high-level of generality such that it amounts to no more than mere instructions to apply the exception using generic computer components; mere instructions to apply an exception using a generic computer component cannot provide an inventive concept (see MPEP 2106.05(f)).
Additionally, the claimed limitation “running multiple iterations of a computing workload;” is acknowledged to be well-understood, routine, conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(d)(II)(ii)).
Furthermore, the limitations of “running multiple iterations of a computing workload;”, “running an accumulator map voting process to generate a total reward for each initial infrastructure allocation;”, “assigning computing resources to the computing workload based on the optimum infrastructure allocation, wherein the optimum infrastructure allocation is identified when the service level agreement average is positioned within the band of region encompassing the service level agreement average;”, and “executing the computing workload using the assigned computing resources and, based on a set of all computing resources available at a time of the executing, wherein the assigned computing resources are optimal for execution of the computing workload.” Are found to mere instructions to apply the abstract idea; mere instructions to apply an exception using a generic computer component cannot provide an inventive concept (see MPEP 2106.05(f)).
Taken alone or in ordered combination, these additional elements do not amount to
significantly more than the above-identified abstract idea. There is no indication that the
combination of elements improves the functioning of a computer or improves any other
technology. Their collective functions merely provide conventional computer implementation.
For the reasons above, claim 1 is rejected as being directed to non-patentable subject
matter under §101. This rejection applies equally to independent claim 11, which recites a computer readable storage medium having stored therein instructions that are executable by one or more hardware processors to perform operations, and wherein it is noted that claim 11 recites generic computer components (storage medium, hardware processors) at high levels of generality.
Dependent Claims:
The limitations of dependent claims but for those addressed below merely set forth
further refinements of the abstract idea without changing the analysis already presented.
Claim 2 (Claim 12) merely recites refinements to the scope of the additional elements of claim 1.
Claim 3 (Claim 13) merely recites refinements of the “reward” and “computing workload” additional elements.
Claim 4 (Claim 14) is found to be mere instructions to apply the aforementioned “assigned computing resources” additional element. This limitation has been evaluated under Step 2A – Prong 2 and re-evaluated under Step 2B and found to be well-understood, routine, and conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(d)(II)(ii)).
Claim 5 (Claim 15) recites additional elements of claim 1 and further refines the “reward” additional element.
Claim 6 (Claim 16) recites additional elements of claim 1 and further refines the “reward” additional element.
Claim 7 (Claim 17) recites additional elements of claim 1 and further refines the “reward” additional element.
Claim 8 (Claim 18) merely recites refinements to the scope of the additional elements of claim 1.
Claim 9 (Claim 19) recites additional elements of claim 1 as well as further refinements of the “reward function” additional element. The additional element “a reward band” has been analyzed under Step 2A – Prong 2 and re-evaluated under Step 2B and have been found to generally link the abstract idea of claim 1 to a particular technological environment or field of use.
Claim 10 (Claim 20) merely refines the “reward band” additional element.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Tzortzatos et al. (US 11,663,039 B2), hereinafter Tzortzatos, further in view of Rajeswaran Chockalingapuramravindran (US 2022/0108165 Al), hereinafter Rajeswaran.
In regards to claim 1: The present invention claims: “A method for improving efficiency of computing resource usage, comprising: running multiple iterations of a computing workload;” Tzortzatos teaches workload management system that repetitively predicts the workload of a system and enacts actions (Column 3 and Fig. 2)
“for each iteration of the computing workload, using a reinforcement learning process to generate a respective infrastructure allocation for the computing workload,” Tzortzatos teaches “After the action has been determined, the method 300 includes enacting the action for the system, as shown in block 306. Enacting the action includes the system allocating system resources for users or workloads that are accessing the system.” (Page 11, Column 6, Lines 38-42). See also Page 11, Column 5, Lines 61-66 for the reinforcement learning process generating a state with an allocation based on a workload.
“wherein a reward function of the reinforcement learning process generates a respective reward for each initial infrastructure allocation, and the respective infrastructure allocation for the computing workload at one iteration is different from the respective infrastructure allocation at another iteration different from the one iteration;” Tzortzatos teaches “After taking an action a, the system environment provides a description of its current states ( e.g., a vector covering selected system parameters) and a reward r is provided. The reward r can be a quantification of state 50 changes occurring as a result of the action a (e.g., the evaluation of a system performance metric.)” (Page 10, Column 3, Lines 47-52, see also Page 11, Column 6, Lines 38-42 for resource allocation). Tzortzatos also teaches “A non-limiting example of the method includes determining, by a machine learning model, a predicted workload for a system and a current system state of the system, determining an action to be enacted for the system based at least in part on the predicted workload and the current system state, enacting the action for the system, evaluating a state of the system after the action has been enacted, determining a reward for the machine learning model based at least in part on the state of the system after the action has been enacted,” (Column 2, Lines 7-16, mapping to iterations being necessarily different as the state of the system changes).
“assigning computing resources to the computing workload based on the optimum infrastructure allocation…” Tzortzatos teaches that their system assigns the resources based on the optimized reward, throughout column 3 and Figure 2.
“and executing the computing workload using the assigned computing resources and, based on a set of all computing resources available at a time of the executing, wherein the assigned computing resources are optimal for execution of the computing workload.” Tzortzatos, in at least Column 3, teaches how the reward function is optimized in order to execute states or actions to meet SLA expectations or workload demands of the action or state.
“running an accumulator map voting process to generate a total reward for each initial infrastructure allocation based on a service level agreement average and a band of region…” Tzortzatos teaches “Exploitation is done by selecting an action that is beneficial in terms of reward which is often referred to as "greedy" . A greedy strategy would be to take the action that is promising the most reward according to the trained model. Another strategy would be to evaluate a serial of actions, accumulate the rewards, and select the path the promises the most reward in the end. This latter strategy includes a whole tree structure of possible actions and to which algorithms can utilized to implement such strategies.” (Page 10, Column 4, Lines 55-63).
While Tzortzatos teaches “…accumulate the rewards, and select the path the promises the most reward in the end.” (Page 10, Column 4, total reward) and “Service level management 84 provides cloud computing resource allocation and management such that required service levels are met. Service Level Agreement (SLA) planning and fulfillment 85 provide pre-arrangement for, and procurement of, cloud computing resources for which a future requirement is anticipated in accordance with an SLA.” (Page 13, Column 9, service level agreement average), Tzortzatos fails to explicitly teach “and a band of region;” however, Rajeswaran teaches the optimum reward following a plotted curve in Fig. 5. The slope at any given point on the curve is the reward per [0059]. Rajeswaran further teaches resource reallocation may occur dynamically ([0058]). Further, paragraphs [0059] and [0060] teach that as the slope of the reward curve flattens, then the allocation of resources may cap. It would be an obvious result of a combination of Tzortzatos and Rajeswaran that such resource allocation would dynamically allocate, reallocate, or deallocate resources to keep a reward at least minimally meeting the parameters of an SLA without overallocating the resources (band of region).
Rajeswaran teaches the need to optimize a reward structure for deep neural networks to improve their efficiency in parallelized environments ([0018]-[0021]). It would have been obvious to one of ordinary skill in the art at the time of the application’s filing to combine the workload management system of Tzortzatos, containing a reinforcement learning model, with the optimized reward structure for deep neural networks of Rajeswaran to achieve greater efficiency within the system.
“identifying an optimum infrastructure allocation for computing workload based on total rewards from multiple iterations;…wherein the optimum infrastructure allocation is identified when the service level agreement average is positioned within the band of region encompassing the service level agreement average” Tzortzatos reads on the use of an SLA to allocate computing resources, and Rajeswaran reads on a reward band or keeping a reward function between an underallocated minimum and overallocated maximum, it would be an obvious result of their combination that the desired infrastructure allocation be when the band of region falls within the parameters of the SLA (“identifying a desired infrastructure allocation…” and “when the service level agreement average is positioned within the band of region encompassing the service level agreement average.”).
While the combination of Tzartzatos and Rajeswaran reasonably read on the generic recitation of keeping a reward in a “band of region” for resource allocation, the combination does not explicitly teach the values recited in:
“…wherein the total reward for each allocation is zero when an execution time is less than the band of region, a positive value when the execution time is within the band of region, and a negative value when the execution time is greater than the band of region;” However, Dominguez, in a similar field of endeavor of reinforcement learning, teaches “Overall, tuning your reward function parameters is an important part of optimizing your AWS DeepRacer’s performance. By defining your goals clearly, experimenting with different reward values, and using negative rewards to discourage undesirable behavior, you can fine-tune your reward function to help your DeepRacer achieve the best possible results on the track.” (mapping to the use of positive, negative, and zero reward function outputs being used in reinforcement learning).
A cursory search indicates the use of positive, negative, and zero reward values to encourage or discourage actions, as Dominguez indicates, would have been known in the art at the time of the Applicant’s filing. As indicated by Dominguez, the tuning of a reward function to include such values or intricacies can improve performance in reinforcement learning models. It would have been obvious to one of ordinary skill in the art at the time of the Applicant’s filing for a reward function allocating computing resources based on an SLA to output a discouraging value for slowness, a net zero value for exceeding computing resources (also indicated by the plateau Rajeswaran recites), and a positive value for maintaining a balance between execution time and the SLA.
In regards to claim 2: The present invention claims: “wherein the assigned computing resources overlap, but are not the same as, a set of computing resources that make up the optimum infrastructure allocation.” Tzortzatos teaches “In some embodiments of the invention, the reward model could also be a separately designed, trained, and deployed machine learning model. That is to say, how to compute the reward can be externalized as a user specific issue that every user might want to define/solve. In that case, the reward is calculated based on the state of the system responsive to an action taken in the historical SMF data. Exploration is usually done to a fixed percentage ( e.g., 10% of the time (or every 10th step) an action is taken at random).” (Page 10, Column 4, Lines 46-55). Rajeswaran also teaches the resources allocation following a curve show in Figure 5 and [0059], with various tasks experiences processing speedup and plateauing as resources are assigned to it. It would be obvious to one of ordinary skill in the art that one may slightly over-allocate resources in the meeting of an SLA or workload demand, or in the sharing of processors most efficiently.
In regards to claim 3: The present invention claims: “wherein one of the rewards has a value that indicates a relation between an execution time of the computing workload and an execution time specified by a service level agreement, and the execution time of the computing workload is the time taken for execution of the computing workload by the initial infrastructure allocation to which the reward value corresponds.” Tzortzatos teaches “and a reward r is provided. The reward r can be a quantification of state 50 changes occurring as a result of the action a (e.g., the evaluation of a system performance metric.)” (Page 10, Column 3, Lines 49-52). See also Lines 49-51 of Column 4 for “That is to say, how to compute the reward can be externalized as a user specific issue that every user might want to define/solve.” See Page 13, Column 9, Line 25 for reference to a service level agreement (SLA).
In regards to claim 4: The present invention claims: “wherein use of the assigned computing resources in the execution of the computing workload minimizes computing resource wastage.” Tzortzatos teaches “The calculated reward 208 can be determined based on these customer goals where the evaluated state of the system 206 can be evaluated against these goals. The reward would be indicative on whether the predicted workload 212 and the action 204 resulted in a system state that reflects these customer defined goals.” (Page 11, Column 5, Lines 53-58) Both Tzortzatos and Rajeswaran indicate using a reward function to optimally allocate processing resources to various workloads (Tzortzatos Column 3, Rajeswaran [0022], mapping such optimization to either the maximization of computing resource or minimization of similar waste).
In regards to claim 5: The present invention claims: “a reward value of zero for an initial infrastructure allocation indicates that computing resources included in that initial infrastructure allocation exceed the computing resources needed to execute the computing workload in a manner that meets requirements of a service level agreement.” While Tzortzatos does teach a reinforcement learning workload management system with rewards, it fails to explicitly teach the aforementioned limitation. However, Rajeswaran teaches “As indicated in Equation 2, the reward value for each of the tasks may be calculated by determining the slope of the corresponding curves (e.g., 502, 504, and 506).” ([0059], mapping the calculation of slope on a reward graph to reward values and indication of their meeting requirements.) See Fig. 5 for reward curves. A 0 slope indicates over-allocation of processors (exceed the computing resources needed) ([0059]).
In regards to claim 6: The present invention claims: “a negative reward value for an initial infrastructure allocation indicates that computing resources included in that initial infrastructure allocation are inadequate to execute the computing workload in a manner that meets requirements of a service level agreement.” See above for how the slope of the reward graph in Fig. 5 of Rajeswaran above shows the reward of a given allocation. (mapping a negative slope on this graph indicating a slowdown, therefore a slowdown in performance)
In regards to claim 7: The present invention claims: “a reward value for an initial infrastructure allocation is at a maximum at a point between a zero reward value and a negative reward value.” See above, the initial slope of the graph would be 0 (too much allocation), or negative (too little allocation), before the curve could form a positive slope.
In regards to claim 8: The present invention claims: “the assigned computing resources are more than are needed for the execution of the computing workload.” Tzortzatos teaches “In some embodiments of the invention, the reward model could also be a separately designed, trained, and deployed machine learning model. That is to say, how to compute the reward can be externalized as a user specific issue that every user might want to define/solve. In that case, the reward is calculated based on the state of the system responsive to an action taken in the historical SMF data. Exploration is usually done to a fixed percentage ( e.g., 10% of the time (or every 10th step) an action is taken at random).” (Page 10, Column 4, Lines 46-55). Rajeswaran also teaches the resources allocation following a curve show in Figure 5 and [0059], with various tasks experiences processing speedup and plateauing as resources are assigned to it. It would be obvious to one of ordinary skill in the art that one may slightly over-allocate resources in the meeting of an SLA or workload demand, or in the sharing of processors most efficiently.
In regards to claim 9: The present invention claims: “a plot of the reward function comprises a reward band that includes a range of reward values, and each of the reward values in the reward band corresponds to an initial infrastructure allocation that is capable of executing the computing workload according to a requirement specified in a service level agreement.” See Rajeswaran Fig. 5. The slope at any given point on the curve is the reward per [0059].
In regards to claim 10: The present invention claims: “the reward band includes a positive reward value, a maximum reward value, and a negative reward value.” See the rejection of claim 9 above, the slope of a curve may be positive, zero, or negative.
In regards to claim 11: Claim 11 recites similar limitations to claim 1, with the exception of “A computer readable storage medium having stored therein instructions that are executable by one or more hardware processors to perform operations for improving efficiency of computing resource usage, the operations comprising:” See Tzortzatos, Page 15, Column 13, Line 12. Therefore, claim 11 is similarly rejected.
In regards to claims 12-20: Claims 12-20 recites similar limitations to claims 2-10, with the exception of the computer-readable storage medium of claim 11. Therefore, claims 12-20 are similarly rejected.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GRIFFIN T BEAN whose telephone number is (703)756-1473. The examiner can normally be reached M - F 7:30 - 4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached at (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/GRIFFIN TANNER BEAN/Examiner, Art Unit 2121
/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121