DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Interpretation
Claims 9, 16 and 20 recite limitations “one or more of grid carbon intensity, weather, and time” and “one or more of allocated workload, unallocated workload, temperature of the data center, energy consumption by the data center, and energy stored by the data center”. The plain meaning of phrase “one or more of A and B” is “one or more of A and one or more of B” (for more details please see Ex parte Jung, 2016-008290 (PTAB Mar. 22, 2017) and/or SuperGuide Corp. v. DirecTV Enters., Inc., 358 F.3d 870 (Fed. Cir. 2004)). If Applicant intends to claim one element or more elements, the limitations can be changed to “one or more of grid carbon intensity, weather, [[and]] or time” and “one or more of allocated workload, unallocated workload, temperature of the data center, energy consumption by the data center, [[and]] or energy stored by the data center”.
Claim Objections
In claim 17, the phrase “and” after “aggregate the rewards” should be deleted since it is redundant.
In claim 19, the phrase “assign” should be amended to “assigning”, to correct the grammatical error.
Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 17-20 are rejected under 35 U.S.C. 101 because the claimed invention is not directed to any of the statutory categories of subject matter.
Claims 17-20 recite a system comprising a digital twin and a plurality of reinforcement learning agents. In their broadest reasonable interpretations, the claimed “digital twin” and “reinforcement learning agents” are software modules, and are not directed to any of the statutory categories of subject matter. Therefore, claims 17-20 are rejected under 35 U.S.C. 101. For details, please refer to MPEP 2106.03 (I).
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim 17 is rejected under 35 U.S.C. 102(a)(1) as being unpatentable over SU (CN 112966431 A, hereinafter as “SU”).
Regarding claim 17, SU teaches:
A system, comprising:
a digital twin (FIG. 4 and [0079] and [0101]: task scheduling reinforcement learning model and temperature control reinforcement learning model are built to mimic the operations of the system as digital twin) of a data center ([0001]), the digital twin comprising a plurality of subsystem models corresponding to a plurality of subsystems of the data center ([0079] and [0101]: scheduling subsystem model and cooling subsystem model) and configured to simulate states for each subsystem model ([0081] and [0102]) and determine rewards for each subsystem model ([0089] and [0106]); and
a plurality of reinforcement learning agents ([0077] and [0078]: “scheduling_agent” and “cooling_agent”) interfaced with the digital twin and configured to:
receive the simulated states and the rewards from a digital twin ([0081] and [0102], [0089] and [0106]),
aggregate the rewards ([0112] and [0129]),
determine actions for each subsystem model based on the aggregated rewards ([0129]), and
generate control signals for each subsystem according to the determined actions, wherein the data center is configured based on the control signals ([0136]).
SU teaches specifically (underlines are added by Examiner for emphasis):
PNG
media_image1.png
578
660
media_image1.png
Greyscale
[0001] The invention belongs to the technical field of data center energy consumption management, and specifically relates to a data center energy consumption joint optimization method, system, medium and equipment.
[0077] S1, build a data center multi-agent environment.
[0078] It is assumed that there are precision air conditioners, several servers, and several tasks waiting to be executed in the data center environment. Assuming that all servers belong to the same cluster, a task scheduling agent scheduling_agent is responsible for the assignment of tasks to machines in the cluster, and the temperature control agent cooling_agent in the precision air conditioner is responsible for adjusting the temperature to cool and heat the server.
[0079] S2, establish a task scheduling reinforcement learning model.
[0081] S201. Establish a state space of scheduling_agent.
[0089] S203, design the reward function of scheduling_agent.
[0101] S3, build a temperature control reinforcement learning model.
[0102] S301. Establish a state space for cooling_agent.
[0106] S303, design the reward function of cooling_agent.
[0112] S4. Training a joint control model based on reinforcement learning of heterogeneous multi-agents, as shown in the training part of Figure 4.
[0129] S5. Use the trained energy consumption joint optimization model to realize joint optimization of scheduling_agent and cooling_agent with the goal of minimizing overall energy consumption in a dynamic data center environment.
[0136] The optimization module uses the scheduling_agent and cooling_agent trained by the joint control model to execute the action strategy aimed at reducing its own energy consumption based on their respective observation information, while ensuring the balance of the dynamic data center environment and minimizing the overall energy consumption.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 2 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over SU in view of TRAUT (US 20210334696 A1, hereinafter as “TRAUT”).
Regarding claim 1, SU teaches:
A method, comprising:
obtaining state data representative of states of the system ([0081-0082] and [0102]: get state data for task scheduling agent “scheduling_agent” and temperature control agent “cooling_agent”), the system comprising a plurality of subsystems (FIG. 4 and [0078]: subsystems “scheduling_agent” and “cooling_agent”);
receiving, by a plurality of reinforcement learning agents ([0077] and [0078]: “scheduling_agent” and “cooling_agent”), reward data ([0089] and [0106]: reward functions are received by “scheduling_agent” and “cooling_agent”) from a digital twin ([0079] and [0101]: task scheduling reinforcement learning model and temperature control reinforcement learning model are built to mimic the operations of the system as digital twin) of the system that simulates operations of the system, the reward data comprising a plurality of rewards each associated with a subsystem of the plurality of subsystems (reward function in [0096] is associated with “scheduling_agent”, and the reward function in [0108] is associated with “cooling_agent”);
determining a plurality of actions, by the plurality of reinforcement learning agents, based on the state data and each of the plurality of rewards ([0112] and [0129]: “S5. Use the trained energy consumption joint optimization model to realize joint optimization of scheduling_agent and cooling_agent with the goal of minimizing overall energy consumption in a dynamic data center environment”), wherein each reinforcement learning agent of the plurality of reinforcement learning agents is associated with a subsystem of the plurality of subsystems ([0078]: “scheduling_agent” is associated with assigning tasks, and “cooling_agent” is associated with adjust the temperature of the server); and
transition the system to updated states according to the plurality of actions ([0136]: “The optimization module uses the scheduling_agent and cooling_agent trained by the joint control model to execute the action strategy aimed at reducing its own energy consumption based on their respective observation information, while ensuring the balance of the dynamic data center environment and minimizing the overall energy consumption”).
SU teaches all the limitations except each reinforcement learning agent assigns a weight to a reward of the plurality of rewards corresponding to an associated subsystem of the plurality of subsystems that is greater than weights assigned to rewards corresponding to other subsystems of the plurality of subsystems.
However, TRAUT teaches in an analogous art:
assigns a weight to a reward of the plurality of rewards that is greater than weights assigned to other rewards ([0037]: “computer-translating the plurality of sub-goals into the shaped reward function may include automatically computer translating each sub-goal into a sub-goal specific reward function, and computer-composing the resulting plurality of sub-goal specific reward functions into the shaped reward function. As a non-limiting example, computer-composing the plurality of sub-goal specific reward functions into the shaped reward function may include evaluating each sub-goal specific reward function and defining the shaped reward function output as a sum of the outputs of the sub-goal specific reward functions. Alternately or additionally to an approach based on summing the sub-goal specific reward functions, any other suitable approach to composing reward functions into a shaped reward function may be employed. As a non-limiting example, instead of directly summing sub-goal specific reward functions, sub-goal specific reward functions may be summed in a weighted combination, wherein each sub-goal has a different weight”. This teaches each reward function is assigned different weight for combination, so a weight assigned to one reward function if greater than other weights assigned to other reward functions).
Since SUN teaches each RL agent corresponds to a subsystem associated with a reward, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified SU based on the teaching of TRAUT, to make the method wherein each reinforcement learning agent of the plurality of reinforcement learning agents assigns a weight to a reward of the plurality of rewards corresponding to an associated subsystem of the plurality of subsystems that is greater than weights assigned to rewards corresponding to other subsystems of the plurality of subsystems. One of ordinary skill in the art would have been motivated to do this modification in order to “achieve suitable performance, and/or improve final performance of a trained reinforcement machine learning computer system”, as TRAUT teaches in [0017].
Regarding claim 2, SU-TRAUT teach(es) all the limitations of its base claim from which the claim depends on.
SU further teaches:
the system is a data center ([0001]).
Claim 10 recites a system conducting the operational steps in the method of claim 1 with patentably the same limitations. Therefore, claim 10 is rejected for the same reason recited in the rejection of claim 1.
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over SU in view of WENZEL (US 20220390137 A1, hereinafter as “WENZEL”).
Regarding claim 18, SU teach(es) all the limitations of its base claim from which the claim depends on.
SU further teaches:
the plurality of subsystems comprises a cooling subsystem ([0103]: cooling_agent).
SU teach(es) all the limitations except a load shifting subsystem, and an energy storage subsystem.
However, WENZEL teaches in an analogous art:
the plurality of subsystems comprises a load shifting subsystem, and an energy storage subsystem (FIG. 5 and [0097]: “economic controller 510 can use battery unit 302 to perform load shifting by drawing electricity from energy grid 414 when energy prices are low and/or when the power consumed by powered CEF components 402 is low. The electricity can be stored in battery unit 302 and discharged later when energy prices are high and/or the power consumption of powered CEF components 402 is high”. This teaches the system comprises load-shifting subsystem and energy storage subsystem).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified SU based on the teaching of WENZEL, to make the system wherein the plurality of subsystems also comprises a load shifting subsystem and an energy storage subsystem. One of ordinary skill in the art would have been motivated to do this modification in order to “reduce the cost of electricity consumed … and can smooth momentary spikes in the electric demand”, as WENZEL teaches in [0097].
Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over SU in view of TRAUT.
Regarding claim 19, SU teach(es) all the limitations of its base claim from which the claim depends on.
SU further teaches:
each reinforcement learning agent of the plurality of reinforcement learning agents is associated with a subsystem model of the plurality of subsystem models ([0078]: “scheduling_agent” is associated with assigning tasks, and “cooling_agent” is associated with adjust the temperature of the server).
SU teaches all the limitations except aggregating the rewards comprises by each reinforcement learning agent of the plurality of reinforcement learning agents, assigning a weight to a reward of the rewards determined by an associated subsystem model that is greater than weights assigned to rewards determined by the other subsystem models.
However, TRAUT teaches in an analogous art:
assigning a weight to a reward of the rewards that is greater than weights assigned to other rewards ([0037]: “computer-translating the plurality of sub-goals into the shaped reward function may include automatically computer translating each sub-goal into a sub-goal specific reward function, and computer-composing the resulting plurality of sub-goal specific reward functions into the shaped reward function. As a non-limiting example, computer-composing the plurality of sub-goal specific reward functions into the shaped reward function may include evaluating each sub-goal specific reward function and defining the shaped reward function output as a sum of the outputs of the sub-goal specific reward functions. Alternately or additionally to an approach based on summing the sub-goal specific reward functions, any other suitable approach to composing reward functions into a shaped reward function may be employed. As a non-limiting example, instead of directly summing sub-goal specific reward functions, sub-goal specific reward functions may be summed in a weighted combination, wherein each sub-goal has a different weight”. This teaches each reward function is assigned different weight for combination, so a weight assigned to one reward function if greater than other weights assigned to other reward functions).
Since SUN teaches each RL agent corresponds to a subsystem associated with a reward, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified SU based on the teaching of TRAUT, to make the system wherein aggregating the rewards comprises by each reinforcement learning agent of the plurality of reinforcement learning agents, assigning a weight to a reward of the rewards determined by an associated subsystem model that is greater than weights assigned to rewards determined by the other subsystem models. One of ordinary skill in the art would have been motivated to do this modification in order to “achieve suitable performance, and/or improve final performance of a trained reinforcement machine learning computer system”, as TRAUT teaches in [0017].
Claims 3, 5-8 and 11-15 are rejected under 35 U.S.C. 103 as being unpatentable over SU in view of TRAUT, and in further view of WENZEL.
Regarding claim 3, SU-TRAUT teach(es) all the limitations of its base claim from which the claim depends on.
SU further teaches:
the plurality of subsystems comprises a cooling subsystem ([0103]: cooling_agent).
SU-TRAUT teach(es) all the limitations except a load shifting subsystem, and an energy storage subsystem.
However, WENZEL teaches in an analogous art:
the plurality of subsystems comprises a load shifting subsystem, and an energy storage subsystem (FIG. 5 and [0097]: “economic controller 510 can use battery unit 302 to perform load shifting by drawing electricity from energy grid 414 when energy prices are low and/or when the power consumed by powered CEF components 402 is low. The electricity can be stored in battery unit 302 and discharged later when energy prices are high and/or the power consumption of powered CEF components 402 is high”. This teaches the system comprises load-shifting subsystem and energy storage subsystem).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified SU-TRAUT based on the teaching of WENZEL, to make the method wherein the plurality of subsystems also comprises a load shifting subsystem and an energy storage subsystem. One of ordinary skill in the art would have been motivated to do this modification in order to “reduce the cost of electricity consumed … and can smooth momentary spikes in the electric demand”, as WENZEL teaches in [0097].
Regarding claim 5, SU-TRAUT-WENZEL teach(es) all the limitations of its base claim from which the claim depends on.
SU further teaches:
an action determined by a reinforcement learning agent associated with the cooling subsystem is dependent on an action determined by a reinforcement learning agent associated with the load shifting subsystem and an action determined by a reinforcement learning agent associated with the energy storage subsystem ([0112-0129]: “S5. Use the trained energy consumption joint optimization model to realize joint optimization of scheduling_agent and cooling_agent with the goal of minimizing overall energy consumption in a dynamic data center environment”. SU teaches the multiple RL agents are interdependent to each other, therefore an action determined by RL cooling agent is dependent on actions determined by RL agent of load shifting and energy storage battery).
Regarding claim 6, SU-TRAUT-WENZEL teach(es) all the limitations of its base claim from which the claim depends on.
SU further teaches:
an action determined by a reinforcement learning agent associated with the cooling subsystem is a cooling setpoint for the system ([0105]: “action is expressed as ΔT, which represents the range of temperature adjustment “).
WENZEL further teaches:
an action determined by a reinforcement learning agent associated with the load shifting subsystem is an allocation of workload ([0097]: “economic controller 510 can use battery unit 302 to perform load shifting by drawing electricity from energy grid 414 when energy prices are low and/or when the power consumed by powered CEF components 402 is low”), and wherein an action determined by a reinforcement learning agent associated with the energy storage subsystem is a charge or discharge amount ([0097]: “The electricity can be stored in battery unit 302 and discharged later when energy prices are high and/or the power consumption of powered CEF components 402 is high”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have further modified SU-TRAUT based on the teaching of WENZEL, to make the method wherein an action determined by a reinforcement learning agent associated with the load shifting subsystem is an allocation of workload, and wherein an action determined by a reinforcement learning agent associated with the energy storage subsystem is a charge or discharge amount. One of ordinary skill in the art would have been motivated to do this modification in order to “reduce the cost of electricity consumed … and can smooth momentary spikes in the electric demand”, as WENZEL teaches in [0097].
Regarding claim 7, SU-TRAUT-WENZEL teach(es) all the limitations of its base claim from which the claim depends on.
SU further teaches:
an action determined by a reinforcement learning agent associated with the load shifting subsystem is dependent on an action determined by a reinforcement learning agent associated with the cooling subsystem and an action determined by a reinforcement learning agent associated with the energy storage subsystem ([0112-0129]: “S5. Use the trained energy consumption joint optimization model to realize joint optimization of scheduling_agent and cooling_agent with the goal of minimizing overall energy consumption in a dynamic data center environment”. SU teaches the multiple RL agents are interdependent to each other, therefore an action determined by load shifting agent is dependent on actions determined by RL agent of cooling and energy storage battery).
Regarding claim 8, SU-TRAUT-WENZEL teach(es) all the limitations of its base claim from which the claim depends on.
SU further teaches:
an action determined by a reinforcement learning agent associated with the energy storage subsystem is dependent on an action determined by a reinforcement learning agent associated with the cooling subsystem and an action determined by a reinforcement learning agent associated with the load shifting subsystem ([0112-0129]: “S5. Use the trained energy consumption joint optimization model to realize joint optimization of scheduling_agent and cooling_agent with the goal of minimizing overall energy consumption in a dynamic data center environment”. SU teaches the multiple RL agents are interdependent to each other, therefore an action determined by energy storage agent is dependent on actions determined by RL agent of cooling and load shifting).
Claim 11 recites a system conducting the operational steps in the method of claims 2 and 3 with patentably the same limitations. Therefore, claim 11 is rejected for the same reasons recited in the rejections of claims 2 and 3.
Claims 12, 13, 14 and 15 recite a system conducting the operational steps in the method of claims 5, 6, 7 and 8 respectively with patentably the same limitations. Therefore, claims 12, 13, 14 and 15 are rejected for the same reason recited in the rejection of claims 5, 6, 7 and 8, respectively.
Allowable Subject Matter
Claims 4, 9, 16 and 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES CAI whose telephone number is (571)272-7192. The examiner can normally be reached on M-F 8-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamini Shah can be reached on 571-272-2279. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHARLES CAI/Primary Patent Examiner, Art Unit 2115