DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is in response to claims filed 12/16/2025.
Claims 1-20 are pending.
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/16/2025 has been entered.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 2, 6-13, 15-18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Malaya (US 2022/0318056 A1) in view of Karnik (US 2003/0176982 A1).
Regarding Claim 1, Malaya teaches:
A device, comprising: one or more circuits, including hardware, to dynamically adjust a load profile of one or more processing devices processing a workload in a bulk-synchronous mode, “Full-scale workloads, such as workloads used in machine learning training applications, sometimes include periods of heavy power loads on devices followed by periods where the same devices are idle. For example, parallel computing workloads often include periods of synchronized high-powered computation (on the order of seconds) and low-powered communication of computed results (on the order of seconds)” [Malaya ¶ 7]. “In various embodiments, each of computing devices 102-106 includes one or more processors such as a parallel processor (e.g., vector processors, graphics processing units (GPUs), general-purpose GPUs (GPGPUs), non-scalar processors, highly-parallel processors, artificial intelligence (AI) processors, inference engines, machine learning processors, other multithreaded processing units, and the like)” [Malaya ¶ 10 Examiner notes ¶ 29 of the specification discusses examples of bulk-synchronous workloads which includes workloads for artificial intelligence].
wherein, in response to power consumed by the one or more processing devices dropping below a first power threshold, “In this manner, the GPU 202 hardware autonomously starts running power burn workloads when the hardware itself detects changes in power usage falling below the predetermined power floor threshold (first power threshold)” [Malaya ¶ 25].
the one or more circuits are to dynamically adjust the load profile by controlling an on-die current (burn) sink circuit to: draw current to raise the power consumed by the one or more processing devices to the first power threshold; “For example and with respect to FIG. 2, in various embodiments, the command processor 204 is capable of launching multiple different kernels of target power workload (e.g., designed to burn 100 W, 200 W, 300 W, 400 W, and 500 W of power) on the design of the GPU 202 and for targeting particular target power usage levels. In various embodiments, the SMU 210 instructs the GPU 202 to operate at that threshold power floor level until the command processor 204 is instructed to begin processing productive workloads” [Malaya ¶ 30]. “At block 410, based on receiving the power dip condition signal from the command processor 204 submits work to target a certain power load, whether that is at the previous level or a floor level” [Malaya ¶ 38]. “In some embodiments, the system driver 212 defines and communicates to the SMU 210 an amount of time that the SMU 210 needs to ensure that the power floor is maintained subsequent to a power dip condition” [Malaya ¶ 23]. “The method of claim 2, wherein assigning one or more target power workloads includes the workload scheduler assigning one or more target power workloads to raise the power draw by the processor device to meet or exceed the target power draw” [Malaya Claim 5].
and gradually reduce the drawn current to cause a corresponding reduction in the power consumed by the one or more processing devices to gradually drop from the first power threshold. “Additionally, it is not necessary to indefinitely perform target power workloads for the purposes of power burn. In some embodiments, the command processor 204 and the SMU 210 gradually decrease power usage by running a first power burn kernel of a higher power usage (e.g., a target power workload designed to power burn at 500 W) for a first period of time, switching to a second power burn kernel of a lower power usage (e.g., a target power workload designed to power bum at 400 W), and so forth, thereby decreasing the amount of power consumed by GPU 202 over time” [Malaya ¶ 24].
Malaya fails to explicitly teach controlling an on-die current sink.
However, Karnik teaches the one or more circuits are to dynamically adjust the load profile by controlling an on-die current sink “Programmable current sinks may be used on the die for controlled repeatable testing” [Karnik ¶ 21]. “As can be seen from FIG. 3, the load current produced by the staggered sink elements may closely approximate the test current waveform of FIG. 2 shown as a dashed line in FIG. 3. As discussed below, the sink elements may be turned on and off gradually to produce a smooth ramp more closely approximating the dashed line” [Karnik ¶ 27].
Karnik is considered to be analogous to the claimed invention because it is in the same field of system power management. Malaya includes power burn kernels run at the hardware to consume target amounts of power; this can be combined with the on-die current sink of Karnik to achieve the imitation of a workloads power consumption. Therefore, it would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Malaya to incorporate the teachings of Karnik and include controlling an on-die current sink. Doing so would improve the system’s ability to regulate power usage. “Increased power noise (from switching currents and power saving modes) and narrowing of the voltage regulation window (from reduced operating voltages) have pushed designers to explore on-die voltage regulation and power distribution techniques to achieve successful chip level and system level designs” [Karnik ¶ 2].
Regarding Claim 2, Malaya in view of Karnik teaches the device of claim 1, as referenced above. Malaya further teaches wherein the one or more circuits are to begin dynamically adjusting the load profile in response to power consumed by the one or more processing devices exceeding a second power threshold. “The method of claim 3, wherein identifying the power dip condition includes a determination by the power monitor of the power draw by the processor device falling below the threshold power floor level at a rate exceeding the threshold rate of power change (second power threshold)” [Malaya Claim 4]. “A method, comprising: communicating a power dip condition to a workload scheduler of a processor device in response to identifying the power dip condition; and assigning, based at least in part on the power dip condition, one or more target power workloads for execution at the processor device, wherein each of the one or more target power workloads is associated with a known power load” [Malaya Claim 1].
Regarding Claim 6, Malaya in view of Karnik teaches the device of claim 1 as referenced above. Malaya further teaches wherein the one or more circuits are controlled by firmware of the one or more processing devices. “As shown, the GPU 202 includes a command processor 204 that receives commands in a command stream from, for example, a corresponding device driver (not shown) and coordinates processing within the GPU 202. In various embodiments, the device driver includes software, firmware, hardware, or any combination thereof” [Malaya ¶ 14]. “In various embodiments, the system driver 212 reserves memory locations at the local memory module 208 for storing that target power workload code and further identifies commands that GPU firmware (e.g., command processor 204) needs for running target power workloads” [Malaya ¶ 20]. “For example, in various embodiments, the operations of block 410 include firmware controllers utilizing power reduction techniques such as dithering or clock switching to lower things over time-as the GPU 202 continues to be idle, firmware controls use a variety of lower power kernels to walk operations down to idle without encountering a fast power dip” [Malaya ¶ 43].
Regarding Claim 7, Malaya in view of Karnik teaches the device of claim 1 as referenced above. Malaya further teaches:
wherein the one or more circuits raise the power consumed by the one or more processing devices “The method of claim 2, wherein assigning one or more target power workloads includes the workload scheduler assigning one or more target power workloads to raise the power draw by the processor device to meet or exceed the target power draw” [Malaya Claim 5].
by injecting additional work after the workload. “In addition, it is difficult to prevent large dips in power consumption (resulting from dip in current as computations end, causing voltage spikes) as even higher power states cannot easily maintain a similar amount of dynamic power usage as a fully loaded processor (e.g., having a 500 W GPU board drawing maximum power for seconds or minutes after the GPU chip becomes idle and stops processing productive workloads would be difficult)” [Malaya ¶ 12]. “A workload scheduler (such as the command processor) is provided with a plurality of power burn kernels of known power draw (e.g., target power workloads). In some embodiments, a target power workload includes an idle power burn kernel used when there is no driver-initiated productive workloads” [Malaya ¶ 38].
Regarding Claim 8, Malaya in view of Karnik teaches the device of claim 1 as referenced above. Malaya further teaches wherein the one or more processing devices comprise a plurality of Graphics Processing Units (GPUs). “For example, in various embodiments, the computing devices 102-106 includes processing units such as graphics processing units (GPUs), central processing units (CPUs), field programmable gate arrays (FPGAs), and the like on the same board or on separate carrier boards that are connected to each other via a backplane” [Malaya ¶ 9]. “In various embodiments, each of computing devices 102-106 includes one or more processors such as a parallel processor (e.g., vector processors, graphics processing units (GPUs), general-purpose GPU s (GPGPUs), non-scalar processors, highly-parallel processors, artificial intelligence (AI) processors, inference engines, machine learning processors, other multithreaded processing units, and the like)” [Malaya ¶ 10].
Regarding Claim 9, Malaya teaches:
A cluster manager, comprising: at least one processor; and memory including instructions that when executed by the at least one processor cause the at least one processor to: “As shown, the GPU 202 includes a command processor 204 that receives commands in a command stream from, for example, a corresponding device driver (not shown) and coordinates processing within the GPU 202 … In various embodiments, the GPU 202 also includes a local memory 208 (e.g., on-chip RAM) for storing register data, storing workload code, and the like. In various embodiments, the GPU 202 also implements various system monitoring and power saving functions” [Malaya ¶ 14-15-16].
determine, based on one or more power delivery specifications, one or more load profiles for one or more processing devices “For example, in some embodiments, the GPU 202 includes a microcontroller 210 such as a system management unit (SMU) that is configured for enforcing numerous protection mechanisms (e.g., chip-wide and device block thermal constraints, di/dt (rate of change of the charge current (i) over time (t)) limits, voltage droop mitigation, etc.), in addition to performing a plethora of dynamic performance optimizations” [Malaya ¶ 16]. “In various embodiments, a job scheduler (not shown) or other management tool communicates a power floor and a unit of time that the power floor should be maintained (power delivery specification) via a system driver 212. The system driver 212 communicates the power floor and unit of time information to the SMU 210 through a number of techniques such as memory-mapped input/output (MMIO) or in-memory mailboxes” [Malaya ¶ 18]. “The command processor 204, upon receiving a power dip condition signal 212, launches target power workload(s) to the CUs 206. As described in more detail below, target power workloads are designed to burn predetermined, fixed amounts of power to enable reaching and maintaining a target power draw” [Malaya ¶ 19]. “The system driver 212 communicates information regarding one or more target power workloads including pre-compiled kernels of known power load (load profile) (such as double-precision matrix-matrix multiplication (DGEMM) kernels) and stores the target power workload code at the local memory module 208 of the GPU 202” [Malaya ¶ 20].
that process a workload in a bulk- synchronous mode; “Full-scale workloads, such as workloads used in machine learning training applications, sometimes include periods of heavy power loads on devices followed by periods where the same devices are idle. For example, parallel computing workloads often include periods of synchronized high-powered computation (on the order of seconds) and low-powered communication of computed results (on the order of seconds)” [Malaya ¶ 7]. “In various embodiments, each of computing devices 102-106 includes one or more processors such as a parallel processor (e.g., vector processors, graphics processing units (GPUs), general-purpose GPUs (GPGPUs), non-scalar processors, highly-parallel processors, artificial intelligence (AI) processors, inference engines, machine learning processors, other multithreaded processing units, and the like)” [Malaya ¶ 10 Examiner notes ¶ 29 of the specification discusses examples of bulk-synchronous workloads which includes workloads for artificial intelligence].
and send the one or more load profiles to the one or more processing devices, “The system driver 212 communicates information regarding one or more target power workloads including pre-compiled kernels of known power load (such as double-precision matrix-matrix multiplication (DGEMM) kernels) and stores the target power workload code at the local memory module 208 of the GPU 202” [Malaya ¶ 20]. “The command processor 204, upon receiving a power dip condition signal 212, launches target power workload(s) to the CUs 206. As described in more detail below, target power workloads are designed to burn predetermined, fixed amounts of power to enable reaching and maintaining a target power draw” [Malaya ¶ 19].
wherein at least one load profile of the one or more load profiles causes, in response to power consumed by the one or more processing devices dropping below a first power threshold, “In this manner, the GPU 202 hardware autonomously starts running power burn workloads when the hardware itself detects changes in power usage falling below the predetermined power floor threshold (first power threshold)” [Malaya ¶ 25].
one or more on-die current (burn) sink circuits to: draw current to raise the power consumed by the one or more processing devices to the first power threshold; “For example and with respect to FIG. 2, in various embodiments, the command processor 204 is capable of launching multiple different kernels of target power workload (e.g., designed to burn 100 W, 200 W, 300 W, 400 W, and 500 W of power) on the design of the GPU 202 and for targeting particular target power usage levels. In various embodiments, the SMU 210 instructs the GPU 202 to operate at that threshold power floor level until the command processor 204 is instructed to begin processing productive workloads” [Malaya ¶ 30]. “At block 410, based on receiving the power dip condition signal from the command processor 204 submits work to target a certain power load, whether that is at the previous level or a floor level” [Malaya ¶ 38]. “In some embodiments, the system driver 212 defines and communicates to the SMU 210 an amount of time that the SMU 210 needs to ensure that the power floor is maintained subsequent to a power dip condition” [Malaya ¶ 23]. “The method of claim 2, wherein assigning one or more target power workloads includes the workload scheduler assigning one or more target power workloads to raise the power draw by the processor device to meet or exceed the target power draw” [Malaya Claim 5].
and gradually reduce the drawn current to cause a corresponding reduction in the power consumed by the one or more processing devices to gradually drop from the first power threshold. “Additionally, it is not necessary to indefinitely perform target power workloads for the purposes of power burn. In some embodiments, the command processor 204 and the SMU 210 gradually decrease power usage by running a first power burn kernel of a higher power usage (e.g., a target power workload designed to power burn at 500 W) for a first period of time, switching to a second power burn kernel of a lower power usage (e.g., a target power workload designed to power bum at 400 W), and so forth, thereby decreasing the amount of power consumed by GPU 202 over time” [Malaya ¶ 24].
Malaya fails to explicitly teach one or more on-die current sink circuits to: draw current.
However, Karnik teaches one or more on-die current sink circuits to: draw current “Programmable current sinks may be used on the die for controlled repeatable testing” [Karnik ¶ 21]. “As can be seen from FIG. 3, the load current produced by the staggered sink elements may closely approximate the test current waveform of FIG. 2 shown as a dashed line in FIG. 3. As discussed below, the sink elements may be turned on and off gradually to produce a smooth ramp more closely approximating the dashed line” [Karnik ¶ 27].
Karnik is considered to be analogous to the claimed invention because it is in the same field of system power management. Malaya includes power burn kernels run at the hardware to consume target amounts of power; this can be combined with the on-die current sink of Karnik to achieve the imitation of a workloads power consumption. Therefore, it would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Malaya to incorporate the teachings of Karnik and include one or more on-die current sink circuits to: draw current. Doing so would improve the system’s ability to regulate power usage. “Increased power noise (from switching currents and power saving modes) and narrowing of the voltage regulation window (from reduced operating voltages) have pushed designers to explore on-die voltage regulation and power distribution techniques to achieve successful chip level and system level designs” [Karnik ¶ 2].
Regarding Claim 10, Malaya in view of Karnik teaches the cluster manager of claim 9 as referenced above. Malaya further teaches wherein the one or more processing devices comprise a plurality of processing devices. “For example, in various embodiments, the computing devices 102-106 includes processing units such as graphics processing units (GPUs), central processing units (CPUs), field programmable gate arrays (FPGAs), and the like on the same board or on separate carrier boards that are connected to each other via a backplane” [Malaya ¶ 9].
Regarding Claim 11, Malaya in view of Karnik teaches the cluster manager of claim 10 as referenced above. Malaya further teaches wherein the plurality of processing devices comprise a plurality of Graphics Processing Units (GPUs). “For example, in various embodiments, the computing devices 102-106 includes processing units such as graphics processing units (GPUs), central processing units (CPUs), field programmable gate arrays (FPGAs), and the like on the same board or on separate carrier boards that are connected to each other via a backplane” [Malaya ¶ 9]. “In various embodiments, each of computing devices 102-106 includes one or more processors such as a parallel processor (e.g., vector processors, graphics processing units (GPUs), general-purpose GPU s (GPGPUs), non-scalar processors, highly-parallel processors, artificial intelligence (AI) processors, inference engines, machine learning processors, other multithreaded processing units, and the like)” [Malaya ¶ 10].
Regarding Claim 12, Malaya in view of Karnik teaches the cluster manager of claim 10 as referenced above. Malaya further teaches wherein additional work is injected to at least some of the plurality of processing devices after the workload is processed “In addition, it is difficult to prevent large dips in power consumption (resulting from dip in current as computations end, causing voltage spikes) as even higher power states cannot easily maintain a similar amount of dynamic power usage as a fully loaded processor (e.g., having a 500 W GPU board drawing maximum power for seconds or minutes after the GPU chip becomes idle and stops processing productive workloads would be difficult)” [Malaya ¶ 12]. “A workload scheduler (such as the command processor) is provided with a plurality of power burn kernels of known power draw (e.g., target power workloads). In some embodiments, a target power workload includes an idle power burn kernel used when there is no driver-initiated productive workloads” [Malaya ¶ 38].
to control their respective load profiles. “The method of claim 2, wherein assigning one or more target power workloads includes the workload scheduler assigning one or more target power workloads to raise the power draw by the processor device to meet or exceed the target power draw” [Malaya Claim 5].
Regarding Claim 13, Malaya in view of Karnik teaches the cluster manager of claim 9, as referenced above. Malaya further teaches:
wherein the at least one load profile corresponds to a ramp-down load profile “In other embodiments, the SMU 210 and command processor 204 instruct the CUs 206 to process the same target power workload (load profile) while walking down the voltage-frequency points at which the GPU 202 operates (e.g., gradually decreases power usage (ramp-down) by running the same power-bum kernel), such as by utilizing power management techniques including dynamic voltage and frequency scaling (DVFS) to dynamically adjust an operating voltage and frequency point (referred to as a "p-state") across GPU 202 components during run time” [Malaya ¶ 24].
applied at an end of the workload. “At time=T0, the processing device is in a low power draw state due to, for example, computations for a previous productive workload ending and causing power draw by the processing device to fall below a threshold power floor level (labeled as Ptarget in FIG. 3).” [Malaya ¶ 28]. “At time=T1, the power monitor of the processing device (e.g., SMU 210 or other power monitoring firmware at the GPU 202 of FIG. 2) detects the power dip condition by observing that power utilized by the GPU 202 dips below the power floor over a specified period of time (e.g., di/dt time rate of change of current consumption, such as over a few milliseconds). Accordingly, the SMU 210 notifies the command processor 204 of the power dip condition and the command processor 204 begins launching target power workloads (precompiled kernels for execution to generate dynamic work by producing results to be discarded for the purposes of burning power) to the CUs 206.” [Malaya ¶ 29].
Regarding Claim 15, Malaya teaches:
A Graphics Processing Unit (GPU), “Additionally, although described with respect to FIG. 2 in the context of a GPU 202 device, those skilled in the art will recognize that the concepts disclosed herein are applicable to various processors, including datacenters with heterogeneous system architectures (HSA) where compute systems are expected to be used for a variety of compute intensive models using combinations of any of the following: CPUs, GPUs, FPGAs, custom ASICs, and the like” [Malaya ¶ 26].
comprising: an on-die current (burn) sink circuit “The GPU 202 includes a plurality of compute units (CUs) 206 that are generally configured to execute sets of instructions (e.g., computer programs) that manipulate the circuitry of the GPU 202 to carry out defined tasks” [Malaya ¶ 15]. “The command processor 204, upon receiving a power dip condition signal 212, launches target power workload(s) to the CUs 206. As described in more detail below, target power workloads are designed to burn predetermined, fixed amounts of power to enable reaching and maintaining a target power draw” [Malaya ¶ 19].
and one or more circuits, including hardware, to dynamically adjust a load profile for the GPU “For example and with respect to FIG. 2, in various embodiments, the command processor 204 is capable of launching multiple different kernels of target power workload (e.g., designed to burn 100 W, 200 W, 300 W, 400 W, and 500 W of power) on the design of the GPU 202 and for targeting particular target power usage levels. In various embodiments, the SMU 210 instructs the GPU 202 to operate at that threshold power floor level until the command processor 204 is instructed to begin processing productive workloads” [Malaya ¶ 30].
when the GPU is operated in a bulk-synchronous mode with one or more other GPUs, “Full-scale workloads, such as workloads used in machine learning training applications, sometimes include periods of heavy power loads on devices followed by periods where the same devices are idle. For example, parallel computing workloads often include periods of synchronized high-powered computation (on the order of seconds) and low-powered communication of computed results (on the order of seconds)” [Malaya ¶ 7]. “In various embodiments, each of computing devices 102-106 includes one or more processors such as a parallel processor (e.g., vector processors, graphics processing units (GPUs), general-purpose GPUs (GPGPUs), non-scalar processors, highly-parallel processors, artificial intelligence (AI) processors, inference engines, machine learning processors, other multithreaded processing units, and the like)” [Malaya ¶ 10 Examiner notes ¶ 29 of the specification discusses examples of bulk-synchronous workloads which includes workloads for artificial intelligence].
wherein, in response to power consumed by the GPU dropping below a first power threshold, “In this manner, the GPU 202 hardware autonomously starts running power burn workloads when the hardware itself detects changes in power usage falling below the predetermined power floor threshold (first power threshold)” [Malaya ¶ 25].
the one or more circuits are to dynamically adjust the load profile by controlling the on-die current (burn) sink circuit to: raise the power consumed by the GPU to the first power threshold “For example and with respect to FIG. 2, in various embodiments, the command processor 204 is capable of launching multiple different kernels of target power workload (e.g., designed to burn 100 W, 200 W, 300 W, 400 W, and 500 W of power) on the design of the GPU 202 and for targeting particular target power usage levels. In various embodiments, the SMU 210 instructs the GPU 202 to operate at that threshold power floor level until the command processor 204 is instructed to begin processing productive workloads” [Malaya ¶ 30]. “At block 410, based on receiving the power dip condition signal from the command processor 204 submits work to target a certain power load, whether that is at the previous level or a floor level” [Malaya ¶ 38]. “In some embodiments, the system driver 212 defines and communicates to the SMU 210 an amount of time that the SMU 210 needs to ensure that the power floor is maintained subsequent to a power dip condition” [Malaya ¶ 23]. “The method of claim 2, wherein assigning one or more target power workloads includes the workload scheduler assigning one or more target power workloads to raise the power draw by the processor device to meet or exceed the target power draw” [Malaya Claim 5].
and gradually reduce the drawn current to cause a corresponding reduction in the power consumed by the GPU to gradually drop from the first power threshold. “Additionally, it is not necessary to indefinitely perform target power workloads for the purposes of power burn. In some embodiments, the command processor 204 and the SMU 210 gradually decrease power usage by running a first power burn kernel of a higher power usage (e.g., a target power workload designed to power burn at 500 W) for a first period of time, switching to a second power burn kernel of a lower power usage (e.g., a target power workload designed to power bum at 400 W), and so forth, thereby decreasing the amount of power consumed by GPU 202 over time” [Malaya ¶ 24].
Malaya fails to explicitly teach an on-die current sink circuit.
However, Karnik teaches an on-die current sink circuit “Programmable current sinks may be used on the die for controlled repeatable testing” [Karnik ¶ 21]. “As can be seen from FIG. 3, the load current produced by the staggered sink elements may closely approximate the test current waveform of FIG. 2 shown as a dashed line in FIG. 3. As discussed below, the sink elements may be turned on and off gradually to produce a smooth ramp more closely approximating the dashed line” [Karnik ¶ 27].
Karnik is considered to be analogous to the claimed invention because it is in the same field of system power management. Malaya includes power burn kernels run at the hardware to consume target amounts of power; this can be combined with the on-die current sink of Karnik to achieve the imitation of a workloads power consumption. Therefore, it would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Malaya to incorporate the teachings of Karnik and include an on-die current sink circuit. Doing so would improve the system’s ability to regulate power usage. “Increased power noise (from switching currents and power saving modes) and narrowing of the voltage regulation window (from reduced operating voltages) have pushed designers to explore on-die voltage regulation and power distribution techniques to achieve successful chip level and system level designs” [Karnik ¶ 2].
Regarding Claim 16, Malaya in view of Karnik teaches the GPU of claim 15 as referenced above. Malaya further teaches wherein the one or more circuits receive information for the load profile from a cluster manager that manages the GPU and the one or more other GPUs. “In various embodiments, a job scheduler (not shown) or other management tool communicates a power floor and a unit of time that the power floor should be maintained via a system driver 212 (cluster manager). The system driver 212 communicates the power floor and unit of time information to the SMU 210 through a number of techniques such as memory-mapped input/output (MMIO) or in-memory mailboxes” [Malaya ¶ 18]. “For example, in various embodiments, the workload scheduler logic is implemented at least in part at a host processor driver or at a processor core external to the GPU 202. Additionally, although described with respect to FIG. 2 in the context of a GPU 202 device, those skilled in the art will recognize that the concepts disclosed herein are applicable to various processors, including datacenters with heterogeneous system architectures (HSA) where compute systems are expected to be used for a variety of compute intensive models using combinations of any of the following: CPUs, GPUs, FPGAs, custom ASICs, and the like” [Malaya ¶ 26].
Regarding Claim 17, Malaya in view of Karnik teaches the GPU of claim 16, as referenced above.
Malaya further teaches wherein the information comprises the first power threshold. “In various embodiments, a job scheduler (not shown) or other management tool communicates a power floor (first power threshold) and a unit of time that the power floor should be maintained via a system driver 212. The system driver 212 communicates the power floor and unit of time information to the SMU 210 through a number of techniques such as memory-mapped input/output (MMIO) or in-memory mailboxes” [Malaya ¶ 18]. “In some embodiments, each of one or more device (e.g., a GPU 202 of FIG. 2) includes a control system by which power and compute loads are monitored and varied to enable reaching and maintaining a target power draw in accordance with a user specified threshold power level and/or rate of change in power usage” [Malaya ¶ 36].
Regarding Claim 18, Malaya in view of Karnik teaches the GPU of claim 17, as referenced above. Malaya further teaches wherein the information comprises slope information that governs how the one or more circuits dynamically adjust the load profile. “In various embodiments, a job scheduler (not shown) or other management tool communicates a power floor and a unit of time that the power floor should be maintained via a system driver 212. The system driver 212 communicates the power floor and unit of time information to the SMU 210 through a number of techniques such as memory-mapped input/output (MMIO) or in-memory mailboxes” [Malaya ¶ 18]. “In other embodiments, the power floor instruction is a user setting that does not specify particular target power usage levels but instead specifies the rate at which power usage of the GPU changes over a period of time. For example, in some embodiments, the power floor instruction includes a predetermined di/dt rate of change (slope information). This information is then provided to each of one or more processing devices (which may be CPUs, GPUs, ASICs, and the like) so they have a corresponding target power draw (or 'power load' as used interchangeably throughout this disclosure)” [Malaya ¶ 33].
Regarding Claim 20, Malaya in view of Karnik teaches the GPU of claim 17, as referenced above. Malaya further teaches:
wherein the information comprises a second power threshold, “In some embodiments, each of one or more device (e.g., a GPU 202 of FIG. 2) includes a control system by which power and compute loads are monitored and varied to enable reaching and maintaining a target power draw in accordance with a user specified threshold power level and/or rate of change in power usage (second power threshold)” [Malaya ¶ 36].
wherein the one or more circuits begin adjusting the load profile in response to power consumed by the GPU exceeding the second power threshold. “The method of claim 3, wherein identifying the power dip condition includes a determination by the power monitor of the power draw by the processor device falling below the threshold power floor level at a rate exceeding the threshold rate of power change” [Malaya Claim 4]. “A method, comprising: communicating a power dip condition to a workload scheduler of a processor device in response to identifying the power dip condition; and assigning, based at least in part on the power dip condition, one or more target power workloads for execution at the processor device, wherein each of the one or more target power workloads is associated with a known power load” [Malaya Claim 1].
Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Malaya (US 2022/0318056 A1) in view of Karnik (US 2003/0176982 A1) in view of Aronovich (US 2023/0176918 A1).
Regarding Claim 3, Malaya in view of Karnik teaches the device of claim 1, as referenced above. Malaya further teaches wherein the power consumed dropping below the first power threshold is caused at least in part by a workload release at an end of the workload being processed. “At time=T0, the processing device is in a low power draw state due to, for example, computations for a previous productive workload ending and causing power draw by the processing device to fall below a threshold power floor level (labeled as Ptarget in FIG. 3).” [Malaya ¶ 28].
Malaya in view of Karnik fails to explicitly teach a workload release at an end of the workload being processed.
However, Aronovich teaches a workload release at an end of the workload being processed. “In some scenarios, scheduler 112 performs processes 202-224 in response to one or more changing conditions within cloud computing environment 50, including, but not limited to, submission or completion of a workload, a change in required completion time or resource requirements of pending or current workloads, when a workload releases a resource from use…” [Aronovich ¶ 59 Fig. 4A and 4B Examiner notes the figures include the processes 202-224 which include adjustments to the workload].
Aronovich is considered to be analogous to the claimed invention because they are in the same field of processing resource allocation. Therefore, it would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Malaya in view of Karnik to incorporate the teachings of Aronovich and include a workload release at an end of the workload being processed. Doing so would allow for further accuracy in planning for workload resource consumption. “In some embodiments, scheduler 112 determines, concurrently, how many resources are required for each workload (active or pending) to meet its specified completion time” [Aronovich ¶ 64].
Claims 4 and 5 are rejected under 35 U.S.C. 103 as being unpatentable over Malaya (US 2022/0318056 A1) in view of Karnik (US 2003/0176982 A1) in view of Cheng (US 2017/0090548 A1).
Regarding Claim 4, Malaya in view of Karnik teaches the device of claim 1 as referenced above. Malaya further teaches wherein the load profile is dynamically adjusted “Referring now to FIG. 2, illustrated is a block diagram illustrating dynamic system load power management for smoothing system power usage at a GPU in accordance with some embodiments” [Malaya ¶ 9].
Malaya in view of Karnik fails to teach in response to detecting a workload ramp-up at a beginning of the workload being processed.
However, Cheng teaches in response to detecting a workload ramp-up at a beginning of the workload being processed. “Turning to FIG. 6, an exemplary load demand/ target profile 600 is provided. In this example, three different target values are requested. The optimal feedforward calculation is performed at the beginning of each load ramp upon detecting a new target. Accordingly, upon receiving the first target value (i.e., Target 1), at time T2, since the boiler/ turbine process was previously operating in a steady state manner (i.e., during time T1), the present state information of the boiler/turbine process is used to calculate the optimal feedforward profile” [Cheng ¶ 72 Fig. 6 Examiner notes time T2 of figure 6 depicts a workload ramp-up at the beginning of the workload]. “At a step 510, upon performing the optimal feedforward profile calculation while taking the current estimated state into account, feedforward and feedback control signals are combined and are used to control the power generation system in accordance with the concurrent and continuous state estimations” [Cheng ¶ 69].
Cheng is considered to be analogous to the claimed invention because they are in the same field of system power management. While Cheng focuses on control of a power generating plant, the detection of a workload ramp-up at the beginning of the workload can be applied to the processors of Malaya. Both systems work to monitor workloads and based on these observations update to conform to system power requirements. Additionally, the start of a new computing workload would have a corresponding ramp-up period alongside a necessitated ramp-up of processing power. Therefore, it would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Malaya in view of Karnik to incorporate the teachings of Cheng and include that the system is responsive to detecting a workload ramp-up at a beginning of the workload being processed. Doing so would allow for further system consideration of a diversity of workloads. “More particularly, this control method may be used in any process plant or control system that receives numerous set point changes and which controls slow reacting equipment, and additionally may be used to produce feedforward control signals or other types of control signals in these or other environments” [Cheng ¶ 75].
Regarding Claim 5, Malaya in view of Karnik teaches the device of claim 1 as referenced above. Malaya further teaches wherein the load profile is dynamically adjusted “Referring now to FIG. 2, illustrated is a block diagram illustrating dynamic system load power management for smoothing system power usage at a GPU in accordance with some embodiments” [Malaya ¶ 9].
Malaya in view of Karnik fails to teach in response to predicting at least one of a workload release at an end of the workload being processed and a workload ramp-up at a beginning of the workload being processed.
However, Cheng teaches in response to predicting at least one of a workload release at an end of the workload being processed and a workload ramp-up at a beginning of the workload being processed. “The power plant controller may also incorporate a feedforward (or anticipative) controller which foresees (predicts) future changes and provides quick action to increase or decrease the output power in response to an expected change in a load demand profile, which may be made either locally or by a remote dispatch (e.g., by the grid manager)” [Cheng ¶ 8]. “Turning to FIG. 6, an exemplary load demand/ target profile 600 is provided. In this example, three different target values are requested. The optimal feedforward calculation is performed at the beginning of each load ramp upon detecting a new target. Accordingly, upon receiving the first target value (i.e., Target 1), at time T2, since the boiler/ turbine process was previously operating in a steady state manner (i.e., during time T1), the present state information of the boiler/turbine process is used to calculate the optimal feedforward profile” [Cheng ¶ 72 Fig. 6 Examiner notes time T2 of figure 6 depicts a workload ramp-up at the beginning of the workload]. “At a step 510, upon performing the optimal feedforward profile calculation while taking the current estimated state into account, feedforward and feedback control signals are combined and are used to control the power generation system in accordance with the concurrent and continuous state estimations” [Cheng ¶ 69].
Cheng is considered to be analogous to the claimed invention because they are in the same field of system power management. While Cheng focuses on control of a power generating plant, the detection of a workload ramp-up at the beginning of the workload can be applied to the processors of Malaya. Both systems work to monitor workloads and based on these observations update to conform to system power requirements. Additionally, the start of a new computing workload would have a corresponding ramp-up period alongside a necessitated ramp-up of processing power. Therefore, it would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Malaya in view of Karnik to incorporate the teachings of Cheng and include that the system is responsive to predicting at least one of a workload release at an end of the workload being processed and a workload ramp-up at a beginning of the workload being processed. Doing so would allow for further system consideration of a diversity of workloads. “More particularly, this control method may be used in any process plant or control system that receives numerous set point changes and which controls slow reacting equipment, and additionally may be used to produce feedforward control signals or other types of control signals in these or other environments” [Cheng ¶ 75].
Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Malaya (US 2022/0318056 A1) in view of Karnik (US 2003/0176982 A1) in view of Cudak (US 9,442,770 B1).
Regarding Claim 14, Malaya in view of Karnik teaches the cluster manager of claim 13, as referenced above. Malaya in view of Karnik fails to teach wherein the one or more load profiles comprises a ramp-up load profile applied at a beginning of the workload.
However, Cudak teaches wherein the one or more load profiles comprises a ramp-up load profile applied at a beginning of the workload. “The second execution profile 420 includes a pre-execution time period 422 during which substantially no amount of a resource of a device is used. The second execution profile 420 includes a ramp up time period 424 during which the second execution profile 420 transitions from using substantially no amount of the resource of the device to using some amount of the resource of the device” [Cudak Col. 13 Lines 18-22 Fig. 4B Examiner notes that figure 4B depicts a profile with a ramp up period 424 at the beginning of the workload which is further illuminated by the preceding pre-execution time period 422].
Cudak is considered to be analogous to the claimed invention because they are in the same field of allocation of processing resources. It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Malaya in view of Karnik to incorporate the teachings of Cudak and include that the one or more load profiles comprises a ramp-up load profile applied at a beginning of the workload. Doing so would allow for further detail in evaluations of system utilization. “The apparatus includes a comparison module that compares an execution profile of each job of the multiple jobs with a resource of a device of the computer system. The execution profile of each job includes an amount of a resource of a device used by the respective job over time” [Cudak Col. 1 Lines 29-32].
Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Malaya (US 2022/0318056 A1) in view of Karnik (US 2003/0176982 A1) in view of Hansen (US 2023/0120165 A1).
Regarding Claim 19, Malaya in view of Karnik teaches the GPU of claim 16, as referenced above. Malaya further teaches wherein the information is based on a maximum power swing “These system-wide power swings (which can be on the magnitude of megawatts power over the course of a few seconds of time or fractions of a second) resulting from rapid changes in power demand risk damaging local power transformers, system circuitry, and also up-stream power transformers and generators” [Malaya ¶ 7]. “For example, in some embodiments, the GPU 202 includes a microcontroller 210 such as a system management unit (SMU) that is configured for enforcing numerous protection mechanisms (e.g., chip-wide and device block thermal constraints, di/dt (rate of change of the charge current (i) over time (t)) limits, voltage droop mitigation, etc.), in addition to performing a plethora of dynamic performance optimizations” [Malaya ¶ 16].
Malaya in view of Karnik fails to teach wherein the information is based on a maximum power swing of a power provider.
However, Hansen teaches wherein the information is based on a maximum power swing of a power provider. “Ramp rate in this context may be defined as the change in power output of a RES facility or RES-BESS facility (e.g., PV +S facility) in a given time interval (e.g., change per minute or change per hour)” [Hansen ¶ 162 Examiner notes this description of ramp rate is in accordance with the description of power swing given in ¶ 42 of the specification]. “If the plant production is less than the current RES production, then curtailment may be applied to make sure that the RES plant output does not violate the ramp limit… Firstly, if there is a sudden increase in RES production, this logic will control plant production so that total output increases in steps of power that are less than equal to the ramp rate up limit (maximum power swing)” [Hansen ¶ 163].
Hansen is considered to be analogous to the claimed invention because they are in the same field of system power control and allocation. While Hansen focuses on control of a power generating plant, the maximum power swing of a power provider can be applied to the management information of the processors of Malaya because both systems seek to account for power constraints in their control signals. Therefore, it would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Malaya in view of Karnik to incorporate the teachings of Hansen and include that the information is based on a maximum power swing of a power provider. Doing so would allow for system consideration of both the cost and operating parameters of the power source. “In some embodiments, assigning the score to each of the plurality of control algorithms comprises calculating the score for each of the plurality of control algorithms based on one or more variables associated with (i) one or more operating parameters of the power plant and/or (ii) energy market prices” [Hansen ¶ 6].
Response to Arguments
Applicant's arguments filed 12/16/25 have been fully considered but they are not persuasive. Applicant argues in substance:
I. Amended claim 1 further emphasizes embodiments according to Fig. 3, where power consumption of processing device dropping below a threshold activates an on-die current sink circuit which draws current to immediately raise the power consumption of the processing device back to the same threshold at time t3 before controlling the current sink circuit to gradually draw less current to cause a corresponding reduction in the power consumption of the processing device. The cited art does not disclose or suggest claim 1.
Whatmough and Harwani are directed to power management for processing workloads but appear silent with respect to raising the power consumed by a processor when the power consumed falls below a threshold. Malaya is directed to power management for processing workloads and describes performing a "burn" to raise consumed power. In particular, Fig. 3 of Malaya shows raising power consumption of a GPU in response to power consumed dropping below a power floor (i.e., Ptarget), but as shown in the figure, both the total GPU power (solid line) and the power consumed by the burn (dotted line) rise above the power floor, Ptarget. By contrast, claim 1 recites controlling an on-die current sink circuit to "draw current to raise the power consumed by the one or more processing devices to the first power threshold," and then "gradually reduce the drawn current to cause a corresponding reduction in the power consumed by the one or more processing devices to gradually drop from the first power threshold."
Because Whatmough, Harwani, and Malaya, taken alone or in combination, do not disclose or suggest each and every feature of claim 1, these documents not anticipate or render obvious claim 1. The other cited references have not been relied upon to remedy the deficiencies of Whatmough, Harwani, and Malaya and do not appear to do so. Accordingly, claim 1 and its associated dependent claims are patentable over the cited art.
Examiner respectfully disagrees. As detailed in the rejection above, Malaya teaches draw current to raise the power consumed by the one or more processing devices to the first power threshold [Malaya ¶ 30, 38, and claim 5] and gradually reduce the drawn current to cause a corresponding reduction in the power consumed by the one or more processing devices to gradually drop from the first power threshold [Malaya ¶ 24]. Malaya states: “At block 410, based on receiving the power dip condition signal from the command processor 204 submits work to target a certain power load, whether that is at the previous level or a floor level” [Malaya ¶ 38]. “The method of claim 2, wherein assigning one or more target power workloads includes the workload scheduler assigning one or more target power workloads to raise the power draw by the processor device to meet or exceed the target power draw” [Malaya Claim 5]. The embodiments of Malaya allow for raising the power consumed to the first power threshold, which is the power floor threshold, as well as for raising the power above the first power threshold. The arguments have been considered but were not found to be persuasive.
Applicant’s further arguments with respect to claim(s) 1, 9, and 15 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
II. Claim 2 is further patentable at least because the cited art has not been shown to disclose or suggest the features added to this claim.
Examiner respectfully disagrees. Malaya teaches wherein the one or more circuits are to begin dynamically adjusting the load profile in response to power consumed by the one or more processing devices exceeding a second power threshold [Malaya Claim 4]. The threshold rate of change of power consumption of Malaya is a second power threshold which is used to determine the adjusting of the load profile of processors.
Conclusion
Examiner respectfully requests, in response to this Office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist Examiner in prosecuting the application.
When responding to this Office Action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections. See 37 CFR 1.111(c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARI F RIGGINS whose telephone number is (571)272-2772. The examiner can normally be reached Monday-Friday 7:00AM-4:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bradley Teets can be reached on (571) 272-3338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/A.F.R./Examiner, Art Unit 2197
/BRADLEY A TEETS/Supervisory Patent Examiner, Art Unit 2197