DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-17 and 20-22 are pending.
The objections to the specification have been corrected and the objections are withdrawn.
The U.S.C. 112 rejections have been corrected and the rejections are withdrawn.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-3, 10-11, 12-15, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bircher (US 20120297232) in view of Paul et al. (US 20160085219)
Regarding claim 1, Bircher teaches
An information processing system, comprising:
a processor; (Fig. 1 (105 – processor))
a non-transitory storage medium comprising instructions executable by the processor to instantiate a region-aware power/energy regulator (regulator) configured to, during runtime of an application being executed by a given processor:
periodically identify a region of the application which is currently being executed by the given processor out of a plurality of regions of the application, each of the regions comprising (A) a function,routine, sub-program, loop, sub-routine, or equivalent, or (B) a section of contiguous memory of predetermined size; ([0014], “the power management unit may also be configured to calculate a real-time frequency sensitivity value of each application executing on each of the two or more processing units” and [0035-54], “Power management unit 220 in one embodiment may cause a processing unit 211 to operate at P-state P0 responsive to a corresponding frequency sensitivity value exceeding a certain high threshold. The high threshold may take on any of a variety of values, depending on the operating conditions and requirements of the application and processing unit 211. Operation in P-state P0 may be utilized for processing workloads that are compute-bounded (i.e., have a high frequency sensitivity value). … The low threshold may take on any of a variety of values, depending on the operating conditions and requirements of the application and processing unit 211. P-state P4 may be used with memory-bounded workloads as well as with other tasks that are not time-sensitive or frequency-sensitive.” where a real-time analysis of an application is interpreted as identifying a region of an application)
measure instructions per second (IPS) of the given processor during execution of the identified region; ([0098], “Power management unit 520 may be configured to monitor one or more processing units (not shown) using memory controller bandwidth unit 502, committed instructions per second (CIPS) unit 504, and instructions-per-cycle (IPC) unit 506. Units 502-506 are representative of any number of hardware performance counters which may provide information to decision unit 508 regarding one or more processing units.”)
determine a compute-boundedness parameter of the identified region based on the measured IPS; (Figs. 6-7, [0053-54], “Power management unit 220 in one embodiment may cause a processing unit 211 to operate at P-state P0 responsive to a corresponding frequency sensitivity value exceeding a certain high threshold. The high threshold may take on any of a variety of values, depending on the operating conditions and requirements of the application and processing unit 211. Operation in P-state P0 may be utilized for processing workloads that are compute-bounded (i.e., have a high frequency sensitivity value). … Power management unit 220 may cause a processing unit 211 to operating in P-state P4 responsive to a corresponding frequency sensitivity value that is less than a low threshold value. The low threshold may take on any of a variety of values, depending on the operating conditions and requirements of the application and processing unit 211. P-state P4 may be used with memory-bounded workloads as well as with other tasks that are not time-sensitive or frequency-sensitive.” And [0108], “ For example, frequency sensitivity calculation unit 704 may be coupled to receive count values generated from various hardware performance counters, including IPC, CIPS, memory controller bandwidth, branch mispredictions, instructions issued, cache hits and misses, instruction executions, pipeline stalls, and/or one or more other metrics.” And [0110], “The real-time frequency sensitivity value calculated and tracked by frequency sensitivity calculation unit 704 may be determined based on a previously created formula.” Where frequency sensitivity value is interpreted as a compute-boundedness based on a measured IPS because the frequency sensitivity value indicates memory bounded/computed bounded workloads)
determine an optimal frequency for the identified region based on the compute-boundedness parameter thereof; and (Fig. 4 (420 and 425), [0053-54], “Power management unit 220 in one embodiment may cause a processing unit 211 to operate at P-state P0 responsive to a corresponding frequency sensitivity value exceeding a certain high threshold. The high threshold may take on any of a variety of values, depending on the operating conditions and requirements of the application and processing unit 211. Operation in P-state P0 may be utilized for processing workloads that are compute-bounded (i.e., have a high frequency sensitivity value). … Power management unit 220 may cause a processing unit 211 to operating in P-state P4 responsive to a corresponding frequency sensitivity value that is less than a low threshold value. The low threshold may take on any of a variety of values, depending on the operating conditions and requirements of the application and processing unit 211. P-state P4 may be used with memory-bounded workloads as well as with other tasks that are not time-sensitive or frequency-sensitive.” And [0114], “Based on the calculated real-time frequency sensitivity score for a particular processor core, a decision may be made to adjust the clock frequency for that particular processor core. For example, if the real-time frequency sensitivity score is above a first high threshold as determined by threshold comparator 706, the frequency may be increased for the core. If the real-time frequency sensitivity score is below a second low threshold, the frequency may be decreased for the core. If the real-time frequency sensitivity score is in between the first and second thresholds, then the current frequency may be maintained.”)
instruct the given processor to set a frequency of the given processor to the optimal frequency for a remainder execution of the identified region. ([0114], “Based on the calculated real-time frequency sensitivity score for a particular processor core, a decision may be made to adjust the clock frequency for that particular processor core. For example, if the real-time frequency sensitivity score is above a first high threshold as determined by threshold comparator 706, the frequency may be increased for the core. If the real-time frequency sensitivity score is below a second low threshold, the frequency may be decreased for the core. If the real-time frequency sensitivity score is in between the first and second thresholds, then the current frequency may be maintained.”)
Bircher does not teach but Paul teaches
each of the regions comprising (A) a function, routine, sub-program, loop, sub-routine, or equivalent, or (B) a section of contiguous memory of predetermined size; ([0018], “An application phase may correspond to an application kernel, which refers to a particular portion of an application defined by the programmer, such as a function, a subroutine, a code block, and the like. … Application phases may also have different thermal properties or characteristics. For example, different application phases may induce different thermal rise times in the processor cores 101-112 or the GPU 120, may have different thermal intensities, or may exhibit different thermal profiles when executed on the different processor cores 101-112 or the GPU 120”)
determine a compute-boundedness parameter of the identified region based on the measured IPS; ([0018], “ Each application phase may run for a different duration, exhibit different mixes of active events and idle events, and have different computational intensities or be more or less memory bounded.” And [0023], “ For example, a processor core 101 that is executing a computationally intensive application phase may retire a relatively large number of instructions per cycle and may therefore dissipate a larger amount of heat. The processor core 101 may therefore exhibit a high thermal density or thermal sensitivity. For another example, an application phase that is memory bounded may exhibit relatively short active periods interspersed with relatively long idle periods and may therefore dissipate a smaller amount of heat.”, [0024], “For another example, the thermal density or thermal sensitivity of the processor core 101 may increase (or decrease) in response to a change in the performance state that causes the operating voltage or frequency of the processor core 101 to increase (or decrease).”, [0041], “The thermal impact predictor 405 may also receive input 416 indicating characteristics of the application or application phase. In some embodiments, the application characteristics include information indicating whether the application or application phase is computationally intensive or memory bounded.”)
determine an optimal frequency for the identified region based on the compute-boundedness parameter thereof; ([0023], “The thermal density or the thermal sensitivity of components such as the processor cores 101-112 or the GPU 120 may also depend on whether the workload or workloads being executed by the processor cores 101-112 or the GPU 120 are computationally intensive or memory bounded. For example, a processor core 101 that is executing a computationally intensive application phase may retire a relatively large number of instructions per cycle and may therefore dissipate a larger amount of heat.”, [0041], “The thermal impact predictor 405 may also receive input 416 indicating characteristics of the application or application phase. In some embodiments, the application characteristics include information indicating whether the application or application phase is computationally intensive or memory bounded.” And [0025], “Some embodiments of the SMU 130 may be used to manage thermal and power conditions in the processing device 100 according to policies set by the operating system and using information that may be provided to the SMU 130 by the operating system, such as a thermal history associated with an application being executed by one of the components of the processing device 100, thermal sensitivities of the components, and a layout of the components in the processing device 100, as discussed herein. The SMU 130 may therefore be able to control power supplied to entities such as the processor cores 101-112 or the GPU 120, as well as adjusting operating points of the processor cores 101-112 or the GPU 120, e.g., by changing an operating frequency or an operating voltage supplied to the processor cores 101-112 or the GPU 120.”)
Bircher and Paul are analogous art. Paul is cited to teach a similar concept of power management related to compute bound and memory bound activity based on regions of the application. Bircher teaches both compute and memory bounded activity be optimized for power/performance by adjusting the frequency of cores/processor based on the evaluation of regions of applications being executed. Paul teaches that compute and memory bounded activity be optimized on a per region where the regions may be a function or a sub-routine basis and using a distinct processor (i.e. GPU) to perform these operations. Based on Paul, it would have been obvious before the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Bircher to use identify the regions into sub-routines or functions and to optimize the frequency as based on functions or sub-routines. Furthermore, being able to a GPU for certain application functions or sub-routines improves on Bircher by being able to optimize the power/performance/heat of the system. To one of ordinary skill in the art before the effective filing data of the invention it would have been advantageous to make this modification because to optimize the power/performance/heat of the system such as “Redistribution of the application phases (which may also be referred to as load-balancing) may reduce some or all of the thermal density peaks 201-205 or reduce the likelihood of thermal emergencies and the processing device 100.”, [0031]
Regarding claim 10, Bircher teaches given processor executing the identified region is the processor. ([0059], “described herein may be configured to reduce the clock frequency in response to calculating a low frequency sensitivity value for a processor executing an application, such as a memory-bounded workload.”)
Regarding claim 11, Bircher teaches the information processing system comprises a compute node of a high-performance compute (HPC) system. (Fig. 1 (100 -computer system) and (105 – processor))
Regarding claim 12, Paul teaches wherein given processor executing the identified region is distinct from the processor. ([0018], “The processor cores 101-112 and the GPU 120 can perform operations such as executing instructions from an application or a phase of an application. As used herein, the term “application phase” refers to a portion of an application that can be scheduled for execution on a component of the processing device 100 independently of scheduling other portions, or other application phases, of the application. And [0026], “The thermal impact of the application phase may depend on characteristics of the application phase such as whether the application phase is computationally intensive or memory bounded, the thermodynamics of the components (e.g., the distinct GPU 120 may be more thermally efficient than the processor cores 101-112)” where the GPU can be used when compute bound for an application phase/region instead of a processor core.)
Regarding claim 13, Bircher teaches wherein the information processing system comprises a high-performance compute (HPC) system (Figs. 1 and 2), the processor is part of a system controller node of the HPC system (Fig. 2 (North Bridge 212/Power management Unit 220)), and the given processor is part of a compute node of the HPC system (Fig. 2 (core 211)). (Figs. 1 and 2)
As to claims 14 and 20, Bircher and Paul teach these claims according to the reasoning provided in claim 1.
Claims 4-5 are also rejected as incorporating the deficiencies of the claims that they are dependent upon.
Claim(s) 2-5, and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bircher and Paul in view of Lee et al. (US 20170205863)
Regarding claim 2, Bircher teaches wherein measuring the IPS of the given processor during execution of the identified region comprises setting the frequency of the given processor to a high frequency and measuring the IPS to determine a high-frequency IPS, and setting the frequency of the given processor to a low frequency and measuring the IPS to determine a low-frequency IPS, and wherein the compute-boundedness parameter is determined based on the high-frequency IPS and the low-frequency IPS without a regression equation obtained from characterization of the given processor. ([0081], “the performance of a pre-defined workload `i` may be calculated at two clock frequencies (Freq1 and Freq2, where Freq1 is greater than Freq2), and then the frequency sensitivity may be calculated based on the performance calculations” and [0053-54], “a corresponding frequency sensitivity value exceeding a certain high threshold. The high threshold may take on any of a variety of values, depending on the operating conditions and requirements of the application and processing unit 211. Operation in P-state P0 may be utilized for processing workloads that are compute-bounded (i.e., have a high frequency sensitivity value). … Power management unit 220 may cause a processing unit 211 to operating in P-state P4 responsive to a corresponding frequency sensitivity value that is less than a low threshold value. The low threshold may take on any of a variety of values, depending on the operating conditions and requirements of the application and processing unit 211. P-state P4 may be used with memory-bounded workloads as well as with other tasks that are not time-sensitive or frequency-sensitive.”)
Bircher and Paul do not teach but Lee teaches
wherein the compute-boundedness parameter is determined based on the high-frequency IPS and the low-frequency IPS without a regression equation obtained from characterization of the given processor. ([0131], “a simple classification heuristic model (H1), two parameters may be used to make decisions, e.g., instruction per second (IPS) and memory bandwidth (memBW). IPS is an indicator of core utilization while memBW directly shows memory occupancy. Workloads with low IPS and high memBW are classified as memory-bound, while high IPS and low memBW workloads are classified as compute-bound. The thresholds of high and low are adjustable, e.g., by a user in some embodiments. The decision strategy may be as follows: enable a weak power configuration (fewer cores/threads and lower voltage/frequency) for memory-bound workloads and enable a strong power configuration (which may be a baseline configuration) for compute-bound workloads.” And [0134], “In a machine learning model, runtime statistics may be included as attributes. A multi-dimensional record is a collection of average statistical values for all attributes during a sampling time period. The most energy efficient configuration for each time interval is assigned a label with the information of cores, threads, voltage, and frequency. The model predicts the next optimal power configuration.” Where neither simple heuristic model nor a machine learning model is a regression equation)
Bircher, Paul and Lee are analogous art. Lee is cited to teach a similar concept of power management related to compute bound and memory bound activity. Bircher and Paul teach both compute and memory bounded activity be optimized for power/performance by adjusting the frequency of cores/processor based on the evaluation of regions of applications being executed. Lee teaches that using a not using a regression equation but using other effective models can calculate compute boundness using a high and low frequency. Based on Lee, it would have been obvious before the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Bircher and Paul to not use a regression equation but using other effective models can calculate compute boundness using a high and low frequency. Furthermore, being able to use other equations improves on Bircher and Paul by being able to optimize the power/performance/heat of the system. To one of ordinary skill in the art before the effective filing data of the invention it would have been advantageous to make this modification because “[b]ased on the particular model logic implemented (and self-learning performed during processor lifetime), many memory bound and compute bound workloads can realize optimal performance and energy efficiency with fewer active compute resources and/or lower frequencies.”, [0171]
Regarding claim 3, Bircher and Paul do not teach but Lee teaches wherein determining the compute-boundedness parameter of the identified region comprises evaluating an equation which relates the high frequency, the low frequency, the high-frequency IPS and the low-frequency IPS as input variables to the compute-boundedness parameter as an output variable. ([0131], “a simple classification heuristic model (H1), two parameters may be used to make decisions, e.g., instruction per second (IPS) and memory bandwidth (memBW). IPS is an indicator of core utilization while memBW directly shows memory occupancy. Workloads with low IPS and high memBW are classified as memory-bound, while high IPS and low memBW workloads are classified as compute-bound. The thresholds of high and low are adjustable, e.g., by a user in some embodiments. The decision strategy may be as follows: enable a weak power configuration (fewer cores/threads and lower voltage/frequency) for memory-bound workloads and enable a strong power configuration (which may be a baseline configuration) for compute-bound workloads.”)
Bircher, Paul and Lee are analogous art. Lee is cited to teach a similar concept of power management related to compute bound and memory bound activity. Bircher and Paul teach both compute and memory bounded activity be optimized for power/performance by adjusting the frequency of cores/processor based on the evaluation of regions of applications being executed. Lee teaches that using a not using a regression equation but using other effective models can calculate compute boundness using a high and low frequency. Based on Lee, it would have been obvious before the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Bircher and Paul to not use a regression equation but using other effective models can calculate compute boundness using a high and low frequency. Furthermore, being able to use other equations improves on Bircher and Paul by being able to optimize the power/performance/heat of the system. To one of ordinary skill in the art before the effective filing data of the invention it would have been advantageous to make this modification because “[b]ased on the particular model logic implemented (and self-learning performed during processor lifetime), many memory bound and compute bound workloads can realize optimal performance and energy efficiency with fewer active compute resources and/or lower frequencies.”, [0171]
Regarding claim 4, Bircher and Paul do not teach but Lee teaches wherein determining optimal frequency for the identified region comprises evaluating an equation which relates the compute boundedness parameter and a performance degradation parameter as independent variables to the optimal frequency as a dependent variable, and wherein the performance degradation parameter is indicative of a level of performance degradation relative to a default performance that will be accepted by the regulator in determining the optimal frequency. ([0028], “an intelligent multi-core power management controller for a processor is provided that learns workload characteristics on-the-fly and dynamically adjusts power configurations to provide optimal performance per energy.”, [0137], “ The calculated performance and energy are used to preprocess the best energy efficient configurations, while a performance drop constraint may be enforced to filter out too much performance sacrifice.” and [0171], “Based on the particular model logic implemented (and self-learning performed during processor lifetime), many memory bound and compute bound workloads can realize optimal performance and energy efficiency with fewer active compute resources and/or lower frequencies.”)
Bircher, Paul and Lee are analogous art. Lee is cited to teach a similar concept of power management related to compute bound and memory bound activity. Bircher and Paul teach both compute and memory bounded activity be optimized for power/performance by adjusting the frequency of cores/processor based on the evaluation of regions of applications being executed. Lee teaches that using a not using a regression equation but using other effective models can calculate compute boundness using a high and low frequency. Based on Lee, it would have been obvious before the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Bircher and Paul to not use a regression equation but using other effective models can calculate compute boundness using a high and low frequency. Furthermore, being able to use other equations improves on Bircher and Paul by being able to optimize the power/performance/heat of the system. To one of ordinary skill in the art before the effective filing data of the invention it would have been advantageous to make this modification because “[b]ased on the particular model logic implemented (and self-learning performed during processor lifetime), many memory bound and compute bound workloads can realize optimal performance and energy efficiency with fewer active compute resources and/or lower frequencies.”, [0171]
Regarding claim 5, Bircher and Paul do not teach but Lee teaches wherein the regulator is configured to receive user input specifying the performance degradation parameter. ([0131], “For a simple classification heuristic model (H1), two parameters may be used to make decisions, e.g., instruction per second (IPS) and memory bandwidth (memBW). IPS is an indicator of core utilization while memBW directly shows memory occupancy. Workloads with low IPS and high memBW are classified as memory-bound, while high IPS and low memBW workloads are classified as compute-bound. The thresholds of high and low are adjustable, e.g., by a user in some embodiments.”)
Bircher, Paul and Lee are analogous art. Lee is cited to teach a similar concept of power management related to compute bound and memory bound activity. Bircher and Paul teach both compute and memory bounded activity be optimized for power/performance by adjusting the frequency of cores/processor based on the evaluation of regions of applications being executed. Lee teaches that using a not using a regression equation but using other effective models can calculate compute boundness using a high and low frequency. Based on Lee, it would have been obvious before the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Bircher and Paul to not use a regression equation but using other effective models can calculate compute boundness using a high and low frequency. Furthermore, being able to use other equations improves on Bircher and Paul by being able to optimize the power/performance/heat of the system. To one of ordinary skill in the art before the effective filing data of the invention it would have been advantageous to make this modification because “[b]ased on the particular model logic implemented (and self-learning performed during processor lifetime), many memory bound and compute bound workloads can realize optimal performance and energy efficiency with fewer active compute resources and/or lower frequencies.”, [0171]
As to claim 15, Bircher and Paul teach this claim according to the reasoning provided in claim 2.
Claim(s) 6, 13, and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bircher and Paul in view of Chidambaram et al. (US 20230195199).
Regarding claim 6, Bircher teaches setting frequency of cores/processor based on IPS but does not discuss
Chidambaram teaches wherein the regulator is configured to determine an optimal uncore frequency for the identified region based on the compute-boundedness parameter thereof; and instruct the given processor to set the uncore frequency of the given processor to the optimal frequency during execution of the identified region. (Fig. 28, [0296], “the core scalability value 2924 may indicate the expected change in performance for a given change of frequency in the core domain, and may be calculated as the ratio between the IPS value 2922 and the frequency change in the core domain. The uncore scalability value 2926 may indicate the expected change in performance for a given change of frequency in the uncore domain, and may be calculated as the ratio between the IPS value 2922 and the frequency change in the uncore domain.”
[0288], “The WL hints may be provided as inputs for determining the uncore and memory frequencies at block 2835 (discussed above). For example, referring to FIG. 26, the processor PMC 2640 may generate WL hints that indicate the workload profile across all processing cores 2620. The WL hints may indicate that a processing core 2620 is core bound, uncore bound, memory bound, or is at an ideal balance between domains.”, and [0308], “ The hint H3 may be generated in response to a determination that the workload of the processing core is memory bound (e.g., the total uncore gain is larger than the total core gain, and one of the following is true: there is a high cache miss rate, there is high memory bandwidth, or there is high memory congestion). Accordingly, the hint H3 may be associated with a reduction to the core frequency, a reduction to the uncore frequency, and an increase to the memory frequency.”)
Bircher, Paul and Chidambaram are analogous art. Chidambaram is cited to teach a similar concept of power management related to compute bound and memory bound activity. Bircher teaches both compute and memory bounded activity be optimized for power/performance by adjusting the frequency of cores/processor based on the evaluation of regions of applications being executed. Chidambaram teaches that not only compute and memory bounded activity be optimized but also the uncore’s frequency can be optimized based on the workload/region of application being used. Based on Chidambaram, it would have been obvious before the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Bircher and Paul to adjust the uncore’s frequency based on the compute boundedness. Furthermore, being able to adjust the uncore’s frequency based on the compute boundedness improves on Bircher and Paul by being able to optimize the power/performance of the system. To one of ordinary skill in the art before the effective filing data of the invention it would have been advantageous to make this modification because “The circuitry may determine the frequency budget distribution based at least in part on total core gain and total uncore gain values. In this manner, some embodiments may provide a frequency budget distribution that improves the performance of heterogenous workloads of the cores. Further, some embodiments may allow the frequency budget distribution to be determined with a relatively short response time, and may therefore allow the processor to take advantage of dynamic workloads of short duration.”, [0042]
As to claim 16, Bircher, Paul, and Chidambaram teach these claims according to the reasoning provided in claim 6.
Claim(s) 7-9 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bircher and Paul in view of Bodas et al. (US 20170285717)
Regarding claim 7, Bircher and Paul teach executing application regions but does not specifically teach using MPI calls during the execution of the application regions.
Bodas teaches wherein the regulator is configured to determine if the identified region is of a Message Passing Interface (MPI) call type and, in response to determining the identified region is of the MPI call type, determine an expected duration of the identified region MPI call. ([0010], “inter-core messaging unit 104 adheres to the message passing interface (MPI) protocol. When core 102-1 reaches a synchronization point, it calls a wait routine. For example, it may call MPI-wait from the inter-core messaging unit 104. In one embodiment, responsive to the call of the wait messaging routine, the power management agent 114-1 transitions core 102-1 into a lower power state. This may take the form of reducing core and/or its power domain power by employing whatever applicable power saving technology such as DVFS (dynamic voltage frequency scaling), gating, parking, offlining, throttling, non-active states, or standby states.”)
Bircher, Paul, and Bodas are analogous art. Bodas is cited to teach a similar concept of power management. Bircher teaches both compute and memory bounded activity be optimized for power/performance by adjusting the frequency of cores/processor based on the evaluation of regions of applications being executed. Bodas teaches based on an MPI-wait the core maybe transitioned into a low power state (specifically reducing the frequency) . Based on Bodas, it would have been obvious before the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Bircher and Paul to adjust the core frequency on an indication of an MPI-wait. Furthermore, being able to reduce the core’s frequency based on the MPI-wait improves on Bircher and Paul by being able to improve the power/performance of the system. To one of ordinary skill in the art before the effective filing data of the invention it would have been advantageous to make this modification to improve the power/performance of the system.
As to claim 17, Bircher, Paul, and Bodas teach these claims according to the reasoning provided in claim 7.
Claim(s) 8-9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bircher, Paul, and Bodas further in view of Arora et al. (US 20140380329)
Regarding claim 8, Bircher, Paul, and Bodas do not teach but Arora teaches wherein the regulator is configured to, if the identified region is of the MPI call type and the expected duration thereof will not exceed a specified threshold; omit the measuring of the IPS, the determining of the compute-boundedness parameter, the determining of the optimal frequency, and the instructing of the given processor to set the frequency to the optimal frequency and instruct the given processor to set the frequency of the given processor to a predetermined high frequency. (Fig. 4, [0022], “"Compute-boundedness" indicates whether the processor is performing computations or not. Thus, compute-bound of work load exhausts computational resources. As an example, a percentage (e.g., 50%) of peak instructions per cycle (IPC) may be used as a predetermined threshold to determine whether a significant benefit is derived by sprinting a new workload without exhausting the remaining thermal capacity of the processor.”, [0023], “The processor may be able to support more available end users, or boost the frequency of different units within the processor because it is bound by the availability of those units.”, and [0002], “ method and apparatus are described for determining when to sprint a multi-core processor, (e.g., when a processor having sufficient thermal capacity should be run at a higher frequency (e.g., 5 or 10 GHZ, rather than 2 GHZ)), and how long to sprint the processor.”)
Bircher, Paul, Boda, and Arora are analogous art. Arora is cited to teach a similar concept of system management related to compute bound and memory bound activity. Bircher and Paul teach both compute and memory bounded activity be optimized for power/performance by adjusting the frequency of cores/processor based on the evaluation of regions of applications being executed. Arora teaches using compute boundedness and thermal availability to determine whether to run the system at a high frequency (i.e. sprinting). Based on Arora, it would have been obvious before the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Bircher, Paul and Bodas to determine whether the system is compute bound using a threshold and when the system is compute bound set the processor(s) to a high frequency to enable spiriting. Furthermore, being able sprint the processor improves on Bircher, Paul, and Bodas by being able to improve the systems performance when compute bound by increasing the frequency of the processor while not exceeding the thermal limitations. To one of ordinary skill in the art before the effective filing data of the invention it would have been advantageous to make this modification because “A method and apparatus for making efficient sprinting decisions in a multi-core processor would be desirable in order to improve energy efficiency and sprinting effectiveness.”, [0004]
Regarding claim 9, Paul teaches wherein the regulator is configured to, if the identified region is of the MPI call type and the whose expected duration thereof will exceed the specified threshold, set the frequency of the given processor according to the determined optimal frequency for the region. ([0041], “The thermal impact predictor 405 may also receive input 416 indicating characteristics of the application or application phase. In some embodiments, the application characteristics include information indicating whether the application or application phase is computationally intensive or memory bounded.” And [0025], “Some embodiments of the SMU 130 may be used to manage thermal and power conditions in the processing device 100 according to policies set by the operating system and using information that may be provided to the SMU 130 by the operating system, such as a thermal history associated with an application being executed by one of the components of the processing device 100, thermal sensitivities of the components, and a layout of the components in the processing device 100, as discussed herein. The SMU 130 may therefore be able to control power supplied to entities such as the processor cores 101-112 or the GPU 120, as well as adjusting operating points of the processor cores 101-112 or the GPU 120, e.g., by changing an operating frequency or an operating voltage supplied to the processor cores 101-112 or the GPU 120.”)
Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 14, and 20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHERI L. HARRINGTON whose telephone number is (571)270-0468. The examiner can normally be reached Generally, M-F, 7:30a-4p.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jaweed Abbaszadeh can be reached at 571-270-1640. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHERI L HARRINGTON/ Examiner, Art Unit 2176
February 2, 2026
/JAWEED A ABBASZADEH/ Supervisory Patent Examiner, Art Unit 2176