The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Specification
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed. The following title is suggested: SYSTEM CACHE MANAGEMENT USING OBSERVER SETS ACROSS MULTI-LEVEL CACHES IN HETEROGENOUS MULTI-CORE PROCESSORS.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103(a) are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-7, 9-19 are rejected under 35 U.S.C. 103 as being unpatentable over Heirman (US PGPUB # 20190303294) in view of Tune (US PGPUB # 20230418765). With respect to independent claims 1, 14 Heirman/Tune discloses: An apparatus [Heirman fig 2] comprising:
a system cache to store one or more cachelines to be evicted from a processor cache [LLC plus idle core MLC is/are system level cache storing cache lines evicted from CPU-private caches MLC - Heirman 0022-0023, 0025-0026, fig 2]; and
logic circuitry to determine whether to store the one or more cachelines in the system cache based at least in part on comparison of a threshold value with a hit rate associated with the one or more cachelines [at least one LLC slice (e.g., slice 0) is extended to keep a counter of the total number of MLC misses, and a counter of how many times an MLC miss was satisfied by one of the unused cores' MLCs. Each time Algorithm 1 is run (e.g., each time a core wakes up or goes to sleep), these counters are read to compute an extended cache hit ratio … when the extended cache hit ratio is above a certain threshold (e.g. 10%), the MLC of unused cores remain active. In another embodiment, when the extended cache hit ratio is below the threshold, the extended cache is disabled. To ensure a hit ratio can be measured in the coming time interval, and potentially decide to enable the extended cache again - Heirman 0075-0076],
wherein the hit rate is to be determined based on re-use behavior of an observer set of data [Another approach can be to provide “set-duelling” (observer vs non-observer set) hardware which uses a first replacement policy for a first group of sets of entries in a set-associative cache, uses a second replacement policy for a second group of sets of entries, and monitors cache hit rate or other performance indicators for the first and second groups of sets, to determine which of the first and second replacement policies is performing better. The better-performing replacement policy is then used for remaining sets of the cache – Tune 0032].
Heirman does not explicitly disclose wherein the hit rate is to be determined based on re-use behavior of an observer set of data. Nevertheless in the same field of endeavor, Tune teaches means for cache replacement control wherein an approach can be to provide “set-duelling” hardware which uses a first replacement policy for a first group of sets of entries in a set-associative cache, uses a second replacement policy for a second group of sets of entries, and monitors cache hit rate or other performance indicators for the first and second groups of sets, to determine which of the first and second replacement policies is performing better. The better-performing replacement policy is then used for remaining sets of the cache (Tune 0032).
Therefore, Heirman/Tune teaches all limitations of the instant claim(s).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to determine the hit rate based on re-use behavior of an observer set of data in the invention of Heirman as taught by Tune because it would be advantageous for optimizing and controlling efficiency of cache replacement (Tune 0030-0032).
With respect to dependent claims 2, 15 Heirman/Tune discloses wherein the logic circuitry is to apply the hit rate from the observer set of data to a plurality of follower sets of data [The better-performing replacement policy is then used for remaining sets (follower sets) of the cache - Tune 0030-0032].
With respect to dependent claims 3, 16 Heirman/Tune discloses wherein the processor cache is one of a Mid-Level Cache (MLC) and a Last Level Cache (LLC) [Heirman 0022, fig 2].
With respect to dependent claims 4, 17 Heirman/Tune discloses wherein the one or more cachelines are to be evicted from the LLC in response to a determination that the LLC lacks capacity to store the one or more cachelines [Once in the LLC, using (pseudo) Least Recently Used (LRU) replacement, each LLC set can be seen as a first in, first out (FIFO) queue that accepts new lines at the front, pushing existing lines as they age towards the back of the queue until they fall out and an eviction occurs—or until they receive a hit in which case the line is removed from the LLC (and installed in the requesting core's FLC and/or MLC) – Heirman 0072; cache controller 303 determines that cache line B is a victim from SF/LLC 304 to make space for cache line A (prior to cache line A being stored in SF/LLC 304 - Heirman 0061].
With respect to dependent claims 5, 18 Heirman/Tune discloses wherein the one or more cachelines are to be evicted from the MLC based at least in part on a determination that the one or more cachelines have a probability of re-use from the LLC below a certain threshold [victim cache entry is selected based on re-reference interval prediction (RRIP) values for a candidate set of cache entries. The RRIP value for a given cache entry is indicative of a relative priority with which the given cache entry is to be selected as the victim cache entry - Tune abstract; each RRIP value 38 has two bits and therefore there are four different RRIP values possible for each cache entry. FIG. 3 is a state diagram showing the transitions between RRIP values which may occur for a particular cache entry. In this example, the four states possible for the RRIP value of a particular cache entry are: RRIPV=0, RRIPV=1, RRIPV=2 and RRIPV=3 (where RRIPV=0 indicates the lowest priority for eviction and RRIPV=3 indicates the highest priority for eviction). It will be appreciated that other examples could have RRIP values with a larger number of bits so that there are more than four states. Also, other examples may represent the highest priority for eviction using the lowest numeric value of the RRIP value 38, and represent the lowest priority for eviction using the RRIP value with the highest numeric value – Tune 0093].
With respect to dependent claims 6 Heirman/Tune discloses wherein the one or more cachelines are to be evicted from the MLC based at least in part on a determination that the one or more cachelines have a probability of re-use from the LLC below a certain threshold and are not stored in the LLC [miss, no allocation Tune 0096, fig 3] [evictions into LLC 250 may be predicated on some algorithm that may decide to drop some block altogether, such as a dead block predictor – Heirman 0033].
With respect to dependent claims 7, 19 Heirman/Tune discloses wherein the one or more cachelines are to be dropped at the LLC in response to a determination that the number of requests received at the LLC exceed a select threshold value [evictions into LLC 250 may be predicated on some algorithm that may decide to drop some block altogether, such as a dead block predictor – Heirman 0033; target entries corresponding to core C are filled in as normal while the other entries (pertaining to unused cores that are fully disabled) are filled with a value that, when selected by the LLC slice, cause the victim cache line to be dropped (or written back to DRAM if dirty) rather than being sent to a now fully disabled MLC - Heirman 0077] [Heirman’s drop functionality in view of performance-driven configuration as disclosed in 0035 of Tune teaches a broadest reasonable interpretation of dropping requests during high traffic].
With respect to dependent claims 9 Heirman/Tune discloses a processor, having a plurality of processor cores, coupled to the processor cache, wherein the plurality of processor cores comprise one or more efficiency-oriented processor cores and one or more performance-oriented processor cores [Tune fig 8].
With respect to dependent claims 10 Heirman/Tune discloses wherein the system cache is to store the one or more cachelines from the processor cache coupled to the one or more performance-oriented processor cores [Tune fig 8].
With respect to dependent claims 11 Heirman/Tune discloses wherein the system cache is to be store cachelines for both the one or more efficiency-oriented processor cores and the one or more performance- oriented processor cores [Tune fig 8].
With respect to dependent claims 12 Heirman/Tune discloses wherein the logic circuitry is to be coupled between the system cache and a main memory fabric [Heirman fig 1-2].
With respect to dependent claims 13 Heirman/Tune discloses wherein a System on Chip comprises the logic circuitry, the processor cache, and the system cache [Heirman fig 1-2, 10].
Claims 8, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Heirman/Tune further in view of Pugsley (US PGPUB # 20180095895).
With respect to dependent claims 8, 20 Heirman/Tune does not explicitly teach the limitations of the instant claim. Nevertheless in the same field of endeavor Pugsley teaches means for cache replacement using conservative set dueling and non-overlapping (mutually exclusive) sets (Pugsley 0150, 0157, 0161), so that the combination of Heirman/Tune/Pugsley teaches wherein an observer set to be used for the LLC is to be mutually exclusive from any observer set that is to be used for the MLC [The better-performing replacement policy is then used for remaining sets (follower sets) of the cache - Tune 0030-0032] [non-overlapping (mutually exclusive) sets - Pugsley 0150, 0157, 0161]. It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to use mutually exclusive sets in the invention of Heirman/Tune as taught by Pugsley because it would be advantageous for optimizing selection of cache replacement policies (Pugsley 0150. 0157. 0161. 0190).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Hady US PGPUB # 20160188466 teaches a multi-core processor providing heterogeneous processor cores and a shared cache. The processor cores 20 include heterogeneous cores, that is, architecturally different processor cores (or types of processor cores). For example, the processor cores 20 may include one or more special purpose processor cores and/or at least one central processing unit (CPU) core.
YIN US PGPUB # 20210109861 teaches: A cache of a processor includes a cache controller to implement a cache management policy for the insertion and replacement of cache lines of the cache. The cache management policy assigns replacement priority levels to each cache line of at least a subset of cache lines in a region of the cache based on a comparison of a number of accesses to a cache set having a way that stores a cache line since the cache line was last accessed to a reuse distance determined for the region of the cache, wherein the reuse distance represents an average number of accesses to a given cache set of the region between accesses to any given cache line of the cache set.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARWAN AYASH whose telephone number is (571)270-1179. The examiner can normally be reached 9a-530p M-R.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Rocio del Mar Perez-Velez can be reached on 571-270-5935. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Marwan Ayash/Examiner, Art Unit 2133
/ROCIO DEL MAR PEREZ-VELEZ/Supervisory Patent Examiner, Art Unit 2133