DETAILED ACTION
1. This Office Action is taken in response to Applicants’ Amendments and Remarks filed on 4/3/2026 regarding application 18/637,710 filed on 4/17/2024.
Claims 1-20 are pending for consideration.
2. Response to Amendments and Remarks
Applicants’ amendments and remarks have been fully and carefully considered, with the Examiner’s response set forth below.
(1) In response to the amendments and remarks, an updated claim analysis has been made, with additional, newly identified reference(s). Refer to the corresponding sections of the following Office Action for details.
3. Examiner’s Note
(1) In the case of amending the Claimed invention, Applicant is respectfully requested to indicate the portion(s) of the specification which dictate(s) the structure relied on for proper interpretation and also to verify and ascertain the metes and bounds of the claimed invention. This will assist in expediting compact prosecution. MPEP 714.02 recites: “Applicant should also specifically point out the support for any amendments made to the disclosure. See MPEP § 2163.06. An amendment which does not comply with the provisions of 37 CFR 1.121(b), (c), (d), and (h) may be held not fully responsive. See MPEP § 714.” Amendments not pointing to specific support in the disclosure may be deemed as not complying with provisions of 37 C.F.R. 1.131(b), (c), (d), and (h) and therefore held not fully responsive. Generic statements such as “Applicants believe no new matter has been introduced” may be deemed insufficient.
(2) Examiner has cited particular columns/paragraph and line numbers in the references applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
Double Patenting
4. The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the "right to exclude" granted by a patent and to prevent possible harassment by multiple assignees. See In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and, In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent is shown to be commonly owned with this application. See 37 CFR 1.130(b).
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b).
5. Claims 1-20 are provisionally rejected under the judicially created doctrine of obvious-type double patenting as being unpatentable over independent claims 1-22 of US Patent Application 18/748,359. Although not all of the conflicting claims are exactly identical, they are extremely similar and are not patentably distinct from each other as shown in the example below:
18/637,710
18/748,359
A method for performing adaptive prefetch in a memory system, the method comprising: receiving a first data request including a first address of a first data page stored in a memory device; and performing a lookup operation using an adaptive prefetch table, to determine whether a prefetch operation should be performed based on the received first data request, wherein the adaptive prefetch table includes an entry for the first data page corresponding to the received first data request, and wherein the entry for the first data page includes an address of a predicted next page, and a weight factor of the predicted next page
1. A method for performing early prefetch in a memory system, the method comprising: receiving a data request including an address of a data page stored in a memory device; storing the data request at a first position in a request queue; accessing the data request stored at the first position and performing a first lookup operation for the data request; updating the device request queue, which includes moving the data request stored at the first position to a second position in the request queue; and accessing the data request stored at the second position and performing a second lookup operation for the data request.
The method of claim 1, wherein performing the first lookup operation for the data request comprises: generating a hash value of the address of the data page; and searching for the hash value in a hash table including a plurality of hash values corresponding to data stored in a device cache.
The method of claim 2, further comprising: generating a request for data at the address of the data page stored in the memory device, based on a search miss for the hash value in the hash table; and sending the request to the memory device.
The method of claim 3, further comprising: receiving, from the memory device, the data at the address of the data page stored in the memory device; and storing the received data in the device cache.
The method of claim 3, further comprising sending the address of the data page to an adaptive prefetch table, wherein the adaptive prefetch table includes an entry for the data page, and wherein with the entry for the data page includes an address of a predicted next page, and a weight factor of the predicted next page.
The method of claim 5, further comprising performing a third lookup operation using the adaptive prefetch table, to determine whether a prefetch operation should be performed based on the data request.
The method of claim 6, further comprising performing the prefetch operation for the predicted next page, based on the weight factor of the predicted next page being greater than or equal to a prefetch threshold.
The method of claim 7, further comprising: generating a prefetch data request for the address of the predicted next page; and sending the prefetch data request to the memory device.
The method of claim 8, further comprising: receiving, from the memory device, data for the address of the predicted next page; and storing the received data in a prefetch cache.
The method of claim 1, wherein performing the second lookup operation for the data request comprises performing a device cache lookup for the data request.
The method of claim 1, wherein the first position in the request queue is located at a tail of the request queue, and wherein the second position in the request queue is located at a head of the request queue.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
6. Claim 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Mashimo et al. (US Patent 20230376420, hereinafter Mashimo), and in view of
Al-Sukhni al. (US Patent 20060248279, hereinafter Al-Sukhni)
As to claim 1, Mashimo teaches A method for performing adaptive prefetch in a memory system [A method includes recording a first set of consecutive memory access deltas, where each of the consecutive memory access deltas represents a difference between two memory addresses accessed by an application, updating values in a prefetch training table based on the first set of memory access deltas, and predicting one or more memory addresses for prefetching responsive to a second set of consecutive memory access deltas and based on values in the prefetch training table (abstract); Al-Sukhni also teaches this limitation – memory system as shown in figure 2], the method comprising:
receiving a first data request including a first address of a first data page stored in a memory device [as shown in figures 4A, 5A, 6A, and 7A of the Mashimo reference, each entry of the LPB (303) corresponds to a “first data page,” with the region ID (404) corresponds to the page address of the first data page, and each of the deltas (412-415) corresponding to the delta address of a candidate of predicted next page with respect to the current data page, ”including” the actual predicted next page;
as shown in figure 4A, where the last memory address accessed (403) and the associated page address (401) are recorded; … Memory access patterns are represented as sequences of deltas, where a delta is the difference between two consecutively generated memory addresses to the same memory region (e.g., a memory page). The prefetcher tracks memory access patterns per memory region (e.g. 4 KB memory pages) independent of the instructions that generate the memory accesses … (¶ 0020-0021); As previously described, the LPB 303 contains multiple entries, each including a region identifier 401, the last memory address 402 accessed in the identified region, and a set of n+1 memory access delta values captured in the region. Lookups in the LPB 303 are performed using the region ID 401, such as a page identifier of a cache miss demand request address (¶ 0063); Al-Sukhni also teaches this limitation – During processor operations, load store unit 207 generates a data address information 150 for the requested data for an instruction. Each generated address makes up an address of the data address stream. In one embodiment, load store unit 207 accesses the L1 cache 213 to determine if the data is present. If the data is not present in the L1 cache 213, load store unit 207 will request the data from the L2 cache (or other cache levels in other embodiments). If the data is not present in any of the caches, then load store unit 207 requests the data from memory 221 via BIU 212. If the data is not present in memory 221, then a request for the data from another memory (e.g. hard disk drive, CD drive etc.) is generated e.g. via an exception or interrupt. Load store unit 207 may perform other operations in data retrieval not described herein. The accessed data is then forwarded to the execution units and/or written to the appropriate cache (¶ 0030)]; and
performing a lookup operation using an adaptive prefetch table [as shown in figures 4A, 5A, 6A, and 7A of the Mashimo reference, each entry of the LPB (303) corresponds to a “first data page,” with the region ID (404) corresponds to the page address of the first data page, and each of the deltas (412-415) corresponding to the delta address of a candidate of predicted next page with respect to the current data page, ”including” the actual predicted next page; as shown in figure 3, where the prefetcher (300) includes a prefetch training table, figure 3, 305, a local pattern buffer (303), and a correlation table (304); At block 451, the prefetch logic 301 receives a virtual memory address that is generated by the processor core 230 when executing the application 232 (e.g., a L1 data cache miss demand address). In an alternative embodiment, a physical memory address is used instead of a virtual memory address. The prefetcher 300 performs region-local classification; accordingly, a lookup is performed in the LPB 303 using the region identifier 401. Since the region is a 4 KB memory page, the region ID 401 is the page number address for the received memory address. The region ID 401 is used to perform a lookup in the LPB 303, as provided at block 453 … (¶ 0043-0046); At block 463, if the lookup of the hash in the COT 304 results in a hit, the process 450 continues at block 467. At block 467, the hash engine 302 calculates a hash of the most recently saved n−m delta values in the LPB 303, and the prefetch logic 301 performs a lookup of the hash in the prefetch training table 305, which is implemented as PHT 410 … (¶ 0051); At block 509, the n−m deltas 412-415 in the PHT 410 entry and the address of the most recent memory access are used to calculate n-m addresses for prefetching. The n-m prefetch requests are then issued to the memory system for the calculated addresses. The n-m addresses for prefetching are calculated sequentially, since each of the delta values is calculated based on the previous memory address. From block 509, the process 500 returns to block 451. The process 500 thus repeats during execution of the application 232 to determine for each memory access whether the most recent pattern of delta values matches a previously recorded pattern, and to predict addresses for prefetching based on the matching pattern (¶ 0060); In the method, the values selected from the prefetch training table include one or more of the first set of consecutive memory access deltas. The method also includes predicting one or more prefetch addresses based on the selected values and a memory address of a most recent memory access by the application. In the method, the prefetch training table includes a correlation weight table, and the values from the prefetch training table represent weights in the correlation weight table … (¶ 0093-0098); Al-Sukhni also teaches a prefetch unit – prefetch unit, figure 2, 209 , and figure 3, 209; In the embodiments described above, prefetch unit 209 requests translation information from a TLB. However, in other embodiments, translation information may be obtained from other types of translation sources in a data processing system. For example, in other embodiments, the translation information requested by the prefetch unit may be obtained from a page table in memory 221 (¶ 0090)], to determine whether a prefetch operation should be performed based on the received first data request [At block 509, the n−m deltas 412-415 in the PHT 410 entry and the address of the most recent memory access are used to calculate n-m addresses for prefetching. The n-m prefetch requests are then issued to the memory system for the calculated addresses. The n-m addresses for prefetching are calculated sequentially, since each of the delta values is calculated based on the previous memory address. From block 509, the process 500 returns to block 451. The process 500 thus repeats during execution of the application 232 to determine for each memory access whether the most recent pattern of delta values matches a previously recorded pattern, and to predict addresses for prefetching based on the matching pattern (¶ 0060); as shown in figure 7B, steps 763 and 765; FIG. 7B illustrates the prefetch prediction process 750, according to an embodiment. The process 750 begins at block 451. At block 451, a new physical or virtual memory address being accessed is sent to the prefetcher 300. At block 453, the region ID 401 (e.g., the page address of the received memory address) is used to perform a lookup in the LPB 303 … If the highest accumulated weight value in the SRV 306 does not exceed a minimum accumulated weight threshold, then at 763 no prefetch is performed and the process 750 returns to block 451. If the highest accumulated weight value does exceed the minimum accumulated weight threshold, then at block 765, the prefetch logic 301 calculates an address for prefetching by adding the most likely future access delta 701 (i.e., the index of the highest valued register in the SRV 306) to the most recently accessed memory address 402. A prefetch request is issued for the calculated prefetch address (¶ 0078-0083)], wherein the adaptive prefetch table includes an entry for the first data page corresponding to the received first data request [as shown in figures 4A, 5A, 6A, and 7A of the Mashimo reference, each entry of the LPB (303) corresponds to a “first data page,” with the region ID (404) corresponds to the page address of the first data page, and each of the deltas (412-415) corresponding to the delta address of a candidate of predicted next page with respect to the current data page, ”including” the actual predicted next page; as shown in figure 4A, where the last memory address accessed (403) and the associated page address (401) are recorded; … Memory access patterns are represented as sequences of deltas, where a delta is the difference between two consecutively generated memory addresses to the same memory region (e.g., a memory page). The prefetcher tracks memory access patterns per memory region (e.g. 4 KB memory pages) independent of the instructions that generate the memory accesses … (¶ 0020-0021); At block 509, the n−m deltas 412-415 in the PHT 410 entry and the address of the most recent memory access are used to calculate n-m addresses for prefetching. The n-m prefetch requests are then issued to the memory system for the calculated addresses. The n-m addresses for prefetching are calculated sequentially, since each of the delta values is calculated based on the previous memory address. From block 509, the process 500 returns to block 451. The process 500 thus repeats during execution of the application 232 to determine for each memory access whether the most recent pattern of delta values matches a previously recorded pattern, and to predict addresses for prefetching based on the matching pattern (¶ 0060); As previously described, the LPB 303 contains multiple entries, each including a region identifier 401, the last memory address 402 accessed in the identified region, and a set of n+1 memory access delta values captured in the region. Lookups in the LPB 303 are performed using the region ID 401, such as a page identifier of a cache miss demand request address (¶ 0063)], and
wherein the entry for the first data page includes an address of a predicted next page, and a weight factor of the predicted next page [Mashimo teaches predicting prefetch address using a Page-Local Delta-based Prefetcher (PLDP), with a weight factor – as shown in figures 4A, 5A, 6A, and 7A of the Mashimo reference, each entry of the LPB (303) corresponds to a “first data page,” with the region ID (404) corresponds to the page address of the first data page, and each of the deltas (412-415) corresponding to the delta address of a candidate of predicted next page with respect to the current data page, ”including” the actual predicted next page; In one embodiment, the prefetcher tracks memory access patterns per memory page, and is thus referred to as a Page-Local Delta-based Prefetcher (PLDP). Delta sequences are recorded per memory region because this address partitioning when searching for patterns in memory traffic provides the highest coverage of memory traffic across a variety of workloads. While the embodiments in the following description operate with 4 KB memory pages, alternative embodiments operate with other region sizes in either the physical or virtual memory address space. In one embodiment, a Multiple-Distance Correlation data Prefetcher (MDCP) is a delta-based prefetcher that records multiple memory access delta patterns per memory region and predicts future access patterns by, for each current delta value, incrementing weight values corresponding to each of multiple preceding delta values observed and their distances from the current delta value in the delta sequence. Given a current set of delta values, an address for prefetching is predicted by accumulating the weights for each of multiple possible next delta values, where the weights to be accumulated are selected based on the current delta values and their respective distances in the delta sequence from the delta value being predicted (¶ 0020-0021);
Al-Sukhni more expressively teaches the aspect of “predicted next page,” where the corresponding “weight factor” is the “confidence level/value” -- Prefetching across a page boundary in a data processing system. The system determines whether a prefetch will cross a page boundary of memory, and if so, it determines whether a translation source has an entry corresponding to the virtual address of the prefetch. If the translation source has an entry corresponding the virtual address, a physical address of the virtual address is used to prefetch the information (abstract); Accessing data from memory 221 (and the L2 cache) requires multiple cycles as compared to accessing data from L1 cache. Accordingly, it may be desirable to prefetch the data. In one embodiment, prefetching includes predicting the address of the data to be accessed in subsequent processing operations of the processor pipeline and retrieving such data before needed for those processing operations. In some embodiments, the retrieved data is written to L1 cache 213 (¶ 0031); Each prefetch engine (e.g. 311) attempts to detect a strided stream in the data addresses associated with its allocated stream identifier. In one embodiment, each prefetch engine includes a detection circuit 316 that detects strided patterns by comparing consecutive data access addresses associated with the allocated stream identifier. The prediction circuit 321, generates predicted addresses of future data accesses of the strided stream based on the detected stride from detection circuit 316 and the current data address. The predictions are provided to the select prefetch engine circuit 309, wherein the predictions are provided to LSU 207 via PFQ 210. LSU 207 uses the predictions to prefetch the data and store the data in L1 cache 213. In other embodiments, the prefetched data may be stored in other memory locations or temporary buffers. In the embodiment shown, each prefetch engine includes a confidence circuit 317 that generates a confidence value for the stream identifier being tracked by the prefetch engine. In one embodiment, the confidence circuit 317 monitors the success of the prefetches associated with the stream identifier and adjusts the confidence level based on factors e.g. as to whether the prefetched data is used, whether there was a load miss, or whether there was a prefetch evict. In one embodiment, confidence level may have 5 active levels where the highest level is indicative of 4 consecutive prefetch uses (See FIG. 5). In one embodiment, the confidence level is increased with every used data prefetch up to maximum confidence level and decremented with each load miss or prefetch evict. In one embodiment, a confidence level of zero would indicate an unallocated prefetch engine … (¶ 0045-0047); However, when the predicted prefetch physical address crosses a memory page boundary, the physical address of the next page in memory is unknown to the prefetch engine (¶ 0080)]; and
wherein the weight factor of the predicted next page is an integer value that is increased by 1 in response to the predicted next page being requested in a next data request, and decreased by 1 in response to the predicted next page not being requested in the next data request [this limitation is taught by Al-Sukhni – as shown in figure 5, where the confidence level (which is the corresponding “weight factor”) is incremented by 1 level when there is a “prefetch hit;” and decremented by one level when there is a “load miss” or “PE eviction;” In the embodiment shown, each prefetch engine includes a confidence circuit 317 that generates a confidence value for the stream identifier being tracked by the prefetch engine. In one embodiment, the confidence circuit 317 monitors the success of the prefetches associated with the stream identifier and adjusts the confidence level based on factors e.g. as to whether the prefetched data is used, whether there was a load miss, or whether there was a prefetch evict. In one embodiment, confidence level may have 5 active levels where the highest level is indicative of 4 consecutive prefetch uses (See FIG. 5). In one embodiment, the confidence level is increased with every used data prefetch up to maximum confidence level and decremented with each load miss or prefetch evict. In one embodiment, a confidence level of zero would indicate an unallocated prefetch engine (¶ 0046); In one embodiment, confidence circuit 317 determines the confidence of the prefetching by implementing a confidence level scheme for a stream identifier. In one embodiment, the confidence level is adjusted based on a number of factors such as an indication whether a prefetch was used by the processor (PF used), whether there was a load miss (e.g. whether a load instruction associated with the stream identifier had an address that was not in the L1 cache) … (¶ 0059)].
Regarding claim 1, Mashimo teaches predicting prefetch address using a Page-Local Delta-based Prefetcher (PLDP), with a weight factor [In one embodiment, the prefetcher tracks memory access patterns per memory page, and is thus referred to as a Page-Local Delta-based Prefetcher (PLDP). Delta sequences are recorded per memory region because this address partitioning when searching for patterns in memory traffic provides the highest coverage of memory traffic across a variety of workloads. While the embodiments in the following description operate with 4 KB memory pages, alternative embodiments operate with other region sizes in either the physical or virtual memory address space. In one embodiment, a Multiple-Distance Correlation data Prefetcher (MDCP) is a delta-based prefetcher that records multiple memory access delta patterns per memory region and predicts future access patterns by, for each current delta value, incrementing weight values corresponding to each of multiple preceding delta values observed and their distances from the current delta value in the delta sequence. Given a current set of delta values, an address for prefetching is predicted by accumulating the weights for each of multiple possible next delta values, where the weights to be accumulated are selected based on the current delta values and their respective distances in the delta sequence from the delta value being predicted (¶ 0020-0021)], but does not teach predicting an address of a next page, and incrementing or decrementing a weight factor of the predicted next page, depending whether the predicted next page is accessed.
However, Al-Sukhni teaches predicting an address of a next page, and incrementing or decrementing a weight factor of the predicted next page, depending whether the predicted next page is accessed [the corresponding “weight factor” is the “confidence level/value” -- Prefetching across a page boundary in a data processing system. The system determines whether a prefetch will cross a page boundary of memory, and if so, it determines whether a translation source has an entry corresponding to the virtual address of the prefetch. If the translation source has an entry corresponding the virtual address, a physical address of the virtual address is used to prefetch the information (abstract); Accessing data from memory 221 (and the L2 cache) requires multiple cycles as compared to accessing data from L1 cache. Accordingly, it may be desirable to prefetch the data. In one embodiment, prefetching includes predicting the address of the data to be accessed in subsequent processing operations of the processor pipeline and retrieving such data before needed for those processing operations. In some embodiments, the retrieved data is written to L1 cache 213 (¶ 0031); Each prefetch engine (e.g. 311) attempts to detect a strided stream in the data addresses associated with its allocated stream identifier. In one embodiment, each prefetch engine includes a detection circuit 316 that detects strided patterns by comparing consecutive data access addresses associated with the allocated stream identifier. The prediction circuit 321, generates predicted addresses of future data accesses of the strided stream based on the detected stride from detection circuit 316 and the current data address. The predictions are provided to the select prefetch engine circuit 309, wherein the predictions are provided to LSU 207 via PFQ 210. LSU 207 uses the predictions to prefetch the data and store the data in L1 cache 213. In other embodiments, the prefetched data may be stored in other memory locations or temporary buffers. as shown in figure 5, where the confidence level (which is the corresponding “weight factor”) is incremented by 1 level when there is a “prefetch hit;” and decremented by one level when there is a “load miss” or “PE eviction;” In the embodiment shown, each prefetch engine includes a confidence circuit 317 that generates a confidence value for the stream identifier being tracked by the prefetch engine. In one embodiment, the confidence circuit 317 monitors the success of the prefetches associated with the stream identifier and adjusts the confidence level based on factors e.g. as to whether the prefetched data is used, whether there was a load miss, or whether there was a prefetch evict. In one embodiment, confidence level may have 5 active levels where the highest level is indicative of 4 consecutive prefetch uses (See FIG. 5). In one embodiment, the confidence level is increased with every used data prefetch up to maximum confidence level and decremented with each load miss or prefetch evict. In one embodiment, a confidence level of zero would indicate an unallocated prefetch engine (¶ 0046) … (¶ 0045-0047); In one embodiment, confidence circuit 317 determines the confidence of the prefetching by implementing a confidence level scheme for a stream identifier. In one embodiment, the confidence level is adjusted based on a number of factors such as an indication whether a prefetch was used by the processor (PF used), whether there was a load miss (e.g. whether a load instruction associated with the stream identifier had an address that was not in the L1 cache) … (¶ 0059)].
Therefore, it would have been obvious for one of ordinary skills in the art before the effective filing date of the claimed invention to predict an address of a next page, and incrementing or decrementing a weight factor of the predicted next page, depending whether the predicted next page is accessed, as expressively demonstrated by Al-Sukhni, and to incorporate it into the existing scheme disclosed by Mashimo, because Al-Sukhni teaches doing so eliminates the access latency [Data prefetching can be used to overcome memory latency in data processing systems. One type of data prefetching is stride prefetching. Stride prefetching is based on detecting regular access patterns in the data address stream. Prefetch addresses can be generated based on those detected strides (¶ 0005)].
As to claim 2, Mashimo in view of Al-Sukhni teaches The method of claim 1, further comprising performing the prefetch operation for the predicted next page, based on the weight factor of the predicted next page satisfying a prefetch threshold [Mashimo -- At block 761, the prefetch logic 301 identifies a register in the SRV 306 having the highest accumulated value … If the highest accumulated weight value in the SRV 306 does not exceed a minimum accumulated weight threshold, then at 763 no prefetch is performed and the process 750 returns to block 451. If the highest accumulated weight value does exceed the minimum accumulated weight threshold, then at block 765, the prefetch logic 301 calculates an address for prefetching by adding the most likely future access delta 701 (i.e., the index of the highest valued register in the SRV 306) to the most recently accessed memory address 402. A prefetch request is issued for the calculated prefetch address (¶ 0083); Al-Sukhni teaches predicting next page -- as shown in figure 6; FIG. 6 is a flow chart of operations of data processing system 201 in obtaining a physical address translation of a request from a prefetch unit. In 601, prediction circuit 321 computes the next prefetch address. In 602, a determination is made by predication circuit 321 of whether the next data address of the strided stream will cross a page boundary. If no in 602, then prediction circuit 321 generates the next prefetch address (physical address) at the appropriate time in that the next prefetch address is known … (¶ 0082-0084); Each prefetch engine (e.g. 311) attempts to detect a strided stream in the data addresses associated with its allocated stream identifier. In one embodiment, each prefetch engine includes a detection circuit 316 that detects strided patterns by comparing consecutive data access addresses associated with the allocated stream identifier. The prediction circuit 321, generates predicted addresses of future data accesses of the strided stream based on the detected stride from detection circuit 316 and the current data address. The predictions are provided to the select prefetch engine circuit 309, wherein the predictions are provided to LSU 207 via PFQ 210. LSU 207 uses the predictions to prefetch the data and store the data in L1 cache 213. In other embodiments, the prefetched data may be stored in other memory locations or temporary buffers. as shown in figure 5, where the confidence level (which is the corresponding “weight factor”) is incremented by 1 level when there is a “prefetch hit;” and decremented by one level when there is a “load miss” or “PE eviction;” In the embodiment shown, each prefetch engine includes a confidence circuit 317 that generates a confidence value for the stream identifier being tracked by the prefetch engine. In one embodiment, the confidence circuit 317 monitors the success of the prefetches associated with the stream identifier and adjusts the confidence level based on factors e.g. as to whether the prefetched data is used, whether there was a load miss, or whether there was a prefetch evict. In one embodiment, confidence level may have 5 active levels where the highest level is indicative of 4 consecutive prefetch uses (See FIG. 5). In one embodiment, the confidence level is increased with every used data prefetch up to maximum confidence level and decremented with each load miss or prefetch evict. In one embodiment, a confidence level of zero would indicate an unallocated prefetch engine (¶ 0046) … (¶ 0045-0047); In one embodiment, confidence circuit 317 determines the confidence of the prefetching by implementing a confidence level scheme for a stream identifier. In one embodiment, the confidence level is adjusted based on a number of factors such as an indication whether a prefetch was used by the processor (PF used), whether there was a load miss (e.g. whether a load instruction associated with the stream identifier had an address that was not in the L1 cache) … (¶ 0059)].
As to claim 3, Mashimo in view of Al-Sukhni teaches The method of claim 2, wherein performing the prefetch operation for the predicted next page comprises: generating a prefetch data request for the address of the predicted next page; and sending the prefetch data request to the memory device [Mashimo -- … If the highest accumulated weight value does exceed the minimum accumulated weight threshold, then at block 765, the prefetch logic 301 calculates an address for prefetching by adding the most likely future access delta 701 (i.e., the index of the highest valued register in the SRV 306) to the most recently accessed memory address 402. A prefetch request is issued for the calculated prefetch address. In one embodiment, the prefetch logic 301 generates more than one prefetch request by recursively accessing the CWT 605 one or more times according to the process 750 using successive predicted delta values in place of the most recent delta value in the LPB 303 … (¶ 0083-0084); Al-Sukhni teaches predicting next page -- as shown in figure 6; FIG. 6 is a flow chart of operations of data processing system 201 in obtaining a physical address translation of a request from a prefetch unit. In 601, prediction circuit 321 computes the next prefetch address. In 602, a determination is made by predication circuit 321 of whether the next data address of the strided stream will cross a page boundary. If no in 602, then prediction circuit 321 generates the next prefetch address (physical address) at the appropriate time in that the next prefetch address is known … (¶ 0082-0084)].
As to claim 4, Mashimo in view of Al-Sukhni teaches The method of claim 3, further comprising: receiving, from the memory device, data for the address of the predicted next page; and storing the received data in a prefetch cache [Mashimo -- as shown in figure 2, L1 cache (201), L2 cache (202), and L3 cache (203); In one embodiment, a delta-based prefetcher exploits the correlation between a repeatable memory access pattern and program meta-data, such as a memory instruction's program counter (PC), to accurately predict future memory access patterns and prefetch data into the cache in a timely fashion … (¶ 0020); Hardware prefetchers 221-223 are associated with cache levels 201-203, respectively, and generate prefetch requests for their associated cache levels or cache levels lower than their associated cache levels … (¶ 0029); Al-Sukhni – as shown in figures 2, 3, and 6].
As to claim 5, Mashimo in view of Al-Sukhni teaches The method of claim 2, further comprising: receiving a prefetch stride value indicating a number of pages to prefetch in response to the first data request; and performing additional prefetch operations for predicted next pages, until the number of pages to prefetch is reached or a weight factor of one of the predicted next pages is 0 [Mashimo -- as shown in figure 4A, where the local pattern buffer (LPB, 303) comprises a plurality of deltas (401-407), and the prefetch operations is performed until the number of deltas are all enumerated; The prefetch logic 301 accumulates the weights in a sum register vector 306 by adding the selected weight values for the delta value in respective registers of the sum register vector 306. The index of the register having the highest accumulated weight value represents the most likely next delta, according to the history of observed delta values. This predicted delta is used by the prefetch logic 301 to calculate the next memory address for prefetching. FIG. 4A illustrates data structures used by the delta-based prefetcher, according to an embodiment. These include the local pattern buffer (LPB) 303, the pattern history table (PHT) 410, and the correlation table (COT) 304. In one embodiment, these structures are implemented in memory in the prefetcher 300; in alternative embodiments, some or all of the structures 303, 304, and 410 reside in one or more other memory devices outside of the prefetcher 300 … (¶ 0036-0040); In one embodiment, the prefetch logic 301 generates more than one prefetch request by recursively accessing the CWT 605 one or more times according to the process 750 using successive predicted delta values in place of the most recent delta value in the LPB 303. For example, if n=3 and the prior delta values saved in the LPB 303 entry are A, B, and C (with C being the most recent delta), the CWT is first accessed by these deltas A, B, and C to predict a future access delta X. Then, B, C, and X are used to continue prefetch request generation by predicting a future delta value that follows X. This process continues to generate multiple addresses for prefetching, and a prefetch request is generated for each of the predicted addresses (¶ 0084); Al-Sukhni teaches predicting next page – as shown in figure 6; FIG. 6 is a flow chart of operations of data processing system 201 in obtaining a physical address translation of a request from a prefetch unit. In 601, prediction circuit 321 computes the next prefetch address. In 602, a determination is made by predication circuit 321 of whether the next data address of the strided stream will cross a page boundary. If no in 602, then prediction circuit 321 generates the next prefetch address (physical address) at the appropriate time in that the next prefetch address is known … (¶ 0082-0084)].
As to claim 6, Mashimo in view of Al-Sukhni teaches The method of claim 1, further comprising: receiving a second data request including a second address of a second data page stored in the memory device; comparing the second address of the second data page to the address of the predicted next page; and updating the entry for the first data page, based on the comparison [Mashimo -- At block 453, if the region ID 401 lookup results in a miss in the LPB 303, an entry in the LPB 303 is created for the region ID 401, as provided at block 455. If the LPB 303 is full, an existing entry is evicted from the LPB 303 according to a replacement policy. The new entry in the LPB 303 includes the region ID 401, a program counter 402 of the memory instruction originating the access, and the address accessed in the identified memory region … After an entry has been created for the region ID 401, lookups of the region ID 401 in the LPB 303 result in a hit. When a hit occurs at block 453, then at block 457 the prefetch logic 301 calculates a delta value from the difference between the new last accessed memory address (received at 451) and the previous last memory address (previously recorded in field 403 of the LPB 303) accessed in the same region … (¶ 0044-0045); At block 451, the prefetch logic 301 receives the address of a memory access by the processor core 230 and looks up the region ID 401 (e.g., the page address) of the received memory address in the LPB 303, as provided at block 453. If the lookup results in a miss, then a new entry is allocated in the LPB 303, as provided at block 455, and the received address is saved as the last address 402 in the LPB 303 … (¶ 0067); Al-Sukhni teaches predicting next page -- as shown in figure 6; FIG. 6 is a flow chart of operations of data processing system 201 in obtaining a physical address translation of a request from a prefetch unit. In 601, prediction circuit 321 computes the next prefetch address. In 602, a determination is made by predication circuit 321 of whether the next data address of the strided stream will cross a page boundary. If no in 602, then prediction circuit 321 generates the next prefetch address (physical address) at the appropriate time in that the next prefetch address is known … (¶ 0082-0084)].
As to claim 7, Mashimo in view of Al-Sukhni teaches The method of claim 6, wherein updating the entry for the first data page, based on the comparison comprises increasing the weight factor of the predicted next page, based on the second address of the second data page corresponding to the address of the predicted next page [Mashimo -- In one embodiment, a Multiple-Distance Correlation data Prefetcher (MDCP) is a delta-based prefetcher that records multiple memory access delta patterns per memory region and predicts future access patterns by, for each current delta value, incrementing weight values corresponding to each of multiple preceding delta values observed and their distances from the current delta value in the delta sequence. Given a current set of delta values, an address for prefetching is predicted by accumulating the weights for each of multiple possible next delta values, where the weights to be accumulated are selected based on the current delta values and their respective distances in the delta sequence from the delta value being predicted (¶ 0021); Al-Sukhni teaches predicting next page -- as shown in figure 6; FIG. 6 is a flow chart of operations of data processing system 201 in obtaining a physical address translation of a request from a prefetch unit. In 601, prediction circuit 321 computes the next prefetch address. In 602, a determination is made by predication circuit 321 of whether the next data address of the strided stream will cross a page boundary. If no in 602, then prediction circuit 321 generates the next prefetch address (physical address) at the appropriate time in that the next prefetch address is known … (¶ 0082-0084); Each prefetch engine (e.g. 311) attempts to detect a strided stream in the data addresses associated with its allocated stream identifier. In one embodiment, each prefetch engine includes a detection circuit 316 that detects strided patterns by comparing consecutive data access addresses associated with the allocated stream identifier. The prediction circuit 321, generates predicted addresses of future data accesses of the strided stream based on the detected stride from detection circuit 316 and the current data address. The predictions are provided to the select prefetch engine circuit 309, wherein the predictions are provided to LSU 207 via PFQ 210. LSU 207 uses the predictions to prefetch the data and store the data in L1 cache 213. In other embodiments, the prefetched data may be stored in other memory locations or temporary buffers. as shown in figure 5, where the confidence level (which is the corresponding “weight factor”) is incremented by 1 level when there is a “prefetch hit;” and decremented by one level when there is a “load miss” or “PE eviction;” In the embodiment shown, each prefetch engine includes a confidence circuit 317 that generates a confidence value for the stream identifier being tracked by the prefetch engine. In one embodiment, the confidence circuit 317 monitors the success of the prefetches associated with the stream identifier and adjusts the confidence level based on factors e.g. as to whether the prefetched data is used, whether there was a load miss, or whether there was a prefetch evict. In one embodiment, confidence level may have 5 active levels where the highest level is indicative of 4 consecutive prefetch uses (See FIG. 5). In one embodiment, the confidence level is increased with every used data prefetch up to maximum confidence level and decremented with each load miss or prefetch evict. In one embodiment, a confidence level of zero would indicate an unallocated prefetch engine (¶ 0046) … (¶ 0045-0047); In one embodiment, confidence circuit 317 determines the confidence of the prefetching by implementing a confidence level scheme for a stream identifier. In one embodiment, the confidence level is adjusted based on a number of factors such as an indication whether a prefetch was used by the processor (PF used), whether there was a load miss (e.g. whether a load instruction associated with the stream identifier had an address that was not in the L1 cache) … (¶ 0059)].
As to claim 8, Mashimo in view of Al-Sukhni teaches The method of claim 6, wherein updating the entry for the first data page, based on the comparison comprises decreasing the weight factor of the predicted next page, based on the second address of the second data page being distinct from the address of the predicted next page [Mashimo -- In one embodiment, the prefetch logic 301 updates weights in the CWT 605 based on whether the weights successfully predict addresses for prefetching. If prefetched data is evicted from the cache before it is used, then each set of weights used to generate the prefetch are decremented in the CWT 605 … If the deltas mismatch, the CWT 605 is accessed by the deltas used to predict the mispredicted delta and decrement the weights corresponding to the mispredicted delta … (¶ 0085-0086); Al-Sukhni teaches predicting next page -- as shown in figure 5, where the confidence level (which is the corresponding “weight factor”) is incremented by 1 level when there is a “prefetch hit;” and decremented by one level when there is a “load miss” or “PE eviction;” In the embodiment shown, each prefetch engine includes a confidence circuit 317 that generates a confidence value for the stream identifier being tracked by the prefetch engine. In one embodiment, the confidence circuit 317 monitors the success of the prefetches associated with the stream identifier and adjusts the confidence level based on factors e.g. as to whether the prefetched data is used, whether there was a load miss, or whether there was a prefetch evict. In one embodiment, confidence level may have 5 active levels where the highest level is indicative of 4 consecutive prefetch uses (See FIG. 5). In one embodiment, the confidence level is increased with every used data prefetch up to maximum confidence level and decremented with each load miss or prefetch evict. In one embodiment, a confidence level of zero would indicate an unallocated prefetch engine (¶ 0046) … (¶ 0045-0047); In one embodiment, confidence circuit 317 determines the confidence of the prefetching by implementing a confidence level scheme for a stream identifier. In one embodiment, the confidence level is adjusted based on a number of factors such as an indication whether a prefetch was used by the processor (PF used), whether there was a load miss (e.g. whether a load instruction associated with the stream identifier had an address that was not in the L1 cache) … (¶ 0059)].
As to claim 9, Mashimo in view of Al-Sukhni teaches The method of claim 1, further comprising: receiving a second data request including a second address of a second data page stored in the memory device; and updating the predicted next page to be the second address of the second data page, based on the weight factor of the predicted next page [Mashimo -- At block 453, if the region ID 401 lookup results in a miss in the LPB 303, an entry in the LPB 303 is created for the region ID 401, as provided at block 455. If the LPB 303 is full, an existing entry is evicted from the LPB 303 according to a replacement policy. The new entry in the LPB 303 includes the region ID 401, a program counter 402 of the memory instruction originating the access, and the address accessed in the identified memory region … After an entry has been created for the region ID 401, lookups of the region ID 401 in the LPB 303 result in a hit. When a hit occurs at block 453, then at block 457 the prefetch logic 301 calculates a delta value from the difference between the new last accessed memory address (received at 451) and the previous last memory address (previously recorded in field 403 of the LPB 303) accessed in the same region … (¶ 0044-0045); At block 451, the prefetch logic 301 receives the address of a memory access by the processor core 230 and looks up the region ID 401 (e.g., the page address) of the received memory address in the LPB 303, as provided at block 453. If the lookup results in a miss, then a new entry is allocated in the LPB 303, as provided at block 455, and the received address is saved as the last address 402 in the LPB 303 … (¶ 0067); Al-Sukhni teaches predicting next page -- as shown in figure 6; FIG. 6 is a flow chart of operations of data processing system 201 in obtaining a physical address translation of a request from a prefetch unit. In 601, prediction circuit 321 computes the next prefetch address. In 602, a determination is made by predication circuit 321 of whether the next data address of the strided stream will cross a page boundary. If no in 602, then prediction circuit 321 generates the next prefetch address (physical address) at the appropriate time in that the next prefetch address is known … (¶ 0082-0084); Each prefetch engine (e.g. 311) attempts to detect a strided stream in the data addresses associated with its allocated stream identifier. In one embodiment, each prefetch engine includes a detection circuit 316 that detects strided patterns by comparing consecutive data access addresses associated with the allocated stream identifier. The prediction circuit 321, generates predicted addresses of future data accesses of the strided stream based on the detected stride from detection circuit 316 and the current data address. The predictions are provided to the select prefetch engine circuit 309, wherein the predictions are provided to LSU 207 via PFQ 210. LSU 207 uses the predictions to prefetch the data and store the data in L1 cache 213. In other embodiments, the prefetched data may be stored in other memory locations or temporary buffers. as shown in figure 5, where the confidence level (which is the corresponding “weight factor”) is incremented by 1 level when there is a “prefetch hit;” and decremented by one level when there is a “load miss” or “PE eviction;” In the embodiment shown, each prefetch engine includes a confidence circuit 317 that generates a confidence value for the stream identifier being tracked by the prefetch engine. In one embodiment, the confidence circuit 317 monitors the success of the prefetches associated with the stream identifier and adjusts the confidence level based on factors e.g. as to whether the prefetched data is used, whether there was a load miss, or whether there was a prefetch evict. In one embodiment, confidence level may have 5 active levels where the highest level is indicative of 4 consecutive prefetch uses (See FIG. 5). In one embodiment, the confidence level is increased with every used data prefetch up to maximum confidence level and decremented with each load miss or prefetch evict. In one embodiment, a confidence level of zero would indicate an unallocated prefetch engine (¶ 0046) … (¶ 0045-0047); In one embodiment, confidence circuit 317 determines the confidence of the prefetching by implementing a confidence level scheme for a stream identifier. In one embodiment, the confidence level is adjusted based on a number of factors such as an indication whether a prefetch was used by the processor (PF used), whether there was a load miss (e.g. whether a load instruction associated with the stream identifier had an address that was not in the L1 cache) … (¶ 0059)].
As to claim 10, Mashimo in view of Al-Sukhni teaches The method of claim 9, further comprising setting the weight of the updated predicted next page to a predetermined initial value [Rowlands -- The prefetch distance can be managed using a confidence value associated with the pattern. The confidence value, in effect, is a measure of how often the pattern is observed or, equivalently, the number of elements that make up the pattern. The confidence value, and hence the prefetch distance, may initially be zero; that is, prefetching might not begin as soon as an apparent pattern is detected … (¶ 0008); Al-Sukhni teaches predicting next page -- In the embodiment shown, each prefetch engine includes a confidence circuit 317 that generates a confidence value for the stream identifier being tracked by the prefetch engine. In one embodiment, the confidence circuit 317 monitors the success of the prefetches associated with the stream identifier and adjusts the confidence level based on factors e.g. as to whether the prefetched data is used, whether there was a load miss, or whether there was a prefetch evict. In one embodiment, confidence level may have 5 active levels where the highest level is indicative of 4 consecutive prefetch uses (See FIG. 5). In one embodiment, the confidence level is increased with every used data prefetch up to maximum confidence level and decremented with each load miss or prefetch evict. In one embodiment, a confidence level of zero would indicate an unallocated prefetch engine (¶ 0046)].
As to claim 11, it recites substantially the same limitations as in claim 1, and is rejected for the same reasons set forth in the analysis of claim 1. Refer to “As to claim 1” presented earlier in this Office Action for details.
As to claim 12, it recites substantially the same limitations as in claim 2, and is rejected for the same reasons set forth in the analysis of claim 2. Refer to “As to claim 2” presented earlier in this Office Action for details.
As to claim 13, it recites substantially the same limitations as in claim 3, and is rejected for the same reasons set forth in the analysis of claim 3. Refer to “As to claim 3” presented earlier in this Office Action for details.
As to claim 14, it recites substantially the same limitations as in claim 4, and is rejected for the same reasons set forth in the analysis of claim 4. Refer to “As to claim 4” presented earlier in this Office Action for details.
As to claim 15, it recites substantially the same limitations as in claim 5, and is rejected for the same reasons set forth in the analysis of claim 5. Refer to “As to claim 5” presented earlier in this Office Action for details.
As to claim 16, it recites substantially the same limitations as in claim 6, and is rejected for the same reasons set forth in the analysis of claim 6. Refer to “As to claim 6” presented earlier in this Office Action for details.
As to claim 17, it recites substantially the same limitations as in claim 7, and is rejected for the same reasons set forth in the analysis of claim 7. Refer to “As to claim 7” presented earlier in this Office Action for details.
As to claim 18, it recites substantially the same limitations as in claim 8, and is rejected for the same reasons set forth in the analysis of claim 8. Refer to “As to claim 8” presented earlier in this Office Action for details.
As to claim 19, it recites substantially the same limitations as in claim 9, and is rejected for the same reasons set forth in the analysis of claim 9. Refer to “As to claim 9” presented earlier in this Office Action for details.
As to claim 20, it recites substantially the same limitations as in claim 10, and is rejected for the same reasons set forth in the analysis of claim 10. Refer to “As to claim 10” presented earlier in this Office Action for details.
Conclusion
7. Claims 1-20 are rejected as explained above.
8. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHENG JEN TSAI whose telephone number is 571-272-4244. The examiner can normally be reached on Monday-Friday, 9-6.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Reginald Bragdon can be reached on 571-272-4204. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
/SHENG JEN TSAI/Primary Examiner, Art Unit 2139