DETAILED ACTION
1. This Office Action is taken in response to Applicants’ Amendments and Remarks filed on 1/20/2026 regarding application 18/491,274 filed on 10/20/2023.
Claims 1-20 are pending for consideration.
2. Response to Amendments and Remarks
Applicants’ amendments and remarks have been fully and carefully considered, with the Examiner’s response set forth below.
(1) Applicant contends that, regarding currently amended claim 1, Favor’s write combination buffer (WCB) fails to teach “a single update cache operation,” because “write-buffer merging dependent on buffer contents, not a pre-execution determination that data units satisfy a condition permitting concurrent storage in a single cache update operation” (see pages 8-10 of Applicant’s Remarks). The Examiner disagrees.
First, the amended limitation merely recites “in a cache update operation,” and is otherwise completely silent on the scope and details of “a cache update operation.” As such, the term “a cache update operation” must be given the broadest, reasonable interpretation according to the MPEP guidelines. Thus, within the context of claim 1, any operation that changes/modifies the contents of the cache memory would qualify as “a cache update operation,” including writing/storing data into the cache memory.
Second, with respect to “a cache update operation,” Favor teaches that the data stored in the write combination buffer (WCB) is written into a L2 cache memory after a timeout period, hence updating the L2 cache memory [Favor teaches write data stored in the WCB into the L2 cache, hence updating the L2 cache memory -- cache-controller execution logic as shown in figures 1 and 32; … When possible, the WCB 109 combines, or merges, multiple write requests into a single entry of the WCB 109 such that the WCB 109 may make a potentially larger single write request to the L2 cache 107 that encompasses the store data of multiple store operations that have spatially-locality … The WCB 109 may combine the four store operations into a single entry and perform a single write request to the L2 cache 107 of the fifteen bytes at address A … (¶ 0073); … In one embodiment, each WCB entry 2401 also includes a timeout value (not shown) that is initially set to zero and that is periodically incremented (or alternatively initially set to a predetermined value and periodically decremented). When the timeout value of an entry (i.e., the oldest entry) exceeds a predetermined value (or alternatively reaches zero), the WCB 109 requests the DTLB 141 to translate the write VA 2411 of the oldest entry 2401 into the write PA 2613 as described above with respect to block 2814, and the WCB 109 pushes the entry 2401 out of the WCB 109 to the L2 cache 107 per block 2816 (¶ 0188)].
Therefore, at least for this reason, Favor indeed teaches “a cache update operation” via a WCB.
(2) In response to the amendments and remarks, an updated claim analysis has been made. Refer to the corresponding sections of the following Office Action for details.
3. Examiner’s Note
(1) In the case of amending the Claimed invention, Applicant is respectfully requested to indicate the portion(s) of the specification which dictate(s) the structure relied on for proper interpretation and also to verify and ascertain the metes and bounds of the claimed invention. This will assist in expediting compact prosecution. MPEP 714.02 recites: “Applicant should also specifically point out the support for any amendments made to the disclosure. See MPEP § 2163.06. An amendment which does not comply with the provisions of 37 CFR 1.121(b), (c), (d), and (h) may be held not fully responsive. See MPEP § 714.” Amendments not pointing to specific support in the disclosure may be deemed as not complying with provisions of 37 C.F.R. 1.131(b), (c), (d), and (h) and therefore held not fully responsive. Generic statements such as “Applicants believe no new matter has been introduced” may be deemed insufficient.
(2) Examiner has cited particular columns/paragraph and line numbers in the references applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
4. Claim 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Favor et al. (US Patent Application Publication 2022/0358052, hereinafter Favor), and in view of Jain (US Patent Application Publication 2019/0114255).
As to claim 1, Favor teaches A method of controlling a cache memory [Cache memories in microprocessors may have a significant impact on their performance … (¶ 0002)], comprising:
receiving a first store request [The LSU 117 includes a write combining buffer (WCB) 109 that buffers write requests sent by the LSU 117 to the DTLB 141 and to the L2 cache 107 … The write request is buffered in the WCB 109. Eventually, at a relatively low priority, the store data associated with the write request will be written to the L2 cache 107. However, entries of the write combining buffer 109 are larger (e.g., 32 bytes) than the largest load and store operations (e.g., eight bytes). When possible, the WCB 109 combines, or merges, multiple write requests into a single entry of the WCB 109 such that the WCB 109 may make a potentially larger single write request to the L2 cache 107 that encompasses the store data of multiple store operations that have spatially-locality … (¶ 0073); Jain also teaches this limitation -- Storage devices, and methods for use therewith, are described herein. Such storage devices can include flash memory, random access memory (RAM), and a memory controller in communication therewith. To improve write performance, the memory controller is configured to store first and second data, corresponding to consecutive unaligned first and second write commands received within a threshold amount of time of one another from a host, sequentially relative to one another within the flash memory (abstract)];
determining whether one or more subsequent store requests are received within a threshold time after receiving the first store request [Favor teaches a timeout period, which is the corresponding “threshold time,” after which data in the write combination buffer (WCB) is stored into the L2 cache -- … In one embodiment, each WCB entry 2401 also includes a timeout value (not shown) that is initially set to zero and that is periodically incremented (or alternatively initially set to a predetermined value and periodically decremented). When the timeout value of an entry (i.e., the oldest entry) exceeds a predetermined value (or alternatively reaches zero), the WCB 109 requests the DTLB 141 to translate the write VA 2411 of the oldest entry 2401 into the write PA 2613 as described above with respect to block 2814, and the WCB 109 pushes the entry 2401 out of the WCB 109 to the L2 cache 107 per block 2816 (¶ 0188);
Jain more expressively teaches this limitation -- Storage devices, and methods for use therewith, are described herein. Such storage devices can include flash memory, random access memory (RAM), and a memory controller in communication therewith. To improve write performance, the memory controller is configured to store first and second data, corresponding to consecutive unaligned first and second write commands received within a threshold amount of time of one another from a host, sequentially relative to one another within the flash memory (abstract); Thereafter, if the memory controller 122 receives a next write command (which will be referred as a 2nd write command) within a predetermined amount of time (also referred to as a threshold amount of time), then the memory controller 122 will determine whether the 2nd write command was intended by the host 102 to cause the 2nd data to be stored in the non-volatile memory 124 sequentially relative to the 1st data … In the above discussed example, it was assumed that the next command that the memory controller 122 received from the host 102 after the 1st write command was also a write command (i.e., the 2nd write command), that the memory controller 122 determined from the 2nd write command that the host 102 wanted to store the 2nd data in the non-volatile memory 124 sequentially relative to the 1st data, and that the 2nd write command was received within the threshold amount of time … Similarly, if a next command was not received within the threshold amount of time, then then the tail portion of the 1st data (which was being stored in the controller RAM 206, and more specifically the TRAM buffer 218) would instead be post-padded (e.g., with dummy data) and then randomly stored by the memory controller 122 within the non-volatile memory 124 … (¶ 0066-0070)]; executing the first store request by storing a respective data unit associated with the first store request to a cache line of the cache memory in a cache update operation selected by cache-controller execution logic [Favor teaches write data stored in the WCB into the L2 cache, hence updating the L2 cache memory -- cache-controller execution logic as shown in figures 1 and 32; … When possible, the WCB 109 combines, or merges, multiple write requests into a single entry of the WCB 109 such that the WCB 109 may make a potentially larger single write request to the L2 cache 107 that encompasses the store data of multiple store operations that have spatially-locality … The WCB 109 may combine the four store operations into a single entry and perform a single write request to the L2 cache 107 of the fifteen bytes at address A … (¶ 0073); … In one embodiment, each WCB entry 2401 also includes a timeout value (not shown) that is initially set to zero and that is periodically incremented (or alternatively initially set to a predetermined value and periodically decremented). When the timeout value of an entry (i.e., the oldest entry) exceeds a predetermined value (or alternatively reaches zero), the WCB 109 requests the DTLB 141 to translate the write VA 2411 of the oldest entry 2401 into the write PA 2613 as described above with respect to block 2814, and the WCB 109 pushes the entry 2401 out of the WCB 109 to the L2 cache 107 per block 2816 (¶ 0188)] based on determining that no further store requests have been received within the threshold time [Favor teaches a timeout period, which is the corresponding “threshold time,” after which data in the write combination buffer (WCB) is stored into the L2 cache, thus, the data associated with the first request stored in the WCB will be written into the L2 cache if there is no further store requests have been received within the timeout period -- … In one embodiment, each WCB entry 2401 also includes a timeout value (not shown) that is initially set to zero and that is periodically incremented (or alternatively initially set to a predetermined value and periodically decremented). When the timeout value of an entry (i.e., the oldest entry) exceeds a predetermined value (or alternatively reaches zero), the WCB 109 requests the DTLB 141 to translate the write VA 2411 of the oldest entry 2401 into the write PA 2613 as described above with respect to block 2814, and the WCB 109 pushes the entry 2401 out of the WCB 109 to the L2 cache 107 per block 2816 (¶ 0188);
Jain more expressively teaches this limitation -- Storage devices, and methods for use therewith, are described herein. Such storage devices can include flash memory, random access memory (RAM), and a memory controller in communication therewith. To improve write performance, the memory controller is configured to store first and second data, corresponding to consecutive unaligned first and second write commands received within a threshold amount of time of one another from a host, sequentially relative to one another within the flash memory (abstract); Thereafter, if the memory controller 122 receives a next write command (which will be referred as a 2nd write command) within a predetermined amount of time (also referred to as a threshold amount of time), then the memory controller 122 will determine whether the 2nd write command was intended by the host 102 to cause the 2nd data to be stored in the non-volatile memory 124 sequentially relative to the 1st data … In the above discussed example, it was assumed that the next command that the memory controller 122 received from the host 102 after the 1st write command was also a write command (i.e., the 2nd write command), that the memory controller 122 determined from the 2nd write command that the host 102 wanted to store the 2nd data in the non-volatile memory 124 sequentially relative to the 1st data, and that the 2nd write command was received within the threshold amount of time … Similarly, if a next command was not received within the threshold amount of time, then then the tail portion of the 1st data (which was being stored in the controller RAM 206, and more specifically the TRAM buffer 218) would instead be post-padded (e.g., with dummy data) and then randomly stored by the memory controller 122 within the non-volatile memory 124 … (¶ 0066-0070)];
based on determining that one or more further store requests are received within the threshold time [Favor teaches a timeout period, which is the corresponding “threshold time,” after which data in the write combination buffer (WCB) is stored into the L2 cache -- … In one embodiment, each WCB entry 2401 also includes a timeout value (not shown) that is initially set to zero and that is periodically incremented (or alternatively initially set to a predetermined value and periodically decremented). When the timeout value of an entry (i.e., the oldest entry) exceeds a predetermined value (or alternatively reaches zero), the WCB 109 requests the DTLB 141 to translate the write VA 2411 of the oldest entry 2401 into the write PA 2613 as described above with respect to block 2814, and the WCB 109 pushes the entry 2401 out of the WCB 109 to the L2 cache 107 per block 2816 (¶ 0188);
Jain more expressively teaches this limitation -- Storage devices, and methods for use therewith, are described herein. Such storage devices can include flash memory, random access memory (RAM), and a memory controller in communication therewith. To improve write performance, the memory controller is configured to store first and second data, corresponding to consecutive unaligned first and second write commands received within a threshold amount of time of one another from a host, sequentially relative to one another within the flash memory (abstract); Thereafter, if the memory controller 122 receives a next write command (which will be referred as a 2nd write command) within a predetermined amount of time (also referred to as a threshold amount of time), then the memory controller 122 will determine whether the 2nd write command was intended by the host 102 to cause the 2nd data to be stored in the non-volatile memory 124 sequentially relative to the 1st data … In the above discussed example, it was assumed that the next command that the memory controller 122 received from the host 102 after the 1st write command was also a write command (i.e., the 2nd write command), that the memory controller 122 determined from the 2nd write command that the host 102 wanted to store the 2nd data in the non-volatile memory 124 sequentially relative to the 1st data, and that the 2nd write command was received within the threshold amount of time … Similarly, if a next command was not received within the threshold amount of time, then then the tail portion of the 1st data (which was being stored in the controller RAM 206, and more specifically the TRAM buffer 218) would instead be post-padded (e.g., with dummy data) and then randomly stored by the memory controller 122 within the non-volatile memory 124 … (¶ 0066-0070)], determining whether two or more store requests received within the threshold time are designated for storage in a same cache line [… When possible, the WCB 109 combines, or merges, multiple write requests into a single entry of the WCB 109 such that the WCB 109 may make a potentially larger single write request to the L2 cache 107 that encompasses the store data of multiple store operations that have spatially-locality. The merging, or combining, is possible when the starting physical memory address and size of two or more store operations align and fall within a single entry of the WCB 109. For example, assume a first 8-byte store operation to 32-byte aligned physical address A, a second 4-byte store operation to physical address A+8, a third 2-byte store operation to physical address A+12, and a fourth 1-byte store operation to physical address A+14. The WCB 109 may combine the four store operations into a single entry and perform a single write request to the L2 cache 107 of the fifteen bytes at address A … (¶ 0073); … In the example embodiment of FIGS. 24 and 25 in which W is four and C is six, there are four possible write blocks and the combination of the write PAP 2404 and write PA[5:4] 2406 is a proxy for the write physical block address within the L2 cache 107, although other embodiments are contemplated as stated above. That is, the write block within the cache line is determined by the write PA[5:4] 2406. Because W is less than or equal to C, each store data 2402 combined into the write data 2402 of a WCB entry 2401 has the same write physical line address and belongs within the same cache line and has the same write physical block address and belongs within the same write block. In one embodiment, W is equal to C, i.e., the width of a WCB entry 2401 is the same as a cache line, in which case the write PA [5:4] bits 2406 are not needed to specify a write block within a cache line … (¶ 0165-0166];
determining that data units associated with the two or more store requests satisfy a condition permitting concurrent storage in a single cache update operation [the combined data units must have a length that is equal to, or smaller than, the length of a cache line to be able to concurrently stored in a single cache update operation -- … When possible, the WCB 109 combines, or merges, multiple write requests into a single entry of the WCB 109 such that the WCB 109 may make a potentially larger single write request to the L2 cache 107 that encompasses the store data of multiple store operations that have spatially-locality. The merging, or combining, is possible when the starting physical memory address and size of two or more store operations align and fall within a single entry of the WCB 109. For example, assume a first 8-byte store operation to 32-byte aligned physical address A, a second 4-byte store operation to physical address A+8, a third 2-byte store operation to physical address A+12, and a fourth 1-byte store operation to physical address A+14. The WCB 109 may combine the four store operations into a single entry and perform a single write request to the L2 cache 107 of the fifteen bytes at address A (¶ 0073); … That is, the write block within the cache line is determined by the write PA[5:4] 2406. Because W is less than or equal to C, each store data 2402 combined into the write data 2402 of a WCB entry 2401 has the same write physical line address and belongs within the same cache line and has the same write physical block address and belongs within the same write block. In one embodiment, W is equal to C, i.e., the width of a WCB entry 2401 is the same as a cache line, in which case the write PA [5:4] bits 2406 are not needed to specify a write block within a cache line (¶ 0165); The WCB 109 compares the store PAP 1304 of the store instruction being committed with the write PAP 2404 of each WCB entry 2401 (e.g., at block 2802 of FIG. 28) and requires a match as a necessary condition for combining the store instruction with a WCB entry 2401. In embodiments in which the width of the write data 2402 of a WCB entry 2401 is less than the width of a cache line (e.g., as in the embodiment of FIGS. 24 through 26), the WCB 109 compares the store PA[54] 1306 of the store instruction being committed with the write PA[5:4] 2406 of each WCB entry 2401 and requires a match as a necessary condition for combining the store instruction with a WCB entry 2401 … (¶ 0175)]; and
concurrently storing the respective data units associated with the two or more store requests to the same cache line of the cache memory in a single cache update operation selected by cache-controller execution logic [cache-controller execution logic as shown in figures 1 and 32] based on determining that the respective data units associated with the two or more store requests are designated for storage in the same cache line [… The merging, or combining, is possible when the starting physical memory address and size of two or more store operations align and fall within a single entry of the WCB 109 … (¶ 0073); … Because conventional high-performance superscalar processors are designed to execute multiple (N) store instructions per clock cycle, i.e., concurrently, each of the concurrently executed store instructions needs to be able to CAM against the load queue at the same time. This requires N CAM ports in the load queue. For example, a conventional high-performance superscalar processor might execute 4 store instructions concurrently, in which case the load queue requires at least 4 CAM ports, which may imply a significant amount of power consumption and area … (¶ 0215); … That is, the write block within the cache line is determined by the write PA[5:4] 2406. Because W is less than or equal to C, each store data 2402 combined into the write data 2402 of a WCB entry 2401 has the same write physical line address and belongs within the same cache line and has the same write physical block address and belongs within the same write block. In one embodiment, W is equal to C, i.e., the width of a WCB entry 2401 is the same as a cache line, in which case the write PA [5:4] bits 2406 are not needed to specify a write block within a cache line … The store PA[5:4] 1306 of each of the store instructions combined into a WCB entry 2401 is identical since, in order to be combined, the store data 1302 of each of the store instructions must be written to the same write block within the same cache line of the L2 cache 107, i.e., have the same store physical block address. Thus, the WCB entry 2401 is able to include a single write PA[5:4] 2406 to hold the identical store PA[5:4] 1304 of all of the combined store instructions (¶ 0165-0166)] and satisfy the condition permitting concurrent storage in a single cache update operation, and storing the respective data units associated with the two or more store requests in separate cache update operations based on determining that none of the two or more store requests are designated for storage in the same cache line or fail to satisfy the condition permitting concurrent storage in a single cache update operation [… When possible, the WCB 109 combines, or merges, multiple write requests into a single entry of the WCB 109 such that the WCB 109 may make a potentially larger single write request to the L2 cache 107 that encompasses the store data of multiple store operations that have spatially-locality. The merging, or combining, is possible when the starting physical memory address and size of two or more store operations align and fall within a single entry of the WCB 109. For example, assume a first 8-byte store operation to 32-byte aligned physical address A, a second 4-byte store operation to physical address A+8, a third 2-byte store operation to physical address A+12, and a fourth 1-byte store operation to physical address A+14. The WCB 109 may combine the four store operations into a single entry and perform a single write request to the L2 cache 107 of the fifteen bytes at address A … (¶ 0073); … That is, the write block within the cache line is determined by the write PA[5:4] 2406. Because W is less than or equal to C, each store data 2402 combined into the write data 2402 of a WCB entry 2401 has the same write physical line address and belongs within the same cache line and has the same write physical block address and belongs within the same write block. In one embodiment, W is equal to C, i.e., the width of a WCB entry 2401 is the same as a cache line, in which case the write PA [5:4] bits 2406 are not needed to specify a write block within a cache line (¶ 0165); The WCB 109 compares the store PAP 1304 of the store instruction being committed with the write PAP 2404 of each WCB entry 2401 (e.g., at block 2802 of FIG. 28) and requires a match as a necessary condition for combining the store instruction with a WCB entry 2401. In embodiments in which the width of the write data 2402 of a WCB entry 2401 is less than the width of a cache line (e.g., as in the embodiment of FIGS. 24 through 26), the WCB 109 compares the store PA[54] 1306 of the store instruction being committed with the write PA[5:4] 2406 of each WCB entry 2401 and requires a match as a necessary condition for combining the store instruction with a WCB entry 2401 … (¶ 0175)].
Regarding claim 1, Favor does not explicitly teach determining whether one or more subsequent store requests are received within a threshold time after receiving the first store request.
However, Favor does teach assigning a timeout value for each request entry of the WCB, and write an entry into the L2 cache when the timeout value is reached [… In one embodiment, each WCB entry 2401 also includes a timeout value (not shown) that is initially set to zero and that is periodically incremented (or alternatively initially set to a predetermined value and periodically decremented). When the timeout value of an entry (i.e., the oldest entry) exceeds a predetermined value (or alternatively reaches zero), the WCB 109 requests the DTLB 141 to translate the write VA 2411 of the oldest entry 2401 into the write PA 2613 as described above with respect to block 2814, and the WCB 109 pushes the entry 2401 out of the WCB 109 to the L2 cache 107 per block 2816 (¶ 0188)].
Hence, Favor effectively uses the timeout value as a time threshold for combining with a second write request.
Further, Jain explicitly teaches determining whether one or more subsequent store requests are received within a threshold time after receiving the first store request [Storage devices, and methods for use therewith, are described herein. Such storage devices can include flash memory, random access memory (RAM), and a memory controller in communication therewith. To improve write performance, the memory controller is configured to store first and second data, corresponding to consecutive unaligned first and second write commands received within a threshold amount of time of one another from a host, sequentially relative to one another within the flash memory (abstract); Thereafter, if the memory controller 122 receives a next write command (which will be referred as a 2nd write command) within a predetermined amount of time (also referred to as a threshold amount of time), then the memory controller 122 will determine whether the 2nd write command was intended by the host 102 to cause the 2nd data to be stored in the non-volatile memory 124 sequentially relative to the 1st data … In the above discussed example, it was assumed that the next command that the memory controller 122 received from the host 102 after the 1st write command was also a write command (i.e., the 2nd write command), that the memory controller 122 determined from the 2nd write command that the host 102 wanted to store the 2nd data in the non-volatile memory 124 sequentially relative to the 1st data, and that the 2nd write command was received within the threshold amount of time … Similarly, if a next command was not received within the threshold amount of time, then then the tail portion of the 1st data (which was being stored in the controller RAM 206, and more specifically the TRAM buffer 218) would instead be post-padded (e.g., with dummy data) and then randomly stored by the memory controller 122 within the non-volatile memory 124 … (¶ 0066-0070)].
Therefore, it would have been obvious for one of ordinary skills in the art before the effective filing date of the claimed invention to determining whether one or more subsequent store requests are received within a threshold time after receiving the first store request, as implicitly alluding to by Favor, and explicitly disclosed by Jain, in order to process the first request in a timely manner, rather than waiting forever the next request to arrive.
As to claim 2, Favor in view of Jain teaches The method of claim 1, wherein: the condition permitting concurrent storage in a single cache update operation is based on determining that a total number of data bits of the respective data units associated with the two or more store requests is less than or equal to a threshold number of data bits [Favor -- … That is, the write block within the cache line is determined by the write PA[5:4] 2406. Because W is less than or equal to C, each store data 2402 combined into the write data 2402 of a WCB entry 2401 has the same write physical line address and belongs within the same cache line and has the same write physical block address and belongs within the same write block. In one embodiment, W is equal to C, i.e., the width of a WCB entry 2401 is the same as a cache line, in which case the write PA [5:4] bits 2406 are not needed to specify a write block within a cache line (¶ 0165); The WCB 109 compares the store PAP 1304 of the store instruction being committed with the write PAP 2404 of each WCB entry 2401 (e.g., at block 2802 of FIG. 28) and requires a match as a necessary condition for combining the store instruction with a WCB entry 2401. In embodiments in which the width of the write data 2402 of a WCB entry 2401 is less than the width of a cache line (e.g., as in the embodiment of FIGS. 24 through 26), the WCB 109 compares the store PA[54] 1306 of the store instruction being committed with the write PA[5:4] 2406 of each WCB entry 2401 and requires a match as a necessary condition for combining the store instruction with a WCB entry 2401 … (¶ 0175)].
As to claim 3, Favor in view of Jain teaches The method of claim 2, wherein: the threshold number of data bits corresponds to a maximum number of data bits in the same cache line of the cache memory [Favor -- … That is, the write block within the cache line is determined by the write PA[5:4] 2406. Because W is less than or equal to C, each store data 2402 combined into the write data 2402 of a WCB entry 2401 has the same write physical line address and belongs within the same cache line and has the same write physical block address and belongs within the same write block. In one embodiment, W is equal to C, i.e., the width of a WCB entry 2401 is the same as a cache line, in which case the write PA [5:4] bits 2406 are not needed to specify a write block within a cache line (¶ 0165); The WCB 109 compares the store PAP 1304 of the store instruction being committed with the write PAP 2404 of each WCB entry 2401 (e.g., at block 2802 of FIG. 28) and requires a match as a necessary condition for combining the store instruction with a WCB entry 2401. In embodiments in which the width of the write data 2402 of a WCB entry 2401 is less than the width of a cache line (e.g., as in the embodiment of FIGS. 24 through 26), the WCB 109 compares the store PA[54] 1306 of the store instruction being committed with the write PA[5:4] 2406 of each WCB entry 2401 and requires a match as a necessary condition for combining the store instruction with a WCB entry 2401 … (¶ 0175)].
As to claim 4, Favor in view of Jain teaches The method of claim 1, wherein the condition permitting concurrent storage in a single cache update operation is based on the two or more store requests are associated with contiguous offset locations [Favor -- as shown in figures 23-25; … Furthermore, a program may perform a burst of small store instructions that specify addresses that are substantially sequential in nature. If each of these small store data is written individually to the cache, each tying up the entire wide cache bus even though only a single byte is being written on the bus, then the bus resources may be used inefficiently and congestion may occur at the cache, which may have a significant negative performance impact (¶ 0157); … That is, the address bits PA[5:3] 2906 specify the offset of an eight byte-aligned eight-byte data word with a 64-byte-aligned 64-byte memory line … (¶ 0194)].
As to claim 5, Favor in view of Jain teaches The method of claim 1, wherein: the two or more store requests are associated with non-contiguous offset locations [Favor -- as shown in figures 23-25; … The write request is buffered in the WCB 109 … When possible, the WCB 109 combines, or merges, multiple write requests into a single entry of the WCB 109 such that the WCB 109 may make a potentially larger single write request to the L2 cache 107 that encompasses the store data of multiple store operations that have spatially-locality … (¶ 0073)].
As to claim 6, Favor in view of Jain teaches The method of claim 5, further comprising: storing an offset value for each respective data unit associated with the two or more store requests, wherein the offset value for each respective data unit corresponds to a location of the respective data unit in the same cache line [Favor -- as shown in figures 23-25].
As to claim 7, Favor in view of Jain teaches The method of claim 6, further comprising: storing a data size value for at least one respective data unit associated with the two or more store requests, wherein the data size value corresponds to a number of bits in the at least one respective data unit [Favor -- as shown in figures 23-25].
As to claim 8, Favor in view of Jain teaches The method of claim 1, wherein: the cache memory comprises level one (L1) cache memory [Favor -- The back-end 130 includes a level-1 (L1) data cache 103, a level-2 (L2) cache 107, a register files 105, a plurality of execution units (EU) 114, and load and store queues (LSQ) 125 … When a store Op is committed, the store data held in the associated store queue 125 entry is written into the L1 data cache 103 at the store address held in the store queue 125 entry … (¶ 0071)].
As to claim 9, Favor in view of Jain teaches The method of claim 8, wherein: the cache memory further comprises level two (L2) cache memory [Favor -- The back-end 130 includes a level-1 (L1) data cache 103, a level-2 (L2) cache 107, a register files 105, a plurality of execution units (EU) 114, and load and store queues (LSQ) 125 … The LSU 117 includes a write combining buffer (WCB) 109 that buffers write requests sent by the LSU 117 to the DTLB 141 and to the L2 cache 107 … (¶ 0071-0073)].
As to claim 10, Favor in view of Jain teaches The method of claim 9, further comprising: concurrently updating both the L1 cache memory and the L2 cache memory during the single cache update operation [Favor -- The LSU 117 includes a write combining buffer (WCB) 109 that buffers write requests sent by the LSU 117 to the DTLB 141 and to the L2 cache 107. In one embodiment, the L1 data cache 103 is a virtually-indexed virtually-tagged write-through cache. In the case of a store operation, when there are no older operations that could cause the store operation to be aborted, the store operation is ready to be committed, and the store data is written into the L1 data cache 103. The LSU 117 also generates a write request to “write-through” the store data to the L2 cache 107 and update the DTLB 141, e.g., to set a page dirty, or page modified, bit. The write request is buffered in the WCB 109. Eventually, at a relatively low priority, the store data associated with the write request will be written to the L2 cache 107 … (¶ 0073); The dPAP 209 is all or a portion of a physical address proxy (PAP), e.g., PAP 699 of FIG. 6. As described herein, the L2 cache 107 is inclusive of the L1 data cache 103. That is, each cache line of memory allocated into the L1 data cache 103 is also allocated into the L2 cache 107, and when the L2 cache 107 evicts the cache line, the L2 cache 107 also causes the L1 data cache 103 to evict the cache line. A PAP is a forward pointer to the unique entry in the L2 cache 107 (e.g., L2 entry 401 of FIG. 4) that holds a copy of the cache line held in the entry 201 of the L1 data cache 103 … (¶ 0093); FIG. 6 illustrates aspects of processing of a snoop request 601 by the cache subsystem 600, which is also described in FIG. 8, to ensure cache coherency between the L2 cache 107, L1 data cache 103 and other caches of a system that includes the core 100 of FIG. 1, such as a multi-processor or multi-core system … (¶ 0107)].
As to claim 11, Favor in view of Jain teaches The method of claim 1, wherein: the respective data units associated with the two or more store requests are each comprised of a same number of data bits [Favor -- as shown in figures 23-25].
As to claim 12, Favor in view of Jain teaches The method of claim 1, wherein: the given cache line has a data block length of 128 bits; and the respective data units associated with the two or more store requests are each comprised of 64 bits [Favor -- as shown in figures 23-25; … The merging, or combining, is possible when the starting physical memory address and size of two or more store operations align and fall within a single entry of the WCB 109 … (¶ 0073); … That is, the write block within the cache line is determined by the write PA[5:4] 2406. Because W is less than or equal to C, each store data 2402 combined into the write data 2402 of a WCB entry 2401 has the same write physical line address and belongs within the same cache line and has the same write physical block address and belongs within the same write block. In one embodiment, W is equal to C, i.e., the width of a WCB entry 2401 is the same as a cache line, in which case the write PA [5:4] bits 2406 are not needed to specify a write block within a cache line (¶ 0165); … In the embodiment of FIG. 6, the PAP 699 is thirteen bits, whereas the physical memory line address is 46 bits, for a saving of 33 bits per entry of the L1 data cache 103, although other embodiments are contemplated in which the different bit savings are enjoyed (¶ 0108)].
As to claim 13, it recites substantially the same limitations as in claim 1, and is rejected for the same reasons set forth in the analysis of claim 1. Refer to “As to claim 1” presented earlier in this Office Action for details.
As to claim 14, it recites substantially the same limitations as in claim 2, and is rejected for the same reasons set forth in the analysis of claim 2. Refer to “As to claim 2” presented earlier in this Office Action for details.
As to claim 15, it recites substantially the same limitations as in claim 7, and is rejected for the same reasons set forth in the analysis of claim 1. Refer to “As to claim 1” presented earlier in this Office Action for details.
As to claim 16, it recites substantially the same limitations as in claim 8, and is rejected for the same reasons set forth in the analysis of claim 8. Refer to “As to claim 8” presented earlier in this Office Action for details.
As to claim 17, it recites substantially the same limitations as in claim 9, and is rejected for the same reasons set forth in the analysis of claim 9. Refer to “As to claim 9” presented earlier in this Office Action for details.
As to claim 18, it recites substantially the same limitations as in claim 10, and is rejected for the same reasons set forth in the analysis of claim 10. Refer to “As to claim 10” presented earlier in this Office Action for details.
As to claim 19, it recites substantially the same limitations as in claim 11, and is rejected for the same reasons set forth in the analysis of claim 11. Refer to “As to claim 11” presented earlier in this Office Action for details.
As to claim 20, it recites substantially the same limitations as in claim 12, and is rejected for the same reasons set forth in the analysis of claim 12. Refer to “As to claim 12” presented earlier in this Office Action for details.
Conclusion
5. Claims 1-20 are rejected as explained above.
6. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
7. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHENG JEN TSAI whose telephone number is 571-272-4244. The examiner can normally be reached on Monday-Friday, 9-6.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Reginald Bragdon can be reached on 571-272-4204. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
/SHENG JEN TSAI/Primary Examiner, Art Unit 2139
February 3, 2026