Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on October 29th, 2025 has been entered.
Claim Status
Claim 1 remains cancelled. Claims 2, 11 and 20 have been amended. Claims 2-21 remain pending and are ready for examination.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 2-3, 5-13, and 15-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Loh et al. (US Publication No. 2013/0138892 -- "Loh") in view of Kim et al. (US Publication No. 2017/0168931 – “Kim”) in further view of Garcia Guirado (US Publication No. 2019/0079874 – “Garcia”).
Regarding claim 2, Loh teaches A memory module comprising: a main memory storing data; a cache memory to store at least a portion of the data from the main memory, the cache memory comprising a plurality of sets of cache lines each comprising a plurality of cache storage locations; (Loh paragraph [0043], Turning now to FIG. 3, a generalized block diagram of one embodiment of a computing system 300 utilizing a three-dimensional (3D) DRAM is shown. Circuitry and logic described earlier are numbered identically. The computing system 300 may utilize three-dimensional (3D) packaging, such as a System in Package (SiP) as described earlier. The computing system 300 may include a SiP 310. In one embodiment, the SiP 310 may include the processing unit 220 described earlier and a 3D DRAM 330 that communicate through low-latency interconnect 340. The in-package low-latency interconnect 340 may be horizontal and/or vertical with shorter lengths than long off-chip interconnects when a SiP is not used. Loh Figure 3; Reference #330, 340, 220. Note that the processing unit 220 is separated from the 3D DRAM 330 (i.e, memory module) via the interconnect 340 (in this case, a low-latency interconnect). Also see Loh paragraph [0074], the memory request may be sent to main memory. The main memory may include an off-chip non-integrated DRAM and/or an off-chip disk memory. If the tag comparisons determine a tag hit occurs (conditional block 516), then in block 520, read or write operations are performed on a corresponding cache line in the row buffer. The main memory device is used to store data, among other various functions. Note that the memory module containing the main memory is coupled to the interconnect, see Loh paragraph [0005], The 3D packaging, known as System in Package (SiP) or Chip Stack multi-chip module (MCM), saves space by stacking separate chips in a single package. Components within these layers communicate using on-chip signaling, whether vertically or horizontally. This signaling provides reduced interconnect signal delay over known two-dimensional planar layout circuits) and a memory controller coupled to the cache memory, wherein the memory controller is configured to: (Loh paragraph [0031], If a cache miss occurs, such as a requested block is not found in a respective one of the cache memory subsystems 124a-124b or in the shared cache memory subsystem 128, then a read request may be generated and transmitted to the memory controller 130. The memory controller 130 may translate an address corresponding to the requested block and send a read request to the off-chip DRAM 170 through the memory bus 150. The off-chip DRAM 170 may be filled with data from the off-chip disk memory 162 through the I/O controller and bus 160 and the memory bus 150. Loh paragraph [0037], The memory controller 130 may include control circuitry for interfacing to the memory channels and following a corresponding protocol. Additionally, the memory controller 130 may include request queues for queuing memory requests) compare first tag data from a received read request to second tag data from the cache memory, wherein the first tag data identifies a first cache line in a first set of cache lines in the cache memory, (Loh paragraphs [0011-0012], In one embodiment, a computing system includes a processing unit and an integrated dynamic random access memory (DRAM). Examples of the processing unit include a general-purpose microprocessor, a graphics processing unit (GPU), an accelerated processing unit (APU), and so forth. The integrated DRAM may be a three-dimensional (3D) DRAM and may be included in a System-in-Package (SiP) with the processing unit. The processing unit may utilize the 3D DRAM as a cache. [0012] In various embodiments, the 3D DRAM may store both a tag array and a data array. Each row of the multiple rows in the memory array banks of the 3D DRAM may store one or more cache tags and one or more corresponding cache lines indicated by the one or more cache tags. In response to receiving a memory request from the processing unit, the 3D DRAM may perform a memory access according to the received memory request on a given cache line indicated by a cache tag within the received memory request. Performing the memory access may include a single read of a respective row of the multiple rows storing the given cache line. Rather than utilizing multiple DRAM transactions, a single, complex DRAM transaction may be used to reduce latency and power consumption. The memory request can be considered a read request, which is associated with a first tag data that is used to identify a specific portion of the cache, i.e., the "given cache line", as stated in the reference) and if the second tag data matches the first tag data, initiate an action with respect to the first cache line in the cache memory (Loh paragraphs [0060-0061], A sequence of steps 1-7 is shown in FIG. 4 for accessing tags, status information and data corresponding to cache lines stored in a 3D DRAM. When the memory array bank 430 is used as a cache storing both a tag array and a data array within a same row, an access sequence different from a sequence utilizing steps 1-7 for a given row of the rows 432a-432k may have a large latency. For example, a DRAM access typically includes an first activation or opening stage, a stage that copies the contents of an entire row into the row buffer, a tag read stage, a tag comparison stage, a data read or write access stage that includes a column access, a first precharge or closing stage, a second activation or opening stage, a stage that copies the contents of the entire row again into the row buffer, a tag read stage, a tag comparison stage, an update stage for status information corresponding to the matching tag, and a second precharge or closing stage. The two separate tags are compared to one another. If the two tags match, then the memory access request that was previously described may be executed, which can involve a plurality of actions as well as cache lines, see Loh paragraph [0061], Continuing with the access steps within the memory array bank 430, one or more additional precharge and activation stages may be included after each access of the row buffer if other data stored in other rows are accessed in the meantime. Rather than utilize multiple DRAM transactions for a single cache access, the sequence of steps 1-7, may be used to convert a cache access into a single DRAM transaction. Each of the different DRAM operations, such as activation/open, column access, read, write, and precharge/close, has a different respective latency) the first set of cache lines storing the first cache line identified by the first data tag from the received read request (Loh paragraphs [0011-0012], In one embodiment, a computing system includes a processing unit and an integrated dynamic random access memory (DRAM). Examples of the processing unit include a general-purpose microprocessor, a graphics processing unit (GPU), an accelerated processing unit (APU), and so forth. The integrated DRAM may be a three-dimensional (3D) DRAM and may be included in a System-in-Package (SiP) with the processing unit. The processing unit may utilize the 3D DRAM as a cache. [0012] In various embodiments, the 3D DRAM may store both a tag array and a data array. Each row of the multiple rows in the memory array banks of the 3D DRAM may store one or more cache tags and one or more corresponding cache lines indicated by the one or more cache tags. In response to receiving a memory request from the processing unit, the 3D DRAM may perform a memory access according to the received memory request on a given cache line indicated by a cache tag within the received memory request. Performing the memory access may include a single read of a respective row of the multiple rows storing the given cache line. Rather than utilizing multiple DRAM transactions, a single, complex DRAM transaction may be used to reduce latency and power consumption. The memory request can be considered a read request, which is associated with a first tag data that is used to identify a specific portion of the cache, i.e., the "given cache line", as stated in the reference).
Loh does not teach wherein the second tag data is stored in a second cache line of a different set of cache lines in the cache memory than the first set of cache lines, wherein the different set of cache lines comprises the second cache line storing cache tag data for the plurality of sets of cache lines including the first set of cache lines, and wherein the different set of cache lines comprises at least one additional cache line storing cache data.
However, Kim teaches wherein the second tag data is stored in a second cache line of a different set of cache lines in the cache memory than the first set of cache lines, wherein the different set of cache lines comprises the second cache line storing cache tag data for the plurality of sets of cache lines including the first set of cache lines (Kim paragraph [0010], at least one dynamic random access memory (DRAM) used as a cache of the at least one nonvolatile memory, and a memory module control device configured to control the nonvolatile memory controller and the at least one DRAM and configured to output tag information to the at least one DRAM. The at least one DRAM stores a tag corresponding to cache data and compares the stored tag with the tag information from the memory module control device to determine whether a hit/miss is generated with respect to the cache, through the tag comparison. The cache lines of a particular DRAM may be used to store cache tag data for its own cache lines as well as cache lines of different sets (i.e., different DRAMS), such as seen in Figure 2, Ref #332, with a designated DRAM for storing cache tag data. Also see Kim paragraph [0049], At least one DRAM 331 of the plurality of first DRAMs 330-1 and the plurality of second DRAMs 330-2 may store a tag corresponding to a cache line and compare stored tag information with input tag information. The remaining DRAMs may be implemented to store cache data corresponding to the tag. Hereinafter, a DRAM, which stores tags, may be referred to as “tag DRAM”, and each of the remaining DRAMs may be referred to as “data DRAM”. The at least one DRAM 331 may be a tag DRAM. DRAM 332 may be a data DRAM).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh with those of Kim. Kim teaches using a DRAM that can store local cache tag information as well as cache tag information for other cache lines as well, which can improve the efficiency of accessing the tag information for various operations, such as hit/miss detection (Kim paragraphs [0044-0045], The at least one cache DRAM 330 may perform a cache function of the at least one nonvolatile memory 310. The at least one cache DRAM 330 may store a tag corresponding to cache data or generate a match signal indicating a cache hit or a cache miss through tag comparison. The computing system 10 according to some embodiments of the inventive concepts may use the nonvolatile memory module 300 having the cache DRAM 330 as a working memory, thereby achieving a lower cost and higher capacity and performance than those of a conventional computing system).
Loh in view of Kim does not teach wherein the different set of cache lines comprises at least one additional cache line storing cache data.
However, Garcia teaches wherein the different set of cache lines comprises at least one additional cache line storing cache data (Garcia Fig. 1; see Ref #160, 165 and 170 in cache storage. The plurality of different cache lines can include cache tag data, as well as additional cache data, see Garcia paragraph [0024], In a second embodiment, there is provided a cache storage comprising cache lines to store data entries representing data which can be retrieved from the cache storage when a storage access instruction contains a storage identifier which corresponds with a tag associated with a cache line, [0025] wherein a said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion, [0026] the cache storage configured to: [0027] allow re-use of a selected shareable tag storage location and thus update a first shareable tag portion comprised therein to a second shareable tag portion; [0028] identify one or more cache lines associated with individual tag portions comprising a pointer to the selected shareable tag storage location; and [0029] set a given cache line status for each of the identified cache lines, wherein the given cache line status: [0030] a) allows a cache line to continue to be used in relation to a storage access instruction received before said given cache line status was set; and [0031] b) inhibits the cache line from being used in relation to a storage access instruction received after the given cache line status is set).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh and Kim with those of Garcia. Garcia teaches using a plurality of cache lines in a cache storage to access cache tag and cache storage data, which can allow more effective cache management, such as clearing cache lines or detecting particular subsets of cache lines (i.e., see Garcia paragraphs [0020-0021], In some examples, the method comprises: receiving a storage access instruction corresponding to data represented by a data entry stored in one of the identified cache lines; and signaling a cache miss based on said cache line having the given cache line status. In some examples, the method comprises, for each cache line of the cache storage, storing in a data structure: a first data value representing whether or not a respective cache line is associated with the selected shareable tag storage location; and a second data value representing whether or not the respective cache line is associated with a pending storage access instruction).
Claim 11 is the corresponding method claim to module/system claim 2. It is rejected with the same references and rationale.
Regarding claim 3, Loh in view of Kim in further view of Garcia teaches The memory module of claim 2, wherein the cache memory comprises a set associative cache implemented on a dynamic random access memory (DRAM) device (Loh paragraph [0008], Utilizing DRAM access mechanisms while storing and accessing the tags and data of the additional cache in the integrated DRAM dissipates a lot of power. In addition, these mechanisms consume a lot of bandwidth, especially for a highly associative on-package cache, and consume too much time as the tags and data are read out in a sequential manner. Therefore, the on-package DRAM provides a lot of extra data storage, but cache and DRAM access mechanisms are inefficient).
Claim 12 is the corresponding method claim to module/system claim 3. It is rejected with the same references and rationale.
Regarding claim 5, Loh in view of Kim in further view of Garcia teaches The memory module of claim 2, wherein the memory controller is further configured to: send the second tag data read from the cache memory to a cache controller coupled to the memory module over an interconnect (Kim claim 11, The nonvolatile memory module of claim 9, wherein, when a read request is received, a second match signal indicating the cache hit corresponding to the read request is generated by reading a second tag from the tag array and comparing the read second tag with second tag information received with the second tag, and second cache data corresponding to the read request is read from the data array in response to the second match signal. Second tag data is read from the cache memory to the controller, which is used for all read operations, see Kim paragraph [0056], FIG. 3 is a block diagram for conceptually illustrating the tag DRAM 331 and the data DRAM 332 of FIG. 2. Referring to FIG. 3, the tag DRAM 331 and the data DRAM 332 may include the same elements, for example, memory cell arrays 331-1 and 332-1, tag comparison circuits 331-5 and 332-5, and multiplexers (Mux Circuit) 331-6 and 332-6. In some embodiments, each of the tag DRAM 331 and the data DRAM 332 may include a dual port DRAM. The dual port DRAM may include input/output ports respectively corresponding to different kinds of devices, for example, data buffer/nonvolatile memory controller. A data path of the dual port DRAM may be connected to a first external device, for example, a data buffer, or a second external device, for example, a nonvolatile memory controller, based on the selection of the multiplexer, that is, multiplexers, 331-6 or 332-6).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh with those of Kim and Garcia. Kim teaches sending a second tag data from the cache memory to the cache controller, to later be used in a comparison, which allows the memory system to compare and contrast the two separate tags for improving memory reliability and consistency (Kim paragraph [0049-0050], At least one DRAM 331 of the plurality of first DRAMs 330-1 and the plurality of second DRAMs 330-2 may store a tag corresponding to a cache line and compare stored tag information with input tag information. The remaining DRAMs may be implemented to store cache data corresponding to the tag. Hereinafter, a DRAM, which stores tags, may be referred to as "tag DRAM", and each of the remaining DRAMs may be referred to as "data DRAM". The at least one DRAM 331 may be a tag DRAM. DRAM 332 may be a data DRAM. [0050] In some embodiments, the tag DRAM 331 may store a 4-byte tag. In some embodiments, the tag DRAM 331 may store tags in a 2-way, 1:8 direct mapping scheme. The tag may include location information about cache data stored in the data DRAMs and dirty/clear information indicating validity of cache data. In some embodiments, the tag may include an error correction value for error correction. Thus, the tag DRAM 331 may further include an error correction circuit for correcting an error. The memory module control device 350 may provide tag information to the DRAM 330-2).
Claim 13 is the corresponding method claim to module/system claim 5. It is rejected with the same references and rationale.
Regarding claim 6, Loh in view of Kim in further view of Garcia teaches The memory module of claim 2, wherein to initiate the action with respect to the first cache line, the memory controller is configured to: prepare a portion of the cache memory corresponding to the first cache line for a second read request to be subsequently received from a cache controller (Loh paragraph [0029], Each of the cache memory subsystems 124a-124b and 128 may include a cache memory, or cache array, connected to a corresponding cache controller. The cache memory subsystems 124a-124b and 128 may be implemented as a hierarchy of caches. Caches located nearer the processor cores 122a-122b (within the hierarchy) may be integrated into the processor cores 122a-122b, if desired. This level of the caches may be a level-one (L1) of a multi-level hierarchy. In one embodiment, the cache memory subsystems 124a-124b each represent L2 cache structures, and the shared cache memory subsystem 128 represents an L3 cache structure. In another embodiment, cache memory subsystems 114 each represent L1 cache structures, and shared cache subsystem 118 represents an L2 cache structure. Other embodiments are possible and contemplated. Multiple read requests can be sent from the cache controller, see Loh paragraph [0064], A cache tag may be used to determine which of the multiple cache lines are being accessed within a selected row. For example, in a 30-way set-associative cache organization, when the row 432a is selected, the cache tag values stored in the fields 434a-434d may be used to determine which one of the 30 cache lines stored in fields 438a-438d is being accessed. The cache tag stored in field 412 within the address 410 may be used in comparison logic to locate a corresponding cache line of the multiple cache lines stored in the row buffer 440).
Claim 15 is the corresponding method claim to module/system claim 6. It is rejected with the same references and rationale.
Regarding claim 7, Loh in view of Kim in further view of Garcia teaches The memory module of claim 2, wherein to initiate the action with respect to the first cache line, the memory controller is configured to: read the first cache line from the cache memory before a second read request is received from a cache controller (Loh paragraph [0062], During sequence 1, a memory request from a processing unit may be received by a 3D DRAM. The memory request may have traversed horizontal or vertical short low-latency interconnect routes available through a 3D integrated fabrication process. A portion of a complete address is shown as address 410. The fields 412 and 414 may store a cache tag and a page index, respectively. Other portions of the complete address may include one or more of a channel index, a bank index, a sub array index, and so forth to identify the memory array bank 430 within the 3D DRAM. During sequence 2, a given row of the rows 432a-432k may be selected from other rows by the page index 414. The first read request is sent and completed before the subsequent read requests are received and acted upon by the memory system).
Claim 16 is the corresponding method claim to module/system claim 7. It is rejected with the same references and rationale.
Regarding claim 8, Loh in view of Kim in further view of Garcia teaches The memory module of claim 7, wherein to initiate the action with respect to the first cache line, the memory controller is further configured to: send the first cache line to the cache controller without receiving the second read request from the cache controller (Loh claims 17-19, The method as recited in claim 16, wherein performing the memory access with a single read of the respective row storing the given cache line includes updating the metadata based on the memory access. 18. The method as recited in claim 15, further comprising sending within the memory request the first cache tag in addition to a DRAM address identifying the respective row. 19. The method as recited in claim 15, wherein the DRAM is a three-dimensional (3D) integrated circuit (IC). The first cache line (i.e., the given cache line associated with the first single read), is sent to the controller individually, without any other future read requests. The interconnect is used to send the cache data).
Claim 17 is the corresponding method claim to module/system claim 8. It is rejected with the same references and rationale.
Regarding claim 9, Loh in view of Kim in further view of Garcia teaches The memory module of claim 7, wherein to initiate the action with respect to the first cache line, the memory controller is further configured to: receive the second read request from the cache controller, and send the first cache line to the cache controller (Loh paragraph [0060], A sequence of steps 1-7 is shown in FIG. 4 for accessing tags, status information and data corresponding to cache lines stored in a 3D DRAM. When the memory array bank 430 is used as a cache storing both a tag array and a data array within a same row, an access sequence different from a sequence utilizing steps 1-7 for a given row of the rows 432a-432k may have a large latency. For example, a DRAM access typically includes an first activation or opening stage, a stage that copies the contents of an entire row into the row buffer, a tag read stage, a tag comparison stage, a data read or write access stage that includes a column access, a first precharge or closing stage, a second activation or opening stage, a stage that copies the contents of the entire row again into the row buffer, a tag read stage, a tag comparison stage, an update stage for status information corresponding to the matching tag, and a second precharge or closing stage. A second read request can be sent via the cache controller to send a first cache line, also see Loh paragraph [0031], If a cache miss occurs, such as a requested block is not found in a respective one of the cache memory subsystems 124a-124b or in the shared cache memory subsystem 128, then a read request may be generated and transmitted to the memory controller 130. The memory controller 130 may translate an address corresponding to the requested block and send a read request to the off-chip DRAM 170 through the memory bus 150. The off-chip DRAM 170 may be filled with data from the off-chip disk memory 162 through the I/O controller and bus 160 and the memory bus 150).
Claim 18 is the corresponding method claim to module/system claim 9. It is rejected with the same references and rationale.
Regarding claim 10, Loh in view of Kim in further view of Garcia teaches The memory module of claim 2, wherein the memory controller is further configured to: if the second tag data does not match the first tag data, return an indication of a cache miss to a cache controller (Loh paragraphs [0072-0074], In block 504, the processing unit may determine a given memory request misses within a cache memory subsystem within the processing unit. In block 506, the processing unit may send an address corresponding to the given memory request to an in-package integrated DRAM cache, such as the 3D DRAM. The address may include a non-translated cache tag in addition to a DRAM address translated from a corresponding cache address used within the processing unit to access on-chip caches. In block 508, control logic within the 3D DRAM may identify a given row corresponding to the address within the memory array banks in the 3D DRAM. In block 510, control logic within the 3D DRAM may activate and open the given row. In block 512, the contents of the given row may be copied and stored in a row buffer. In block 514, the tag information in the row buffer may be compared with tag information in the address. The steps described in blocks 506-512 may correspond to the sequences 1-4 described earlier regarding FIG. 4. If the tag comparisons determine a tag hit does not occur (conditional block 516), then in block 518, the memory request may be sent to main memory. The main memory may include an off-chip non-integrated DRAM and/or an off-chip disk memory. If the tag comparisons determine a tag hit occurs (conditional block 516), then in block 520, read or write operations are performed on a corresponding cache line in the row buffer. When the data being compared results in a difference (i.e., not matching), the memory can return a cache miss).
Claims 19 and 21 are the corresponding method and device claims to module/system claim 10. They are rejected with the same references and rationale.
Regarding claim 20, Loh teaches A device comprising: a cache memory comprising a plurality of sets of cache lines each comprising a plurality of cache storage locations; and a memory controller coupled to the cache memory, (Loh paragraphs [0011-0012], In one embodiment, a computing system includes a processing unit and an integrated dynamic random access memory (DRAM). Examples of the processing unit include a general-purpose microprocessor, a graphics processing unit (GPU), an accelerated processing unit (APU), and so forth. The integrated DRAM may be a three-dimensional (3D) DRAM and may be included in a System-in-Package (SiP) with the processing unit. The processing unit may utilize the 3D DRAM as a cache. [0012] In various embodiments, the 3D DRAM may store both a tag array and a data array. Each row of the multiple rows in the memory array banks of the 3D DRAM may store one or more cache tags and one or more corresponding cache lines indicated by the one or more cache tags. In response to receiving a memory request from the processing unit, the 3D DRAM may perform a memory access according to the received memory request on a given cache line indicated by a cache tag within the received memory request. Performing the memory access may include a single read of a respective row of the multiple rows storing the given cache line. Rather than utilizing multiple DRAM transactions, a single, complex DRAM transaction may be used to reduce latency and power consumption. The memory request can be considered a read request, which is associated with a first tag data that is used to identify a specific portion of the cache, i.e., the "given cache line", as stated in the reference) wherein the memory controller is configured to: compare first tag data from a received write request to second tag data from the cache memory, wherein the first tag data identifies a first cache line in a first set of cache lines in the cache memory, (Loh paragraphs [0060-0061], A sequence of steps 1-7 is shown in FIG. 4 for accessing tags, status information and data corresponding to cache lines stored in a 3D DRAM. When the memory array bank 430 is used as a cache storing both a tag array and a data array within a same row, an access sequence different from a sequence utilizing steps 1-7 for a given row of the rows 432a-432k may have a large latency. For example, a DRAM access typically includes an first activation or opening stage, a stage that copies the contents of an entire row into the row buffer, a tag read stage, a tag comparison stage, a data read or write access stage that includes a column access, a first precharge or closing stage, a second activation or opening stage, a stage that copies the contents of the entire row again into the row buffer, a tag read stage, a tag comparison stage, an update stage for status information corresponding to the matching tag, and a second precharge or closing stage. The two separate tags are compared to one another. If the two tags match, then the memory access request that was previously described may be executed, which can involve a plurality of actions as well as cache lines, see Loh paragraph [0061], Continuing with the access steps within the memory array bank 430, one or more additional precharge and activation stages may be included after each access of the row buffer if other data stored in other rows are accessed in the meantime. Rather than utilize multiple DRAM transactions for a single cache access, the sequence of steps 1-7, may be used to convert a cache access into a single DRAM transaction. Each of the different DRAM operations, such as activation/open, column access, read, write, and precharge/close, has a different respective latency) if the second tag data matches the first tag data and the first cache line is not already marked as dirty, modify a dirty status indicator for the first cache line before a second write request is received from a cache controller coupled to the device over an interconnect; (Loh paragraphs [0060-0061], A sequence of steps 1-7 is shown in FIG. 4 for accessing tags, status information and data corresponding to cache lines stored in a 3D DRAM. When the memory array bank 430 is used as a cache storing both a tag array and a data array within a same row, an access sequence different from a sequence utilizing steps 1-7 for a given row of the rows 432a-432k may have a large latency. For example, a DRAM access typically includes an first activation or opening stage, a stage that copies the contents of an entire row into the row buffer, a tag read stage, a tag comparison stage, a data read or write access stage that includes a column access, a first precharge or closing stage, a second activation or opening stage, a stage that copies the contents of the entire row again into the row buffer, a tag read stage, a tag comparison stage, an update stage for status information corresponding to the matching tag, and a second precharge or closing stage. The two separate tags are compared to one another. If the two tags match, then the memory access request that was previously described may be executed, which can involve a plurality of actions as well as cache lines, see Loh paragraph [0061], Continuing with the access steps within the memory array bank 430, one or more additional precharge and activation stages may be included after each access of the row buffer if other data stored in other rows are accessed in the meantime. Rather than utilize multiple DRAM transactions for a single cache access, the sequence of steps 1-7, may be used to convert a cache access into a single DRAM transaction. Each of the different DRAM operations, such as activation/open, column access, read, write, and precharge/close, has a different respective latency) receive the second write request is received from the cache controller over the interconnect; and perform a write operation on the first cache line (Loh paragraph [0022], A corresponding cache fill line with the requested block may be conveyed from the off-chip DRAM 170 to a corresponding one of the cache memory subsystems 124a-124b in order to complete the original read or write request. The cache fill line may be placed in one or more levels of caches. In addition, the cache fill line may be placed within a corresponding set within the cache. If there are no available ways within the corresponding set, then typically a Least Recently Used (LRU) algorithm determines which way within the set is to have its data evicted and replaced by the cache fill line data. Typically, allocation refers to storing a cache fill line fetched from a lower level of the cache hierarchy into a way of a particular cache subsequent a cache miss to the particular cache. Read/write operations may be received from a controller and can be implemented on a specific cache row/column/line, as seen in Loh paragraph [0067], During sequence 5, a given one of the multiple cache lines stored in the row buffer 440 is selected based on the tag comparison result. This column access is based on information stored in the received address and stored in the row buffer 440, such as the cache tags in fields 444a-444d and in the cache line state information stored in the field 446. The selected given cache line is read or written based on the received memory request. In one embodiment, an offset value may be stored in the received address and may be used to indicate a specific byte or word within the selected cache line to be accessed. The read or write operations operate directly on contents stored in the row buffer 440) (Loh paragraphs [0011-0012], In one embodiment, a computing system includes a processing unit and an integrated dynamic random access memory (DRAM). Examples of the processing unit include a general-purpose microprocessor, a graphics processing unit (GPU), an accelerated processing unit (APU), and so forth. The integrated DRAM may be a three-dimensional (3D) DRAM and may be included in a System-in-Package (SiP) with the processing unit. The processing unit may utilize the 3D DRAM as a cache. [0012] In various embodiments, the 3D DRAM may store both a tag array and a data array. Each row of the multiple rows in the memory array banks of the 3D DRAM may store one or more cache tags and one or more corresponding cache lines indicated by the one or more cache tags. In response to receiving a memory request from the processing unit, the 3D DRAM may perform a memory access according to the received memory request on a given cache line indicated by a cache tag within the received memory request. Performing the memory access may include a single read of a respective row of the multiple rows storing the given cache line. Rather than utilizing multiple DRAM transactions, a single, complex DRAM transaction may be used to reduce latency and power consumption. The memory request can be considered a read request, which is associated with a first tag data that is used to identify a specific portion of the cache, i.e., the "given cache line", as stated in the reference).
Loh does not teach wherein the second tag data is stored in a second cache line of a different set of cache lines in the cache memory than the first set of cache lines, the second cache line of the different set of cache lines comprising cache tag data for the plurality of sets of cache lines including the first set of cache lines, wherein the different set of cache lines comprises at least one additional cache line storing cache data.
However, Kim teaches wherein the second tag data is stored in a second cache line of a different set of cache lines in the cache memory than the first set of cache lines, the second cache line of the different set of cache lines comprising cache tag data for the plurality of sets of cache lines including the first set of cache lines (Kim paragraph [0010], at least one dynamic random access memory (DRAM) used as a cache of the at least one nonvolatile memory, and a memory module control device configured to control the nonvolatile memory controller and the at least one DRAM and configured to output tag information to the at least one DRAM. The at least one DRAM stores a tag corresponding to cache data and compares the stored tag with the tag information from the memory module control device to determine whether a hit/miss is generated with respect to the cache, through the tag comparison. The cache lines of a particular DRAM may be used to store cache tag data for its own cache lines as well as cache lines of different sets (i.e., different DRAMS), such as seen in Figure 2, Ref #332, with a designated DRAM for storing cache tag data. Also see Kim paragraph [0049], At least one DRAM 331 of the plurality of first DRAMs 330-1 and the plurality of second DRAMs 330-2 may store a tag corresponding to a cache line and compare stored tag information with input tag information. The remaining DRAMs may be implemented to store cache data corresponding to the tag. Hereinafter, a DRAM, which stores tags, may be referred to as “tag DRAM”, and each of the remaining DRAMs may be referred to as “data DRAM”. The at least one DRAM 331 may be a tag DRAM. DRAM 332 may be a data DRAM).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh with those of Kim. Kim teaches using a DRAM that can store local cache tag information as well as cache tag information for other cache lines as well, which can improve the efficiency of accessing the tag information for various operations, such as hit/miss detection (Kim paragraphs [0044-0045], The at least one cache DRAM 330 may perform a cache function of the at least one nonvolatile memory 310. The at least one cache DRAM 330 may store a tag corresponding to cache data or generate a match signal indicating a cache hit or a cache miss through tag comparison. The computing system 10 according to some embodiments of the inventive concepts may use the nonvolatile memory module 300 having the cache DRAM 330 as a working memory, thereby achieving a lower cost and higher capacity and performance than those of a conventional computing system).
Loh in view of Kim does not teach wherein the different set of cache lines comprises at least one additional cache line storing cache data.
However, Garcia teaches wherein the different set of cache lines comprises at least one additional cache line storing cache data (Garcia Fig. 1; see Ref #160, 165 and 170 in cache storage. The plurality of different cache lines can include cache tag data, as well as additional cache data, see Garcia paragraph [0024], In a second embodiment, there is provided a cache storage comprising cache lines to store data entries representing data which can be retrieved from the cache storage when a storage access instruction contains a storage identifier which corresponds with a tag associated with a cache line, [0025] wherein a said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion, [0026] the cache storage configured to: [0027] allow re-use of a selected shareable tag storage location and thus update a first shareable tag portion comprised therein to a second shareable tag portion; [0028] identify one or more cache lines associated with individual tag portions comprising a pointer to the selected shareable tag storage location; and [0029] set a given cache line status for each of the identified cache lines, wherein the given cache line status: [0030] a) allows a cache line to continue to be used in relation to a storage access instruction received before said given cache line status was set; and [0031] b) inhibits the cache line from being used in relation to a storage access instruction received after the given cache line status is set).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh and Kim with those of Garcia. Garcia teaches using a plurality of cache lines in a cache storage to access cache tag and cache storage data, which can allow more effective cache management, such as clearing cache lines or detecting particular subsets of cache lines (i.e., see Garcia paragraphs [0020-0021], In some examples, the method comprises: receiving a storage access instruction corresponding to data represented by a data entry stored in one of the identified cache lines; and signaling a cache miss based on said cache line having the given cache line status. In some examples, the method comprises, for each cache line of the cache storage, storing in a data structure: a first data value representing whether or not a respective cache line is associated with the selected shareable tag storage location; and a second data value representing whether or not the respective cache line is associated with a pending storage access instruction).
Claim(s) 4 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Loh in view of Kim in further view of Garcia as applied to claim 2 and 11 above, and further in view of Nale (US Publication No. 2019/0102313 -- "Nale").
Regarding claim 4, Loh in view of Kim in further view of Garcia and further in view of Nale teaches The memory module of claim 2, wherein the read request comprises a tag read request identified by a corresponding identifier (Nale paragraph [0013], Various embodiments described herein include a memory controller that can store a copy of a portion of a critical chunk in a spare lane such that the entire critical chunk can be provided to a CPU using one half of a cache line. In some embodiments, the memory controller may utilize the spare lane to store an entire critical chunk in each half of a cache line. For example, the critical chunk may include metadata (e.g., a cache tag or a Read ID) stored in both halves of a cache line. In such examples, the metadata that is normally in the first half of the cache line may be copied or mapped to spare lane bits associated with the second half of the cache line and metadata that is normally in the second half of the cache line may be copied or mapped to spare lane bits associated with the first half of the cache line. The read/access request includes metadata (i.e., an identifier) which indicates whether or not a tag is present for the read request target).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh and Kim and Garcia with those of Nale. Nale teaches using an identifier to indicate whether or not the first read request is a tag read request, which allows the system to more easily identify and classify when a tag read request occurs versus a normal read occurring, resulting in improved memory performance (Nale paragraph [0013], Various embodiments described herein include a memory controller that can store a copy of a portion of a critical chunk in a spare lane such that the entire critical chunk can be provided to a CPU using one half of a cache line. In some embodiments, the memory controller may utilize the spare lane to store an entire critical chunk in each half of a cache line. For example, the critical chunk may include metadata (e.g., a cache tag or a Read ID) stored in both halves of a cache line. In such examples, the metadata that is normally in the first half of the cache line may be copied or mapped to spare lane bits associated with the second half of the cache line and metadata that is normally in the second half of the cache line may be copied or mapped to spare lane bits associated with the first half of the cache line. In embodiments, the memory controller may allow critical chunk operations to be used in 2LM and/or DDR-T2 environments when a spare lane is implemented by the memory controller and the DDR-T interface. In one or more embodiments, the memory controller may store the critical chunk in the same locations in the two halves of the cache line such that the memory controller does not have to multiplex (MUX) the data depending on which half comes first. In various embodiments, the memory controller may arrange the bits in a critical chunk separately, such as depending on which half of a cache line is requested. In these and other ways the memory controller may enable reliable and efficient critical chunk operation to achieve improved memory performance, such as by reducing the overall number of memory operations required to provide a critical chunk to a CPU, resulting in several technical effects and advantages).
Claim 14 is the corresponding method claim to module/system claim 4. It is rejected with the same references and rationale.
Response to Arguments
Applicant’s arguments, see pages 1-4 (numbered pages 7-10), filed September 29th, 2025, with respect to the rejection(s) of claim(s) 2-21 under 35 U.S.C. 103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of Loh et al. (US Publication No. 2013/0138892 -- "Loh") in view of Kim et al. (US Publication No. 2017/0168931 – “Kim”) in further view of Garcia Guirado (US Publication No. 2019/0079874 – “Garcia”).
The teachings of Garcia have been added to the 35 U.S.C. 103 Rejection to disclose the newly added limitation to the independent claims. Garcia is used to explicitly disclose using a plurality of different/distinct cache lines which can comprise cache tag data corresponding to cache line data, as described in the rejection above. In light of the above references and rationale, the 35 USC 103 Rejection is maintained.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Usui (US Publication No. 2018/0004669 – “Usui”) teaches storing cache line tag data in a plurality of different cache lines (i.e., see Usui Fig. 3, see Sets for plurality of cache lines storing plurality of cache tag data; also see Usui paragraphs [0061-0063], The data position information P indicates a position of a piece of cache data related to a piece of tag data concerned. The tag information is address data.Though each of the pieces of cache data Data0 to Data3 are shown as variable-length data in FIGS. 3 and 4, each piece of cache data is stored in fixed-length units in the present embodiment. That is, the cache data storage area Da is managed in divided fixed-length units. The data position information P is configured with a plurality of bits in the bitmap format indicating storage positions of a plurality of pieces of data divided in the fixed lengths. Each cache data/cache line can have associated tag data stored in a different cache line/area, see Usui paragraph [0070], As described above, each piece of cache data is divided in fixed lengths and stored in the tag/data memory 24. Each piece of tag data includes the data position information P indicating a position of each of a plurality of divided areas in the storage area ma. Note that a data configuration of the data position information P may not be in the bitmap format).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONAH C KRIEGER whose telephone number is (571)272-3627. The examiner can normally be reached Monday - Friday 8 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kenneth Lo can be reached on (571) 272-9774. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.C.K./ Examiner, Art Unit 2136
/KENNETH M LO/ Supervisory Patent Examiner, Art Unit 2136