Last updated: April 18, 2026
Application No. 18/956,020
MEMORY STATUS BASED TRAFFIC ROUTING ON HETEROGENEOUS MEMORY SUBSYSTEM

Final Rejection §103
Filed
Nov 22, 2024
Examiner
RUIZ, ARACELIS
Art Unit
2139
Tech Center
2100 — Computer Architecture & Software
Assignee
Tenstorrent Usa Inc.
OA Round
2 (Final)
Interview Optional

— +12.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 814 resolved cases, 2023–2026
Examiner Intelligence

RUIZ, ARACELIS View full profile →
Grants 87% — above average
Career Allow Rate
709 granted / 814 resolved
+32.1% vs TC avg
Moderate +12% lift
Without
With
+12.5%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
22 currently pending
Career history
836
Total Applications
across all art units
Statute-Specific Performance

§101
5.9%
-34.1% vs TC avg
§103
55.1%
+15.1% vs TC avg
§102
16.5%
-23.5% vs TC avg
§112
9.6%
-30.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 814 resolved cases
Office Action

§103
DETAILED ACTION
	Claims 1-25 are present for examination.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-4, 9-13, 15 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Paul et al. (US 2022/0206850) in view of Gupta et al. (US 9,880,933).

With respect claim 1, Paul et al. teaches receiving, at a memory controller, a read request for first data (see paragraph 35; power management logic includes memory latency monitor logic that detects memory access latency associated with memory load requests issued by a central processing compute unit during runtime); 
determining, by the memory controller, a usage of a first memory having the first data (see paragraph 35; the power management logic includes memory latency monitor logic that detects memory bandwidth monitoring logic that detects memory bandwidth levels associated with other of the plurality of compute units); and 
sending, from the memory controller, the first data to a processing core (see paragraph 62; memory management hub 240 is bidirectionally connected to data fabric 518 for generating such memory accesses and receiving read data returned from the memory system).
Paul et al. does not explicitly teach conducting exactly one of: (i) sending, by the memory controller based on the usage of the first memory satisfying a latency criteria, a first memory read request for the first data to the first memory; (ii) sending, by the memory controller based on the usage of the first memory not satisfying the latency criteria, a second memory read request for the first data to a second memory, wherein the second memory is a cache for the first memory.
However, Gupta et al. teaches techniques for processing storage I/O (input/output) read requests in a system implementing a separate distributed buffer cache system and a separate distributed storage system for read requests exceeding a read latency threshold… the current pending time for a storage I/O read request may be evaluated in order to determine whether a latency time threshold for a storage I/O read request is exceeded, as indicated at 1220. A latency time threshold for a storage I/O read request may be set or determined based on a throughput or other service guarantee that may be still met if alternative means of obtaining data specified in as storage I/O read request are performed. For example, if the storage I/O read request does exceed the latency time threshold for the storage I/O read request, then the storage I/O read request may be sent directly to the distributed storage system (see column 33, lines 7-59).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. to include the above mentioned to improve the performance of the device (see Gupta, column 33, lines 7-22).

With respect claim 2, Paul et al. teaches requesting, by the memory controller, the usage of the first memory from a first memory module of the first memory (see paragraphs 54, 71 and 74; power management logic 302 provides memory latency monitoring for one or more of the compute units); and 
providing, by the first memory module, the usage of the first memory to the memory controller (see paragraph 35; the power management logic includes memory latency monitor logic that detects memory bandwidth monitoring logic that detects memory bandwidth levels associated with other of the plurality of compute units).
Paul et al. does not teach wherein the first memory comprises a double data rate (“DDR”) memory.
However, Gupta et al. teaches wherein system memories 2020 may be implemented using any suitable memory technology, (e.g., one or more of cache, static random-access memory (SRAM), DRAM, RDRAM, EDO RAM, DDR 10 RAM (see column 41, lines 59-67).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. to include the above mentioned to improve the performance of the device (see Gupta, column 33, lines 7-22).

With respect claim 3, Paul et al. teaches wherein a first access latency of the first memory is less than a second access latency of the second memory when the latency criteria is satisfied (see paragraph 74; each of the bandwidth detectors provides bandwidth metrics such as bandwidth level data 550 to the power management logic 302. Similarly, the latency detector 540 provides measured information in the form of metrics such as count data indicating the number of late loads that were encountered shown as latency count data 552. In some implementations the SOC 306 has a similar latency and bandwidth detector arrangement as SOC 304 such that the socket bandwidth detector 530 block represents a latency detector in SOC 306 providing measured latency information from the SOC 306 for the PML, so that latency information from SOC 306 can be compared to latency information from latency detector 540).

With respect claim 4, Paul et al. does not teach wherein the first access latency of the first memory is greater than the second access latency of the second memory at least some of a time when the latency criteria is not satisfied.
However, Gupta et al. teaches wherein the current pending time for a storage I/O read request may be evaluated in order to determine whether a latency time threshold for a storage I/O read request is exceeded, as indicated at 1220. A latency time threshold for a storage I/O read request may be set or determined based on a throughput or other service guarantee that may be still met if alternative means of obtaining data specified in as storage I/O read request are performed. For example, if the storage I/O read request does exceed the latency time threshold for the storage I/O read request, then the storage I/O read request may be sent directly to the distributed storage system, as indicated at 1230 (see column 33, lines 24-59).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. to include the above mentioned to improve the performance of the device (see Gupta, column 33, lines 7-22).

With respect claim 9, Paul et al. teaches wherein the usage of the first memory comprises a transient memory bandwidth utilization of the first memory (see paragraph 84; each of the latency detector and bandwidth detectors serve as an independent optimization unit that measures actual latency data or data traffic metrics and/or predicted latency and/or predicted bandwidth usage, and looks for its own performance state).

With respect claim 10, Paul et al. teaches wherein the latency criteria comprises a percentage usage of the first memory being less than a threshold percentage (see paragraph 90; for up hysteresis, a single observation over a programmable threshold (e.g., over a threshold of 50% of current DPM state bandwidth) is used, however any suitable threshold can be employed. For down hysteresis, multiple consecutive observations below the threshold of the next lower PState is used to prevent dithering).

With respect claim 11, Paul et al. teaches wherein the threshold percentage is based on an increase in access latency that defines the latency criteria when the usage of the first memory exceeds the threshold percentage (see paragraphs 74 and 90; for up hysteresis, a single observation over a programmable threshold (e.g., over a threshold of 50% of current DPM state bandwidth) is used).

With respect claim 12, Paul et al. teaches monitoring, for each of a plurality of usage values for the first memory, a respective access latency value (see paragraph 74; latency detector 540 provides measured information in the form of metrics such as count data indicating the number of late loads that were encountered shown as latency count data 552).

With respect claim 13, Paul et al. teaches wherein the latency criteria is based on the monitored usage values and respective access latency values (see paragraph 74; latency detector 540 provides measured information in the form of metrics such as count data indicating the number of late loads that were encountered shown as latency count data 552. In some implementations the SOC 306 has a similar latency and bandwidth detector arrangement as SOC 304 such that the socket bandwidth detector 530 block represents a latency detector in SOC 306 providing measured latency information from the SOC 306 for the PML, so that latency information from SOC 306 can be compared to latency information from latency detector 540).

With respect claim 15, Paul et al. does not teach determining, by the memory controller, whether the first data is in the second memory; and sending, by the memory controller, a first memory read request for the first data to the first memory if the first data is not in the second memory, even if the latency criteria is not satisfied.
However, Gupta et al. teaches wherein if the buffer node has a valid buffer cache entry (a hit), then the requested data pages 733 may be returned to client-side/cache storage service drive. If the buffer cache node has an invalid cache entry (e.g., a miss, write-in-progress, or unavailable storage node), then storage service driver 736 for a buffer cache node 738 may send a read request 755 to specific storage nodes 735, which may return the requested data page 753 (see column 23, lines 34-47).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. to include the above mentioned to improve the performance of the device (see Gupta, column 33, lines 7-22).

With respect claim 17, Paul et al. teaches one or more non-transitory computer-readable media storing instructions, which when executed by one or more processors cause the one or more processors to conduct a method for routing requests to memory (see paragraph 104; non-transitory computer-readable storage medium), the method comprising: 
receiving, at a memory controller, a read request for first data (see paragraph 35; power management logic includes memory latency monitor logic that detects memory access latency associated with memory load requests issued by a central processing compute unit during runtime); 
determining, by the memory controller, a usage of a first memory having the first data (see paragraph 35; the power management logic includes memory latency monitor logic that detects memory bandwidth monitoring logic that detects memory bandwidth levels associated with other of the plurality of compute units); and
sending, from the memory controller, the first data to a processing core (see paragraph 62; memory management hub 240 is bidirectionally connected to data fabric 518 for generating such memory accesses and receiving read data returned from the memory system).
Paul et al. does not explicitly teach conducting exactly one of: (i) sending, by the memory controller based on the usage of the first memory satisfying a latency criteria, a first memory read request for the first data to the first memory; (ii) sending, by the memory controller based on the usage of the first memory not satisfying the latency criteria, a second memory read request for the first data to a second memory, wherein the second memory is a cache for the first memory. 
However, Gupta et al. teaches techniques for processing storage I/O (input/output) read requests in a system implementing a separate distributed buffer cache system and a separate distributed storage system for read requests exceeding a read latency threshold… the current pending time for a storage I/O read request may be evaluated in order to determine whether a latency time threshold for a storage I/O read request is exceeded, as indicated at 1220. A latency time threshold for a storage I/O read request may be set or determined based on a throughput or other service guarantee that may be still met if alternative means of obtaining data specified in as storage I/O read request are performed. For example, if the storage I/O read request does exceed the latency time threshold for the storage I/O read request, then the storage I/O read request may be sent directly to the distributed storage system (see column 33, lines 7-59).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. to include the above mentioned to improve the performance of the device (see Gupta, column 33, lines 7-22).

Claim(s) 5-8, 14 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Paul et al. (US 2022/0206850) and Gupta et al. (US 9,880,933) as applied to claim 1-3 and 12-13 above, and further in view of Doshi et al. (US 10,564,972).
With respect claim 5, Paul et al. and Gupta et al. do not teach wherein the second memory comprises high bandwidth memory (“HBM”) memory and wherein a second memory module interfaces between the memory controller and the second memory to provide the first data to the memory controller.
However, Doshi et al. teaches wherein Node 3 includes two processors 1961-1962 coupled to standard double data rate (DDR) memory 1963-1964 and/or high bandwidth memory (HBM) 1965-1966 which may be mapped to a portion of the virtual address space shared by Node 0 and Node 3 (see column 29, lines 57-65).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. and Gupta et al. to include the above mentioned to improve performance (see Doshi, column 11, lines 56-57).

With respect claim 6, Paul et al. and Gupta et al. do not teach wherein the read request is provided to the memory controller based on one or more lower level caches not having the first data.
However, Doshi et al. teaches wherein the access sequence would begin with core 114.sub.1 sending out a Read for Ownership (RFO) message and first “snooping” (i.e., checking) its local L1 and L2 caches to see if the requested cache line is currently present in either of those caches. In this example, producer 200 desires to access the cache line so its data can be modified, and thus the RFO is used rather than a Read request. The presence of a requested cache line in a cache is referred to as a “hit,” while the absence is referred to as a “miss.” This is done using well-known snooping techniques, and the determination of a hit or miss for information maintained by each cache identifying the addresses of the cache lines that are currently present in that cache (see column 8, lines 18-30).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. and Gupta et al. to include the above mentioned to improve performance (see Doshi, column 11, lines 56-57).

With respect claim 7, Paul et al. and Gupta et al. do not teach wherein the one or more lower level caches comprise a level one cache, a level two cache, and a level three cache.
However, Doshi et al. teaches wherein in addition to snooping a core's local L1 and L2 caches, the core will also snoop L3 cache 108. If the processor employs an architecture under which the L3 cache is inclusive, meaning that a cache line that exists in L1 or L2 for any core also exists in the L3, the core knows the only valid copy of the cache line is in system memory if the L3 snoop results in a miss (see column 8, lines 39-51).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. and Gupta et al. to include the above mentioned to improve performance (see Doshi, column 11, lines 56-57).

With respect claim 8, Paul et al. and Gupta et al. do not teach wherein a first access latency of the one or more lower level caches is at least about an order of magnitude less than a second access latency of the first memory and a third access latency of the second memory.
However, Doshi et al. teaches discussion with respect to movement from L1 to LLC is applicable to other movement such as L1 to L2, L2 to L3, MLC to LLC, etc. The CLDEMOTE instruction allows the software to provide application-level knowledge to hardware for optimizations. By proactively pushing data to the LLC that is closer to the consumer, the communication latency is reduced by more than 2x (see column 11, lines 49-56).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. and Gupta et al. to include the above mentioned to improve performance (see Doshi, column 11, lines 56-57).

With respect claim 14, Paul et al. and Gupta et al. do not teach dynamically modifying the latency criteria based on the monitored usage values and the respective access latency values.
However, Doshi et al. teaches wherein the choice of the type of memory tier on which to perform these auto-writeback operations may be dynamic. For example, the decision may be determined on the basis of a configuration parameter initialized at system restart, set by a Machine State Register (MSR) during system operation, and/or based on threshold values such as a bandwidth or average latency to the target memory tier (see column 29, lines 18-27).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. and Gupta et al. to include the above mentioned to improve performance (see Doshi, column 11, lines 56-57).

With respect claim 16, Paul et al. and Gupta et al. do not teach determining, by the memory controller, whether a data line of the second memory is clean; and sending, by the memory controller, a second memory read request for the first data to the second memory if the data line is not clean, even if the latency criteria is satisfied.
However, Doshi et al. teaches wherein demotions within the L1, L2 and LLC caches are handled using the following techniques. For example, in one embodiment, if the cache line is hosted in the L1 cache and is being demoted to the L2 cache, the core 1901a stores the line in a clean mode in the L2 cache and generates a message to the cache/memory management circuitry 2020-2022 (see column 31, lines 20-39).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. and Gupta et al. to include the above mentioned to improve performance (see Doshi, column 11, lines 56-57).

Claim(s) 18-25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Paul et al. (US 2022/0206850) in view of Gupta et al. (US 9,880,933) and Doshi et al. (US 10,564,972).
With respect claim 18, Paul et al. teaches monitoring usage information of a memory (see paragraph 35; the power management logic includes memory latency monitor logic that detects memory bandwidth monitoring logic that detects memory bandwidth levels associated with other of the plurality of compute units);
monitoring a read access latency of the memory, the read access latency being associated with the usage information (see paragraph 35; power management logic includes memory latency monitor logic that detects memory access latency associated with memory load requests issued by a central processing compute unit during runtime); 
determining a usage of a first memory having first data (see paragraph 35; the power management logic includes memory latency monitor logic that detects memory bandwidth monitoring logic that detects memory bandwidth levels associated with other of the plurality of compute units); and 
sending the first data to a processing core (see paragraph 62; memory management hub 240 is bidirectionally connected to data fabric 518 for generating such memory accesses and receiving read data returned from the memory system).
Paul et al. does not teach determining a usage threshold based at least in part on the usage information and the read access latency; and conducting exactly one of: (i) sending, based on the usage of the first memory not satisfying the usage threshold, a first memory read request for the first data to the first memory; (ii) sending, based on the usage of the first memory satisfying the usage threshold, a second memory read request for the first data to a second memory.
However, Gupta et al. teaches techniques for processing storage I/O (input/output) read requests in a system implementing a separate distributed buffer cache system and a separate distributed storage system for read requests exceeding a read latency threshold… the current pending time for a storage I/O read request may be evaluated in order to determine whether a latency time threshold for a storage I/O read request is exceeded, as indicated at 1220. A latency time threshold for a storage I/O read request may be set or determined based on a throughput or other service guarantee that may be still met if alternative means of obtaining data specified in as storage I/O read request are performed. For example, if the storage I/O read request does exceed the latency time threshold for the storage I/O read request, then the storage I/O read request may be sent directly to the distributed storage system (see column 33, lines 7-59).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. to include the above mentioned to improve the performance of the device (see Gupta, column 33, lines 7-22).
Paul et al. and Gupta et al. do not teach determining a usage threshold based at least in part on the usage information and the read access latency.
However, Doshi et al. teaches wherein the choice of the type of memory tier on which to perform these auto-writeback operations may be dynamic. For example, the decision may be determined on the basis of a configuration parameter initialized at system restart, set by an Machine State Register (MSR) during system operation, and/or based on threshold values such as a bandwidth or average latency to the target memory tier (see column 29, lines 18-27).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. and Gupta et al. to include the above mentioned to improve performance (see Doshi, column 11, lines 56-57).

With respect claim 19, Paul et al. teaches wherein: satisfying the usage threshold is based at least in part on the usage of the first memory being above the usage threshold (see paragraph 90; for up hysteresis, a single observation over a programmable threshold (e.g., over a threshold of 50% of current DPM state bandwidth) is used, however any suitable threshold can be employed. For down hysteresis, multiple consecutive observations below the threshold of the next lower PState is used to prevent dithering); and 
not satisfying the usage threshold is based at least in part on the usage of the first memory being below the usage threshold (see paragraph 90; for up hysteresis, a single observation over a programmable threshold (e.g., over a threshold of 50% of current DPM state bandwidth) is used, however any suitable threshold can be employed. For down hysteresis, multiple consecutive observations below the threshold of the next lower PState is used to prevent dithering).

With respect claim 20, Paul et al. and Gupta et al. do not teach wherein: the usage threshold is based at least in part on an increase of the read access latency from a low value within a latency range to a high value within the latency range.
However, Doshi et al. teaches wherein there is a lot of VM and network stack related software overhead involved in this case that prevents the packet throughput from reaching the bandwidth upper bound of the host platform's memory system (see column 6, lines 10-18).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. and Gupta et al. to include the above mentioned to improve performance (see Doshi, column 11, lines 56-57).

With respect claim 21, Paul et al. does not teach determining a first read access latency value at a first time based at least in part on monitoring the read access latency of the memory, wherein the usage threshold is based at least in part on the first read access latency value; determining a second read access latency value at a second time based at least in part on monitoring the read access latency of the memory.
However, Gupta et al. teaches wherein platform 300 may be configured to collect, monitor and/or aggregate a variety of storage service system operational metrics, such as metrics reflecting the rates and types of requests received from clients 350, bandwidth utilized by such requests, system processing latency for such requests, system component utilization (e.g., network bandwidth and/or storage utilization within the storage service system), rates and types of errors resulting from requests, characteristics of stored and requested data pages or records thereof (e.g., size, data type, etc.), or any other suitable metrics. In some embodiments such metrics may be used by system administrators to tune and maintain system components (see column 13, lines 46-67 and column 14, lines 1-12).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. to include the above mentioned to improve the performance of the device (see Gupta, column 33, lines 7-22).
Paul et a. and Gupta et al. do not teach adjusting the usage threshold based at least in part on the second read access latency value.
However, Doshi et al. teaches wherein the choice of the type of memory tier on which to perform these auto-writeback operations may be dynamic. For example, the decision may be determined on the basis of a configuration parameter initialized at system restart, set by an Machine State Register (MSR) during system operation, and/or based on threshold values such as a bandwidth or average latency to the target memory tier (see column 29, lines 18-27).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. and Gupta et al. to include the above mentioned to improve performance (see Doshi, column 11, lines 56-57).

With respect claim 22, Paul et al. does not teach wherein: the usage threshold is pre-programmed.
However, Gupta et al. teaches wherein: the usage threshold is pre-programmed (see column 33, lines 24-59; a latency time threshold for a storage I/O read request may be set or determined based on a throughput or other service guarantee that may be still met if alternative means of obtaining data specified in as storage I/O read request are performed).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. to include the above mentioned to improve the performance of the device (see Gupta, column 33, lines 7-22).

With respect claim 23, Paul et al. teaches wherein: the usage threshold is based at least in part on a hysteresis of the memory (see paragraph 90; programmable hysteresis thresholds are used to provide up and down hysteresis).

With respect claim 24, Paul et al. does not teach wherein the usage information is associated with the first memory and the method further comprises: monitoring second usage information associated with the second memory, wherein the usage threshold is based at least in part on the second usage information.
However, Gupta et al. teaches wherein platform 300 may be configured to collect, monitor and/or aggregate a variety of storage service system operational metrics, such as metrics reflecting the rates and types of requests received from clients 350, bandwidth utilized by such requests, system processing latency for such requests, system component utilization (e.g., network bandwidth and/or storage utilization within the storage service system), rates and types of errors resulting from requests, characteristics of stored and requested data pages or records thereof (e.g., size, data type, etc.), or any other suitable metrics. In some embodiments such metrics may be used by system administrators to tune and maintain system components (see column 13, lines 57-67 and column 14, lines 1-12).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. to include the above mentioned to improve the performance of the device (see Gupta, column 33, lines 7-22).

With respect claim 25, Paul et al. does not teach wherein: the usage of the first memory satisfying or not satisfying the usage threshold is based at least in part on a difference between the usage information and the second usage information.
However, Gupta et al. teaches wherein some examples of reserve compute capacity may include purchased opportunities to utilize compute nodes, excess capacity of nodes to deal with vagaries of demand fluctuation for compute service, and/or that capacity may be added in larger units than closely tracks the increased use of compute nodes (see column 37, lines 58-67 and column 38, lines 1-7).
It would have been obvious to a person having ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to have modified the method taught by Paul et al. to include the above mentioned to improve the performance of the device (see Gupta, column 33, lines 7-22).

Response to Arguments
Applicant's arguments filed 01/22/2026 have been fully considered but they are not persuasive. 
Applicant’s representative argues, in pages 7-11 that Paul et al. does not teach exactly one of: (i) sending, by the memory controller based on the usage of the first memory satisfying a latency criteria, a first memory read request for the first data to the first memory; (ii) sending, by the memory controller based on the usage of the first memory not satisfying the latency criteria, a second memory read request for the first data to a second memory, wherein the second memory is a cache for the first memory as recited in claims 1 and 17-18. Applicant’s representative argues that the claims require that a "memory read request for the first data" is sent to only one memory "based on the usage of the first memory" either "satisfying" or "not satisfying a latency criteria.".
In response: The examiner disagrees. Claim language as present only requires one of the two memory read request to be sent based on the usage of the memory either satisfying or not satisfying a latency criteria. Gupta et al. teaches techniques for processing storage I/O (input/output) read requests in a system implementing a separate distributed buffer cache system and a separate distributed storage system for read requests exceeding a read latency threshold… the current pending time for a storage I/O read request may be evaluated in order to determine whether a latency time threshold for a storage I/O read request is exceeded, as indicated at 1220. A latency time threshold for a storage I/O read request may be set or determined based on a throughput or other service guarantee that may be still met if alternative means of obtaining data specified in as storage I/O read request are performed. For example, if the storage I/O read request does exceed the latency time threshold for the storage I/O read request, then the storage I/O read request may be sent directly to the distributed storage system (i.e., read request is sent to memory if latency is satisfied) (see column 33, lines 7-59).

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARACELIS RUIZ whose telephone number is (571)270-1038. The examiner can normally be reached Monday-Friday 11:00am-7:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Reginald G. Bragdon can be reached at (571)272-4204. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ARACELIS RUIZ/            Primary Examiner, Art Unit 2139
Read full office action
Prosecution Timeline

Nov 22, 2024
Application Filed
Oct 17, 2025
Non-Final Rejection — §103
Jan 22, 2026
Response Filed
Apr 03, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/769,631
Patent 12554649
Profile Guided Memory Trimming
2y 5m to grant Granted Feb 17, 2026
18/410,238
Patent 12536104
USING SPECIAL DATA STORAGE PARAMETERS WHEN STORING COLD STREAM DATA IN A DATA STORAGE DEVICE
2y 5m to grant Granted Jan 27, 2026
18/418,689
Patent 12524353
CANCELLING CACHE ALLOCATION TRANSACTIONS
2y 5m to grant Granted Jan 13, 2026
18/762,201
Patent 12517833
ELECTRONIC DEVICES, INCLUDING MEMORY DEVICES, AND OPERATING METHODS THEREOF
2y 5m to grant Granted Jan 06, 2026
18/628,916
Patent 12499051
METHOD OF DETERMINING A CACHE SIZE USING AN ESTIMATION OF A NUMBER OF REQUESTS FOR CACHE MEMORY AND A SIZE OF THE REQUESTS
2y 5m to grant Granted Dec 16, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
87%
Grant Probability
99%
With Interview (+12.5%)
2y 7m
Median Time to Grant
Moderate
PTA Risk
Based on 814 resolved cases by this examiner. Grant probability derived from career allow rate.