Last updated: May 29, 2026
Application No. 18/439,647
Performance Optimization for Loading Data in Memory Services Configured on Storage Capacity of a Data Storage Device

Non-Final OA §102§103§112
Filed
Feb 12, 2024
Priority
Feb 15, 2023 — provisional 63/485,137
Examiner
SAVLA, ARPAN P
Art Unit
2137
Tech Center
2100 — Computer Architecture & Software
Assignee
Micron Technology, Inc.
OA Round
2 (Non-Final)
Interview Optional

— +9.1% interview lift. Interview lift (+9.1%) is below the 15.0% threshold. A written response is recommended.
Based on 318 resolved cases, 2023–2026
Examiner Intelligence

SAVLA, ARPAN P View full profile →
Grants 58% of resolved cases
Career Allowance Rate
186 granted / 318 resolved
+3.5% vs TC avg
Moderate +9% lift
Without
With
+9.1%
Interview Lift
resolved cases with interview
Typical timeline
4y 3m
Avg Prosecution
3 currently pending
Career history
339
Total Applications
across all art units
Statute-Specific Performance

§101
3.2%
-36.8% vs TC avg
§103
79.1%
+39.1% vs TC avg
§102
11.2%
-28.8% vs TC avg
§112
4.7%
-35.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 318 resolved cases
Office Action

§102 §103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claims 1-20 are pending in this action.

Claim Objections
The previous objections are withdrawn in view of the amendment.

Claim Rejections - 35 USC § 112
The previous 112(b) rejection is withdrawn in view of the amendment.


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-4, 6, 7, 10, 11, 13, 16, and 18-20 is/are rejected under 35 U.S.C. 102a(1) and a(2) as being anticipated by HORWICH (US 20210374080 A1), hereinafter “Horwich”.
With regards to claim 1, Horwich teaches comprising:
providing, by a memory sub-system over a connection from a host interface of the memory sub-system to a host system using a first protocol of cache-coherent memory access, memory services in a memory space addressable using memory addresses; (¶0076  As shown, in some embodiments, the dedicated bus 105 is a Computer Express Link (CXL) bus 305 and CMX device 100 is implemented as a CXL memory expansion device or a CXL card to be inserted into a CXL expansion slot of the computer system 101. ... CXL bus 305 is a high-speed CPU-to-device and CPU-to-memory interconnect or link based on the CXL protocol, including sub-protocols CXL.io, CXL.cache and CXL.memory, which can be used concurrently.; 
¶0077 As shown in FIG. 3A, in some embodiments, bus interface 122 is configured to interface with the CXL bus 305 and includes a physical layer 304 …;
¶0072 FIG. 1 is a block diagram of computer system 101 including a host computer (or host) 110 and a coherent memory expansion device (CMX device) 100 coupled to the host via a dedicated bus 105 (e.g., a CXL bus) …;
¶0073 FIG. 1 also shows that CMXC 120 includes a bus interface 122 configured to interface with the host via the dedicated bus 105, and control logic (e.g., logic circuitry) 125 coupled to the bus interface 122 and configurable to control communication of commands (or requests) and data between the CPU and local memory 130, and between local memory 130 and NVM 140, and to maintain coherency of the device cache 127 and other caches (e.g., CPU cache 113) in the computer system 101, and the coherency of a memory space mapped to at least part of the local memory 130.; 
¶0081 In some embodiments, a processor core 112-1, 112-2, . . . , 112-n may access physical memory by paging (e.g., having a page moved in and out of memory), where a page is the smallest partition of memory mapped by the processor from a virtual address to a physical address and may include multiple cache lines.;
¶0085 FIG. 4 illustrates various memory spaces in computer system 101 in accordance with some embodiments. As shown, the memory spaces include a coherent host memory space 410 provided by host memory 116, a coherent device memory space 420 provided by local memory 130 and a private memory space 450 also provided by local memory 130. Memory spaces 410 and 420 are in a coherent memory space 400 accessible by the host 110. In some embodiments, a controller memory buffer (CMB) including submission queues 432 and completion queues 434 occupies a CMB space 430 in the coherent device memory space 420. In some embodiments, coherent memory space 400 includes cache lines, e.g., cache lines 421 and cache lines 422, for storing demand and predictive data and other application data. In some embodiments, private memory space 450 is hidden from the host 110 so that it is accessible by control logic 125 but not by the CPU 112. Private memory space 450 can be used to store speculative read data, as discussed further below.
The claimed memory subsystem is interpreted to be the coherent memory expansion device (CMX Device) 100.  The CXL.mem or CXL.cache protocols are interpreted as the claimed first protocol of cache-coherent memory access. The CXL bus 305 is interpreted as the claimed host interface. The claimed host system is interpreted as Host 110. The claimed connection to the host system is interpreted to be the bus (bus 105 and CXL bus 305) which is described as a high-speed CPU-to-device and CPU-to-memory interconnect. The memory spaces: Host Memory 410, Device Memory 420, and Private Memory 450 are interpreted to be possible memory spaces for the claimed memory space. The claimed memory services the memory space provides is interpreted to include the storing of the submission queues and its data including demand, predictive, and speculative data. The cache lines of the memory space are interpreted to be addressable using memory addresses (¶0081) which is interpreted to be the claimed memory addresses to address memory services in memory space.)
providing, by the memory sub-system over the connection using a second protocol of storage access, storage services in a storage space addressable using logical block addresses; (¶0007 In some embodiments, the memory expansion device is coupled to the host via a Computer Express Link (CXL) bus, wherein the interface circuitry provides a CXL interface between the control logic and the CXL bus, and wherein the first coherent destination memory space is accessible by the host using a CXL protocol.
¶0076 As shown, in some embodiments, the dedicated bus 105 is a Computer Express Link (CXL) bus 305 and CMX device 100 is implemented as a CXL memory expansion device or a CXL card to be inserted into a CXL expansion slot of the computer system 101. ... CXL bus 305 is a high-speed CPU-to-device and CPU-to-memory interconnect or link based on the CXL protocol, including sub-protocols CXL.io, CXL.cache and CXL.memory, which can be used concurrently.;
¶0087 In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers. In some embodiments, the payload 630 corresponds to a plurality of logical blocks at corresponding logical block addresses (LBA-1, LBA-2, . . . , LBA-n) in the NVM 130 and can be specified by an LBA of a starting logical block (e.g., LBA-1) and a number of logical blocks n starting at the starting logical block.;
¶0099 In some embodiments, each logical block in the payload corresponds to one or more of the pending cache lines. In some embodiments, the pending cache lines correspond to cache lines (e.g., cache lines 421) in a coherent destination memory space accessible by the host 110, which could be the coherent memory space 420 provided by local memory 130, or, when local memory 130 is not available or provided, the coherent memory space 410 corresponding to host memory 116.;
Logical block addresses correspond to the NVM 130 storage space, the local memory 130 storage space 420, or the host memory 116 storage space 410.  The memory expansion device (memory subsystem) is coupled to the host via a Computer Express Link (CXL) bus and the memory space is accessible by the host using a CXL protocol (¶0007). The CXL.io is interpreted as the claimed second protocol of storage access.)
receiving, in the memory sub-system, first information about a request to load a first data item from a first memory address in the memory space hosted in a first memory of the memory sub-system; (¶0087 FIG. 6 is a diagram illustrating a submission 600 from the host 110 in accordance with some embodiments. … In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers. In some embodiments, the payload 630 corresponds to a plurality of logical blocks at corresponding logical block addresses (LBA-1, LBA-2, . . . , LBA-n) in the NVM 130 and can be specified by an LBA of a starting logical block (e.g., LBA-1) and a number of logical blocks n starting at the starting logical block.;
¶0073 As shown in FIG. 1, the CMX device 100 includes a coherent memory expansion controller (CMXC) 120 (which includes cache memory or device cache 127), and may further include or has access to local memory 130 (e.g., DDR DRAM), and/or non-volatile memory (NVM) 140 (e.g., NAND Flash memory).; 
¶0081 In some embodiments, a processor core 112-1, 112-2, . . . , 112-n may access physical memory by paging (e.g., having a page moved in and out of memory), where a page is the smallest partition of memory mapped by the processor from a virtual address to a physical address and may include multiple cache lines.; 
¶0095 In some embodiments, as shown in FIG. 8B, in response to a second submission 802 for predictive read, control logic 125 is configured to transfer a payload 821 specified in submission 802 from the NVM 140 to the device memory 420. Subsequent read/write operations 823 related to the payload 821 can be between the CPU and the device memory 420 via the CXL.mem protocol.;
In light of the applicant’s specification (¶0218-¶0226), the first information is seen as an access hint. The claimed first information is interpreted as a hint in the customizable fields 620 in FIG. 6. The subsequent read/write operations related to the payload that is sent between the CPU and the device memory 420 via the CXL.mem protocol (¶0095) is interpreted to be the claimed request to load a first data item. The claimed first data item is interpreted to be the plurality of logical blocks specified in the payload. One or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command corresponds to a plurality of logical blocks at corresponding logical block addresses (LBA-1, LBA-2, . . . , LBA-n) in the NVM 130 and can be specified by an LBA of a starting logical block (e.g., LBA-1). The starting logical block (e.g., LBA-1) is interpreted to be a first memory address. NVM is interpreted as the claimed first memory.)
configuring, by the memory sub-system, a portion of a second memory of the memory sub-system as a cache memory; (¶0078 As also shown in FIG. 3A, cache memory 127 may include a controller memory buffer (CMB) cache 327A and a demand read cache 327B, local memory 130 may include one or more DRAM modules or units, e.g., DRAM modules 130A, 130B, memory controller 126 may include one or more memory controllers, e.g., memory controllers 336A, 336B, coupled, respectively, to the one or more DRAM modules 130A, 130B, and NVM media controller 128 may include or is coupled to associated NVM command queues 328.;
¶0085 As shown, the memory spaces include a coherent host memory space 410 provided by host memory 116, a coherent device memory space 420 provided by local memory 130 and a private memory space 450 also provided by local memory 130. Memory spaces 410 and 420 are in a coherent memory space 400 accessible by the host 110. In some embodiments, a controller memory buffer (CMB) including submission queues 432 and completion queues 434 occupies a CMB space 430 in the coherent device memory space 420.;
The claimed second memory is interpreted to be the device cache 127 and the local memory 130. This includes memory space 420 and 450.)
retrieving, based on the first information and before the request being transmitted to the memory sub-system over the connection using the first protocol of cache-coherent memory access, the first data item from the first memory; and (¶0091 In some embodiments, the cNVMe controller 322 includes registers 322R corresponding, respectively, to the head/tail pointers in the CMB, and cache controller(s) 318 is further configured to alert the cNVMe controller 322 when a new submission is written into the CMB or mirrored in the CMB cache 327A by, for example, writing into a corresponding register 322R of the cNVMe controller 322. In response, cNVMe controller 322 would read the NVMe submission from the CMB cache 327A and start transfers of data to or from the NVM 140 by, for example, issuing NVM read or write commands to the NVM subsystem 340, and instructing the DMA channel 320 to move the data between the different memory and storage resources, in accordance with the NVMe submission.; 
¶0095 In some embodiments, as shown in FIG. 8B, in response to a second submission 802 for predictive read, control logic 125 is configured to transfer a payload 821 specified in submission 802 from the NVM 140 to the device memory 420. Subsequent read/write operations 823 related to the payload 821 can be between the CPU and the device memory 420 via the CXL.mem protocol.;
When a submission is a predictive read, the CMX device retrieves the payload from the NVM and caches the payload to the device memory. The submission (FIG.6) is interpreted to have the first information (hints) in a field (element 620) of the submission. ”Subsequent read/write operations 823 related to the payload 821 can be between the CPU and the device memory 420 via the CXL.mem protocol” is interpreted to mean that the retrieving of the payload would happen before the request (read/write operations) is transmitted to the memory sub-system over the connection using the first protocol of cache-coherent memory access, hence the wording “subsequent”. )
caching, in the cache memory and based on the first information, the first data item retrieved from the first memory. (¶0091 In some embodiments … In response, cNVMe controller 322 would read the NVMe submission from the CMB cache 327A and start transfers of data to or from the NVM 140 by, for example, issuing NVM read or write commands to the NVM subsystem 340, and instructing the DMA channel 320 to move the data between the different memory and storage resources, in accordance with the NVMe submission.; 
¶0095 In some embodiments, as shown in FIG. 8B, in response to a second submission 802 for predictive read, control logic 125 is configured to transfer a payload 821 specified in submission 802 from the NVM 140 to the device memory 420. Subsequent read/write operations 823 related to the payload 821 can be between the CPU and the device memory 420 via the CXL.mem protocol.;
When a submission is a predictive read, the CMX device retrieves the payload from the NVM and caches the payload to the device memory. The submission (FIG.6) is interpreted to have the first information (hints) in a field (element 620) of the submission. The transfer of a payload 821 specified in submission 802 from the NVM 140 to the device memory 420 is interpreted as caching the first data item retrieved from the first memory. 
With regards to claim 2, Horwich teaches wherein the connection is according to a standard of computer express link (CXL). (¶0076 As shown, in some embodiments, the dedicated bus 105 is a Computer Express Link (CXL) bus 305 and CMX device 100 is implemented as a CXL memory expansion device or a CXL card to be inserted into a CXL expansion slot of the computer system 101. ... CXL bus 305 is a high-speed CPU-to-device and CPU-to-memory interconnect or link based on the CXL protocol, including sub-protocols CXL.io, CXL.cache and CXL.memory, which can be used concurrently.)
With regards to claim 3, Horwich teaches further comprising:
receiving, in the memory sub-system, second information about a second memory address in the memory space; and (¶0087 FIG. 6 is a diagram illustrating a submission 600 from the host 110 in accordance with some embodiments. ... In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers.;
¶0082 In some embodiments, the SDM software 201 initiates data transfers into and out of the NVM 140 by writing submissions into one or more submission queues in a controller memory buffer (CMB) on the CMX device 100, the CMX device 100 indicates completion of the submissions by writing completions into one or more completion queues in the CMB.; 
It is within the scope of the invention for there to be more than one submission from the host, hence the submission queue. The second information is interpreted to be access hint, applicant’s specification (¶0218-0226). Second information is interpreted as a hint in the customizable fields 620 of the submission 600 in FIG. 6. The one or more fields for memory location specification is interpreted to contain a form of memory addressing.)
writing, by the memory sub-system based on the second information, a second data item cached in the cache memory into the first memory for the caching of the first data item.
(¶0091 In some embodiments, the cNVMe controller 322 includes registers 322R corresponding, respectively, to the head/tail pointers in the CMB, and cache controller(s) 318 is further configured to alert the cNVMe controller 322 when a new submission is written into the CMB or mirrored in the CMB cache 327A by, for example, writing into a corresponding register 322R of the cNVMe controller 322. In response, cNVMe controller 322 would read the NVMe submission from the CMB cache 327A and start transfers of data to or from the NVM 140 by, for example, issuing NVM read or write commands to the NVM subsystem 340, and instructing the DMA channel 320 to move the data between the different memory and storage resources, in accordance with the NVMe submission.; 
Under the Broadest reasonable interpretation, the claim is interpreted to be, writing, from the cache memory in the second memory, a second data to the first memory (which was used when caching the first data). The submission can transfer the data to the NVM subsystem (containing NVM) from different memory and storage resources (interpreted to include cache memory) therefor “writing” the data.)
With regards to claim 4, Horwich teaches wherein the first information and the second information are received, from the host system over the connection using the first protocol of cache-coherent memory access, in a queue configured in the second memory. (¶0085 Memory spaces 410 and 420 are in a coherent memory space 400 accessible by the host 110. In some embodiments, a controller memory buffer (CMB) including submission queues 432 and completion queues 434 occupies a CMB space 430 in the coherent device memory space 420. In some embodiments, coherent memory space 400 includes cache lines, e.g., cache lines 421 and cache lines 422, for storing demand and predictive data and other application data.
The CMB cache stores the submissions (interpreted to contain first/second information) in a submission queue inside a CMB space. The CMB cache is interpreted to be part of the second memory. The host sends the submission over the CXL bus (interpreted as the connection) which is capable of using the first protocol.)
With regards to claim 6, Horwich teaches wherein the first information includes a read command configured to be optional for execution by the memory sub-system;
(¶0087 FIG. 6 is a diagram illustrating a submission 600 from the host 110 in accordance with some embodiments. ... In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers.)
and when executed by the memory sub-system, the read command causes the retrieving of the first data item from the first memory into the cache memory. 
(¶0082 In some embodiments, the SDM software 201 initiates data transfers into and out of the NVM 140 by writing submissions into one or more submission queues in a controller memory buffer (CMB) on the CMX device 100, the CMX device 100 indicates completion of the submissions by writing completions into one or more completion queues in the CMB.; 
¶0091 In some embodiments, the cNVMe controller 322 includes registers 322R corresponding, respectively, to the head/tail pointers in the CMB, and cache controller(s) 318 is further configured to alert the cNVMe controller 322 when a new submission is written into the CMB or mirrored in the CMB cache 327A by, for example, writing into a corresponding register 322R of the cNVMe controller 322. In response, cNVMe controller 322 would read the NVMe submission from the CMB cache 327A and start transfers of data to or from the NVM 140 by, for example, issuing NVM read or write commands to the NVM subsystem 340, and instructing the DMA channel 320 to move the data between the different memory and storage resources, in accordance with the NVMe submission.;
The command 611 is interpreted to include a read command. the submission containing the read command and hints is used when transferring data from the NVM (interpreted as first memory) to the different memory and storage resources (interpreted to include cache memory).)
With regards to claim 7, Horwich teaches wherein the second information includes a write command configured to be optional for execution by the memory sub-system; (¶0087 FIG. 6 is a diagram illustrating a submission 600 from the host 110 in accordance with some embodiments. ... In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers.)
and when executed by the memory sub-system, the write command causes the writing of the second data item from the cache memory into the first memory. (¶0082  In some embodiments, the SDM software 201 initiates data transfers into and out of the NVM 140 by writing submissions into one or more submission queues in a controller memory buffer (CMB) on the CMX device 100, the CMX device 100 indicates completion of the submissions by writing completions into one or more completion queues in the CMB.; 
¶0091 In some embodiments, the cNVMe controller 322 includes registers 322R corresponding, respectively, to the head/tail pointers in the CMB, and cache controller(s) 318 is further configured to alert the cNVMe controller 322 when a new submission is written into the CMB or mirrored in the CMB cache 327A by, for example, writing into a corresponding register 322R of the cNVMe controller 322. In response, cNVMe controller 322 would read the NVMe submission from the CMB cache 327A and start transfers of data to or from the NVM 140 by, for example, issuing NVM read or write commands to the NVM subsystem 340, and instructing the DMA channel 320 to move the data between the different memory and storage resources, in accordance with the NVMe submission.
The command 611 is interpreted to include a write command. the submission containing the write command and hints is used when transferring data to the NVM (interpreted as first memory) from the different memory and storage resources (interpreted to include cache memory).)
With regards to claim 10, Horwich teaches comprising:
a host interface operable on a connection to a host system according to a storage access protocol and a cache-coherent memory access protocol; (¶0076 As shown, in some embodiments, the dedicated bus 105 is a Computer Express Link (CXL) bus 305 and CMX device 100 is implemented as a CXL memory expansion device or a CXL card to be inserted into a CXL expansion slot of the computer system 101. ... CXL bus 305 is a high-speed CPU-to-device and CPU-to-memory interconnect or link based on the CXL protocol, including sub-protocols CXL.io, CXL.cache and CXL.memory, which can be used concurrently.; 
¶0073 FIG. 1 also shows that CMXC 120 includes a bus interface 122 configured to interface with the host via the dedicated bus 105, and control logic (e.g., logic circuitry) 125 coupled to the bus interface 122 and configurable to control communication of commands (or requests) and data between the CPU and local memory 130, and between local memory 130 and NVM 140, and to maintain coherency of the device cache 127 and other caches (e.g., CPU cache 113) in the computer system 101, and the coherency of a memory space mapped to at least part of the local memory 130.; 
The CXL.mem or CXL.cache protocols are interpreted as the cache-coherent memory access protocol. The CXL.io is interpreted as the storage access protocol. The bus interface is interpreted as the host interface. The connection to a host system is interpreted to be the CXL bus which is described as a high-speed CPU-to-device and CPU-to-memory interconnect.)
a first memory configured to provide a non-volatile storage capacity of the memory sub-system, wherein at least a portion of the first memory is configured to implement a memory device attached via the connection to the host system; (¶0078 As shown in ... As also shown in FIG. 3A, cache memory 127 may include a controller memory buffer (CMB) cache 327A and a demand read cache 327B, local memory 130 may include one or more DRAM modules or units, e.g., DRAM modules 130A, 130B, memory controller 126 may include one or more memory controllers, e.g., memory controllers 336A, 336B, coupled, respectively, to the one or more DRAM modules 130A, 130B, and NVM media controller 128 may include or is coupled to associated NVM command queues 328. In some embodiments, the combination of NVM media controller 128, its associated NVM command queues 328 and NVM 140 is sometimes referred to herein as an NVM subsystem 340.
The NVM 140 which is interpreted to be the first memory. The NVM subsystem 340 which contains the NVM media controller 128, the NVM command queues 328, and the NVM 140 (interpreted as the first memory) is interpreted as the memory device.  Looking at FIG. 3A and FIG. 1, it can be seen that the NVM subsystem is part of the CMX device (interpreted to be the memory subsystem) which is connected to the connection.)
a second memory faster than the first memory; and (¶0073 As shown in FIG. 1, the CMX device 100 includes a coherent memory expansion controller (CMXC) 120 (which includes cache memory or device cache 127), and may further include or has access to local memory 130 (e.g., DDR DRAM), and/or non-volatile memory (NVM) 140 (e.g., NAND Flash memory).;
¶0078 As also shown in FIG. 3A, cache memory 127 may include a controller memory buffer (CMB) cache 327A and a demand read cache 327B, local memory 130 may include one or more DRAM modules or units, e.g., DRAM modules 130A, 130B, memory controller 126 may include one or more memory controllers, e.g., memory controllers 336A, 336B, coupled, respectively, to the one or more DRAM modules 130A, 130B, and NVM media controller 128 may include or is coupled to associated NVM command queues 328.;
¶0085 As shown, the memory spaces include a coherent host memory space 410 provided by host memory 116, a coherent device memory space 420 provided by local memory 130 and a private memory space 450 also provided by local memory 130. Memory spaces 410 and 420 are in a coherent memory space 400 accessible by the host 110. In some embodiments, a controller memory buffer (CMB) including submission queues 432 and completion queues 434 occupies a CMB space 430 in the coherent device memory space 420.;
Second memory is interpreted to be the device cache 127 and the local memory 130. This includes memory space 420 and 450. Local memory is DRAM and first memory (NVM) is NAND Flash memory, therefore second memory is interpreted to be “faster” than first memory because DRAM is faster than NAND Flash memory.)
a controller configured to: (¶0078 As shown in FIG. 3A, control logic 125 in CMXC 120 includes a CXL bridge 310, a device coherency engine (DCOH) 312, a bias table 314, a snooping unit 316, one or more cache controllers 318, a direct memory access (DMA) channel 320 including one or more DMA engines, and a coherent NVMe (cNVMe) controller 322. As also shown in FIG. 3A, cache memory 127 may include a controller memory buffer (CMB) cache 327A and a demand read cache 327B, local memory 130 may include one or more DRAM modules or units, e.g., DRAM modules 130A, 130B, memory controller 126 may include one or more memory controllers, e.g., memory controllers 336A, 336B, coupled, respectively, to the one or more DRAM modules 130A, 130B, and NVM media controller 128 may include or is coupled to associated NVM command queues 328. In some embodiments, the combination of NVM media controller 128, its associated NVM command queues 328 and NVM 140 is sometimes referred to herein as an NVM subsystem 340.
Horwich discloses multiple controllers which are interpreted to be the controller.)
receive hints about the host system accessing the memory device over the connection using the cache-coherent memory access protocol; (¶0087 FIG. 6 is a diagram illustrating a submission 600 from the host 110 in accordance with some embodiments. ... In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers.;
¶0073 As shown in FIG. 1, the CMX device 100 includes a coherent memory expansion controller (CMXC) 120 (which includes cache memory or device cache 127), and may further include or has access to local memory 130 (e.g., DDR DRAM), and/or non-volatile memory (NVM) 140 (e.g., NAND Flash memory). 
When the host sends a submission, the submission is interpreted to contain hints pertaining to the host accessing the memory device. The hints are included in the customizable fields 620 of the submission 600. Looking at FIG. 3A, the host could send the submission through the CXL bus 305 connection using the cache-coherent memory access protocol (CXL.mem or CXL.cache) to the controller of the memory sub-system (CXL device).)
identify, based on the hints, a page of memory in the memory device to be accessed by the host system; and (¶0091 In some embodiments, the cNVMe controller 322 includes registers 322R corresponding, respectively, to the head/tail pointers in the CMB, and cache controller(s) 318 is further configured to alert the cNVMe controller 322 when a new submission is written into the CMB or mirrored in the CMB cache 327A by, for example, writing into a corresponding register 322R of the cNVMe controller 322. In response, cNVMe controller 322 would read the NVMe submission from the CMB cache 327A and start transfers of data to or from the NVM 140 by, for example, issuing NVM read or write commands to the NVM subsystem 340, and instructing the DMA channel 320 to move the data between the different memory and storage resources, in accordance with the NVMe submission.; 
¶0081 In some embodiments, a processor core 112-1, 112-2, . . . , 112-n may access physical memory by paging (e.g., having a page moved in and out of memory), where a page is the smallest partition of memory mapped by the processor from a virtual address to a physical address and may include multiple cache lines.;
When the submission (containing hints) reaches the CMB, the controllers command the NVM subsystem to retrieve data (interpreted to include a page of memory in the memory device).)
cache, based on the access hints, a content of the page in the second memory prior to receiving a request, from the host system over the connection using the cache-coherent memory access protocol, to load a data item from a memory address in the page. (¶0081 In some embodiments, a processor core 112-1, 112-2, . . . , 112-n may access physical memory by paging (e.g., having a page moved in and out of memory), where a page is the smallest partition of memory mapped by the processor from a virtual address to a physical address and may include multiple cache lines.;
¶0087 In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers. In some embodiments, the payload 630 corresponds to a plurality of logical blocks at corresponding logical block addresses (LBA-1, LBA-2, . . . , LBA-n) in the NVM 130 and can be specified by an LBA of a starting logical block (e.g., LBA-1) and a number of logical blocks n starting at the starting logical block.;
¶0089 In addition to demand read (e.g., an operation to resolve page fault at the host), CMX device 100 also facilitates predictive read (e.g., an operation to load a payload in a coherent memory space 410 or 420 based on prediction that the payload may be needed in a predictive time frame) and speculative read (e.g., an operation to load a payload in the private memory space 450 based on speculation that the payload may be needed in a speculative time frame. In some embodiments, control logic control logic 125 is configured to process a submission from the host 110 with a certain priority based on whether the submission is for demand read, predictive read, or speculative read.
¶0090 In some embodiments, as shown in FIG. 7, CMB cache 327A is synchronized with the CMB space 430 and includes one or more synchronized (or mirrored) submission queues 731, 732, 733, corresponding, respectively, to the one or more submission queues, e.g., demand queue 531, predictive queue 532, speculative queue …
¶0095 In some embodiments, as shown in FIG. 8B, in response to a second submission 802 for predictive read, control logic 125 is configured to transfer a payload 821 specified in submission 802 from the NVM 140 to the device memory 420. Subsequent read/write operations 823 related to the payload 821 can be between the CPU and the device memory 420 via the CXL.mem protocol.
¶0091 In some embodiments, the cNVMe controller 322 includes registers 322R corresponding, respectively, to the head/tail pointers in the CMB, and cache controller(s) 318 is further configured to alert the cNVMe controller 322 when a new submission is written into the CMB or mirrored in the CMB cache 327A by, for example, writing into a corresponding register 322R of the cNVMe controller 322. In response, cNVMe controller 322 would read the NVMe submission from the CMB cache 327A and start transfers of data to or from the NVM 140 by, for example, issuing NVM read or write commands to the NVM subsystem 340, and instructing the DMA channel 320 to move the data between the different memory and storage resources, in accordance with the NVMe submission.;
When the submission (containing hints (¶0087)) is for a predictive read (¶0089) the predictive read is configured to transfer a payload (interpreted to contain the claimed content of the page) from the NVM (interpreted as first memory), into the device memory 420 (interpreted to be part of second memory) (¶0095). This can be interpreted as caching, based on the access hints, a content from the page into the second memory. After the caching of the content from the page, any subsequent read/write operations (interpreted to be the claimed request) related to the payload (interpreted to contain the claimed data item from a memory address in the page) will be between the CPU (interpreted to be in host system) and the device memory (interpreted to be the second memory) using the CXL.mem protocol (¶0095). This can be interpreted as receiving a request, from the host system over the connection using the cache-coherent memory access protocol, to load a data item from a memory address in the page after the caching of the content from the page. The connection can be seen in FIG. 8B as the CXL Bus 305.)
With regards to claim 11, Horwich teaches wherein the controller is further configured to:
allocate a portion of the second memory as a buffer memory; (¶0078 As shown in ...  As also shown in FIG. 3A, cache memory 127 may include a controller memory buffer (CMB) cache 327A and a demand read cache 327B, local memory 130 may include one or more DRAM modules or units, e.g., DRAM modules 130A, 130B, memory controller 126 may include one or more memory controllers, e.g., memory controllers 336A, 336B, coupled, respectively, to the one or more DRAM modules 130A, 130B, and NVM media controller 128 may include or is coupled to associated NVM command queues 328. 
Figure 3A shows a controller memory buffer cache (CMB) 327A inside of the cache 127, the (CMB) is interpreted to be the buffer.)
enter at least a portion of the hints in a queue configured in the buffer memory in response to requests from the host system over the connection, the hints including memory addresses specified in the requests. (¶0085 FIG. 4 illustrates … In some embodiments, a controller memory buffer (CMB) including submission queues 432 and completion queues 434 occupies a CMB space 430 in the coherent device memory space 420.; 
¶0017 In some embodiments, the memory expansion device further comprises a controller memory buffer (CMB) including submission queues, accessible by the host, the submission queues including at least a first submission queue for queuing submissions of the first priority and at least a second submission queue for queuing submissions of the second priority, wherein the first submission is queued in the first submission queue, and the second submission is queued in the second submission queue.;
The CMB is interpreted to have a space in memory for queues which contain the submissions which further include the addresses, commands (requests), and hints.)
With regards to claim 13, Horwich teaches wherein at least a portion of the hints are entered by the host system into a storage access queue. (¶0085 FIG. 4 illustrates … In some embodiments, a controller memory buffer (CMB) including submission queues 432 and completion queues 434 occupies a CMB space 430 in the coherent device memory space 420. 
¶0017 In some embodiments, the memory expansion device further comprises a controller memory buffer (CMB) including submission queues, accessible by the host, the submission queues including at least a first submission queue for queuing submissions of the first priority and at least a second submission queue for queuing submissions of the second priority, wherein the first submission is queued in the first submission queue, and the second submission is queued in the second submission queue.
As interpreted above in the 112(b) rejections section, the “a portion of the hints are entered by the host system into a storage access queue” in claim 13 is interpreted the “a portion of the hints in a queue” in claim 11. 
The CMB has a space in memory for queues which contain the submissions, which include the addresses, commands (requests), and hints. The submissions are sent by the host through the CXL bus. The queue is interpreted to be a storage access queue.)
With regards to claim 14, Horwich teaches wherein the storage access queue is configured in the memory device attached via the connection to the host system; (¶0092 In some embodiments, as shown in FIG. 7, the NVM queues 328 include one or more NVM command queues, e.g., NVM command queues 751, 752, 753, corresponding, respectively, to the one or more submission queues, e.g., demand queue 531, predictive queue 532, speculative queue speculative queue 533, in the CMB, or to the one or more mirrored submission queues in the CMB cache 327A.
The NVM command queues 328 (inside of the memory device) corresponds to the one or more submission queues (e.g. demand queue 531, predictive queue 532, speculative queue speculative queue 533) and can therefore be interpreted as storing the submissions (which stores the portion of the hints) inside a queue of the memory device.)
the host system is configured to enter hints into the storage access queue over the connection using the cache-coherent memory access protocol; (¶0087 FIG. 6 is a diagram illustrating a submission 600 from the host 110 in accordance with some embodiments. ... In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers.;
¶0085 FIG. 4 illustrates … In some embodiments, a controller memory buffer (CMB) including submission queues 432 and completion queues 434 occupies a CMB space 430 in the coherent device memory space 420. 
The host is sending the submission through the CXL bus which is able to use the cache-coherent memory access protocol. The CMB has a space in memory for queues which contain the submissions, which include the addresses, commands (requests), and hints.)
and the controller is configured to retrieve hints from the storage access queue without using the connection. (¶0083 In some embodiments, … cNVMe controller 322 is further configured to facilitate movement of data between the NVM subsystem 340 and device cache 127 and/or local memory 130 using the DMA channel 320.
Because the cache (device cache 127) and the memory device (NVM subsystem 340) is connected to the Controller (see FIG. 3A) without the need of the connection (CXL bus), it is interpreted that the CXL bus is unneeded to send the hints from the CMB to the controller. The DMA channel is used when moving data between the NVM, cache, and local memory, therefore it shows that the connection is unneeded when sending hints to the controller.)
With regards to claim 16, Horwich teaches comprising:
establishing a connection between a host system and a memory sub-system, the connection operable according to a storage access protocol and a cache-coherent memory access protocol; (¶0076  As shown, in some embodiments, the dedicated bus 105 is a Computer Express Link (CXL) bus 305 and CMX device 100 is implemented as a CXL memory expansion device or a CXL card to be inserted into a CXL expansion slot of the computer system 101. ... CXL bus 305 is a high-speed CPU-to-device and CPU-to-memory interconnect or link based on the CXL protocol, including sub-protocols CXL.io, CXL.cache and CXL.memory, which can be used concurrently.; 
¶0073 FIG. 1 also shows that CMXC 120 includes a bus interface 122 configured to interface with the host via the dedicated bus 105, and control logic (e.g., logic circuitry) 125 coupled to the bus interface 122 and configurable to control communication of commands (or requests) and data between the CPU and local memory 130, and between local memory 130 and NVM 140, and to maintain coherency of the device cache 127 and other caches (e.g., CPU cache 113) in the computer system 101, and the coherency of a memory space mapped to at least part of the local memory 130. 
The CXL.mem or CXL.cache protocols are interpreted as the cache-coherent memory access protocol. The CXL.io is interpreted as the storage access protocol. The connection to a host system is interpreted to be the bus which is described as a high-speed CPU-to-device and CPU-to-memory interconnect.)
attaching a first memory of the memory sub-system over the connection to the host system as a memory device accessible via the cache-coherent memory access protocol; (¶0078 As shown in ... As also shown in FIG. 3A, cache memory 127 may include a controller memory buffer (CMB) cache 327A and a demand read cache 327B, local memory 130 may include one or more DRAM modules or units, e.g., DRAM modules 130A, 130B, memory controller 126 may include one or more memory controllers, e.g., memory controllers 336A, 336B, coupled, respectively, to the one or more DRAM modules 130A, 130B, and NVM media controller 128 may include or is coupled to associated NVM command queues 328. In some embodiments, the combination of NVM media controller 128, its associated NVM command queues 328 and NVM 140 is sometimes referred to herein as an NVM subsystem 340.
The NVM subsystem 340 is interpreted as a memory device that includes the NVM 140 which is interpreted to be the first memory. Looking at FIG. 3A it can be seen that the NVM subsystem is part of the CMX device (interpreted to be the memory subsystem) which is connected to the connection.)
entering, in a storage access queue, a plurality of hints about the host system accessing the memory device over the connection using the cache-coherent memory access protocol; and (¶0087 FIG. 6 is a diagram illustrating a submission 600 from the host 110 in accordance with some embodiments. ... In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers.;
¶0085 FIG. 4 illustrates … In some embodiments, a controller memory buffer (CMB) including submission queues 432 and completion queues 434 occupies a CMB space 430 in the coherent device memory space 420. 
¶0017 In some embodiments, the memory expansion device further comprises a controller memory buffer (CMB) including submission queues, accessible by the host, the submission queues including at least a first submission queue for queuing submissions of the first priority and at least a second submission queue for queuing submissions of the second priority, wherein the first submission is queued in the first submission queue, and the second submission is queued in the second submission queue.
The host is sending the submission through the CXL bus which is able to use the cache-coherent memory access protocol. The CMB has a space in memory for queues which contain the submissions, which include the addresses, commands (requests), and hints. The submissions are stored in a submission queue which are located in the CMB. The submission queue is interpreted to be the storage access queue.)
caching, by the memory sub-system based on the plurality of hints, a first page of the memory device in a second memory of the memory sub-system prior to the host system accessing a memory address in the first page. (¶0085 As shown, the memory spaces include a coherent host memory space 410 provided by host memory 116, a coherent device memory space 420 provided by local memory 130 and a private memory space 450 also provided by local memory 130. Memory spaces 410 and 420 are in a coherent memory space 400 accessible by the host 110. In some embodiments, a controller memory buffer (CMB) including submission queues 432 and completion queues 434 occupies a CMB space 430 in the coherent device memory space 420.;
¶0087 FIG. 6 is a diagram illustrating a submission 600 from the host 110 in accordance with some embodiments. ... In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers.; 
¶0081 In some embodiments, a processor core 112-1, 112-2, . . . , 112-n may access physical memory by paging (e.g., having a page moved in and out of memory), where a page is the smallest partition of memory mapped by the processor from a virtual address to a physical address and may include multiple cache lines.;
¶0090 In some embodiments, as shown in FIG. 7, CMB cache 327A is synchronized with the CMB space 430 and includes one or more synchronized (or mirrored) submission queues 731, 732, 733, corresponding, respectively, to the one or more submission queues, e.g., demand queue 531, predictive queue 532, speculative queue speculative queue speculative queue speculative queue 533, in the CMB.;
¶0095 In some embodiments, as shown in FIG. 8B, in response to a second submission 802 for predictive read, control logic 125 is configured to transfer a payload 821 specified in submission 802 from the NVM 140 to the device memory 420. Subsequent read/write operations 823 related to the payload 821 can be between the CPU and the device memory 420 via the CXL.mem protocol.;
The second memory is interpreted to include the device cache 127 and local memory 130. The memory space 420 and private memory space 450 are interpreted to be part of the second memory's memory space. The submissions (interpreted to contain hints) are sent to the memory sub-system and the memory sub-system starts the transfer of data, based on the submission, from the NVM 140 to the device memory 420 (Figure 8B shows the transfer from NVM to local memory). The payload is seen to include the first page of data and any subsequent read/write operation can be done between the second memory and the host via the cache coherent protocol. The "Subsequent" operations Horwich states is interpreted to mean that the host has not accessed the memory before moving the payload.)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over HORWICH (US 20210374080 A1) in view of Sonksen (US 11868282 B1), hereinafter “Sonksen”.
With regards to claim 5, Horwich teaches wherein the first information and the second information are received, …, over the connection using the second protocol of storage access. (¶0076 As shown, in some embodiments, the dedicated bus 105 is a Computer Express Link (CXL) bus 305 and CMX device 100 is implemented as a CXL memory expansion device or a CXL card to be inserted into a CXL expansion slot of the computer system 101. ... CXL bus 305 is a high-speed CPU-to-device and CPU-to-memory interconnect or link based on the CXL protocol, including sub-protocols CXL.io, CXL.cache and CXL.memory, which can be used concurrently.; 
¶0073 FIG. 1 also shows that CMXC 120 includes a bus interface 122 configured to interface with the host via the dedicated bus 105, and control logic (e.g., logic circuitry) 125 coupled to the bus interface 122 and configurable to control communication of commands (or requests) and data between the CPU and local memory 130, and between local memory 130 and NVM 140, and to maintain coherency of the device cache 127 and other caches (e.g., CPU cache 113) in the computer system 101, and the coherency of a memory space mapped to at least part of the local memory 130.;
Fig. 1 shows the host memory 116 connected to the CMX device 100 using the connection. The connection uses CXL protocols CXL.io, CXL.cache, and CXL.memory to send and receive data, therefore it is interpreted that data send from the host side 110 including the host memory 116 is using the CXL protocols. It is further interpreted that the host memory uses the CXL.io protocol when sending data such as the first and second information.)
Horwich does not teach:
… from a storage access queue configured in a memory of the host system …
However, Sonksen does teach:
… from a storage access queue configured in a memory of the host system … (Col. 5 Lines 51-53: Host device 102 includes host memory 112 which houses IOCB request queue 122 which, as drawn includes 256 slots numbered 0, . . . , 255. 
Sonksen teaches a queue in the host memory. If the first information and the second information were sent from a queue in the host memory, using the Horwich reference, then it would have to use one of the CXL protocols, such as the CXL.io protocol, to send data over the CXL bus (interpreted as the connection).)
Horwich and Sonksen are analogous art because they are from the same field of endeavor, that being data storage devices. It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Horwich to have the first and second information be sent, from the storage access queue in a memory of the host system, to the controller using the second protocol of storage access using the teaching of Sonksen. The motivation would have been to solve the problem of memory capacity and bandwidth gaps in order to keep up with the increase in processor speed. (¶0003 Emerging applications, such … The increasing processor power places increasing demand on memory capacity and memory speed or bandwidth, which unfortunately do not increase at the same rate. Often, higher memory speed means lower memory capacity, and, as memory capacity increases to keep up with the increase in processor speed, memory latency, which is a measure of how long it takes to complete a memory operation, is also increasing at a rate of about 1.1 times every two years. Thus, solving the problem of memory capacity and bandwidth gaps is critical in the performance of data processing systems.)

With regards to claim 15, Horwich teaches wherein …;
and the controller is configured to retrieve hints from the storage access queue over the connection using the storage access protocol. (¶0076 As shown, in some embodiments, the dedicated bus 105 is a Computer Express Link (CXL) bus 305 and CMX device 100 is implemented as a CXL memory expansion device or a CXL card to be inserted into a CXL expansion slot of the computer system 101. ... CXL bus 305 is a high-speed CPU-to-device and CPU-to-memory interconnect or link based on the CXL protocol, including sub-protocols CXL.io, CXL.cache and CXL.memory, which can be used concurrently.; 
¶0073 FIG. 1 also shows that CMXC 120 includes a bus interface 122 configured to interface with the host via the dedicated bus 105, and control logic (e.g., logic circuitry) 125 coupled to the bus interface 122 and configurable to control communication of commands (or requests) and data between the CPU and local memory 130, and between local memory 130 and NVM 140, and to maintain coherency of the device cache 127 and other caches (e.g., CPU cache 113) in the computer system 101, and the coherency of a memory space mapped to at least part of the local memory 130.;
Fig. 1 shows the host memory 116 connected to the CMX device 100 using the connection. The connection uses CXL protocols CXL.io, CXL.cache, and CXL.memory to send and receive data, therefore it is interpreted that data send from the host side 110 including the host memory 116 is using the CXL protocols. It is further interpreted that the host memory uses the CXL.io protocol when sending data such as the hints to the controller.)
Horwich does not teach:
… the storage access queue is configured in a memory of the host system … 
However, Sonksen does teach:
… the storage access queue is configured in a memory of the host system … (Col. 5 Lines 51-53: Host device 102 includes host memory 112 which houses IOCB request queue 122 which, as drawn includes 256 slots numbered 0, . . . , 255. 
Sonksen teaches a queue in the host memory. If the first information and the second information were sent from a queue host memory, using the Horwich reference, then it would have to use one of the CXL protocols, such as the CXL.io protocol, to send data over the CXL bus.)
Horwich and Sonksen are analogous art because they are from the same field of endeavor, that being data storage devices. It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Horwich to have the hints be sent, from the storage access queue in a memory of the host system, to the controller using the storage access protocol using the teaching of Sonksen. The motivation would have been to solve the problem of memory capacity and bandwidth gaps in order to keep up with the increase in processor speed. (¶0003 Emerging applications, such … The increasing processor power places increasing demand on memory capacity and memory speed or bandwidth, which unfortunately do not increase at the same rate. Often, higher memory speed means lower memory capacity, and, as memory capacity increases to keep up with the increase in processor speed, memory latency, which is a measure of how long it takes to complete a memory operation, is also increasing at a rate of about 1.1 times every two years. Thus, solving the problem of memory capacity and bandwidth gaps is critical in the performance of data processing systems.)

Claim(s) 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over HORWICH (US 20210374080 A1) in view of NGUYEN (US 20230057633 A1), hereinafter “Nguyen”, further in view of Roberts (US 20210390053 A1), hereinafter “Roberts”.
With regards to claim 8, Horwich teaches all the limitations of claim 3, from which claim 8 depends. Further, Horwich teaches the first information (¶0087) and accesses to memory addresses in memory space (¶0085). Horwich further teaches predicting, based on the first information, that the payload may be needed in a predictive time frame (¶0089 In addition to demand read (e.g., an operation to resolve page fault at the host), CMX device 100 also facilitates predictive read (e.g., an operation to load a payload in a coherent memory space 410 or 420 based on prediction that the payload may be needed in a predictive time frame) … In some embodiments, control logic control logic 125 is configured to process a submission from the host 110 with a certain priority based on whether the submission is for demand read, predictive read, or speculative read 
Predictive read is interpreted to have the hints of the submission (first information) since the predictive read is a submission type).
Horwich does not teach wherein the first information is configured to identify a pattern of accesses to memory addresses in the memory space;. 
However, Nguyen does teach wherein the first information is configured to identify a pattern of accesses to memory addresses in the memory space; (¶0070 An example of a pseudocode definition for a procedure for sending one or more indications (e.g., hints) to a prefetcher may be as follows: send_prefetch_hint (const void*prefetcher, size_t producerid, size_t consumer_id, const void*buffer_ptr, size_t size, string access_pattern);
¶0071 Access_pattern: can be sequential, random, or determined at runtime;
The first information is seen as an access hint, therefore the indication (e.g., hints) in this reference is interpreted to be the first information with a pattern of accesses including a sequential access pattern.) 
Horwich and Nguyen are analogous art because they are from the same field of endeavor, that being data storage devices. It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Horwich to have the first information configured to identify a pattern of accesses using the teaching of Nguyen. The motivation would have been to reduce latency and improve throughput. (Nguyen: ¶0024 In some embodiments, this may reduce or eliminate the involvement of a host which may be a bottleneck in transferring data between devices. Depending on the implementation details, prefetching data and transferring it to a consumer device may reduce access latency and/or synchronization overhead, and/or may enable data input and/or output (I/O) operations to overlap with data processing operations at the consumer device, thereby improving throughput.)
Furthermore, Horwich does not teach the method further comprising: predicting, based on the first information, a timing of the request to load the first data item from the first memory address; 
and scheduling, based on the timing predicted using the first information, the retrieving of the first data item from the first memory. 
However, Roberts does teach the method further comprising: predicting, based on the first information, a timing of the request to load the first data item from the first memory address; (¶0036 Continuing the example operations, the prefetch engine 118 can determine (e.g., predict), based at least in part on the prefetching configuration 304, one or more memory addresses of the backing memory 120 that may be requested by the host device. For example, the prefetch engine 118 can use a trained neural network, such as the RNN described herein, to predict memory addresses that are likely to be requested before the memory addresses actually are requested. This determination (e.g., prediction) uses as inputs, the ongoing series of memory address requests from the host device. In other words, the memory addresses of the backing memory 120 that may be requested by the host device 102 are memory addresses that, from a probabilistic perspective based on the prefetching configuration, will be (or are likely to be) requested by the host device within some future timeframe—e.g., in accordance with operational patterns of code being executed. The future timeframe can include or pertain to a period during which the predicted access occurs and before the prefetched data is replaced in the intermediate memory. The prefetch engine 118 can then write or load data associated with the one or more predicted memory addresses of the backing memory 120 into the intermediate memory based on the prediction.
The backing memory is the first memory and the intermediate memory is the second memory/cache. The "the memory addresses of the backing memory 120 that may be requested by the host device 102 are memory addresses that … will be (or are likely to be) requested by the host device within some future time frame" is interpreted as a timing of the request to load the first data item from the first memory address. The timing being the anticipated timeframe.)
and scheduling, based on the timing predicted using the first information, the retrieving of the first data item from the first memory. (¶0036 Continuing the example operations, the prefetch engine 118 can determine (e.g., predict), based at least in part on the prefetching configuration 304, one or more memory addresses of the backing memory 120 that may be requested by the host device. For example, the prefetch engine 118 can use a trained neural network, such as the RNN described herein, to predict memory addresses that are likely to be requested before the memory addresses actually are requested. This determination (e.g., prediction) uses as inputs, the ongoing series of memory address requests from the host device. In other words, the memory addresses of the backing memory 120 that may be requested by the host device 102 are memory addresses that, from a probabilistic perspective based on the prefetching configuration, will be (or are likely to be) requested by the host device within some future timeframe—e.g., in accordance with operational patterns of code being executed. The future timeframe can include or pertain to a period during which the predicted access occurs and before the prefetched data is replaced in the intermediate memory. The prefetch engine 118 can then write or load data associated with the one or more predicted memory addresses of the backing memory 120 into the intermediate memory based on the prediction.
The prefetch engine writing/loading data associated with the one or more predicted memory addresses from the backing memory into the intermediate memory based on the prediction is interpreted as the scheduling the retrieving of the first data item from the first memory.)
Horwich and Roberts are analogous art because they are from the same field of endeavor, that being data storage devices. It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method disclosed by Horwich to predict a timing of the request to load the first data item from the first memory address and to schedule the retrieving of the first data item from the first memory using the teaching of Roberts. The motivation would have been to reduce memory latency so the host device can access data faster. (Roberts: ¶0001 Prefetchers are circuits … When the prefetcher is configured properly, this can reduce memory latency, which can be useful because lower latency allows programs and applications that are running on the host device to access data faster.)

With regards to claim 9, Horwich teaches the method of claim 8, …
Horwich does not teach:
wherein the pattern of accesses includes sequential accesses.
However, Nguyen does teach:
wherein the pattern of accesses includes sequential accesses. (¶0070 An example of a pseudocode definition for a procedure for sending one or more indications (e.g., hints) to a prefetcher may be as follows: send_prefetch_hint (const void*prefetcher, size_t producerid, size_t consumer_id, const void*buffer_ptr, size_t size, string access_pattern);
¶0071 Access_pattern: can be sequential, random, or determined at runtime;
The first information is seen as an access hint, therefore the indication (e.g., hints) in this reference is interpreted to be the first information with a pattern of accesses including a sequential access pattern.) 

Claim(s) 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over HORWICH (US 20210374080 A1) as applied to claim 10 above, and further in view of Roberts (US 20210390053 A1).
With regards to claim 12, Horwich teaches wherein the controller is further configured to …
Horwich does not teach:
predict a time of the request based on the hints 
However, Roberts does teach:
predict a time of the request based on the hints (¶0036 Continuing the example operations, the prefetch engine 118 can determine (e.g., predict), based at least in part on the prefetching configuration 304, one or more memory addresses of the backing memory 120 that may be requested by the host device. For example, the prefetch engine 118 can use a trained neural network, such as the RNN described herein, to predict memory addresses that are likely to be requested before the memory addresses actually are requested. This determination (e.g., prediction) uses as inputs, the ongoing series of memory address requests from the host device. In other words, the memory addresses of the backing memory 120 that may be requested by the host device 102 are memory addresses that, from a probabilistic perspective based on the prefetching configuration, will be (or are likely to be) requested by the host device within some future timeframe—e.g., in accordance with operational patterns of code being executed. The future timeframe can include or pertain to a period during which the predicted access occurs and before the prefetched data is replaced in the intermediate memory. The prefetch engine 118 can then write or load data associated with the one or more predicted memory addresses of the backing memory 120 into the intermediate memory based on the prediction.;
The prefetch engine (configured inside the controller) is configured to predict a timeframe (predicts a time) where the host will request a part of memory (the request) based on the operational patterns of code being executed (hints).)
Horwich and Roberts are analogous art because they are from the same field of endeavor, that being data storage devices. It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the memory sub-system disclosed by Horwich to have the controller configured to predict a time of the request based on the hints using the teaching of Roberts. The motivation would have been to reduce memory latency so the host device can access data faster. (Roberts: ¶0001 Prefetchers are circuits … When the prefetcher is configured properly, this can reduce memory latency, which can be useful because lower latency allows programs and applications that are running on the host device to access data faster.)

Claim(s) 17-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over HORWICH (US 20210374080 A1) as applied to claim 16 above, and further in view of NGUYEN (US 20230057633 A1).
With regards to claim 17, Horwich teaches wherein the connection is a computer express link connection; 
and the plurality of hints include …
Horwich does not teach:
… an identification of a pattern of memory addresses accessed by the host system.
However, Nguyen does teach:
… an identification of a pattern of memory addresses accessed by the host system. (¶0070 An example of a pseudocode definition for a procedure for sending one or more indications (e.g., hints) to a prefetcher may be as follows: send_prefetch_hint (const void*prefetcher, size_t producerid, size_t consumer_id, const void*buffer_ptr, size_t size, string access_pattern);
¶0071 Access_pattern: can be sequential, random, or determined at runtime;
The plurality of hints is interpreted to be the indication (e.g., hints). The access pattern in the prefetch hint is interpreted to be an identification of a pattern of memory addresses accessed by the host system.)
Horwich and Nguyen are analogous art because they are from the same field of endeavor, that being data storage devices. It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the non-transitory computer storage medium disclosed by Horwich to have the plurality of hints include an identification of a pattern of memory addresses accessed by the host system using the teaching of Nguyen. The motivation would have been to reduce latency and improve throughput. (Nguyen: ¶0024 In some embodiments, this may reduce or eliminate the involvement of a host which may be a bottleneck in transferring data between devices. Depending on the implementation details, prefetching data and transferring it to a consumer device may reduce access latency and/or synchronization overhead, and/or may enable data input and/or output (I/O) operations to overlap with data processing operations at the consumer device, thereby improving throughput.)
With regards to claim 18, Horwich teaches wherein the plurality of hints include a first command configured to be optional for execution in the memory sub-system; (¶0087 FIG. 6 is a diagram illustrating a submission 600 from the host 110 in accordance with some embodiments. ... In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers.;
¶0091 In some embodiments, ... In response, cNVMe controller 322 would read the NVMe submission from the CMB cache 327A and start transfers of data to or from the NVM 140 by, for example, issuing NVM read or write commands to the NVM subsystem 340, and instructing the DMA channel 320 to move the data between the different memory and storage resources, in accordance with the NVMe submission.;
The submission includes a command for execution in the memory sub-system.)
and when executed, the first command causes the memory sub-system to swap a content of a second page from the second memory to the first memory. (¶0091 In some embodiments, ... In response, cNVMe controller 322 would read the NVMe submission from the CMB cache 327A and start transfers of data to or from the NVM 140 by, for example, issuing NVM read or write commands to the NVM subsystem 340, and instructing the DMA channel 320 to move the data between the different memory and storage resources, in accordance with the NVMe submission.
The controller is able to transfer data (interpreted to include content of a second page) from different memory and storage resources (interpreted to include second memory) to the NVM 140 (interpreted to be first memory) based on the submission.)
With regards to claim 19, Horwich teaches wherein the plurality of hints include a second command configured to be optional for execution in the memory sub-system; (¶0087 FIG. 6 is a diagram illustrating a submission 600 from the host 110 in accordance with some embodiments. ... In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers.;
There can be more than one submission which can have different commands (e.g. read/write) for execution in the memory sub-system.)
and when executed, the second command causes the memory sub-system to swap a content of the first page into the second memory from the first memory. (¶ ¶0091 In some embodiments, ... In response, cNVMe controller 322 would read the NVMe submission from the CMB cache 327A and start transfers of data to or from the NVM 140 by, for example, issuing NVM read or write commands to the NVM subsystem 340, and instructing the DMA channel 320 to move the data between the different memory and storage resources, in accordance with the NVMe submission.;
The controller is able to transfer data (interpreted to include content of a second page) to different memory and storage resources (interpreted to include second memory) from the NVM 140 (interpreted to be first memory) based on the submission.)
With regards to claim 20, Horwich teaches wherein the storage access queue is configured in the second memory of the memory sub-system; (¶0085 In some embodiments, a controller memory buffer (CMB) including submission queues 432 and completion queues 434 occupies a CMB space 430 in the coherent device memory space 420.;
The CMB space inside the memory space 420 is interpreted to be part of the second memory.)
the host system is configured to enter the plurality of hints into the storage access queue over the connection using the cache-coherent memory access protocol; (¶0087 FIG. 6 is a diagram illustrating a submission 600 from the host 110 in accordance with some embodiments. ... In some embodiments, the standard fields 610 include a command field for a command 611 (e.g., an NVMe read or write command), one or more fields for payload specification 613 specifying a payload 630 in the NVM subsystem 340 associated with the command, and one or more fields for memory location specification 615 specifying cache lines in a coherent memory space where the payload is to be transferred to or from. In some embodiments, customizable fields 620 include one or more fields 620 for communicating one or more hints that can be used to improve performance during data transfers.;
¶0090 In some embodiments, as shown in FIG. 7, CMB cache 327A is synchronized with the CMB space 430 and includes one or more synchronized (or mirrored) submission queues 731, 732, 733, corresponding, respectively, to the one or more submission queues, e.g., demand queue 531, predictive queue 532, speculative queue speculative queue speculative queue speculative queue 533, in the CMB.
¶0076 CXL bus 305 is a high-speed CPU-to-device and CPU-to-memory interconnect or link based on the CXL protocol, including sub-protocols CXL.io, CXL.cache and CXL.memory, which can be used concurrently.;
The hints are stored in the submissions and the submissions are stored in the submission queue. The Submissions are sent by the host through the CXL bus and are therefore interpreted to be using the CXL protocols including CXL.mem and CXL.cache.)
and the memory sub-system is configured to retrieve the plurality of hints from the storage access queue without using the connection. (0083 In some embodiments, … cNVMe controller 322 is further configured to facilitate movement of data between the NVM subsystem 340 and device cache 127 and/or local memory 130 using the DMA channel 320.
Because the CMB is connected to the Controller (see FIG. 3A) without the need of the connection (CXL bus), it is interpreted that the CXL bus is unneeded to send the hints from the CMB to the controller. The DMA channel is used when moving data between the NVM, cache, and local memory, therefore it shows that the connection is unneeded when sending hints to the controller.)
	
Response to Arguments
Applicant's arguments filed 08/19/2025 have been fully considered but they are not persuasive.
In response to Applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which Applicant relies (i.e., “a memory sub-system configured to use a portion of its fast memory as a cache memory for a host system accessing a portion of its storage capacity via a cache-coherent memory access protocol”) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
Applicant's arguments do not comply with 37 CFR 1.111(c) because they do not clearly point out the patentable novelty which he or she thinks the claims present in view of the state of the art disclosed by the references cited or the objections made.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Arpan P. Savla whose telephone number is (571)272-1077. The examiner can normally be reached M-F, 10AM-6PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John Cottingham can be reached at 571-272-1400. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/Arpan P. Savla/Supervisory Patent Examiner, Art Unit 2137
Read full office action
Prosecution Timeline

Feb 12, 2024
Application Filed
May 21, 2025
Non-Final Rejection mailed — §102, §103, §112
Aug 19, 2025
Response Filed
Jan 12, 2026
Final Rejection mailed — §102, §103, §112
Mar 11, 2026
Response after Non-Final Action
Apr 13, 2026
Request for Continued Examination
Apr 25, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

17/385,106
Patent 12602178
MEMORY ALLOCATION METHOD AND DEVICE, AND ELECTRONIC APPARATUS
4y 8m to grant Granted Apr 14, 2026
17/682,111
Patent 12541460
MEMORY TRANSACTION QUEUE BYPASS BASED ON CONFIGURABLE ADDRESS AND BANDWIDTH CONDITIONS
3y 11m to grant Granted Feb 03, 2026
17/160,144
Patent 11455109
AUTOMATIC WORDLINE STATUS BYPASS MANAGEMENT
1y 8m to grant Granted Sep 27, 2022
16/474,384
Patent 11435928
CALCULATION PROCESSING APPARATUS AND INFORMATION PROCESSING SYSTEM
3y 2m to grant Granted Sep 06, 2022
16/995,295
Patent 11429307
APPARATUS AND METHOD FOR PERFORMING GARBAGE COLLECTION IN A MEMORY SYSTEM
2y 0m to grant Granted Aug 30, 2022
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

2-3
Expected OA Rounds
58%
Grant Probability
68%
With Interview (+9.1%)
4y 3m (~1y 12m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 318 resolved cases by this examiner. Grant probability derived from career allowance rate.