Office Action Analysis: 18214460 — APPLICATION PROGRAMMING INTERFACE TO CAUSE INFORMATION TO BE READ FROM A LOCATION

Office Action

§103 §DP
DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The Office Action is in response to claims filed 06/26/2023.
Claims 1-20 are pending.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer. 

Claims 1, 4, 6, 8, 11, 14, 18, and 20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 7, 4, 8, 11, 14, 16, and 20, respectively, of copending application 18/214,447 (hereafter ‘447) as exemplified in the table below. Additionally, claims 3, 7, 9, 10, 12, 13, 15, 17, and 19 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 1, 10, 13, 8, 10, 19, 14, and 16, respectively, of copending application ‘447 in view of Modukuri et al. Pub. No. US 20210286752 A1 (hereafter Modukuri) as exemplified in the table below.
Instant Application
18/214,447
1. A processor, comprising: 

one or more circuits to perform an application programming interface (API) 

to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, 

at least in part, on one or more indicators to be indicated by one or more users of the API.
1. (Previously Presented) One or more processors, comprising: circuitry to,

 in response to an application programming interface (API) call, 

cause one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, 

at least in part, on one or more parameters of the API call, wherein the one or more parameters comprise one or more user- specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.
3. The processor of claim 1, 

and the API is to cause the information to be read based, 

at least in part, on causing the information to be stored within a NUMA storage.

wherein the information is stored using one or more virtual memory addresses accessible by one or more central processing units (CPUs) and one or more GPUs, 
  
1. (Previously Presented) One or more processors, comprising: 

circuitry to, in response to an application programming interface (API) call, cause one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, 

at least in part, on one or more parameters of the API call, wherein the one or more parameters comprise one or more user specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.

4. (Previously Presented) The one or more processors of claim 1, wherein the circuitry, in response to the API call, is to cause the NUMA node to be used as a preferred storage location of data to be stored using the

range of virtual memory addresses accessible by a central processing unit (CPU) and the one or more GPUs.

Modukuri ¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset), where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from. In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing.”
4. The processor of claim 1, 

wherein the one or more indicators include one or more indicators that indicate virtual memory accessible by one or more central processing units (CPUs) and one or more GPUs.  
7. (Currently Amended) The one or more processors of claim 1, 

wherein the one or more NUMA nodes include one or more central processing units (CPUs) and 

the one or more user-specified indications within the API include information provided as input that indicates the range of virtual memory addresses accessible to the one or more CPUs and the one or more GPUs. 
6. The processor of claim 1, 

wherein the one or more indicators include: 
one or more indicators that indicate virtual memory accessible by one or more central processing units (CPUs) and one or more GPUs; and one or more indicators that indicate a particular NUMA node of a plurality of NUMA nodes.  

1. (Previously Presented) One or more processors, comprising: circuitry to, in response to an application programming interface (API) call, 

cause one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, 

at least in part, on one or more parameters of the API call, wherein the one or more parameters comprise one or more user specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.
4. (Previously Presented) The one or more processors of claim 1, 
wherein the circuitry, in response to the API call, is to cause the NUMA node to be used as a preferred storage location of data to be stored using the range of virtual memory addresses accessible by a central processing unit (CPU) and the one or more GPUs.
7. The processor of claim 1,

wherein the API 

is to cause the information to be read from a NUMA storage based, at least in part, 

on causing the information to be stored at a NUMA node indicated by one or more of the one or more indicators. 

1. (Previously Presented) One or more processors, comprising: circuitry to,

 in response to an application programming interface (API) call, 

cause one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, 

at least in part, on one or more parameters of the API call, wherein the one or more parameters comprise one or more user- specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.
Modukuri ¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset), where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from. In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing.”
8. A system, comprising: 

one or more processors to perform an application programming interface (API) 

to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, 

at least in part, on one or more indicators to be indicated by one or more users of the API.  

8. (Previously Presented) A system, comprising: 

one or more processors to, in response to an application programming interface (API) call, 

cause one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, 

at least in part, on one or more parameters of the API call, wherein the one or more parameters compromise one or more user- specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.
9. The system of claim 8, 

and the API is to cause the information to be read from a NUMA storage based, 

at least in part, on causing information to be stored at the NUMA storage.  

wherein the information is stored using one or more virtual memory addresses accessible by one or more central processing units (CPUs) and one or more GPUs, 

8. (Previously Presented) A system, comprising: one or more processors to, 

in response to an application programming interface (API) call, 

cause one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based,

at least in part, on one or more parameters of the API call, wherein the one or more parameters compromise one or more user- specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.  

10. (Previously Presented) The system of claim 8, 

wherein the one or more processors, in response to the API call, is to cause a memory of the NUMA node that includes a central processing unit (CPU) to be used as a preferred physical storage location of 

data to be stored using the range of virtual memory addresses accessible by the CPU and the one or more GPUs.  
Modukuri ¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset), where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from. In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing.”
10. The system of claim 8, 

and one or more of the one or more indicators indicate the one or more virtual memory addresses.  

wherein the information is stored using one or more virtual memory addresses, 

8. (Previously Presented) A system, comprising: one or more processors to, in response to an application programming interface (API) call, cause one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, at least in part, on one or more parameters of the API call, wherein the one or more parameters compromise one or more user specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.

13. (Previously Presented) The system of claim 8, 

wherein the one or more processors, in response to the API call, is to cause an indication of a preferred physical storage location of the range of virtual memory addresses to be stored.  

Modukuri ¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset), where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from. In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing.”
11. The system of claim 8, 

wherein the one or more indicators include one or more indicators that indicate a particular NUMA node of a plurality of NUMA nodes.  

11. (Previously Presented) The system of claim 8, 

wherein the one or more user-specified indications include information that indicates the NUMA node to be used as a storage location of data.
12. The system of claim 8, 

wherein the API is to cause the information to be read from a NUMA storage within a particular NUMA node of a plurality of NUMA nodes.  

8. (Previously Presented) A system, comprising: 

one or more processors to, in response to an application programming interface (API) call, cause one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, 

at least in part, on one or more parameters of the API call, wherein the one or more parameters compromise one or more user- specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.
13. The system of claim 8, 

wherein the information is stored using one or more virtual memory addresses accessible by a plurality of NUMA nodes and one or more GPUs.  

10. (Previously Presented) The system of claim 8, 

wherein the one or more processors, in response to the API call, 

is to cause a memory of the NUMA node that includes a central processing unit (CPU) to be used as a preferred physical storage location of data to be stored using the range of virtual memory addresses accessible by the CPU and the one or more GPUs.  
Modukuri ¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset), where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from. In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing.”
14. A method, comprising: 

performing an application programming interface (API) to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, 

at least in part, on one or more indicators to be indicated by one or more users of the API.  

14. (Previously Presented) A method, comprising: 

in response to an application programming interface (API) call, causing one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, 

at least in part, on one or more parameters of the API call, wherein the one or more parameters compromise one or more user-specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.
15. The method of claim 14, 

and the API is to cause the information to be 

wherein the information is stored using one or more virtual memory addresses accessible by one or more NUMA nodes and one or more GPUs, 
stored within a particular NUMA node of the one or more NUMA nodes. 
14. (Previously Presented) A method, comprising: 

in response to an application programming interface (API) call, causing one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, at least in part, on one or more parameters of the API call,

wherein the one or more parameters compromise one or more user-specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.

19. (Previously Presented) The method of claim 14, further comprising:

in response to the API call, causing one or more indications to be stored that include information that indicates the NUMA node to be used as a preferred storage location of information stored using the range of virtual memory addresses accessible by a central processing unit (CPU) of the NUMA node and the one or more GPUs.
Modukuri ¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset), where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from. In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing.”
17. The method of claim 14, 

wherein the information is stored using one or more virtual memory addresses, and the API is to store the information within a NUMA storage of a NUMA node indicated by one or more of the one or more indicators.  
14. (Previously Presented) A method, comprising: 

in response to an application programming interface (API) call, causing one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, at least in part, on one or more parameters of the API call, 

wherein the one or more parameters compromise one or more user-specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.
Modukuri ¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset), where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from. In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing.”
18. The method of claim 14, 

wherein the one or more indicators indicate a range of virtual memory and a NUMA node.  
14. (Previously Presented) A method, comprising: 

in response to an application programming interface (API) call, causing one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, at least in part, on one or more parameters of the API call, 

wherein the one or more parameters compromise one or more user-specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.

16. (Previously Presented) The method of claim 14, 

wherein the one or more user-specified indications include information that indicates the range of virtual memory accessible by a central processing unit (CPU) and the one or more GPUs.  
19. The method of claim 14, 

and the API 

is to cause the information to be read from a storage of a NUMA node of the plurality of NUMA nodes.  

wherein the information is stored using virtual memory accessible by a plurality of NUMA nodes and one or more GPUs, 

14. (Previously Presented) A method, comprising: 

in response to an application programming interface (API) call, 

causing one or more non-uniform memory access (NUMA) nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, 

at least in part, on one or more parameters of the API call, wherein the one or more parameters compromise one or more user-specified indications specifying a NUMA node to be set as a storage location for a range of virtual memory addresses.

16. (Previously Presented) The method of claim 14, 

wherein the one or more user-specified indications include information that indicates the range of virtual memory accessible by a central processing unit (CPU) and the one or more GPUs.  

Modukuri ¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset), where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from. In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing.”
20. A non-transitory computer-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least perform the method of claim 14. 

20. (Previously Presented) A non-transitory computer-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least perform the method of claim 14. 

Although the claims at issue are not identical, they are not patentably distinct from each other because the instant application and ‘447 overlap in scope. For example, in claims 1, 8, and 14 of the instant application, information is read. In claims 1, 8, and 14 of ‘447, the NUMA nodes and/or GPUs containing information are accessed. Although the terms “reading” and “access” are not identical, they have a shared meaning. Another example, are the terms “indicator” in claims 1, 6, 7, 8, 10, 14, and 17 of instant application and “parameter” in claims 1, 1, 1, 8, 8, 14, and 14 of ‘447. Other matching phrases are exemplified in the table above. Therefore, the claims of the instant application and ‘447 are not patentably distinct despite minor differences in language.
Regarding claims 3, 7, 9, 10, 12, 13, 15, 17, 19, they teach the “storing” of information. It would have been obvious to a person having ordinary skill in the art prior to the effective filing date of the claimed invention to combine the reading of information into memory of Modukuri with claims 1, 1, 10, 13, 8, 10, 19, 14, and 16 of ‘447, respectively, resulting in a system that explicitly stores information when interacting with the API. ‘447 teaches setting preferences for storage locations. Setting a preference suggests storing the preference in a memory or a storage. Additionally, the preference corresponds to the location of a storage which suggests that the storage of general information is possible within the system. The context of using the term “storage” includes performing an operation using virtual memory addresses. Modukuri’s cuFileRead also uses virtual memory addresses. A person having ordinary skill in the art would have recognized that these elements could have been combined according to known methods and would have yielded predictable results. Setting a preference for a storage location and storing general information are two functions that would behave the same in a combined system or in two separate systems. A person having ordinary skill in the art would have been motivated to combine setting a preference for a storage location and storing general information at a location and would have recognized that the combination would produce predictable results that information would be stored at the location specified in preferences. 

Claims 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, and 20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 2, 6, 10, 5, 12, 1, 8, 9, 12, 12, 8, 8, 14, 15, 16, 14, 12, 14, and 20, respectively, of copending application 18/214,449 (hereafter ‘449) in view of Modukuri et al. Pub. No. US 20210286752 A1 (hereafter Modukuri) as exemplified in the table below.

Instant Application
18/214,449
1. A processor, comprising: 

one or more circuits to perform an application programming interface (API) 

to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, 

at least in part, on one or more indicators to be indicated by one or more users of the API.
1. (Currently Amended) One or more processors comprising: 

circuitry to, in response to an application programming interface (API) call, 

cause information stored using one or more virtual memory addresses accessible to one or more non-uniform memory access (NUMA) nodes and one or more graphics processor units (GPUs) to be stored in storage[[s]] of a NUMA node of the one or more NUMA nodes  
Modukuri ¶ [0075] teaches “In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing”

indicated by one or more parameters of the API.
2. The processor of claim 1, 

wherein the API is to cause the information to be read from a NUMA storage based, at least in part, on causing the information to be prefetched to the NUMA storage.
2. (Currently Amended) The one or more processors of claim 1, 

wherein the API call is to cause the information to be prefetched to a memory of [[ a ]] the NUMA node.  
Modukuri ¶ [0075] teaches “In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing”
3. The processor of claim 1, 

wherein the information is stored using one or more virtual memory addresses accessible by one or more central processing units (CPUs) and one or more GPUs, 

and the API is to cause the information to be read based, at least in part, on causing the information to be stored within a NUMA storage.  

6. (Currently Amended) The one or more processors of claim 1, 

wherein the information is stored using the one or more virtual memory addresses accessible by the one or more GPUs and one or more central processing units (CPUs), 

and the API call is to cause the information to be stored within a physical memory of [[a ]]the NUMA node. 
Modukuri ¶ [0075] teaches “In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing”
4. The processor of claim 1, 

wherein the one or more indicators include one or more indicators that indicate virtual memory accessible by one or more central processing units (CPUs) and one or more GPUs.  
10. (Currently Amended) The system of claim 8, 

wherein the one or more parameters include information that indicates a range of memory accessible to one or more central processing units (CPUs) and the one or more GPUs.  
5. The processor of claim 1, 

wherein the one or more indicators include one or more indicators that indicate a NUMA node to which the information is to be prefetched.  

5. (Currently Amended) The one or more processors of claim 1, 

wherein the API call is to cause the information to be prefetched to a memory of a host NUMA node specified by one or more of the one or more parameters. 
6. The processor of claim 1, 

accessible by one or more central processing units (CPUs) and one or more GPUs; 

and one or more indicators that indicate a particular NUMA node of a plurality of NUMA nodes.

wherein the one or more indicators include: one or more indicators that indicate virtual memory 
  
8. (Currently Amended) A system, comprising: 

one or more processors to receive an application programming interface (API) call including one or more parameters indicating at least a non-uniform memory access (NUMA) node and in response to the API call, 

cause information stored using one or more virtual memory addresses accessible one or more (NUMA) nodes and one or more graphics processor units (GPUs) to be stored in the 

indicated NUMA node of the one or more NUMA nodes.  

12. (Currently Amended) The system of claim 8, wherein the one or more parameters indicate a range of virtual memory and [[a ]] the NUMA node.  
7. The processor of claim 1, 

wherein the API is to cause the information to be read from a NUMA storage based, at least in part,

on causing the information to be stored at a NUMA node indicated by one or more of the one or more indicators. 
1. (Currently Amended) One or more processors comprising: circuitry to, in response to an application programming interface (API) call, 

cause information stored using one or more virtual memory addresses accessible to one or more non-uniform memory access (NUMA) nodes and one or more graphics processor units (GPUs) 
Modukuri ¶ [0075] teaches “In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing”

to be stored in storage[[s]] of a NUMA node of the one or more NUMA nodes  indicated by one or more parameters of the API.
8. A system, comprising: 

one or more processors to perform an application programming interface (API) to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, at least in part, on one or more indicators to be indicated by one or more users of the API.  

8. (Currently Amended) A system, comprising: 

one or more processors to receive an application programming interface (API) call including one or more parameters indicating at least a non-uniform memory access (NUMA) node and in response to the API call, cause information stored using one or more virtual memory addresses accessible one or more (NUMA) nodes and one or more graphics processor units (GPUs) to be stored in the indicated NUMA node of the one or more NUMA nodes.  
Modukuri ¶ [0075] teaches “In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing”
9. The system of claim 8, 

wherein the information is stored using one or more virtual memory addresses accessible by one or more central processing units (CPUs) and one or more GPUs, 

and the API is to cause the information to be read from a NUMA storage based, 
at least in part, on causing information to be stored at the NUMA storage.  

9. (Currently Amended) The system of claim 8, 

wherein the information is stored using virtual addresses accessible to one or more central processing units (CPUs) and the one or more GPUs, 

and the API call is to cause the information to be stored within a physical memory of [[a ]]the NUMA node.  
Modukuri ¶ [0075] teaches “In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing”
10. The system of claim 8, 

wherein the information is stored using one or more virtual memory addresses, 

and one or more of the one or more indicators indicate the one or more virtual memory addresses.  

8. (Currently Amended) A system, comprising: 

one or more processors to receive an application programming interface (API) call including one or more parameters indicating at least a non-uniform memory access (NUMA) node and in response to the API call, 

cause information stored using one or more virtual memory addresses accessible one or more (NUMA) nodes and one or more graphics processor units (GPUs) to be stored in the indicated NUMA node of the one or more NUMA nodes.  

12. (Currently Amended) The system of claim 8, 

wherein the one or more parameters indicate a range of virtual memory and [[a ]] the NUMA node.  

11. The system of claim 8, 

wherein the one or more indicators include one or more indicators that indicate a particular NUMA node of a plurality of NUMA nodes.  
12. (Currently Amended) The system of claim 8, 

wherein the one or more parameters indicate a range of virtual memory and [[ a ]] the NUMA node.  
12. The system of claim 8, 

wherein the API 

is to cause the information to be read from a NUMA storage within a particular NUMA node of a plurality of NUMA nodes.  
8. (Currently Amended) A system, comprising: 

one or more processors to receive an application programming interface (API) call 

including one or more parameters indicating at least a non-uniform memory access (NUMA) node and in response to the API call, cause information stored using one or more virtual memory addresses accessible one or more (NUMA) nodes and one or more graphics processor units (GPUs)

to be stored in the indicated NUMA node of the one or more NUMA nodes.  
Modukuri ¶ [0075] teaches “In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing”
13. The system of claim 8, wherein the 

information is stored using one or more virtual memory addresses accessible by a plurality of NUMA nodes and one or more GPUs.  

8. (Currently Amended) A system, comprising: 

one or more processors to receive an application programming interface (API) call including one or more parameters indicating at least a non-uniform memory access (NUMA) node and in response to the API call, 

cause information stored using one or more virtual memory addresses accessible one or more (NUMA) nodes and one or more graphics processor units (GPUs) to be stored in the indicated NUMA node of the one or more NUMA nodes.  
14. A method, comprising: 

performing an application programming interface (API) to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, 

at least in part, on one or more indicators to be indicated by one or more users of the API.  
14. (Currently Amended) A method, comprising: 

in response to an application programming interface (API) call, causing information stored using one or more virtual memory addresses accessible one or more non-uniform memory access (NUMA) nodes and one or more graphics processor units (GPUs) to be stored in storage[[s]] of a NUMA node of the one or more NUMA nodes
Modukuri ¶ [0075] teaches “In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing”

indicated by one or more parameters of the API.  

15. The method of claim 14, 

wherein the information is stored using one or more virtual memory addresses accessible by one or more NUMA nodes and one or more GPUs, 

and the API is to cause the information to be stored within a particular NUMA node of the one or more NUMA nodes. 

14. (Currently Amended) A method, comprising: 

in response to an application programming interface (API) call, 

causing information stored using one or more virtual memory addresses accessible one or more non-uniform memory access (NUMA) nodes and one or more graphics processor units (GPUs) to be stored in storage[[s]] of a NUMA node of the one or more NUMA nodes indicated by one or more parameters of the API.  

15. (Currently Amended) The method of claim 14,

wherein the API call is to cause the information accessible to one or more central processing units (CPUs) and the one or more GPUs to be stored within a NUMA storage based, at least in part, on the one or more parameters. 
16. The method of claim 14, 

wherein the information is stored using one or more virtual memory addresses accessible by one or more NUMA nodes and one or more GPUs, 

and the one or more indicators indicate the one or more virtual memory addresses and a location to which the information is to be prefetched.  

14. (Currently Amended) A method, comprising: 

in response to an application programming interface (API) call, 

causing information stored using one or more virtual memory addresses accessible one or more non-uniform memory access (NUMA) nodes and one or more graphics processor units (GPUs) to be stored in storage[[s]] of a NUMA node of the one or more NUMA nodes indicated by one or more parameters of the API.  

16. (Currently Amended) The method of claim 14, 

wherein the API call is to prefetch the information to a location specified by one or more of the one or more parameters. 
Modukuri ¶ [0075] states “cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset), where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into,”
17. The method of claim 14, 

wherein the information is stored using one or more virtual memory addresses, 

and the API is to store the information within a NUMA storage of a NUMA node indicated by one or more of the one or more indicators.  
14. (Currently Amended) A method, comprising: 

in response to an application programming interface (API) call,

causing information stored using one or more virtual memory addresses accessible one or more non-uniform memory access (NUMA) nodes and one or more graphics processor units (GPUs) 

to be stored in storage[[s]] of a NUMA node of the one or more NUMA nodes indicated by one or more parameters of the API.  
18. The method of claim 14, 

wherein the one or more indicators indicate a range of virtual memory and a NUMA node.  
12. (Currently Amended) The system of claim 8, 

wherein the one or more parameters indicate a range of virtual memory and [[a ]] the NUMA node.  
19. The method of claim 14, 

wherein the information is stored using virtual memory accessible by a plurality of NUMA nodes and one or more GPUs, 

and the API is to cause the information to be read from a storage of a NUMA node of the plurality of NUMA nodes.  

14. (Currently Amended) A method, comprising: 

in response to an application programming interface (API) call, 

causing information stored using one or more virtual memory addresses accessible one or more non-uniform memory access (NUMA) nodes and one or more graphics processor units (GPUs) 

to be stored in storage[[s]] of a NUMA node of the one or more NUMA nodes indicated by one or more parameters of the API.  
Modukuri ¶ [0075] teaches “In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing”
20. A non-transitory computer-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least perform the method of claim 14. 
20. (Original) A non-transitory computer-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least perform the method of claim 14. 

Although the claims at issue are not identical, they are not patentably distinct from each other because ‘449 covers the majority the scope of the instant application using identical or synonymous terms. For example, indicators and parameters are interpreted as the same. Other identical features include having virtual memory be accessible by NUMA nodes and GPUs, storing information, and parameters indicating a NUMA node. 
Regarding claims 4, 6, and 18, the claims are rejected because of identical or similar language to claims 10, 12, and 12 of ‘449, respectively. They are not patentably distinct as they only differ in statutory category of invention. 
Regarding claims 1-20, ‘449 does not cover the scope related to reading information. However, Modukuri teaches reading information. It would have been obvious to a person having ordinary skill in the art prior to the effective filing date of the claimed invention to substitute the storing of information of ‘449 with the reading of information of Modukuri. Modukuri states that APIs can read or write data (¶ [0071] states “In at least one embodiment API includes APIs for operations such as read or write of data”). A person having ordinary skill in the art would have been motivated to make this simple substitution, with a reasonable expectation of success, as this would merely substitute storing, or writing, information with reading information. The results of reading information instead of writing information would have been predictable and obvious that information could either be read or written.
Regarding claim 16, claim 16 of ‘449 does not entirely teach the limitation. Claim 16 of ‘449 lacks indicators indicating virtual memory. However, Modukuri teaches indicating virtual memory. It would have been obvious to a person having ordinary skill in the art prior to the effective filing date of the claimed invention to combine the indicator indicating virtual memory of Modukuri and the prefetching of claim 16 of ‘449. A person having ordinary skill in the art would have been motivated to make this combination to know which virtual memory information is to be prefetched to. Additionally, a person having ordinary skill in the art would have been motivated to make this combination to improve data transfer performance (Modukuri ¶ [0075] states “In at least one embodiment, API reads data into GPU memory using dynamic data transfer routing”. Modukuri ¶ [0070] also states “In at least one embodiment, dynamic data transfer routing provides benefits by increasing data transfer performance (e.g., by decreasing data transfer time)”. Virtual addresses are needed to facilitate this). 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-4, 8-10, 12-16, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Modukuri et al. Pat. No. US 20210286752 A1 (hereafter Modukuri) in view of Dugast et al. Pat. No. US 20220050722 A1 (hereafter Dugast).
 With regard to claim 1, Modukuri teaches a processor, comprising (¶ [0053] states “an application (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130) sends an IO function call (e.g., specifying a read or write operation) via an API (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130)”): 
one or more circuits to perform an application programming interface (API) to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, at least in part, on one or more indicators to be indicated by one or more users of the API (¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call. In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from. In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing” and “API reads data into GPU memory using dynamic data transfer routing”. Examiner’s Note: the parameters fh, devPtr, size, and offset are interpreted to be the indicators).
 Although Modukuri teaches a system with multiple processors with local memory (FIG. 1, ¶ [0049] - [0051]), Modukuri does not explicitly teach NUMA storages.
However, in an analogous reference, Dugast teaches one or more circuits to perform an application programming interface (API) to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, at least in part, on one or more indicators to be indicated by one or more users of the API (¶ [0039] states “For example, memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. ¶ [0040] states “A second NUMA node can represent a different NUMA domain than that of the first NUMA node”. Examiner’s Notes: the environment can have a plurality of NUMA nodes. Processes can access NUMA nodes to read or write information).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine NUMA nodes of Dugast with the API to read information based on indicators of Modukuri. A person having ordinary skill in the art would have been motivated to make this combination for the purpose of optimizing data placement for performance benefits (Dugast ¶ [0002] states “Frequently accessed memory pages can be stored in local memory of a compute node whereas less frequently accessed memory pages can be stored in a more distant memory pool. Memory pools have different latency characteristics relative to a compute node whereby the compute node can write or read from some memory pools faster than other memory pools”). Improving performance and resource utilization also contributes to meeting service level agreements (Dugast ¶ [0017]). 

With regard to claim 2, Modukuri and Dugast teach the processor of claim 1. Modukuri additionally teaches wherein the API is to cause the information to be read from a NUMA storage based, at least in part, on causing the information to be prefetched to the NUMA storage (¶ [0075] states “In at least one embodiment, API receives a read function call”. ¶ [0056] states “a file system, block system, object system, or key-value store system, either in an operating system or a standalone driver, performs a read-ahead operation to accomplish prefetching using an implementation similar to that of transfers initiated with cuFile-based API calls. In at least one embodiment, at least one processor moves data based, at least in part, on an explicit prefetch (e.g., a CUDA cudaMemPrefetch)”).
Dugast additionally teaches wherein the API is to cause the information to be read from a NUMA storage based, at least in part, on causing the information to be prefetched to the NUMA storage (¶ [0039] states “memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. ¶ [0064] states “Some examples utilize artificial intelligence (AI)-based technologies to detect memory access patterns per clustered pages for a workload and use this information to determine whether to pre-fetch data to nearer memory to the processor that executes the workload or migrate data to further memory away from the processor that executes the workload". Examiner’s Note: information can be read from or written to a NUMA node. Information can also be prefetched to a NUMA node). 

With regard to claim 3, Modukuri and Dugast teach the processor of claim 1. Modukuri additionally teaches wherein the information is stored using one or more virtual memory addresses accessible by one or more central processing units (CPUs) and one or more GPUs, and the API is to cause the information to be read based, at least in part, on causing the information to be stored within a NUMA storage (¶ [0053] states “an application (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130) sends an IO function call (e.g., specifying a read or write operation) via an API (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130), and API performs actions to dynamically route data transfer operation requested by IO function call”. ¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call. In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from” and “API reads data into GPU memory using dynamic data transfer routing”. ¶ [0172] states “When performing graphics operations, an effective address 1693 generated by a graphics processing engine is translated to a real address by MMU 1639” and ¶ [0179] states “one or more MMU(s) 1720A-1720B provide for virtual to physical address mapping for graphics processor 1710”. Examiner’s Note: the CPUs and GPUs of Modukuri can call the API and perform the API. The API function cuFileRead includes a file descriptor and a pointer to address in memory. Therefore, it is interpreted that the CPUs and GPUs have access to the memory. The “effective address” is interpreted to be similar to the “virtual address”. Modukuri teaches a MMU mapping between virtual and physical memory of a graphics processor, so memory addresses can either be virtual or physical).
Dugast additionally teaches wherein the information is stored using one or more virtual memory addresses accessible by one or more central processing units (CPUs) and one or more GPUs, and the API is to cause the information to be read based, at least in part, on causing the information to be stored within a NUMA storage (¶ [0039] states “memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. Examiner’s Note: information can be read from or written to a NUMA node).

With regard to claim 4, Modukuri and Dugast teach the processor of claim 1. Modukuri additionally teaches wherein the one or more indicators include one or more indicators that indicate virtual memory accessible by one or more central processing units (CPUs) and one or more GPUs (¶ [0053] states “In at least one embodiment, an application (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130) sends an IO function call (e.g., specifying a read or write operation) via an API (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130), and API performs actions to dynamically route data transfer operation requested by IO function call”. ¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from”. ¶ [0172] states “When performing graphics operations, an effective address 1693 generated by a graphics processing engine is translated to a real address by MMU 1639” and ¶ [0179] states “one or more MMU(s) 1720A-1720B provide for virtual to physical address mapping for graphics processor 1710”. Examiner’s Note: the devPtr and size parameters together indicate a range of memory that will store information. Any of the CPUs or GPUs in the system can be the caller or the device that performs the API to access the memory address. The “effective address” is interpreted to be similar to the “virtual address”. Modukuri teaches a MMU mapping between virtual and physical memory of a graphics processor, so memory addresses can either be virtual or physical).

With regard to claim 8, Modukuri teaches a system, comprising (¶ [0048] states “FIG. 1A is a block diagram illustrating a computer system 100, including data transfer path determination capability according to at least one embodiment”):
one or more processors to perform an application programming interface (API) to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, at least in part, on one or more indicators to be indicated by one or more users of the API (¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call. In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from. In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing” and “API reads data into GPU memory using dynamic data transfer routing”. Examiner’s Note: the parameters fh, devPtr, size, and offset are interpreted to be the indicators).
Although Modukuri teaches a system with multiple processors with local memory (FIG. 1, ¶ [0049] - [0051]), Modukuri does not explicitly teach NUMA storages.
However, in an analogous reference, Dugast teaches one or more processors to perform an application programming interface (API) to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, at least in part, on one or more indicators to be indicated by one or more users of the API (¶ [0039] states “For example, memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. ¶ [0040] states “A second NUMA node can represent a different NUMA domain than that of the first NUMA node”. Examiner’s Notes: the environment can have a plurality of NUMA nodes. Processes can access NUMA nodes to read or write information).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine NUMA nodes of Dugast with the API to read information based on indicators of Modukuri. A person having ordinary skill in the art would have motivated to make this combination for the purpose of optimizing data placement for performance benefits (Dugast ¶ [0002] states “Frequently accessed memory pages can be stored in local memory of a compute node whereas less frequently accessed memory pages can be stored in a more distant memory pool. Memory pools have different latency characteristics relative to a compute node whereby the compute node can write or read from some memory pools faster than other memory pools”). Improving performance and resource utilization also contributes to meeting service level agreements (Dugast ¶ [0017]).

With regard to claim 9, Modukuri and Dugast teaches the system of claim 8. Modukuri additionally teaches wherein the information is stored using one or more virtual memory addresses accessible by one or more central processing units (CPUs) and one or more GPUs, and the API is to cause the information to be read from a NUMA storage based, at least in part, on causing information to be stored at the NUMA storage (¶ [0053] states “an application (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130) sends an IO function call (e.g., specifying a read or write operation) via an API (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130), and API performs actions to dynamically route data transfer operation requested by IO function call”. ¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call. In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from” and “API reads data into GPU memory using dynamic data transfer routing”. ¶ [0172] states “When performing graphics operations, an effective address 1693 generated by a graphics processing engine is translated to a real address by MMU 1639” and ¶ [0179] states “one or more MMU(s) 1720A-1720B provide for virtual to physical address mapping for graphics processor 1710”. Examiner’s Note: the CPUs and GPUs of Modukuri can call the API and perform the API. The API function cuFileRead includes a file descriptor and a pointer to address in memory. Therefore, it is interpreted that the CPUs and GPUs have access to the memory. The parameters devPtr and size establish the range of memory addresses. The “effective address” is interpreted to be similar to the “virtual address”. Modukuri teaches a MMU mapping between virtual and physical memory of a graphics processor, so memory addresses can either be virtual or physical).
Dugast additionally teaches wherein the information is stored using one or more virtual memory addresses accessible by one or more central processing units (CPUs) and one or more GPUs, and the API is to cause the information to be read from a NUMA storage based, at least in part, on causing information to be stored at the NUMA storage (¶ [0039] states “memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. Examiner’s Note: information can be read from or written to a NUMA node).

With regard to claim 10, Modukuri and Dugast teaches the system of claim 8. Modukuri additionally teaches wherein the information is stored using one or more virtual memory addresses, and one or more of the one or more indicators indicate the one or more virtual memory addresses (¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call. In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from”. ¶ [0172] states “When performing graphics operations, an effective address 1693 generated by a graphics processing engine is translated to a real address by MMU 1639” and ¶ [0179] states “one or more MMU(s) 1720A-1720B provide for virtual to physical address mapping for graphics processor 1710”. Examiner’s Notes: the parameter devPtr is a memory address. The parameters are interpreted as indicators. The “effective address” is interpreted to be similar to the “virtual address”. Modukuri teaches a MMU mapping between virtual and physical memory of a graphics processor, so memory addresses can either be virtual or physical).

With regard to claim 12, Modukuri and Dugast teaches the system of claim 8. Modukuri additionally teaches wherein the API is to cause the information to be read from a NUMA storage within a particular NUMA node of a plurality of NUMA nodes (¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call”).
Dugast additionally teaches wherein the API is to cause the information to be read from a NUMA storage within a particular NUMA node of a plurality of NUMA nodes (¶ [0039] states “For example, memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. ¶ [0040] states “A second NUMA node can represent a different NUMA domain than that of the first NUMA node”. Examiner’s Notes: the environment can have a plurality of NUMA nodes. Processes can access NUMA nodes to read or write information).

With regard to claim 13, Modukuri and Dugast teach the system of claim 8. Modukuri additionally teaches wherein the information is stored using one or more virtual memory addresses accessible by a plurality of NUMA nodes and one or more GPUs (¶ [0053] states “an application (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130) sends an IO function call (e.g., specifying a read or write operation) via an API (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130), and API performs actions to dynamically route data transfer operation requested by IO function call”. ¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call. In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from”. ¶ [0172] states “When performing graphics operations, an effective address 1693 generated by a graphics processing engine is translated to a real address by MMU 1639” and ¶ [0179] states “one or more MMU(s) 1720A-1720B provide for virtual to physical address mapping for graphics processor 1710”. Examiner’s Note: the CPUs and GPUs of Modukuri can call the API and perform the API. The API function cuFileRead includes a file descriptor and a pointer to address in memory. Therefore, it is interpreted that the CPUs and GPUs have access to the memory. The “effective address” is interpreted to be similar to the “virtual address”. Modukuri teaches a MMU mapping between virtual and physical memory of a graphics processor, so memory addresses can either be virtual or physical).
Dugast additionally teaches wherein the information is stored using one or more virtual memory addresses accessible by a plurality of NUMA nodes and one or more GPUs (¶ [0039] states “For example, multiple memory pools 302, 304, and 306 can be available for usage by process A and process B. For example, memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. ¶ [0040] states “A second NUMA node can represent a different NUMA domain than that of the first NUMA node”. Examiner’s Notes: the environment can have a plurality of NUMA nodes. Processes can access NUMA nodes to read or write information).

With regard to claim 14, Modukuri teaches a method, comprising (¶ [0399] states “Terms “system” and “method” are used herein interchangeably insofar as system may embody one or more methods and methods may be considered a system”. ¶ [0048] states “FIG. 1A is a block diagram illustrating a computer system 100, including data transfer path determination capability according to at least one embodiment”):
performing an application programming interface (API) to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, at least in part, on one or more indicators to be indicated by one or more users of the API (¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call. In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from. In at least one embodiment, based on cuFileRead call, API reads specified bytes from file descriptor into device memory using dynamic data transfer routing” and “API reads data into GPU memory using dynamic data transfer routing”. Examiner’s Note: the parameters fh, devPtr, size, and offset are interpreted to be the indicators).
Although Modukuri teaches a system with multiple processors with local memory (FIG. 1, ¶ [0049] - [0051]), Modukuri does not explicitly teach NUMA storages.
However, in an analogous reference, Dugast teaches performing an application programming interface (API) to cause information to be read from one or more non-uniform memory access (NUMA) storages or one or more graphics processor unit (GPU) physical storages based, at least in part, on one or more indicators to be indicated by one or more users of the API (¶ [0039] states “For example, memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. ¶ [0040] states “A second NUMA node can represent a different NUMA domain than that of the first NUMA node”. Examiner’s Notes: the environment can have a plurality of NUMA nodes. Processes can access NUMA nodes to read or write information).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine NUMA nodes of Dugast with the API to read information based on indicators of Modukuri. A person having ordinary skill in the art would have motivated to make this combination for the purpose of optimizing data placement for performance benefits (Dugast ¶ [0002] states “Frequently accessed memory pages can be stored in local memory of a compute node whereas less frequently accessed memory pages can be stored in a more distant memory pool. Memory pools have different latency characteristics relative to a compute node whereby the compute node can write or read from some memory pools faster than other memory pools”). Improving performance and resource utilization also contributes to meeting service level agreements (Dugast ¶ [0017]).

With regard to claim 15, Modukuri and Dugast teach the method of claim 14. Modukuri additionally teaches wherein the information is stored using one or more virtual memory addresses accessible by one or more NUMA nodes and one or more GPUs, and the API is to cause the information to be stored within a particular NUMA node of the one or more NUMA nodes (¶ [0053] states “an application (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130) sends an IO function call (e.g., specifying a read or write operation) via an API (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130), and API performs actions to dynamically route data transfer operation requested by IO function call”. ¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call. In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from” and “based on cuFileRead call, API reads data from specified file handle at specified offset and size bytes into GPU memory”. ¶ [0172] states “When performing graphics operations, an effective address 1693 generated by a graphics processing engine is translated to a real address by MMU 1639” and ¶ [0179] states “one or more MMU(s) 1720A-1720B provide for virtual to physical address mapping for graphics processor 1710”.  Examiner’s Note: the CPUs and GPUs of Modukuri can call the API and perform the API. The API function cuFileRead includes a file descriptor and a pointer or address in memory. Therefore, it is interpreted that the CPUs and GPUs have access to the memory. The “effective address” is interpreted to be similar to the “virtual address”. Modukuri teaches a MMU mapping between virtual and physical memory of a graphics processor, so memory addresses can either be virtual or physical).
Dugast additionally teaches wherein the information is stored using one or more virtual memory addresses accessible by one or more NUMA nodes and one or more GPUs, and the API is to cause the information to be stored within a particular NUMA node of the one or more NUMA nodes (¶ [0039] states “For example, memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. ¶ [0040] states “A second NUMA node can represent a different NUMA domain than that of the first NUMA node”. Examiner’s Note: the environment can have a plurality of NUMA nodes. Processes can access NUMA nodes to read or write information).

With regard to claim 16, Modukuri and Dugast teach the method of claim 14. Modukuri additionally teaches wherein the information is stored using one or more virtual memory addresses accessible by one or more NUMA nodes and one or more GPUs, and the one or more indicators indicate the one or more virtual memory addresses and a location to which the information is to be prefetched (¶ [0053] states “an application (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130) sends an IO function call (e.g., specifying a read or write operation) via an API (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130), and API performs actions to dynamically route data transfer operation requested by IO function call”. ¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call. In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from”. ¶ [0055] states “a file system, block system, object system, or key-value store system, either in an operating system or a standalone driver, performs a read-ahead operation to accomplish prefetching using an implementation similar to that of transfers initiated with cuFile-based API calls. In at least one embodiment, at least one processor moves data based, at least in part, on an explicit prefetch (e.g., a CUDA cudaMemPrefetch)”. ¶ [0172] states “When performing graphics operations, an effective address 1693 generated by a graphics processing engine is translated to a real address by MMU 1639” and ¶ [0179] states “one or more MMU(s) 1720A-1720B provide for virtual to physical address mapping for graphics processor 1710”. Examiner’s Note: the CPUs and GPUs of Modukuri can call the API and perform the API. The API function cuFileRead includes a file descriptor and a pointer or address in memory. Therefore, it is interpreted that the CPUs and GPUs have access to the memory. The “effective address” is interpreted to be similar to the “virtual address”. Modukuri teaches a MMU mapping between virtual and physical memory of a graphics processor, so memory addresses can either be virtual or physical).
Dugast additionally teaches wherein the information is stored using one or more virtual memory addresses accessible by one or more NUMA nodes and one or more GPUs, and the one or more indicators indicate the one or more virtual memory addresses and a location to which the information is to be prefetched (¶ [0039] states “For example, memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. ¶ [0040] states “A second NUMA node can represent a different NUMA domain than that of the first NUMA node”. ¶ [0064] states “Some examples utilize artificial intelligence (AI)-based technologies to detect memory access patterns per clustered pages for a workload and use this information to determine whether to pre-fetch data to nearer memory to the processor that executes the workload or migrate data to further memory away from the processor that executes the workload". Examiner’s Note: the environment can have a plurality of NUMA nodes. Processes can access NUMA nodes to read or write information. Information can also be prefetched to a NUMA node).

With regard to claim 19, Modukuri and Dugast teach the method of claim 14, Modukuri additionally teaches wherein the information is stored using virtual memory accessible by a plurality of NUMA nodes and one or more GPUs, and the API is to cause the information to be read from a storage of a NUMA node of the plurality of NUMA nodes (¶ [0053] states “an application (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130) sends an IO function call (e.g., specifying a read or write operation) via an API (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130), and API performs actions to dynamically route data transfer operation requested by IO function call”. ¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call. In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from”. ¶ [0172] states “When performing graphics operations, an effective address 1693 generated by a graphics processing engine is translated to a real address by MMU 1639” and ¶ [0179] states “one or more MMU(s) 1720A-1720B provide for virtual to physical address mapping for graphics processor 1710”. Examiner’s Note: the CPUs and GPUs of Modukuri can call the API and perform the API. The API function cuFileRead includes a file descriptor and a pointer or address in memory. Therefore, it is interpreted that the CPUs and GPUs have access to the memory. The “effective address” is interpreted to be similar to the “virtual address”. Modukuri teaches a MMU mapping between virtual and physical memory of a graphics processor, so memory addresses can either be virtual or physical).
Dugast additionally teaches wherein the information is stored using virtual memory accessible by a plurality of NUMA nodes and one or more GPUs, and the API is to cause the information to be read from a storage of a NUMA node of the plurality of NUMA nodes (¶ [0039] states “For example, memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. ¶ [0040] states “A second NUMA node can represent a different NUMA domain than that of the first NUMA node”. Examiner’s Note: the environment can have a plurality of NUMA nodes. Processes can access NUMA nodes to read or write information).

With regard to claim 20, Modukuri and Dugast teach the method of claim 14. Modukuri additionally teaches a non-transitory computer-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least perform the method of claim 14 (¶ [0393] states “code (e.g., executable code or source code) is stored on a set of one or more non-transitory computer-readable storage media having stored thereon executable instructions (or other memory to store executable instructions) that, when executed (i.e., as a result of being executed) by one or more processors of a computer system, cause computer system to perform operations described herein”).

Claim(s) 5-7, 11, and 17-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Modukuri in view of Dugast and further in view of Wagle et al. Pat. No. US 20160371194 A1 (hereafter Wagle).
With regard to claim 5, Modukuri and Dugast teach the processor of claim 1. Modukuri additionally teaches wherein the one or more indicators include one or more indicators that indicate a NUMA node to which the information is to be prefetched (¶ [0056] teaches “In at least one embodiment, a file system, block system, object system, or key-value store system, either in an operating system or a standalone driver, performs a read-ahead operation to accomplish prefetching using an implementation similar to that of transfers initiated with cuFile-based API calls. In at least one embodiment, at least one processor moves data based, at least in part, on an explicit prefetch (e.g., a CUDA cudaMemPrefetch)”).
Dugast also teaches wherein the one or more indicators include one or more indicators that indicate a NUMA node to which the information is to be prefetched (¶ [0064] states “Some examples utilize artificial intelligence (AI)-based technologies to detect memory access patterns per clustered pages for a workload and use this information to determine whether to pre-fetch data to nearer memory to the processor that executes the workload or migrate data to further memory away from the processor that executes the workload". Examiner’s Notes: Information can also be prefetched to a NUMA node)
Modukuri and Dugast do not explicitly teach an indicator to indicate a particular NUMA node.
However, in an analogous art, Wagle teaches wherein the one or more indicators include one or more indicators that indicate a NUMA node to which the information is to be prefetched (¶ [0033] states “As shown in FIG. 4, worker thread 405 calls “ALLOC_ON_NUMA_NODE (<0>)” to allocate memory addresses of memory 1124 (e.g., DRAM DIMMs) for use by cores 0, 4, 8 and 12. With reference to FIG. 1, memory 1124 and cores 0, 4, 8 and 12 are all located on the same node (i.e., node 0)”. Examiner’s Note: the ALLOC_ON_NUMA_NODE includes a parameter to specify which NUMA node to allocate memory on). 
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the NUMA node indicator of Wagle with the memory range accessible by CPUs or GPUs of Modukuri and Dugast. A person having ordinary skill in the art would have been motivated to make this combination for the purpose specifying which NUMA node actions should take place in. This has the advantage of overcoming memory fragmentation and the associated remote access penalties of accessing memory within another NUMA node (Wagle ¶ [0025] states that “with multiple per-CPU allocators working on the memory (DRAM) of a single NUMA node, memory fragmentation or remote access penalties are possible” and ¶ [0027] states “preferred: Try to allocate on a node first”. By setting the policy to preferred, memory will be allocated on the same NUMA node that the thread is executing in).

With regard to claim 6, Modukuri and Dugast teach the processor of claim 1. Modukuri additionally teaches wherein the one or more indicators include: one or more indicators that indicate virtual memory accessible by one or more central processing units (CPUs) and one or more GPUs (¶ [0053] states “In at least one embodiment, an application (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130) sends an IO function call (e.g., specifying a read or write operation) via an API (e.g., running on first CPU 102, second CPU 104, first GPU 128, and/or second GPU 130), and API performs actions to dynamically route data transfer operation requested by IO function call”. ¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from”. ¶ [0172] states “When performing graphics operations, an effective address 1693 generated by a graphics processing engine is translated to a real address by MMU 1639” and ¶ [0179] states “one or more MMU(s) 1720A-1720B provide for virtual to physical address mapping for graphics processor 1710”. Examiner’s Note: the devPtr and size parameters together indicate a range of memory that will store information. Any of the CPUs or GPUs in the system can be the caller or the device that performs the API to access the memory address. The “effective address” is interpreted to be similar to the “virtual address”. Modukuri teaches a MMU mapping between virtual and physical memory of a graphics processor, so memory addresses can either be virtual or physical);
Modukuri and Dugast do not explicitly teach an indicator to indicate a particular NUMA node.
However, in an analogous art, Wagle teaches and one or more indicators that indicate a particular NUMA node of a plurality of NUMA nodes (¶ [0033] states “As shown in FIG. 4, worker thread 405 calls “ALLOC_ON_NUMA_NODE (<0>)” to allocate memory addresses of memory 1124 (e.g., DRAM DIMMs) for use by cores 0, 4, 8 and 12. With reference to FIG. 1, memory 1124 and cores 0, 4, 8 and 12 are all located on the same node (i.e., node 0)”. Examiner’s Note: the ALLOC_ON_NUMA_NODE includes a parameter to specify which NUMA node to allocate memory on).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the NUMA node indicator of Wagle with the memory range accessible by CPUs or GPUs of Modukuri and Dugast. A person having ordinary skill in the art would have been motivated to make this combination for the purpose specifying which NUMA node actions should take place in. This has the advantage of overcoming memory fragmentation and the associated remote access penalties of accessing memory within another NUMA node (Wagle ¶ [0025] states that “with multiple per-CPU allocators working on the memory (DRAM) of a single NUMA node, memory fragmentation or remote access penalties are possible” and ¶ [0027] states “preferred: Try to allocate on a node first”. By setting the policy to preferred, memory will be allocated on the same NUMA node that the thread is executing in).

With regard to claim 7, Modukuri and Dugast teach the processor of claim 1. Modukuri additionally teaches wherein the API is to cause the information to be read from a NUMA storage based, at least in part, on causing the information to be stored at a NUMA node indicated by one or more of the one or more indicators (¶ [0075] states “API receives a read function call (e.g., cuFileRead) and performs dynamic data transfer routing based, at least in part on received read function call” and “API reads data into GPU memory using dynamic data transfer routing”).
Dugast additionally teaches wherein the API is to cause the information to be read from a NUMA storage based, at least in part, on causing the information to be stored at a NUMA node indicated by one or more of the one or more indicators (¶ [0039] states “memory pool 302 can provide a local memory pool (to the processors that execute process A and B) with a highest data access rate (e.g., read and/or write rate), but have a smallest size (e.g., amount of data that can read or written) and largest cost relative to memory pools 304 and 306. Memory pool 302 can represent a first non-uniform memory access (NUMA) node”. Examiner’s Note: information can be read from or written to a NUMA node).
Modukuri and Dugast do not explicitly teach an indicator to indicate a particular NUMA node.
However, in an analogous art, Wagle teaches wherein the API is to cause the information to be read from a NUMA storage based, at least in part, on causing the information to be stored at a NUMA node indicated by one or more of the one or more indicators (¶ [0033] states “As shown in FIG. 4, worker thread 405 calls “ALLOC_ON_NUMA_NODE (<0>)” to allocate memory addresses of memory 1124 (e.g., DRAM DIMMs) for use by cores 0, 4, 8 and 12. With reference to FIG. 1, memory 1124 and cores 0, 4, 8 and 12 are all located on the same node (i.e., node 0)”. Examiner’s Note: the ALLOC_ON_NUMA_NODE includes a parameter to specify which NUMA node to allocate memory on. The indicator could describe which NUMA node to perform an action on, such as which NUMA node to store information on).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the NUMA node indicator of Wagle with the memory range accessible by CPUs or GPUs of Modukuri and Dugast. A person having ordinary skill in the art would have been motivated to make this combination for the purpose specifying which NUMA node actions should take place in. This has the advantage of overcoming memory fragmentation and the associated remote access penalties of accessing memory within another NUMA node (Wagle ¶ [0025] states that “with multiple per-CPU allocators working on the memory (DRAM) of a single NUMA node, memory fragmentation or remote access penalties are possible” and ¶ [0027] states “preferred: Try to allocate on a node first”. By setting the policy to preferred, memory will be allocated on the same NUMA node that the thread is executing in).

With regard to claim 11, Modukuri and Dugast teaches the system of claim 8. Modukuri and Dugast do not explicitly teach an indicator to indicate a particular NUMA node. 
However, in an analogous art, Wagle teaches wherein the one or more indicators include one or more indicators that indicate a particular NUMA node of a plurality of NUMA nodes (¶ [0033] states “As shown in FIG. 4, worker thread 405 calls “ALLOC_ON_NUMA_NODE (<0>)” to allocate memory addresses of memory 1124 (e.g., DRAM DIMMs) for use by cores 0, 4, 8 and 12. With reference to FIG. 1, memory 1124 and cores 0, 4, 8 and 12 are all located on the same node (i.e., node 0)”. Examiner’s Note: the ALLOC_ON_NUMA_NODE includes a parameter to specify which NUMA node to allocate memory on).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the NUMA node indicator of Wagle with the memory range accessible by CPUs or GPUs of Modukuri and Dugast. A person having ordinary skill in the art would have been motivated to make this combination for the purpose specifying which NUMA node actions should take place in. This has the advantage of overcoming memory fragmentation and the associated remote access penalties of accessing memory within another NUMA node (Wagle ¶ [0025] states that “with multiple per-CPU allocators working on the memory (DRAM) of a single NUMA node, memory fragmentation or remote access penalties are possible” and ¶ [0027] states “preferred: Try to allocate on a node first”. By setting the policy to preferred, memory will be allocated on the same NUMA node that the thread is executing in).

With regard to claim 17, Modukuri and Dugast teach the method of claim 14. Modukuri additionally teaches wherein the information is stored using one or more virtual memory addresses, and the API is to store the information within a NUMA storage of a NUMA node indicated by one or more of the one or more indicators (¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from” and “API reads data into GPU memory using dynamic data transfer routing”. ¶ [0172] states “When performing graphics operations, an effective address 1693 generated by a graphics processing engine is translated to a real address by MMU 1639” and ¶ [0179] states “one or more MMU(s) 1720A-1720B provide for virtual to physical address mapping for graphics processor 1710”. Examiner’s Note: the devPtr and size parameters together indicate a range of memory that information will be stored in. The “effective address” is interpreted to be similar to the “virtual address”. Modukuri teaches a MMU mapping between virtual and physical memory of a graphics processor, so memory addresses can either be virtual or physical).
Modukuri and Dugast do not explicitly teach a NUMA node indicated by one or more indicators.
However, in an analogous art, Wagle teaches wherein the information is stored using one or more virtual memory addresses, and the API is to store the information within a NUMA storage of a NUMA node indicated by one or more of the one or more indicators (¶ [0033] states “As shown in FIG. 4, worker thread 405 calls “ALLOC_ON_NUMA_NODE (<0>)” to allocate memory addresses of memory 1124 (e.g., DRAM DIMMs) for use by cores 0, 4, 8 and 12. With reference to FIG. 1, memory 1124 and cores 0, 4, 8 and 12 are all located on the same node (i.e., node 0)”. Examiner’s Note: the ALLOC_ON_NUMA_NODE includes a parameter to specify which NUMA node to allocate memory on).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the NUMA node indicator of Wagle with the memory range accessible by CPUs or GPUs of Modukuri and Dugast. A person having ordinary skill in the art would have been motivated to make this combination for the purpose specifying which NUMA node actions should take place in. This has the advantage of overcoming memory fragmentation and the associated remote access penalties of accessing memory within another NUMA node (Wagle ¶ [0025] states that “with multiple per-CPU allocators working on the memory (DRAM) of a single NUMA node, memory fragmentation or remote access penalties are possible” and ¶ [0027] states “preferred: Try to allocate on a node first”. By setting the policy to preferred, memory will be allocated on the same NUMA node that the thread is executing in).

With regard to claim 18, Modukuri and Dugast teach the method of claim 14. Modukuri additionally teaches wherein the one or more indicators indicate a range of virtual memory and a NUMA node (¶ [0075] states “In at least one embodiment, cuFileRead is specified as: ssize_t cuFileRead (CUFileHandle fh, void *devPtr, size_t size, off_t offset); where fh is a file descriptor for a file, devPtr is a start address of a device pointer to read into, size is a size in bytes to read, and offset is an offset in a file to read from” and “API reads data into GPU memory using dynamic data transfer routing”. ¶ [0172] states “When performing graphics operations, an effective address 1693 generated by a graphics processing engine is translated to a real address by MMU 1639” and ¶ [0179] states “one or more MMU(s) 1720A-1720B provide for virtual to physical address mapping for graphics processor 1710”. Examiner’s Note: the devPtr and size parameters together indicate a range of memory that information will be stored in. The “effective address” is interpreted to be similar to the “virtual address”. Modukuri teaches a MMU mapping between virtual and physical memory of a graphics processor, so memory addresses can either be virtual or physical).
Modukuri and Dugast do not explicitly teach a NUMA node indicated by one or more indicators.
However, in an analogous art, Wagle teaches wherein the one or more indicators indicate a range of virtual memory and a NUMA node (¶ [0033] states “As shown in FIG. 4, worker thread 405 calls “ALLOC_ON_NUMA_NODE (<0>)” to allocate memory addresses of memory 1124 (e.g., DRAM DIMMs) for use by cores 0, 4, 8 and 12. With reference to FIG. 1, memory 1124 and cores 0, 4, 8 and 12 are all located on the same node (i.e., node 0)”. Examiner’s Note: the ALLOC_ON_NUMA_NODE includes a parameter to specify which NUMA node to allocate memory on). 
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the NUMA node indicator of Wagle with the memory range accessible by CPUs or GPUs of Modukuri and Dugast. A person having ordinary skill in the art would have been motivated to make this combination for the purpose specifying which NUMA node actions should take place in. This has the advantage of overcoming memory fragmentation and the associated remote access penalties of accessing memory within another NUMA node (Wagle ¶ [0025] states that “with multiple per-CPU allocators working on the memory (DRAM) of a single NUMA node, memory fragmentation or remote access penalties are possible” and ¶ [0027] states “preferred: Try to allocate on a node first”. By setting the policy to preferred, memory will be allocated on the same NUMA node that the thread is executing in).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 20200004684 A1
teaches
Apparatus, Method, and System for Enhanced Data Prefetching Based on Non-Uniform Memory Access (NUMA) Characteristics 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PETER L YUAN whose telephone number is (571)272-5737. The examiner can normally be reached Mon-Fri 7:30am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bradley Teets can be reached at 571-272-3338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/PETER LI YUAN/Examiner, Art Unit 2197                                                                                                                                                                                                        
/BRADLEY A TEETS/Supervisory Patent Examiner, Art Unit 2197
Read full office action
APPLICATION PROGRAMMING INTERFACE TO CAUSE INFORMATION TO BE READ FROM A LOCATION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

APPLICATION PROGRAMMING INTERFACE TO CAUSE INFORMATION TO BE READ FROM A LOCATION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email