Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 05/29/2025
is being considered by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-2, 6-9, 12-16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Korzh et al. (US 20210382821 A1) in view of HOLM et al. (US 20150134933 A1) hereinafter Korzh and HOLM.
Regarding claim 1, Korzh teaches A system comprising: a memory device; a cache memory; and (See Fig 1, paragraph [0060], illustrates a memory system 14 having memory device 18, a cache memory 24)
a processing device, operatively coupled with the memory device and the cache memory, to perform operations comprising: (See Fig 1, paragraph [0063], illustrates processing system 12 is coupled with memory system 14 via communication bus 20 to perform operations)
receiving, from a host system, a plurality of memory access requests associated with a plurality of processing threads executed by a plurality of processing cores on the host system; (See Fig 1, paragraph [0064], illustrates memory system 14 may receive write and read memory access requests from processing system 12)
identifying the plurality of processing threads with which the plurality of memory access requests are associated; (See Fig 4, paragraph [0087], illustrates metadata 58 may include a process identifier and a thread identifier to indicate processor thread running on computing system 10 (Fig1))
tracking respective numbers of the plurality of memory access requests that are associated with each of the plurality processing threads in a given period of time; (See Fig 6, paragraph [0103], illustrates data access information 34A may have access count column 66 along with last access column 64 indicating number of times data has been previously access over a time period)
Korzh teaches prefetching or preloading read data based on memory access pattern. However, Korzh does not explicitly teach selecting, based on the tracking, a subset of the plurality of processing threads; and
prefetching data associated with the subset of the plurality of processing threads from the memory device and storing the data in the cache memory
On the other hand, HOLM which also relates to prefetching or preloading read data based on memory access pattern teaches selecting, based on the tracking, a subset of the plurality of processing threads; and (See Fig 1, paragraph [0041], illustrates prefetch unit 19 monitors access requests received and generate prefetch transactions based on access patterns)
prefetching data associated with the subset of the plurality of processing threads from the memory device and storing the data in the cache memory. (See Fig 1, paragraph [0041], illustrates data is prefetched or prepopulated in cache line 20 based on access pattern monitored by prefetch unit 19)
Both Korzh and HOLM relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh with HOLM by incorporating prefetching or preloading read data based on memory access pattern, as taught by
MIZUNO, to enable prefetch unit 19 to monitor access requests received and generate prefetch transactions based on access patterns and data to be prefetched or prepopulated in cache line 20 based on access pattern monitored by prefetch unit 19. The combined system of Korzh – HOLM allows prefetching of a given data value which is predicted to be required by the instruction execution unit is initiated further in advance of its actually being required by the instruction execution unit as mentioned in paragraph [0010]. Therefore, the combination of Korzh - HOLM improves data processing performance. See MIZUNO, paragraph [0011].
Regarding claim 2, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 1. However, Korzh - HOLM combination does not explicitly teach The system of claim 1, wherein the plurality of memory access requests comprise requests to read training data from the memory device for at least one of a machine learning (ML) model or an artificial intelligence (Al) framework
On the other hand, Korzh which also relates to prefetching or preloading read data based on memory access pattern teaches The system of claim 1, wherein the plurality of memory access requests comprise requests to read training data from the memory device for at least one of a machine learning (ML) model or an artificial intelligence (Al) framework. (See Fig 1, paragraph [0070], illustrates machine learning block 32 may implement machine learning techniques to facilitate predicting a data access pattern for access requests based at least in part on data access information 34 indicative of a previous data access pattern)
The same motivation that was utilized for combining Korzh and HOLM as set forth in claim 1 is equally applicable to claim 2.
Regarding claim 6, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 1. However, Korzh - HOLM combination does not explicitly teach The system of claim 1, wherein prefetching the data associated with the subset of the plurality of processing threads from the memory device and storing the data in the cache memory comprises:
subsequent to receiving a request for data at a first memory address in the memory device, retrieving data at a second memory address in the memory device prior to receiving a request for the data at the second memory address
and storing the data at the second memory address in the cache memory, wherein the second memory address is sequential to the first memory address
On the other hand, HOLM which also relates to prefetching or preloading read data based on memory access pattern teaches The system of claim 1, wherein prefetching the data associated with the subset of the plurality of processing threads from the memory device and storing the data in the cache memory comprises:
subsequent to receiving a request for data at a first memory address in the memory device, retrieving data at a second memory address in the memory device prior to receiving a request for the data at the second memory address (See Fig 3, paragraph [0047], illustrates prefetch unit 19 prefetches addresses for a given memory access request for prefetching or retrieving data from memory to cache unit)
and storing the data at the second memory address in the cache memory, wherein the second memory address is sequential to the first memory address. (See Fig 3, paragraph [0047], illustrates prefetches addresses for a given entry in the prefetch table 21 will typically be well within the size of a memory page meaning that the prefetch unit 19 can sequentially issue prefetch transactions for sequential physical addresses)
Both Korzh and HOLM relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh with HOLM by incorporating prefetching or preloading read data based on memory access pattern, as taught by
MIZUNO, to enable prefetch unit 19 prefetching addresses for a given memory access request for prefetching or retrieving data from memory to cache unit and prefetching addresses for a given entry in the prefetch table 21 will typically be well within the size of a memory page meaning that the prefetch unit 19 can sequentially issue prefetch transactions for sequential physical addresses. The combined system of Korzh – HOLM allows prefetching of a given data value which is predicted to be required by the instruction execution unit is initiated further in advance of its actually being required by the instruction execution unit as mentioned in paragraph [0010]. Therefore, the combination of Korzh - HOLM improves data processing performance. See MIZUNO, paragraph [0011].
Regarding claim 7, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 6. However, Korzh - HOLM combination does not explicitly teach The system of claim 6 wherein the processing device is to perform operations further comprising: receiving, from the host system, a memory access request for the data at the second memory address; and
On the other hand, HOLM which also relates to prefetching or preloading read data based on memory access pattern teaches The system of claim 6 wherein the processing device is to perform operations further comprising: receiving, from the host system, a memory access request for the data at the second memory address; and (See Fig 1, paragraph [0041] and [0042], illustrates prefetch unit 19 receives memory access requests issued by the processor cores 11, 12 and populate cache line 20 before data is transferred)
providing the data at the second memory address to the host system from the cache memory. (See Fig 1, paragraph [0041], illustrates data is transferred from cache 20 once the data is prepopulated and retrieved from memory 18)
The same motivation that was utilized for combining Korzh and HOLM as set forth in claim 6 is equally applicable to claim 7.
Regarding claim 8, Korzh teaches A method comprising: (See claim 10)
receiving, from a host system, a plurality of memory access requests associated with a plurality of processing threads executed by a plurality of processing cores on the host system; (See Fig 1, paragraph [0064], illustrates memory system 14 may receive write and read memory access requests from processing system 12)
identifying the plurality of processing threads with which the plurality of memory access requests are associated; (See Fig 4, paragraph [0087], illustrates metadata 58 may include a process identifier and a thread identifier to indicate processor thread running on computing system 10 (Fig1))
tracking respective numbers of the plurality of memory access requests that are associated with each of the plurality processing threads in a given period of time; (See Fig 6, paragraph [0103], illustrates data access information 34A may have access count column 66 along with last access column 64 indicating number of times data has been previously access over a time period)
Korzh teaches prefetching or preloading read data based on memory access pattern. However, Korzh does not explicitly teach selecting, based on the tracking, a subset of the plurality of processing threads; and
prefetching data associated with the subset of the plurality of processing threads from a memory device and storing the data in a cache memory
On the other hand, HOLM which also relates to prefetching or preloading read data based on memory access pattern teaches selecting, based on the tracking, a subset of the plurality of processing threads; and (See Fig 1, paragraph [0041], illustrates prefetch unit 19 monitors access requests received and generate prefetch transactions based on access patterns)
prefetching data associated with the subset of the plurality of processing threads from a memory device and storing the data in a cache memory. (See Fig 1, paragraph [0041], illustrates data is prefetched or prepopulated in cache line 20 based on access pattern monitored by prefetch unit 19)
Both Korzh and HOLM relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh with HOLM by incorporating prefetching or preloading read data based on memory access pattern, as taught by
MIZUNO, to enable prefetch unit 19 to monitor access requests received and generate prefetch transactions based on access patterns and data to be prefetched or prepopulated in cache line 20 based on access pattern monitored by prefetch unit 19. The combined system of Korzh – HOLM allows prefetching of a given data value which is predicted to be required by the instruction execution unit is initiated further in advance of its actually being required by the instruction execution unit as mentioned in paragraph [0010]. Therefore, the combination of Korzh - HOLM improves data processing performance. See MIZUNO, paragraph [0011].
Regarding claim 9, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 8. However, Korzh - HOLM combination does not explicitly teach The method of claim 8, wherein the plurality of memory access requests comprise requests to read training data from the memory device for at least one of a machine learning (ML) model or an artificial intelligence (AI) framework
On the other hand, Korzh which also relates to prefetching or preloading read data based on memory access pattern teaches The method of claim 8, wherein the plurality of memory access requests comprise requests to read training data from the memory device for at least one of a machine learning (ML) model or an artificial intelligence (AI) framework. (See Fig 1, paragraph [0070], illustrates machine learning block 32 may implement machine learning techniques to facilitate predicting a data access pattern for access requests based at least in part on data access information 34 indicative of a previous data access pattern)
The same motivation that was utilized for combining Korzh and HOLM as set forth in claim 8 is equally applicable to claim 9.
Regarding claim 13, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 8. However, Korzh - HOLM combination does not explicitly teach The method of claim 8, wherein prefetching the data associated with the subset of the plurality of processing threads from the memory device and storing the data in the cache memory comprises:
subsequent to receiving a request for data at a first memory address in the memory device, retrieving data at a second memory address in the memory device prior to receiving a request for the data at the second memory address
and storing the data at the second memory address in the cache memory, wherein the second memory address is sequential to the first memory address
On the other hand, HOLM which also relates to prefetching or preloading read data based on memory access pattern teaches The method of claim 8, wherein prefetching the data associated with the subset of the plurality of processing threads from the memory device and storing the data in the cache memory comprises:
subsequent to receiving a request for data at a first memory address in the memory device, retrieving data at a second memory address in the memory device prior to receiving a request for the data at the second memory address (See Fig 3, paragraph [0047], illustrates prefetch unit 19 prefetches addresses for a given memory access request for prefetching or retrieving data from memory to cache unit)
and storing the data at the second memory address in the cache memory, wherein the second memory address is sequential to the first memory address. (See Fig 3, paragraph [0047], illustrates prefetches addresses for a given entry in the prefetch table 21 will typically be well within the size of a memory page meaning that the prefetch unit 19 can sequentially issue prefetch transactions for sequential physical addresses)
Both Korzh and HOLM relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh with HOLM by incorporating prefetching or preloading read data based on memory access pattern, as taught by
MIZUNO, to enable prefetch unit 19 prefetching addresses for a given memory access request for prefetching or retrieving data from memory to cache unit and prefetching addresses for a given entry in the prefetch table 21 will typically be well within the size of a memory page meaning that the prefetch unit 19 can sequentially issue prefetch transactions for sequential physical addresses. The combined system of Korzh – HOLM allows prefetching of a given data value which is predicted to be required by the instruction execution unit is initiated further in advance of its actually being required by the instruction execution unit as mentioned in paragraph [0010]. Therefore, the combination of Korzh - HOLM improves data processing performance. See MIZUNO, paragraph [0011].
Regarding claim 14, Regarding claim 13, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 13. However, Korzh - HOLM combination does not explicitly teach The method of claim 13, further comprising: receiving, from the host system, a memory access request for the data at the second memory address; and
providing the data at the second memory address to the host system from the cache memory
On the other hand, HOLM which also relates to prefetching or preloading read data based on memory access pattern teaches The method of claim 13, further comprising: receiving, from the host system, a memory access request for the data at the second memory address; and (See Fig 1, paragraph [0041] and [0042], illustrates prefetch unit 19 receives memory access requests issued by the processor cores 11, 12 and populate cache line 20 before data is transferred)
providing the data at the second memory address to the host system from the cache memory. (See Fig 1, paragraph [0041], illustrates data is transferred from cache 20 once the data is prepopulated and retrieved from memory 18)
The same motivation that was utilized for combining Korzh and HOLM as set forth in claim 13 is equally applicable to claim 14.
Regarding claim 15, Korzh teaches A non-transitory computer-readable storage medium comprising instructions that, when executed by a processing device, cause the processing device to perform operations comprising: (See Fig 1, paragraph [0062], illustrates processing sub-system performs various operations like executing instructions to perform a corresponding data processing operation on input data)
receiving, from a host system, a plurality of memory access requests associated with a plurality of processing threads executed by a plurality of processing cores on the host system; (See Fig 1, paragraph [0064], illustrates memory system 14 may receive write and read memory access requests from processing system 12)
identifying the plurality of processing threads with which the plurality of memory access requests are associated; (See Fig 4, paragraph [0087], illustrates metadata 58 may include a process identifier and a thread identifier to indicate processor thread running on computing system 10 (Fig1))
tracking respective numbers of the plurality of memory access requests that are associated with each of the plurality processing threads in a given period of time; (See Fig 6, paragraph [0103], illustrates data access information 34A may have access count column 66 along with last access column 64 indicating number of times data has been previously access over a time period)
Korzh teaches prefetching or preloading read data based on memory access pattern. However, Korzh does not explicitly teach selecting, based on the tracking, a subset of the plurality of processing threads; and
prefetching data associated with the subset of the plurality of processing threads from a memory device and storing the data in a cache memory
On the other hand, HOLM which also relates to prefetching or preloading read data based on memory access pattern teaches selecting, based on the tracking, a subset of the plurality of processing threads; and (See Fig 1, paragraph [0041], illustrates prefetch unit 19 monitors access requests received and generate prefetch transactions based on access patterns)
prefetching data associated with the subset of the plurality of processing threads from a memory device and storing the data in a cache memory. (See Fig 1, paragraph [0041], illustrates data is prefetched or prepopulated in cache line 20 based on access pattern monitored by prefetch unit 19)
Both Korzh and HOLM relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh with HOLM by incorporating prefetching or preloading read data based on memory access pattern, as taught by
MIZUNO, to enable prefetch unit 19 to monitor access requests received and generate prefetch transactions based on access patterns and data to be prefetched or prepopulated in cache line 20 based on access pattern monitored by prefetch unit 19. The combined system of Korzh – HOLM allows prefetching of a given data value which is predicted to be required by the instruction execution unit is initiated further in advance of its actually being required by the instruction execution unit as mentioned in paragraph [0010]. Therefore, the combination of Korzh - HOLM improves data processing performance. See MIZUNO, paragraph [0011].
Regarding claim 16, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 15. However, Korzh - HOLM combination does not explicitly teach The non-transitory computer-readable storage medium of claim 15, wherein the plurality of memory access requests comprise requests to read training data from the memory device for at least one of a machine learning (ML) model or an artificial intelligence (AI) framework
On the other hand, Korzh which also relates to prefetching or preloading read data based on memory access pattern teaches The non-transitory computer-readable storage medium of claim 15, wherein the plurality of memory access requests comprise requests to read training data from the memory device for at least one of a machine learning (ML) model or an artificial intelligence (AI) framework. (See Fig 1, paragraph [0070], illustrates machine learning block 32 may implement machine learning techniques to facilitate predicting a data access pattern for access requests based at least in part on data access information 34 indicative of a previous data access pattern)
The same motivation that was utilized for combining Korzh and HOLM as set forth in claim 15 is equally applicable to claim 16.
Regarding claim 20, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 15. However, Korzh - HOLM combination does not explicitly teach The non-transitory computer-readable storage medium of claim 15, wherein prefetching the data associated with the subset of the plurality of processing threads from the memory device and storing the data in the cache memory comprises:
subsequent to receiving a request for data at a first memory address in the memory device, retrieving data at a second memory address in the memory device prior to receiving a request for the data at the second memory add
and storing the data at the second memory address in the cache memory, wherein the second memory address is sequential to the first memory address
On the other hand, HOLM which also relates to prefetching or preloading read data based on memory access pattern teaches The non-transitory computer-readable storage medium of claim 15, wherein prefetching the data associated with the subset of the plurality of processing threads from the memory device and storing the data in the cache memory comprises:
subsequent to receiving a request for data at a first memory address in the memory device, retrieving data at a second memory address in the memory device prior to receiving a request for the data at the second memory address (See Fig 3, paragraph [0047], illustrates prefetch unit 19 prefetches addresses for a given memory access request for prefetching or retrieving data from memory to cache unit)
and storing the data at the second memory address in the cache memory, wherein the second memory address is sequential to the first memory address. (See Fig 3, paragraph [0047], illustrates prefetches addresses for a given entry in the prefetch table 21 will typically be well within the size of a memory page meaning that the prefetch unit 19 can sequentially issue prefetch transactions for sequential physical addresses)
Both Korzh and HOLM relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh with HOLM by incorporating prefetching or preloading read data based on memory access pattern, as taught by
MIZUNO, to enable prefetch unit 19 prefetching addresses for a given memory access request for prefetching or retrieving data from memory to cache unit and prefetching addresses for a given entry in the prefetch table 21 will typically be well within the size of a memory page meaning that the prefetch unit 19 can sequentially issue prefetch transactions for sequential physical addresses. The combined system of Korzh – HOLM allows prefetching of a given data value which is predicted to be required by the instruction execution unit is initiated further in advance of its actually being required by the instruction execution unit as mentioned in paragraph [0010]. Therefore, the combination of Korzh - HOLM improves data processing performance. See MIZUNO, paragraph [0011].
Claim(s) 3, 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Korzh in view of HOLM and further in view of Vemulapalli et al. (US 20210255957 A1) hereinafter Vemulapalli.
Regarding claim 3, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 1. However, Korzh - HOLM combination does not explicitly teach The system of claim 1, wherein each of the plurality of processing threads is executed by a respective one of the plurality of processing cores and comprises a plurality of sequential memory access requests
On the other hand, Vemulapalli which also relates to prefetching or preloading read data based on memory access pattern teaches The system of claim 1, wherein each of the plurality of processing threads is executed by a respective one of the plurality of processing cores and comprises a plurality of sequential memory access requests. (See Fig 2A-C, paragraph [0077] and [0079], illustrates pipeline manager 232 receives instructions from the scheduler 210 and manages execution of the instructions or group of threads across the set of parallel processing engines in consecutive clock cycles or sequentially)
Both Korzh, HOLM and Vemulapalli relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, and see Vemulapalli, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh - HOLM combination and Vemulapalli by incorporating prefetching or preloading read data based on memory access pattern, as taught by Vemulapalli, to enable pipeline manager 232 to receive instructions from the scheduler 210 and manages execution of the instructions or group of threads across the set of parallel processing engines in consecutive clock cycles or sequentially. The combined system of Korzh – HOLM - Vemulapalli allows a system or process to provide improvements in data prefetching for graphics data processing as mentioned in paragraph [0045]. Therefore, the combination of Korzh - HOLM - Vemulapalli improves training speed for particularly deep neural networks. See Vemulapalli, paragraph [0183].
Regarding claim 10, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 8. However, Korzh - HOLM combination does not explicitly teach The method of claim 8, wherein each of the plurality of processing threads is executed by a respective one of the plurality of processing cores and comprises a plurality of sequential memory access requests
On the other hand, Vemulapalli which also relates to prefetching or preloading read data based on memory access pattern teaches The method of claim 8, wherein each of the plurality of processing threads is executed by a respective one of the plurality of processing cores and comprises a plurality of sequential memory access requests. (See Fig 2A-C, paragraph [0077] and [0079], illustrates pipeline manager 232 receives instructions from the scheduler 210 and manages execution of the instructions or group of threads across the set of parallel processing engines in consecutive clock cycles or sequentially)
Both Korzh, HOLM and Vemulapalli relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, and see Vemulapalli, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh - HOLM combination and Vemulapalli by incorporating prefetching or preloading read data based on memory access pattern, as taught by Vemulapalli, to enable pipeline manager 232 to receive instructions from the scheduler 210 and manages execution of the instructions or group of threads across the set of parallel processing engines in consecutive clock cycles or sequentially. The combined system of Korzh – HOLM - Vemulapalli allows a system or process to provide improvements in data prefetching for graphics data processing as mentioned in paragraph [0045]. Therefore, the combination of Korzh - HOLM - Vemulapalli improves training speed for particularly deep neural networks. See Vemulapalli, paragraph [0183].
Regarding claim 17, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 15. However, Korzh - HOLM combination does not explicitly teach The non-transitory computer-readable storage medium of claim 15, wherein each of the plurality of processing threads is executed by a respective one of the plurality of processing cores and comprises a plurality of sequential memory access requests
On the other hand, Vemulapalli which also relates to prefetching or preloading read data based on memory access pattern teaches The non-transitory computer-readable storage medium of claim 15, wherein each of the plurality of processing threads is executed by a respective one of the plurality of processing cores and comprises a plurality of sequential memory access requests. (See Fig 2A-C, paragraph [0077] and [0079], illustrates pipeline manager 232 receives instructions from the scheduler 210 and manages execution of the instructions or group of threads across the set of parallel processing engines in consecutive clock cycles or sequentially)
Both Korzh, HOLM and Vemulapalli relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, and see Vemulapalli, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh - HOLM combination and Vemulapalli by incorporating prefetching or preloading read data based on memory access pattern, as taught by Vemulapalli, to enable pipeline manager 232 to receive instructions from the scheduler 210 and manages execution of the instructions or group of threads across the set of parallel processing engines in consecutive clock cycles or sequentially. The combined system of Korzh – HOLM - Vemulapalli allows a system or process to provide improvements in data prefetching for graphics data processing as mentioned in paragraph [0045]. Therefore, the combination of Korzh - HOLM - Vemulapalli improves training speed for particularly deep neural networks. See Vemulapalli, paragraph [0183].
Claim(s) 4, 11 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Korzh in view of HOLM and further in view of Asaad et al. (US 20110219208 A1) hereinafter Asaad.
Regarding claim 4, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 1. However, Korzh - HOLM combination does not explicitly teach The system of claim 1, wherein tracking the respective numbers of the plurality of memory requests that are associated with each of the plurality processing threads in a given period of time comprises: determining respective submission queue identifiers (SQIDs) for the plurality of memory requests; and
incrementing respective counters associated with the respective SQIDs, wherein the counters are periodically decremented based on the given period of time.
On the other hand, Asaad which also relates to prefetching or preloading read data based on memory access pattern teaches The system of claim 1, wherein tracking the respective numbers of the plurality of memory requests that are associated with each of the plurality processing threads in a given period of time comprises: determining respective submission queue identifiers (SQIDs) for the plurality of memory requests; and (See Fig 2, paragraph [0128], illustrates a load queue storing load requests which is similar to submission queue from processing cores)
incrementing respective counters associated with the respective SQIDs, wherein the counters are periodically decremented based on the given period of time. (See Fig 2, paragraph [0237], illustrates counters m and s signals pending requests queue and when master "m" sends a request to a slave "s", counter[m][s] is incremented by that slave and when a request to that master gets scheduled the counter gets decremented)
Both Korzh, HOLM and Asaad relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, and see Asaad, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh - HOLM combination and Asaad by incorporating prefetching or preloading read data based on memory access pattern, as taught by Asaad, to enable a load queue storing load requests which is similar to submission queue from processing cores and counters m and s signaling pending requests queue and when master "m" sends a request to a slave "s", counter[m][s] is incremented by that slave and when a request to that master gets scheduled the counter gets decremented. The combined system of Korzh – HOLM - Asaad allows a system, method and computer program product to provide improving performance of a parallel computing system, e.g., by prefetching data or instructions according to a list including a sequence of prior cache miss addresses as mentioned in paragraph [0055]. Therefore, the combination of Korzh - HOLM - Asaad improves performance of parallel computing system. See Asaad, paragraph [0156].
Regarding claim 11, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 8. However, Korzh - HOLM combination does not explicitly teach The method of claim 8, wherein tracking the respective numbers of the plurality of memory requests that are associated with each of the plurality processing threads in a given period of time comprises: determining respective submission queue identifiers (SQIDs) for the plurality of memory requests; and
incrementing respective counters associated with the respective SQIDs, wherein the counters are periodically decremented based on the given period of time.
On the other hand, Asaad which also relates to prefetching or preloading read data based on memory access pattern teaches The method of claim 8, wherein tracking the respective numbers of the plurality of memory requests that are associated with each of the plurality processing threads in a given period of time comprises: determining respective submission queue identifiers (SQIDs) for the plurality of memory requests; and (See Fig 2, paragraph [0128], illustrates a load queue storing load requests which is similar to submission queue from processing cores)
incrementing respective counters associated with the respective SQIDs, wherein the counters are periodically decremented based on the given period of time. (See Fig 2, paragraph [0237], illustrates counters m and s signals pending requests queue and when master "m" sends a request to a slave "s", counter[m][s] is incremented by that slave and when a request to that master gets scheduled the counter gets decremented)
Both Korzh, HOLM and Asaad relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, and see Asaad, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh - HOLM combination and Asaad by incorporating prefetching or preloading read data based on memory access pattern, as taught by Asaad, to enable a load queue storing load requests which is similar to submission queue from processing cores and counters m and s signaling pending requests queue and when master "m" sends a request to a slave "s", counter[m][s] is incremented by that slave and when a request to that master gets scheduled the counter gets decremented. The combined system of Korzh – HOLM - Asaad allows a system, method and computer program product to provide improving performance of a parallel computing system, e.g., by prefetching data or instructions according to a list including a sequence of prior cache miss addresses as mentioned in paragraph [0055]. Therefore, the combination of Korzh - HOLM - Asaad improves performance of parallel computing system. See Asaad, paragraph [0156].
Regarding claim 18, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 15. However, Korzh - HOLM combination does not explicitly teach The non-transitory computer-readable storage medium of claim 15, wherein tracking the respective numbers of the plurality of memory requests that are associated with each of the plurality processing threads in a given period of time comprises: determining respective submission queue identifiers (SQIDs) for the plurality of memory requests; and
incrementing respective counters associated with the respective SQIDs, wherein the counters are periodically decremented based on the given period of time.
On the other hand, Asaad which also relates to prefetching or preloading read data based on memory access pattern teaches The non-transitory computer-readable storage medium of claim 15, wherein tracking the respective numbers of the plurality of memory requests that are associated with each of the plurality processing threads in a given period of time comprises: determining respective submission queue identifiers (SQIDs) for the plurality of memory requests; and (See Fig 2, paragraph [0128], illustrates a load queue storing load requests which is similar to submission queue from processing cores)
incrementing respective counters associated with the respective SQIDs, wherein the counters are periodically decremented based on the given period of time. (See Fig 2, paragraph [0237], illustrates counters m and s signals pending requests queue and when master "m" sends a request to a slave "s", counter[m][s] is incremented by that slave and when a request to that master gets scheduled the counter gets decremented)
Both Korzh, HOLM and Asaad relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, and see Asaad, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh - HOLM combination and Asaad by incorporating prefetching or preloading read data based on memory access pattern, as taught by Asaad, to enable a load queue storing load requests which is similar to submission queue from processing cores and counters m and s signaling pending requests queue and when master "m" sends a request to a slave "s", counter[m][s] is incremented by that slave and when a request to that master gets scheduled the counter gets decremented. The combined system of Korzh – HOLM - Asaad allows a system, method and computer program product to provide improving performance of a parallel computing system, e.g., by prefetching data or instructions according to a list including a sequence of prior cache miss addresses as mentioned in paragraph [0055]. Therefore, the combination of Korzh - HOLM - Asaad improves performance of parallel computing system. See Asaad, paragraph [0156].
Claim(s) 5, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Korzh in view of HOLM and further in view of Nubile et al. (US 20220083241 A1) hereinafter Nubile.
Regarding claim 5, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 1. However, Korzh - HOLM combination does not explicitly teach The system of claim 1, wherein selecting the subset of the plurality of processing threads comprises selecting a number of processing threads that have issued the highest number of memory access requests in the given period of time
On the other hand, Nubile which also relates to prefetching or preloading read data based on memory access pattern teaches The system of claim 1, wherein selecting the subset of the plurality of processing threads comprises selecting a number of processing threads that have issued the highest number of memory access requests in the given period of time. (See Fig 4A-B, paragraph [0053], illustrates data structure 450 can identify a leading thread and the leading thread can be a single or combination of processing threads having the highest priority ring counter 400 corresponds to number of entries)
Both Korzh, HOLM and Nubile relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, and see Nubile, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh - HOLM combination and Nubile by incorporating prefetching or preloading read data based on memory access pattern, as taught by Nubile, to enable data structure 450 to identify a leading thread and the leading thread can be a single or combination of processing threads having the highest priority ring counter 400 corresponds to number of entries. The combined system of Korzh – HOLM - Nubile allows support for independent parallel plane access in a memory device with significantly reduced hardware resources in the memory sub-system as mentioned in paragraph [0020]. Therefore, the combination of Korzh - HOLM - Nubile improves quality of service. See Nubile, paragraph [0020].
Regarding claim 12, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 8. However, Korzh - HOLM combination does not explicitly teach The method of claim 8, wherein selecting the subset of the plurality of processing threads comprises selecting a number of processing threads that have issued the highest number of memory access requests in the given period of time
On the other hand, Nubile which also relates to prefetching or preloading read data based on memory access pattern teaches The method of claim 8, wherein selecting the subset of the plurality of processing threads comprises selecting a number of processing threads that have issued the highest number of memory access requests in the given period of time. (See Fig 4A-B, paragraph [0053], illustrates data structure 450 can identify a leading thread and the leading thread can be a single or combination of processing threads having the highest priority ring counter 400 corresponds to number of entries)
Both Korzh, HOLM and Nubile relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, and see Nubile, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh - HOLM combination and Nubile by incorporating prefetching or preloading read data based on memory access pattern, as taught by Nubile, to enable data structure 450 to identify a leading thread and the leading thread can be a single or combination of processing threads having the highest priority ring counter 400 corresponds to number of entries. The combined system of Korzh – HOLM - Nubile allows support for independent parallel plane access in a memory device with significantly reduced hardware resources in the memory sub-system as mentioned in paragraph [0020]. Therefore, the combination of Korzh - HOLM - Nubile improves quality of service. See Nubile, paragraph [0020].
Regarding claim 19, Korzh in view of HOLM teaches prefetching or preloading read data based on memory access pattern in claim 15. However, Korzh - HOLM combination does not explicitly teach The non-transitory computer-readable storage medium of claim 15, wherein selecting the subset of the plurality of processing threads comprises selecting a number of processing threads that have issued the highest number of memory access requests in the given period of time
On the other hand, Nubile which also relates to prefetching or preloading read data based on memory access pattern teaches The non-transitory computer-readable storage medium of claim 15, wherein selecting the subset of the plurality of processing threads comprises selecting a number of processing threads that have issued the highest number of memory access requests in the given period of time. (See Fig 4A-B, paragraph [0053], illustrates data structure 450 can identify a leading thread and the leading thread can be a single or combination of processing threads having the highest priority ring counter 400 corresponds to number of entries)
Both Korzh, HOLM and Nubile relate to prefetching or preloading read data based on memory access pattern (see Korzh, abstract, and see HOLM, abstract, and see Nubile, abstract, regarding prefetching or preloading read data).
Therefore, it would have been obvious to one of ordinary skill at the time the
invention was effectively filed to combine Korzh - HOLM combination and Nubile by incorporating prefetching or preloading read data based on memory access pattern, as taught by Nubile, to enable data structure 450 to identify a leading thread and the leading thread can be a single or combination of processing threads having the highest priority ring counter 400 corresponds to number of entries. The combined system of Korzh – HOLM - Nubile allows support for independent parallel plane access in a memory device with significantly reduced hardware resources in the memory sub-system as mentioned in paragraph [0020]. Therefore, the combination of Korzh - HOLM - Nubile improves quality of service. See Nubile, paragraph [0020].
Conclusion
The prior art made of record and not relied upon is considered pertinent to
applicant's disclosure.
a. Cai et al. (US 20190034239 A1) teaches a central processing unit (CPU) with dynamic thread mapping includes a set of multiple cores each with a set of multiple threads. A set of registers for each of the multiple threads monitors for in-flight memory requests the number of loads from and stores to at least a first memory interface and a second memory interface by each respective thread. The second memory interface has a greater latency than the first memory interface. The CPU further has logic to map and migrate each thread to respective CPU cores where the number of cores accessing only one of the at least first and second memory interfaces is maximized.
b. MARONCELLI et al. (US 20240028516 A1) teaches A data processing apparatus is provided. Prefetch circuitry generates a prefetch request for a cache line prior to the cache line being explicitly requested. The cache line is predicted to be required for a store operation in the future. Issuing circuitry issues the prefetch request to a memory hierarchy and filter circuitry filters the prefetch request based on at least one other prefetch request made to the cache line, to control whether the prefetch request is issued by the issuing circuitry.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SUBIR K CHOWDHURY whose telephone number is (703)756-1207. The examiner can normally be reached Monday-Friday 8:30 - 5:00 CST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached at (571)-272-3978. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/S.K.C./Examiner, Art Unit 2132
/HOSAIN T ALAM/Supervisory Patent Examiner, Art Unit 2132