Prosecution Insights
Last updated: April 19, 2026
Application No. 18/354,150

Offloading Data Storage Device Processing Tasks to a Graphics Processing Unit

Non-Final OA §103
Filed
Jul 18, 2023
Examiner
WU, BENJAMIN C
Art Unit
2195
Tech Center
2100 — Computer Architecture & Software
Assignee
Western Digital Technologies Inc.
OA Round
1 (Non-Final)
87%
Grant Probability
Favorable
1-2
OA Rounds
3y 0m
To Grant
99%
With Interview

Examiner Intelligence

Grants 87% — above average
87%
Career Allow Rate
456 granted / 522 resolved
+32.4% vs TC avg
Strong +16% interview lift
Without
With
+16.4%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
29 currently pending
Career history
551
Total Applications
across all art units

Statute-Specific Performance

§101
19.8%
-20.2% vs TC avg
§103
48.4%
+8.4% vs TC avg
§102
0.8%
-39.2% vs TC avg
§112
16.1%
-23.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 522 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status 1. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 2. Claims 1–20 are presented for examination in a non-provisional application filed on Jul. 18, 2023. Claim Interpretation Under 35 USC § 112 The following is a quotation of 35 U.S.C. 112(f): (f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 3. The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) is invoked. As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f): (A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; (B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and (C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f), is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action. Accordingly, claim 20 is being interpreted under 35 U.S.C. 112(f). Abbreviations 4. Where appropriate, the following abbreviations will be used when referencing Applicant’s submissions and specific teachings of the reference(s): i. figure / figures: Fig. / Figs. ii. column / columns: Col. / Cols. iii. page / pages: p. / pp. References Cited 5. (A) Applicant’s Specification, construed as Applicant Admitted Prior Art (“AAPA”). (B) Pandurangan et al., 2023/0114636 A1 (“Pandurangan”). (C) Gangani et al., US 2021/0240524 A1 (“Gangani”). Notice re prior art available under both pre-AIA and AIA 6. In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. A. 7. Claims 1–3, 5–6, 12, 14, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over (A) AAPA in view of (B) Pandurangan and (C) Gangani. See “References Cited” section, above, for full citations of references. 8. Regarding claim 1, (A) AAPA teaches/suggests the invention substantially as claimed, including: “A system, comprising: a data storage device comprising: a peripheral interface configured to connect to a host system; a storage medium configured to store host data” (¶ 4: Each storage device in a multi-device storage system may be connected to a host system through at least one high-bandwidth interface, such as PCle, using an appropriate storage protocol for the storage device, such as non-volatile memory express (NVMe) for accessing solid state drives (SSDs) or the storage blades of all flash arrays); “a direct memory access service configured to: store, to a host … data; and access, from the host … data” (¶ 4: fabric-based distributed storage systems may include storage devices configured with direct memory access to enable more efficient transfer of data to and from hosts and other systems). AAPA does not teach “store, to a host memory buffer of the host system and through the peripheral interface, a first set of task input data; and access, from the host memory buffer and through the peripheral interface, a first set of task output data” (B) Pandurangan, teaching a computational storage device, however teaches or suggests: “store, to a host memory buffer of the host system and through the peripheral interface, a first set of task input data; and access, from the host memory buffer and through the peripheral interface, a first set of task output data” (¶ 22: A computational storage device in accordance with example embodiments of the disclosure may enable a user (e.g., an application running on a host) to access a device program using a storage protocol such as Nonvolatile Memory Express (NVMe). For example, in some embodiments, a user may start a device program by sending a command to the storage device using a storage protocol. The command may include, for example, the name of the device program to execute, one or more parameters to pass to the device program, a pointer to a memory area that may be used to exchange input and/or output data between the device program and an application running on the host; ¶ 24: a host may collect a stream of output data such as logs, debug messages, and/or the like, from a device program by directly reading the output data from a buffer using the storage protocol. In such an embodiment, a host and a storage device may exchange output data from a device program through a ring buffer using a producer-consumer model in which the device may produce data and the host may consume data; ¶ 71: the memory space in the data buffer pointed to by DPTR in the start command illustrated in FIG. 7 may include, for example, the name of the device program to execute and/or one or more input arguments for the device program. The data buffer may be located, for example, at a storage device, at a host, at any other location; ¶ 72: the data buffer may be populated (at least partially) automatically, for example, by an application running on a host, a driver on a host, a computational storage device, and/or the like, in response to a command to execute a device program; ¶ 73: As another example, if the data buffer is located at least partially at a host, the device program may send output data to the data buffer at the host using the storage protocol; ¶ 54: The buffer logic 438 may maintain any type of buffer that may enable a user ( e.g., an application running on a host) to receive output data from one or more device programs). It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of (B) Pandurangan with those of (A) AAPA to provide a computational storage device and device programs therein to execute user/device programs and exchange data with the host. The motivation or advantage to do so is provide shared access to the host the compute resources of a computational storage device, including specialized hardware/software components of the storage device (see AAPA, ¶ 5: storage devices may incorporate custom hardware accelerators integrated in their controller application specific integrated circuits (ASICs)). AAP and Pandurangan do not teach “a processing offload service configured to notify, through the peripheral interface, a processor device to initiate a first processing task on the first set of task data, wherein the processor device comprises a graphics processing unit.” (C) Gangani, in the context of AAPA and Pandurangan’s teachings, however teaches or suggests implementing: “a processing offload service configured to notify, through the peripheral interface, a processor device to initiate a first processing task on the first set of task data, wherein the processor device comprises a graphics processing unit” (¶ 51: CPU 210 may be configured to execute the application 212. The application 212 may be an application that offloads the performing of ML tasks to the GPU 220. For example, the CPU 210 may use the GPU 220 to execute one or more ML primitives. For example, the application 212 may include operations that cause the GPU 220 to execute one or more computational jobs associated with an ML primitive; ¶ 52: graphics driver 214 may receive the operations from the application 212 and may control operation of the GPU 220 to facilitate executing the operations. For example, the graphics driver 214 may generate one or more command streams, store the generated command streams in the command buffer 230 of the memory 124, and instruct the GPU 220 to execute the command streams; ¶ 54: the CPU 210 may provide graphics data to the GPU 220 for rendering and issue one or more graphics commands to the GPU 220 … the CPU 210 may provide the graphics commands and the graphics data to the memory 124, which may be accessed by the GPU 220; ¶ 61: an ML command may cause the GPU 220 to generate a quantity of outputs (sometimes referred to as “primitive outputs”) associated with an ML primitive; ¶ 63: GPU 220 may access ML data associated with each of the respective computational jobs from the ML data buffer 232 at the memory 124, execute the respective computational jobs using the accessed ML data, and then write the output generated by executing each of the 100 computational jobs to the ML data buffer 232 of the memory 124). It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to further combine the teachings of (C) Gangani with those of AAPA and Pandurangan to offload specific computation tasks to the GPU from a device program (i.e. computational storage device). The motivation or advantage to do so is provide specialized or additional computational processing or acceleration devices (functions) to meet the performance or processing requirements of host applications and/or user programs. 9. Regarding claim 2, AAPA, Pandurangan, and Gangani, in combination, teach or suggest: “a peripheral bus configured for communication among the data storage device, the host system, and the processor device” (AAPA, ¶ 2: Some computing systems, such as storage arrays, may include multiple data storage devices supporting one or more host systems through a peripheral or storage interface bus, such as peripheral component interconnect express (PCIe); Pandurangan, Fig. 10 and ¶ 90: The host apparatus 1000 illustrated in FIG. 10 may include a processor 1002, which may include a memory controller 1004, a system memory 1006, host control logic 1008, and/or an interconnect interface 1010, which may be implemented, for example using CXL. Any or all of the components illustrated in FIG. 10 may communicate through one or more system buses 1012; Gangani, Fig. 2 and ¶ 48: FIG. 2, the example CPU 210, the example GPU 220, and the example memory 124 are in communication via an example bus 202); and “the host system comprising: a host processor; and a host memory device comprising a set of host memory locations configured to be: allocated to the host memory buffer; accessible to the data storage device using direct memory access; and accessible to the processor device using direct memory access” (AAPA, ¶ 2: host systems … are being tasked with mathematically intensive tasks related to graphics, video processing, machine learning, and other computational tasks; ¶ 4: fabric-based distributed storage systems may include storage devices configured with direct memory access to enable more efficient transfer of data to and from hosts and other systems; Pandurangan, Fig. 10 and ¶ 90: The host apparatus 1000 illustrated in FIG. 10 may include a processor 1002, which may include a memory controller 1004, a system memory 1006, host control logic 1008, and/or an interconnect interface 1010, which may be implemented, for example using CXL. Any or all of the components illustrated in FIG. 10 may communicate through one or more system buses 1012; ¶ 35: remote direct memory access (RDMA); Gangani, Fig. 2 and ¶ 48: FIG. 2, the example CPU 210, the example GPU 220, and the example memory 124 are in communication via an example bus 202); and 10. Regarding claim 3, AAPA, Pandurangan, and Gangani, in combination, teach or suggest: “wherein the host memory buffer is further configured with: a first subset of the set of host memory locations allocated to task input data and including the first set of task input data; a second subset of the set of host memory locations allocated to task output data and including the first set of task output data” (AAPA, ¶ 4: fabric-based distributed storage systems may include storage devices configured with direct memory access to enable more efficient transfer of data to and from hosts and other systems; Pandurangan, (as applied in rejecting claim 1) ¶ 22: A computational storage device in accordance with example embodiments of the disclosure may enable a user (e.g., an application running on a host) to access a device program using a storage protocol such as Nonvolatile Memory Express (NVMe). For example, in some embodiments, a user may start a device program by sending a command to the storage device using a storage protocol. The command may include, for example, the name of the device program to execute, one or more parameters to pass to the device program, a pointer to a memory area that may be used to exchange input and/or output data between the device program and an application running on the host; ¶ 24: a host may collect a stream of output data such as logs, debug messages, and/or the like, from a device program by directly reading the output data from a buffer using the storage protocol. In such an embodiment, a host and a storage device may exchange output data from a device program through a ring buffer using a producer-consumer model in which the device may produce data and the host may consume data; ¶ 71: the memory space in the data buffer pointed to by DPTR in the start command illustrated in FIG. 7 may include, for example, the name of the device program to execute and/or one or more input arguments for the device program. The data buffer may be located, for example, at a storage device, at a host, at any other location; ¶ 72: the data buffer may be populated (at least partially) automatically, for example, by an application running on a host, a driver on a host, a computational storage device, and/or the like, in response to a command to execute a device program; ¶ 73: As another example, if the data buffer is located at least partially at a host, the device program may send output data to the data buffer at the host using the storage protocol; ¶ 54: The buffer logic 438 may maintain any type of buffer that may enable a user ( e.g., an application running on a host) to receive output data from one or more device programs). Gangani, (as applied in rejecting claim 1) ¶ 52: graphics driver 214 may receive the operations from the application 212 and may control operation of the GPU 220 to facilitate executing the operations. For example, the graphics driver 214 may generate one or more command streams, store the generated command streams in the command buffer 230 of the memory 124, and instruct the GPU 220 to execute the command streams; ¶ 54: the CPU 210 may provide graphics data to the GPU 220 for rendering and issue one or more graphics commands to the GPU 220 … the CPU 210 may provide the graphics commands and the graphics data to the memory 124, which may be accessed by the GPU 220; ¶ 61: an ML command may cause the GPU 220 to generate a quantity of outputs (sometimes referred to as “primitive outputs”) associated with an ML primitive; ¶ 63: GPU 220 may access ML data associated with each of the respective computational jobs from the ML data buffer 232 at the memory 124, execute the respective computational jobs using the accessed ML data, and then write the output generated by executing each of the 100 computational jobs to the ML data buffer 232 of the memory 124); and “a third subset of the set of host memory locations allocated to a status register configured to include at least one status indicator for the first processing task” (Pandurangan, ¶ 54: buffer logic 438 may notify the user that output data is available, for example, through an interrupt, a status bit in a status register, and/or the like; ¶ 73: data buffer is located at least partially at a host, the device program may send output data to the data buffer at the host using the storage protocol). 11. Regarding claim 5, AAPA, Pandurangan, and Gangani, in combination, teach or suggest: “further comprising the processor device, the processor device configured to: receive the notification to initiate the first processing task” (Pandurangan, ¶ 22: start a device program by sending a command to the storage device using a storage protocol. The command may include, for example, the name of the device program to execute, one or more parameters to pass to the device program, a pointer to a memory area that may be used to exchange input and/or output data between the device program and an application running on the host; Gangani, ¶ 52: graphics driver 214 may receive the operations from the application 212 and may control operation of the GPU 220 to facilitate executing the operations. For example, the graphics driver 214 may generate one or more command streams, store the generated command streams in the command buffer 230 of the memory 124, and instruct the GPU 220 to execute the command streams; ¶ 61: an ML command may cause the GPU 220 to generate a quantity of outputs … once the GPU 220 receives the ML command ( e.g., from the command buffer 230), control may be passed to the GPU 220 for launching one or more computational jobs for generating the requested quantity of outputs); “access, using direct memory access to the host memory buffer, the first set of task input data; process, using a first set of task code for the first processing task, the first set of task input data to determine the first set of task output data; store, using direct memory access, the first set of task output data to the host memory buffer” (AAPA, ¶ 4: fabric-based distributed storage systems may include storage devices configured with direct memory access to enable more efficient transfer of data to and from hosts and other systems; Pandurangan, ¶ 22, 24, 71–73, and ¶ 54, as applied in rejecting claim 1 above. See also rejection of claim 3 incorporating similar features; Gangani, ¶ 52, 54, 61, and ¶ 63, as applied in rejecting claim 1 above. See also rejection of claim 3 incorporating similar features); and “notify, responsive to storing the first set of task output data to the host memory buffer, the data storage device that the first processing task is complete” (Pandurangan, ¶ 54: buffer logic 438 may notify the user that output data is available, for example, through an interrupt, a status bit in a status register, and/or the like; ¶ 73: data buffer is located at least partially at a host, the device program may send output data to the data buffer at the host using the storage protocol; Gangani, ¶ 72: after completion of the first batch of computational jobs, the GPU 220 may repeat, for each subsequent batch, the loading of the respective input data, the executing of the respective computational jobs to generate job output data, and the writing (or storing) of the respective job output data to the ML data buffer 232; ¶ 73: after completion of the first batch of computational jobs, the GPU 220 may write the batch output data ( e.g., the job output data generated by the execution of each computational job of the batch) from the GPU memory 226 to the memory 124 ( e.g., to the ML data buffer 232)). 12. Regarding claim 6, Pandurangan and Gangani teach or suggest: “the processing offload service is further configured to determine the first set of task code for the first processing task” (Pandurangan, ¶ 45: commands may include commands to download a user program from a host, execute a user program, transfer input and/or output data for a user program to and/or from the storage media 312 and/or a host or other device through the storage protocol interface 304 and/or the like; Gangani, ¶ 51: CPU 210 may be configured to execute the application 212. The application 212 may be an application that offloads the performing of ML tasks to the GPU 220. For example, the CPU 210 may use the GPU 220 to execute one or more ML primitives. For example, the application 212 may include operations that cause the GPU 220 to execute one or more computational jobs associated with an ML primitive; ¶ 52: graphics driver 214 may receive the operations from the application 212 and may control operation of the GPU 220 to facilitate executing the operations. For example, the graphics driver 214 may generate one or more command streams, store the generated command streams in the command buffer 230 of the memory 124, and instruct the GPU 220 to execute the command streams); and “the notification to initiate the first processing task includes the first set of task code for the first processing task” (Pandurangan, ¶ 22: start a device program by sending a command to the storage device using a storage protocol. The command may include, for example, the name of the device program to execute, one or more parameters to pass to the device program, a pointer to a memory area that may be used to exchange input and/or output data between the device program and an application running on the host; Gangani, ¶ 52: graphics driver 214 may receive the operations from the application 212 and may control operation of the GPU 220 to facilitate executing the operations. For example, the graphics driver 214 may generate one or more command streams, store the generated command streams in the command buffer 230 of the memory 124, and instruct the GPU 220 to execute the command streams; ¶ 61: an ML command may cause the GPU 220 to generate a quantity of outputs … once the GPU 220 receives the ML command ( e.g., from the command buffer 230), control may be passed to the GPU 220 for launching one or more computational jobs for generating the requested quantity of outputs). 13. Regarding claim 12, it is the corresponding method claim reciting similar limitations of commensurate scope as the system of claim 1. Therefore, it is rejected on the same basis as claim 1 above. 14. Regarding claim 14, it is the corresponding method claim reciting similar limitations of commensurate scope as the system of claim 5. Therefore, it is rejected on the same basis as claim 5 above. 15. Regarding claim 20, it is a corresponding system claim reciting similar limitations of commensurate scope as the system of claim 1. Therefore, it is rejected on the same basis as claim 1 above. Allowable Subject Matter 16. Claims 4, 7–11, 13, and 15–19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN C WU whose telephone number is (571)270-5906. The examiner can normally be reached Monday through Friday, 8:30 A.M. to 5:00 P.M.. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee J. Li can be reached on (571)272-4169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /BENJAMIN C WU/Primary Examiner, Art Unit 2195 January 9, 2026
Read full office action

Prosecution Timeline

Jul 18, 2023
Application Filed
Jan 09, 2026
Non-Final Rejection — §103
Mar 23, 2026
Interview Requested
Apr 01, 2026
Examiner Interview Summary
Apr 01, 2026
Applicant Interview (Telephonic)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12602258
INSTANTIATING SOFTWARE DEFINED STORAGE NODES ON EDGE INFORMATION HANDLING SYSTEMS
2y 5m to grant Granted Apr 14, 2026
Patent 12585508
RECONSTRUCTING AND VERIFYING PROPRIETARY CLOUD BASED ON STATE TRANSITION
2y 5m to grant Granted Mar 24, 2026
Patent 12579006
SYSTEMS AND METHODS FOR UNIVERSAL AUTO-SCALING
2y 5m to grant Granted Mar 17, 2026
Patent 12572388
COMPUTING RESOURCE SCHEDULING BASEDON EXPECTED CYCLES
2y 5m to grant Granted Mar 10, 2026
Patent 12566646
Accessing Critical Resource in a Non-Uniform Memory Access (NUMA) System
2y 5m to grant Granted Mar 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
87%
Grant Probability
99%
With Interview (+16.4%)
3y 0m
Median Time to Grant
Low
PTA Risk
Based on 522 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month