Last updated: May 29, 2026

Application No. 18/193,790

METHOD AND SYSTEM FOR GENERATING HIGH PERFORMANCE MACHINE LEARNING TRAINING DATA STREAMS

Non-Final OA §103

Filed

Mar 31, 2023

Examiner

MILLS, PAUL V

Art Unit

2196

Tech Center

2100 — Computer Architecture & Software

Assignee

DELL PRODUCTS, L.P.

OA Round

1 (Non-Final)

This examiner grants 53% of cases after interview

— +40.3% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 355 resolved cases, 2023–2026

Examiner Intelligence

MILLS, PAUL V View full profile →

Grants 53% of resolved cases

Career Allowance Rate

188 granted / 355 resolved

-2.0% vs TC avg

Strong +40% interview lift

Without

With

+40.3%

Interview Lift

resolved cases with interview

Typical timeline

4y 1m

Avg Prosecution

15 currently pending

Career history

375

Total Applications

across all art units

Statute-Specific Performance

§101

1.8%

-38.2% vs TC avg

§103

85.2%

+45.2% vs TC avg

§102

3.8%

-36.2% vs TC avg

§112

8.6%

-31.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 355 resolved cases

Office Action

§103

DETAILED ACTION

Status of Claims

This action is in reply to the application filed on 03/31/2023.
Claims 1-20 are currently pending and have been examined.

	Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Zhao (US 20200174840 A1) in view of Zhu et al. (“Entropy-Aware I/O Pipelining for Large-Scale Deep Learning on HPC Systems”, 2018).

Claims 1, 11, and 16:
Zhao discloses the limitations as shown in the following rejections:
obtaining, by a training data stream manager (TDSM) (service controller and nodes of  data flow pipeline), a first stream request, wherein the first stream request comprises a stream creation request and a stream specification (service request and dataflow pipeline configuration parameters) (¶0059-0062, 0081; FIG. 2, 7).
in response to obtaining the stream creation request: generating a new stream entry in a stream database (service request record and associated pipeline composition at service controller); loading training data specified by the stream specification into a cache (staging area/memory) (¶0054-0057, 0060, 0063-0064; 0048, 0081, 0084). 
generating a mini-batch sequence using the training data and the stream specification…generating mini-batch sequence access information (batch index/labels) associated with the mini-batch sequence; setting up a data transfer application programming interface (API) (RDMA interface) associated with the cache (¶0047, 0062, 0065-0066).
streaming the mini-batch access information to a client in a machine learning training environment (GPU server nodes for deep learning model training)…wherein the client uses the mini-batch sequence access information and the data transfer API to obtain mini batches of the mini-batch sequence (¶0069-0072, 0049).
A system for managing training data, comprising: a client; and a training data stream manager (TDSM), comprising a processor and memory, programmed to/ non-transitory computer readable medium comprising computer readable program code, which when executed by a computer processor enables the computer processor to perform (FIG. 2, 7, 8; ¶0094-0096).
In Zhao’s system, the GPU server nodes (ML training environment) receive the mini-batch index and identifier labels (mini-batch sequence access information) from a client node coordinating the model training process (¶0069-0070), and Zhao does not disclose creating a mini-batch sequence queue and a stream endpoint in the data processing pipeline.
Zhu, however, discloses (pg. 149-150) an analogous distributed system for ML model training including DeepIO servers (training data stream manager (TDSM)) which generate mini-batches by forming a list of data element IDs (mini-batch sequence queue) of the training dataset and provides an API endpoint shuffle_rand (stream endpoint) by which the worker nodes obtain an array of indexes of data elements used to retrieve a minibatch deepIO_batch API (data transfer API). 
It would have been obvious to one of ordinary skill in the art prior to the filing date of the invention to modify Zhao to employ the minibatch generation and pipelining methods of Zhu to optimize the delivery of training dataset elements to the GPU server/worker nodes (Zhu pg. 145, Abstract; pg. 148, sect. III).

Claims 2, 12, and 17:
The combination of Zhao/Zhu discloses the limitations as shown in the rejections above. Zhao further discloses wherein the mini-batch sequence access information comprises pointers (batch indexes/IDs) associated with the mini-batches in the cache. (¶0066, 0070-0071).

Claims 3, 13, and 18:
The combination of Zhao/Zhu discloses the limitations as shown in the rejections above. Zhao further discloses wherein the data transfer API enables the client to perform remote direct memory access (RDMA) reads to obtain the mini-batches from the cache of the TDSM (¶0047, 0053, 0071). See also Zhu pg. 148, sect. III.

Claims 4-6, 14, 15, 19 and 20:
The combination of Zhao/Zhu discloses the limitations as shown in the rejections above. Zhao further discloses the mini-batch sequence access information is streamed using a first network channel (RPC); and the client uses a second network channel (RDMA) to obtain the mini-batches. (¶0026, 0042, 0047, 0053, 0070-0071) including alternatives wherein the second network channel comprises an InfiniBand network channel and/or an NVMe-oF network channel:
“memory devices 262 can be accessed using the NVMe interface specification for access to direct-attached solid state devices using high-speed PCIe connections and/or the NVMe over Fabrics (NVMeOF) network protocol which allows host systems to access remote memory over a network (e.g., fabric) using remote direct memory access (RDMA) technologies such as InfiniBand, RDMA over Converged Ethernet (RoCE), Internet Wide-Area RDMA Protocol (iWARP), etc.” (¶0047)

Claim 7:
The combination of Zhao/Zhu discloses the limitations as shown in the rejections above. Zhao further discloses wherein the mini-batch sequence comprises: a plurality of mini-batches; end of epoch messages; and an end of stream message (last epoch complete) (¶0035, 0060, 0071-0072).

Claim 8:
The combination of Zhao/Zhu discloses the limitations as shown in the rejections above. Zhu further wherein a mini-batch of the mini-batch sequence comprises a randomly sampled portion of the augmented training data (pg. 148, sect. III; pg. 150).

Claims 9 and 10:
The combination of Zhao/Zhu discloses the limitations as shown in the rejections above. Zhao further discloses a stream identifier (Job and/or pipeline); the stream specification; and a stream status…wherein the stream specification comprises: stream metadata associated with the stream; training data access information associated with the training data; (e.g. location, directory) mini-batch parameters; and augmentation parameters. (¶0046, 0060-0063, 0065, 0081, 0092).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
The following references are directed to data pipelines for distributed training: US-20190042887-A1, US-20200042362-A1, US-20200242487-A1, “Cachew: Machine Learning Input Data Processing as a Service”, “A Holistic Approach to Data Access for Cloud-Native Analytics and Machine Learning”, “Large Scale Caching and Streaming of Training Data for Online Deep Learning”
“Efficient User-Level Storage Disaggregation for Deep Learning” is directed distributed training employing RDMA batch distribution.
US-20200302334-A1 discloses a method for generating mini-batch sequences.
Any inquiry of a general nature or relating to the status of this application or concerning this communication or earlier communications from the Examiner should be directed to Paul Mills whose telephone number is 571-270-5482.  The Examiner can normally be reached on Monday-Friday 11:00am-8:00pm.  If attempts to reach the examiner by telephone are unsuccessful, the Examiner’s supervisor, April Blair can be reached at 571-270-1014.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/P. M./
Paul Mills
03/23/2026
/APRIL Y BLAIR/Supervisory Patent Examiner, Art Unit 2196

Read full office action

Prosecution Timeline

Mar 31, 2023

Application Filed

Mar 30, 2026

Non-Final Rejection mailed — §103

May 26, 2026

Interview Requested

Precedent Cases

Applications granted by this same examiner with similar technology

18/429,269

Patent 12639124

MANAGING COMPUTER RESOURCES FOR CLINICAL APPLICATIONS

2y 3m to grant Granted May 26, 2026

18/968,095

Patent 12639648

METHOD AND APPARATUS FOR BASELINE MONITORING AND ALARMING, COMPUTER DEVICE, AND STORAGE MEDIUM

1y 5m to grant Granted May 26, 2026

18/194,776

Patent 12632288

MANAGING EXECUTION OF DATA PROCESSING JOBS IN A VIRTUAL COMPUTING ENVIRONMENT

3y 1m to grant Granted May 19, 2026

17/721,964

Patent 12613738

METHOD AND SYSTEM FOR DYNAMIC SELECTION OF POLICY PRIORITIES FOR PROVISIONING AN APPLICATION IN A DISTRIBUTED MULTI-TIERED COMPUTING ENVIRONMENT

4y 0m to grant Granted Apr 28, 2026

17/217,353

Patent 12572385

DYNAMIC SYSTEM POWER LOAD MANAGEMENT

4y 11m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

53%

Grant Probability

93%

With Interview (+40.3%)

4y 1m (~11m remaining)

Median Time to Grant

Low

PTA Risk

Based on 355 resolved cases by this examiner. Grant probability derived from career allowance rate.