Prosecution Insights
Last updated: April 19, 2026
Application No. 18/193,762

METHOD AND SYSTEM FOR GENERATING AND MANAGING MACHINE LEARNING MODEL TRAINING DATA STREAMS

Non-Final OA §103
Filed
Mar 31, 2023
Examiner
SEYE, ABDOU K
Art Unit
2198
Tech Center
2100 — Computer Architecture & Software
Assignee
DELL PRODUCTS, L.P.
OA Round
1 (Non-Final)
82%
Grant Probability
Favorable
1-2
OA Rounds
3y 5m
To Grant
99%
With Interview

Examiner Intelligence

Grants 82% — above average
82%
Career Allow Rate
480 granted / 583 resolved
+27.3% vs TC avg
Strong +28% interview lift
Without
With
+27.5%
Interview Lift
resolved cases with interview
Typical timeline
3y 5m
Avg Prosecution
38 currently pending
Career history
621
Total Applications
across all art units

Statute-Specific Performance

§101
21.6%
-18.4% vs TC avg
§103
54.6%
+14.6% vs TC avg
§102
2.8%
-37.2% vs TC avg
§112
13.0%
-27.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 583 resolved cases

Office Action

§103
DETAILED ACTION Statement of claims The present application include : Claims 1-20 remain pending in the application. Claims 1-20 are being considered on the merits. Information Disclosure Statement The information disclosure statement (IDS) submitted on 07/30/2024 . The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner. Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Krishna et al. (US 2022/0245505, Krishna herein after) in view of Nikoli Dryden et al. “Clairvoyant Prefetching for Distributed Machine Learning I/O”, 2021-11-14, Nikoli hereinafter). As to claim 1, Krishna teaches a method for managing training data (e.g., see FIG.4B, para 68, “data manager 338 in communication with a training container” , “ training data” ) , comprising: obtaining a first stream request wherein the first stream request comprises a stream creation request and a stream specification (e.g., see, FIG. 3, para 47, “ the ingress request 302 is generated in response to input received from a user via user interface 205”, “ an ingress request 302, which identifies a job payload (e.g., a hyper-parameter optimization processing task)”, “an HTTP POST request that includes the job payload” and “request is received that identifies a payload for execution”, a plurality of tasks are generated based on the payload” in para 72, see FIG. 6. Thus, the “request” represents the a first stream request, the “payload” represents stream specification, the” ingress request 302 is generated coupled with “tasks are generated “ include the stream creation request) ; in response to obtaining the stream creation request ( e.g., para 43, “” ingress request 302 is generated “): generating a new stream entry in a stream database ( e.g., “306”, FIG. 3, see FIG. 3, para 47, wherein “ Entry point service 304 receives ingress request 302”, : a checksum based on the received job payload”, and “stores checkpoint data within object store 324”, “Entry point service 304 places the job payload in processing queue 306.” checkpoint data may include a checksum (e.g., an ID), a timestamp” in para 52. Thus, the “processing queue 306” coupled with “object store 324” include the stream database, the “a checksum”, ID, a timestamp” include the a new stream entry ); loading training data specified by the stream specification into a cache (e.g., “data Queue”, FIG. 4B, and “training data Stream”, FIG.4B and para 68, wherein “receives the training data stream from object store 324”, FIG. 4B, “456” . Thus, “data queue” include the cache); generating augmented training data using the training data and the stream specification (e.g., FIG. 4B, “460 “, para 68 “the shuffled training data to augmentation 460. Augmentation 460 augments the shuffled training data,); generating a mini-batch using the augmented training data and the stream specification (e.g., para 69, “Batching module 458 batches the training data into batch sizes”. Thus, one of the “batch sizes” include a mini-batch ); creating a mini-batch queue and a stream endpoint (e.g., e.g., “470 training batch queue”, FIG. 4B , para 69, “ batch sizes”, “batches) within data queue 456” , “the batches within training batch queue 470” , “processing unit, such as GPU 472” and “checkpoint data may include a checksum (e.g., an ID)” in para 52 ) Thus, “470 training batch queue” represent a mini-batch queue, the “checksum (e.g., an ID)” coupled with “processing unit, such as GPU 472” include a stream endpoint ); and training the mini-batch However, Krishna does not teach a mini-batch sequence , a mini-batch sequence queue. Nikoli teaches generating a mini-batch sequence , using the augmented training data (e.g., “Mini-batch 1”, “Mini-batch 2”, ., “Mini-batch 3”, ., “Mini-batch 4”, Figure 2, and “2.2 Machine Learning I/O Frameworks”, “ data augmentation, and finally collating them into a mini-batch for training (see Fig. 2. The ““Mini-batch 1”, “Mini-batch 2”, ., “Mini-batch 3”, ., “Mini-batch 4” include the mini-batch sequence ), creating a mini-batch sequence queue (e.g., “Staging buffer”, Figure 5 in page 5, also, see FIG. 6, page 6, “the staging buffer, which is filled in a circular manner”, “a producer/consumer queue” for “Externa data augmentation”, “ data augmentation, and finally collating them into a mini-batch for training. Thus, the “Staging buffer” coupled with “producer/consumer queue” include the mini-batch sequence queue) , wherein the mini-batch sequence is used by a training environment to train a machine learning model (e.g., see page 2, “2.2 Machine Learning I/O Frameworks I/O for training deep neural networks “, “ reading samples from storage”, see Figure 6). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify the method of Krishna by adopting the teachings of Nikoli to have generating a mini-batch sequence using the augmented training data and the stream specification; creating a mini-batch sequence queue and a stream endpoint; and streaming the mini-batch sequence using the mini-batch sequence queue and the stream endpoint, wherein the mini-batch sequence is used by a training environment to train a machine learning model since, it would “reduces I/O times and improves end to-end training” (see Nikoli, abstract) or to provide “powerful interface that can be used in existing training pipelines to improve their I/O performance and reduce overall runtime” (see Nikoli, concludsion). As to claim 2, Krishna does not explicitly teach wherein the augmented training data comprises training data examples of the training data and additional augmented training data examples. However, Nikoli teaches wherein the augmented training data comprises training data examples of the training data and additional augmented training data examples (e.g., see page 2, “reading samples from storage”, “data augmentation, and finally collating them into a mini-batch for training (see Fig. 2)”. Thus, the “samples” represent the examples). As to claim 3, Krishna does not teach wherein the mini-batch sequence comprises: a plurality of mini-batches; end of epoch messages; and an end of stream message. However, Nikoli teaches wherein the mini-batch sequence comprises: a plurality of mini-batches; end of epoch messages; and an end of stream ( (e.g., page 4, wherein “The mini-batch size is 𝐵 and there are 𝐸 epochs. “, “ a batch 𝐵ℎ ⊆ {1, . . . , 𝐹 } “. “ local batch 𝐵ℎ,𝑖 ⊆ 𝐵ℎ. We write 𝑏𝑖 = |𝐵ℎ,𝑖 |” and “Access stream 𝑅 = (⋯, 7, 4, 5, 8,⋯)”, Figure 5. Thus, wherein the mini-batch sequence comprises: a plurality of mini-batches; end of epoch messages; and an end of stream would have been inherent) . Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify the method of Krishna by adopting the teachings of Nikoli to have wherein the mini-batch sequence comprises: a plurality of mini-batches; end of epoch messages; and an end of stream message since, it would “reduces I/O times and improves end to-end training” (see Nikoli, abstract) or to provide “powerful interface that can be used in existing training pipelines to improve their I/O performance and reduce overall runtime” (see Nikoli, concludsion).. As to claim 4, Krishna does not teach wherein a mini-batch of the mini-batch sequence comprises a randomly sampled portion of at least one of the augmented training data and the training data. However, Nikoli teaches wherein a mini-batch of the mini-batch sequence comprises a randomly sampled portion of at least one of the augmented training data and the training data (e.g., see page 4, “Random aggregate read throughput of the PFS, as a function of the number of readers 𝛾” , Figure 6, “ RandomSampler”) . Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify the method of Krishna by adopting the teachings of Nikoli to have wherein a mini-batch of the mini-batch sequence comprises a randomly sampled portion of at least one of the augmented training data and the training data.since, it would “reduces I/O times and improves end to-end training” (see Nikoli, abstract) or to provide “powerful interface that can be used in existing training pipelines to improve their I/O performance and reduce overall runtime” (see Nikoli, concludsion).. As to claim 5, Krishna teaches wherein the stream entry comprises: a stream identifier; the stream specification; and a stream status (e.g., see FIG. 3, para 66 and 68, “data”, “data stream” and “a job (e.g., payload “, “status information” . in para 55 and 56). As to claim 6, Krishna teaches wherein the stream specification comprises: stream metadata associated with the stream ; training data access information associated with the training data; mini-batch parameters; and augmentation parameters (e.g., para 68, “training data stream”, “ training data”, “ Augmentation 460 augments the shuffled training data” , “the shuffled training data to batching model 458”, “ the training data into batch sizes”,” batches”) . As to claim 7, Krishna teaches wherein the method further comprises: obtaining a second stream request, wherein the second stream request comprises a stream status request and a stream identifier (e.g., para 56, “obtain the status information” for “requests after the failure in para 65) ; in response to obtaining the second request: obtaining a stream status from a stream entry in the stream database; and providing the stream status to a client associated with the second stream request ( e.g., para 56, “A user may view the job status “, “ displayed via a user interface 205.”) . As to claim 8, Krishna teaches further wherein the method further comprises: obtaining a second stream request, wherein the second stream request comprises a duplicate stream request and a parent stream identifier associated with a parent stream ; in response to obtaining the second request: creating a new stream entry associated with the parent stream in the stream database; creating a new stream endpoint (e.g., para 47, “ Entry point service 304 receives ingress request 302, and validates ingress request 302, such as by verifying a checksum.” “ checksum to a received checksum “, “checksums” , “mirrored (e.g., duplicated)” in para 49. Thus, obtaining a second stream request, wherein the second stream request comprises a duplicate stream request and a parent stream identifier associated with a parent stream ; in response to obtaining the second request: creating a new stream entry associated with the parent stream in the stream database, creating a new stream endpoint would have been inherent). However, Krishna does not teach regenerating a mini-batch sequence associated with the parent stream; and streaming the mini-batch sequence using the mini-batch sequence queue and the stream endpoint. Nikoli teaches regenerating a mini-batch sequence associated with the parent stream; and streaming the mini-batch sequence using the mini-batch sequence queue and the stream endpoint (see rejection of claim 1 above). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify the method of Krishna by adopting the teachings of Nikoli to have obtaining a second stream request, wherein the second stream request comprises a duplicate stream request and a parent stream identifier associated with a parent stream; in response to obtaining the second request: creating a new stream entry associated with the parent stream in the stream database; regenerating a mini-batch sequence associated with the parent stream; creating a new stream endpoint; and streaming the mini-batch sequence using the mini-batch sequence queue and the stream endpoint since, it would “reduces I/O times and improves end to-end training” (see Nikoli, abstract) or to provide “powerful interface that can be used in existing training pipelines to improve their I/O performance and reduce overall runtime” (see Nikoli, concludsion).. As to claim 9, Krishna teaches wherein the method further comprises: obtaining a second stream request, wherein the second stream request comprises a stream save request and a stream identifier associated with the stream (e.g., see FIG. 4B, para 68 and 69, wherein “ receives the training data stream from object store 324, “, “stores the batched training data (i.e., batches) within data queue 456.”. Thus, obtaining a second stream request, wherein the second stream request comprises a stream save request and a stream identifier associated with the stream) ; in response to obtaining the second request: saving entries associated with the stream in the stream database a training data database (e.g., “456”, FIG. 4B) , and a mini-batch database in a log file (e.g., “470”, FIG. 4B ); and storing the log file in a storage (e.g., para 69, “The API client 468 stores the batches within training batch queue 470. A processing unit, such as GPU 472, obtains the batches from the training batch queue, and trains a machine learning model with the batches. Although only one GPU 472 is illustrated, a worker pod 332, 334 may execute on multiple GPUs”). As to claim 10, Krishna teaches further wherein the method further comprises: obtaining a third stream request, wherein the third stream request comprises a restore stream request and the stream identifier associated with the stream; in response to obtaining the third request: creating a new stream entry associated with the stream in the stream database, obtaining the log file from the storage (e.g., para 53, “ Assuming the processing task is interrupted (e.g., fails), a worker pod 332, 334 that is reassigned the same processing task may obtain the checkpoint data from the object store 324, and determine where in a given training batch to begin applying the processing task”. The “reassigned the same processing task” include a restore stream request ), creating a stream endpoint( see rejection of claim1 above). However, Krishna does not teach regenerating the mini-batch sequence associated with the stream using the log file; creating a mini-batch sequence queue; and streaming the mini-batch sequence using the mini-batch sequence queue and the stream endpoint. Nikoli teaches regenerating the mini-batch sequence associated with the stream using the log file; creating a mini-batch sequence queue and a stream endpoint; and streaming the mini-batch sequence using the mini-batch sequence queue (see rejection of claim 1 above). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify the method of Krishna by adopting the teachings of Nikoli to have obtaining a third stream request, wherein the third stream request comprises a restore stream request and the stream identifier associated with the stream; in response to obtaining the third request: creating a new stream entry associated with the stream in the stream database; obtaining the log file from the storage; regenerating the mini-batch sequence associated with the stream using the log file; creating a mini-batch sequence queue and a stream endpoint; and streaming the mini-batch sequence using the mini-batch sequence queue and the stream endpoint.since, it would “reduces I/O times and improves end to-end training” (see Nikoli, abstract) or to provide “powerful interface that can be used in existing training pipelines to improve their I/O performance and reduce overall runtime” (see Nikoli, concludsion). As to claim 11, Krishna teaches further obtaining a second stream request, wherein the second stream request comprises a stream termination request and a stream identifier associated with the stream; in response to obtaining the second request: deleting the stream endpoint and the mini-batch queue associated with the stream; delete cached data associated with the stream; and updating a stream status to indicate that the stream is terminated (e.g., para 5, “when the task is interrupted, from a last “checkpoint,” rather than starting the task from the beginning. The master, worker pods, and work and results queues may be deleted upon completion of the job” and “the completion of a job (e.g., payload has been completely processed), completion manager 312 deletes master 316 and the associated worker pods 332, 334, as well as the work queue 320 and the results queue 322.” in para 55. Thus, deleted upon completion of the job include deleting the stream endpoint and the mini-batch queue associated with the stream). However, Krishna does not teach mini-batch sequence queue. Nikoli teaches mini-batch sequence queue (see rejection of claim 1 above). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify the method of Krishna by adopting the teachings of Nikoli to have obtaining a second stream request, wherein the second stream request comprises a stream termination request and a stream identifier associated with the stream; in response to obtaining the second request: deleting the stream endpoint and the mini-batch sequence queue associated with the stream; delete cached data associated with the stream; and updating a stream status to indicate that the stream is terminated since, it would “reduces I/O times and improves end to-end training” (see Nikoli, abstract) or to provide “powerful interface that can be used in existing training pipelines to improve their I/O performance and reduce overall runtime” (see Nikoli, concludsion).. As to claim 12, see rejection of claim 1 above. Krishna teaches further a system for managing training data, comprising: a client; and a training data stream manager (TDSM), comprising a processor and memory, programmed (see FIG. 4B) . As to claims 13-16, see rejection of claims 2-5 above. As to claim 17, see rejection of claim 1 above. Krishna teaches further a non-transitory computer readable medium comprising computer readable program code, which when executed by a computer processor enables the computer processor to (e.g., see FIG. 2). As to claims 18-20, see rejection of claims 2-4 above. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Cmielowski et al. discloses A continuous machine learning system includes a data generator module, a pipeline search module, a pipeline refinement module, and a pipeline training module. The data generator module obtains raw training data defining a total data size and generates a plurality of data batches from the raw training data. The pipeline search module obtains an initial data batch from among the plurality of data batches and determines a best machine learning model pipeline among a plurality of machine learning model pipelines based on the initial data batch. The pipeline refinement module receives the best machine learning model pipeline and refines the best machine learning model pipeline to generate a refined pipeline that consumes the plurality of data batches. The pipeline training module incrementally trains the refined pipeline using remaining data batches among the plurality of data batches generated after the initial data batch. Chen et al. (US 11,113,244) discloses An integrated data pipeline can take advantage of a streaming service, which can handle tasks such as automated redelivery, as well as a processing service, which can allocate workers on a task- or event-specific basis. Event data is aggregated and compressed for delivery by the streaming service. The streaming service can deliver the data asynchronously to the processing service, which can disaggregate and decompress the data to obtain the original data records. The type of event for each record can be determined to determine whether the data should be processed using online and/or offline processing. For online processing the appropriate fields are determined and data extracted to be passed to the online processing services. For offline processing the record data is concatenated sequentially into mini-batches, then compacted into larger batch files that are stored for subsequent offline processing.. Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABDOU K SEYE whose telephone number is (571)270-1062. The examiner can normally be reached M-F 9-5:30. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Vital can be reached at 5712724215. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /ABDOU K SEYE/Examiner, Art Unit 2198 /PIERRE VITAL/Supervisory Patent Examiner, Art Unit 2198
Read full office action

Prosecution Timeline

Mar 31, 2023
Application Filed
Jan 12, 2026
Non-Final Rejection — §103
Apr 06, 2026
Interview Requested
Apr 10, 2026
Examiner Interview Summary
Apr 10, 2026
Applicant Interview (Telephonic)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12598527
Real-Time Any-G SON
2y 5m to grant Granted Apr 07, 2026
Patent 12587456
MACHINE LEARNING BASED EVENT MONITORING
2y 5m to grant Granted Mar 24, 2026
Patent 12585512
CUSTOMIZED SOCKET APPLICATION PROGRAMMING INTERFACE FUNCTIONS
2y 5m to grant Granted Mar 24, 2026
Patent 12541410
THREAD SPECIALIZATION FOR COLLABORATIVE DATA TRANSFER AND COMPUTATION
2y 5m to grant Granted Feb 03, 2026
Patent 12530245
CONTAINER IMAGE TOOLING STORAGE MIGRATION
2y 5m to grant Granted Jan 20, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
82%
Grant Probability
99%
With Interview (+27.5%)
3y 5m
Median Time to Grant
Low
PTA Risk
Based on 583 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month