Prosecution Insights
Last updated: April 19, 2026
Application No. 18/060,749

SYSTEMS AND METHODS FOR BAGGING ENSEMBLE CLASSIFIERS FOR IMBALANCED BIG DATA

Final Rejection §103
Filed
Dec 01, 2022
Examiner
PEREZ-ARROYO, RAQUEL
Art Unit
2169
Tech Center
2100 — Computer Architecture & Software
Assignee
Capital One Services LLC
OA Round
2 (Final)
58%
Grant Probability
Moderate
3-4
OA Rounds
3y 5m
To Grant
90%
With Interview

Examiner Intelligence

Grants 58% of resolved cases
58%
Career Allow Rate
171 granted / 296 resolved
+2.8% vs TC avg
Strong +32% interview lift
Without
With
+32.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 5m
Avg Prosecution
28 currently pending
Career history
324
Total Applications
across all art units

Statute-Specific Performance

§101
21.9%
-18.1% vs TC avg
§103
47.6%
+7.6% vs TC avg
§102
8.7%
-31.3% vs TC avg
§112
15.0%
-25.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 296 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Amendment This Office Action has been issued in response to Applicant’s Communication of amended application S/N 18/060,749 filed on November 7, 2025. Claims 1 to 18, 20, and 21 are currently pending with the application. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1 to 3, 5, 7 to 10, 12, 14 to 17, 20, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over NATSUI (U.S. Publication No. 2023/0186092), in view of Zhang (U.S. Patent No. 11,182,691), and further in view of HAO et a. (U.S. Publication No. 2023/0141749) Hao. As to claim 1: Natsui discloses: A system, comprising: one or more processors; and a memory in communication with the one or more processors and storing instructions that, when executed by the one or more processors, are configured to cause the system to: receive user input comprising a number of machine learning base models to generate [Paragraph 0027 teaches receive input made by a user; Paragraph 0111 teaches receive user input of a desired number of parameters of machine learning models to generate]; generate a plurality of machine learning base models corresponding to the number based on the user input [Paragraph 0039 teaches generates plural untrained models; Paragraph 0111 teaches model generation unit may generate untrained models of the number of parameters for which the user input has been received]; determine a chunk for a machine learning base model of the plurality of machine learning base models [Paragraph 0104 teaches storing pieces of training data formed of input data and correct classification results as a training dataset, and training the untrained model using the training dataset, in other words, determining a chunk, which could comprise all the training dataset], and train the machine learning base model with the chunk [Paragraph 0103 teaches training the learning model; Paragraph 0104 teaches training the untrained model using the training dataset]; and validate the plurality of machine learning base models using the testing data [Paragraph 0105 teaches evaluating the accuracy of the model trained by the training unit; Paragraph 0139 teaches the evaluation data may be data with correct answers that have not been used for training, for example, validation data]. Natsui does not appear to expressly disclose receive a first dataset; store a minority portion of the first dataset as testing data and a remaining portion of the first dataset as training data; separate the training data into majority cases and minority cases; iteratively for each machine learning base model of the plurality of machine learning base models until all machine learning base models of the plurality of machine learning base models are trained: determine a chunk for a machine learning base model, wherein the chunk comprises all minority cases from the training data and a subset of majority cases from the training data that is unique from other chunks associated with other machine learning base models of the plurality of machine learning base models. Zhang discloses: receive a first dataset [Column 39, lines 46 to 49 teaches a chunked data set that will be processed with filtering and splitting operations, therefore, the dataset must be received]; store a minority portion of the first dataset as testing data and a remaining portion of the first dataset as training data [Column 39, lines 58 to 62 teach a split operation applied to the dataset, where 70% of the chunks are placed in a training set, and 30% of the chunks are placed in a test set]; separate the training data into majority cases and minority cases [Column 54, lines 39 to 42 teach records of a raw training data set are classified into a majority category and two minority categories, hence, separating the training data into majority and minority cases]. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Natsui, by receive a first dataset; store a minority portion of the first dataset as testing data and a remaining portion of the first dataset as training data; separate the training data into majority cases and minority cases, as taught by Zhang [Column 39, 54], because both applications are directed to generation and training of machine learning models; filtering and splitting the datasets into majority and minority cases provides improvements in prediction accuracy, training time and run-time performance (See Zhang [Col 56, line 59 - Col 57 line 5]). Neither Natsui nor Zhang appear to expressly disclose iteratively for each machine learning base model of the plurality of machine learning base models until all machine learning base models of the plurality of machine learning base models are trained: determine a chunk for a machine learning base model, wherein the chunk comprises all minority cases from the training data and a subset of majority cases from the training data that is unique from other chunks associated with other machine learning base models of the plurality of machine learning base models. Hao discloses: iteratively for each machine learning base model of the plurality of machine learning base models until all machine learning base models of the plurality of machine learning base models are trained [Paragraph 0112 teaches selecting subsets of samples in the training set for each base classifier when training the initial base classification models; Paragraph 0115 teaches each base classification model uses different training data, therefore, iteratively until all base models are trained; Fig. 10, training all the models by selecting subsets of healthy data (majority cases) and all erroneous data (minority cases), until model n is trained, hence, until all models are trained]: determine a chunk for a machine learning base model, wherein the chunk comprises all minority cases from the training data and a subset of majority cases from the training data that is unique from other chunks associated with other machine learning base models of the plurality of machine learning base models [Paragraph 0103 teaches each base classification model is trained by using different subsets of healthy data (majority cases); Paragraph 0117 teaches constructing a training set for each LSTM model with a method of majority-class under-sampling, and performing initial training, by selecting a part of the majority-class (i.e. healthy data) samples and all the minority-class (i.e. erroneous data) samples as a training set; Paragraph 0134 teaches each base classification model is an initial base classification model that is obtained by training using all of erroneous data in the historical SMART data of the plurality of storage devices (minority class) and a first subset of healthy data in the historical SMART data (majority class), wherein the healthy data in the historical SMART data is divided into a plurality of first subsets, wherein the plurality of first subsets do not cross or overlap each other, therefore, where the chunks comprise all minority cases and a subset of majority cases that is unique from other chunks]. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Natsui, by iteratively for each machine learning base model of the plurality of machine learning base models until all machine learning base models of the plurality of machine learning base models are trained: determine a chunk for a machine learning base model, wherein the chunk comprises all minority cases from the training data and a subset of majority cases from the training data that is unique from other chunks associated with other machine learning base models of the plurality of machine learning base models, as taught by Hao [Paragraphs 0103, 0112, 0115, 0117, 0134], because the applications are directed to generation and training of machine learning models; selecting a subset of healthy data (majority cases) and all the erroneous data (minority cases) in a sampling method, ensures a difference of training data while alleviating the problem of unbalanced sample proportions for majority and minority categories (See Hao Para [0117]). As to claim 2: Natsui as modified by Zhang discloses: wherein each chunk comprises no more than 50% minority cases [Column 56, lines 26 to 33 teach an example of sample ratios includes a 33% sample of minority category 4014A, and a 50% sample of minority category 4014B, therefore, no more than 50% minority cases]. As to claim 3: Natsui as modified by Zhang discloses: wherein the minority portion comprises 10 to 30% of the first dataset [Column 39, lines 58 to 62 teach a split operation applied to the dataset, where 70% of the chunks are placed in a training set, and 30% of the chunks are placed in a test set]. As to claim 5: Natsui as modified by Zhang discloses: wherein each machine learning base model comprises a logistic regression model, a gradient boosted tree method model, a k-nearest neighbor model, or combinations thereof [Column 59, lines 24 to 26 teach model types include regression models, etc.]. As to claim 7: Natsui as modified by Zhang discloses: wherein determining the chunk for a machine learning base model of the plurality of machine learning base models is conducted dynamically at runtime [Column 27, lines 31 to 49 teach run-time recipe manager may retrieve the executable version of R1, perform a set of run-time validations, and schedule the execution of the transformation operations of R1 at respective resource sets 1175A and 1175B, where respective outputs 1185A and 1185B may be produced by the application of the recipe R1 on input datasets 1 and 2, and where the outputs may represent data that is to be used as input for a model, in other words, the determining of the chunk is conducted dynamically at runtime]. As to claim 21: Natsui as modified by Hao discloses: wherein each majority case of the majority cases of training data is incorporated into at least one subset of majority cases associated with a chunk such that none of the majority cases of training data are excluded from training at least one machine learning base model of the plurality of machine learning base models [Hao - Paragraph 0112 teaches subset of the majority class samples is selected for each base classifier through an integrated strategy in order to use all sample information in the training set]. Same rationale applies to claims 8 to 10, 12, 14 to 17, and 20, since they recite similar limitations. Claims 4, 6, 11, 16, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over NATSUI (U.S. Publication No. 2023/0186092), in view of Zhang (U.S. Patent No. 11,182,691), in view of HAO et a. (U.S. Publication No. 2023/0141749) Hao, and further in view of Merrill et al. (U.S. Publication No. 2019/0043070) hereinafter Merrill. As to claim 4: Natsui discloses all the limitations as set forth in claim 1 above, but does not appear to expressly disclose wherein each machine learning base model comprises a gradient boosted tree method model. Merrill discloses: wherein each machine learning base model comprises a gradient boosted tree method model [Paragraph 0158 teaches using tree-based methods such as gradient boosted trees]. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Natsui, by incorporating gradient boosted tree method model, as taught by Merrill [Paragraph 0158], because the applications are directed to generation and training of machine learning models; using a gradient boosted tree method model improves accuracy of the models. As to claim 6: Natsui as modified by Merrill discloses: the user input further comprises a selection of a logistic regression model, a gradient boosted tree method model, or a k-nearest neighbor model [Paragraph 0252 teaches receives user-selection of a model type and generates the at least one instruction that defines a model type of the protected class model based on the received user-selection; Paragraph 0298 teaches the tree model is a gradient boosted tree]. Same rationale applies to claims 11, 16, and 18, since they recite similar limitations. Response to Arguments This is in response to arguments filed on November 7, 2025. Applicant’s arguments have been fully and respectfully considered, but are moot in view of new grounds of rejections, as necessitated by the amendments. Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to RAQUEL PEREZ-ARROYO whose telephone number is (571)272-8969. The examiner can normally be reached Monday - Friday, 8:00am - 5:30pm, Alt Friday, EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sherief Badawi can be reached at 571-272-9782. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /RAQUEL PEREZ-ARROYO/Primary Examiner, Art Unit 2169
Read full office action

Prosecution Timeline

Dec 01, 2022
Application Filed
Aug 18, 2025
Non-Final Rejection — §103
Nov 05, 2025
Applicant Interview (Telephonic)
Nov 05, 2025
Examiner Interview Summary
Nov 07, 2025
Response Filed
Feb 24, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12566786
NATURAL LANGUAGE PROCESSING WORKFLOW FOR RESPONDING TO CLIENT QUERIES
2y 5m to grant Granted Mar 03, 2026
Patent 12566726
ENABLING EXCLUSION OF ASSETS IN IMAGE BACKUPS
2y 5m to grant Granted Mar 03, 2026
Patent 12555109
DETERMINISTIC CONCURRENCY CONTROL FOR PRIVATE BLOCKCHAINS
2y 5m to grant Granted Feb 17, 2026
Patent 12547602
LOG ENTRY REPRESENTATION OF DATABASE CATALOG
2y 5m to grant Granted Feb 10, 2026
Patent 12517948
INFORMATION PROCESSING METHOD AND DEVICE FOR SORTING MUSIC IN A PLAYLIST
2y 5m to grant Granted Jan 06, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
58%
Grant Probability
90%
With Interview (+32.3%)
3y 5m
Median Time to Grant
Moderate
PTA Risk
Based on 296 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month