DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
This Office Action has been issued in response to Applicant’s Communication of amended application S/N 18/060,749 filed on November 7, 2025. Claims 1 to 18, 20, and 21 are currently pending with the application.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1 to 3, 5, 7 to 10, 12, 14 to 17, 20, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over NATSUI (U.S. Publication No. 2023/0186092), in view of Zhang (U.S. Patent No. 11,182,691), and further in view of HAO et a. (U.S. Publication No. 2023/0141749) Hao.
As to claim 1:
Natsui discloses:
A system, comprising: one or more processors; and a memory in communication with the one or more processors and storing instructions that, when executed by the one or more processors, are configured to cause the system to:
receive user input comprising a number of machine learning base models to generate [Paragraph 0027 teaches receive input made by a user; Paragraph 0111 teaches receive user input of a desired number of parameters of machine learning models to generate];
generate a plurality of machine learning base models corresponding to the number based on the user input [Paragraph 0039 teaches generates plural untrained models; Paragraph 0111 teaches model generation unit may generate untrained models of the number of parameters for which the user input has been received];
determine a chunk for a machine learning base model of the plurality of machine learning base models [Paragraph 0104 teaches storing pieces of training data formed of input data and correct classification results as a training dataset, and training the untrained model using the training dataset, in other words, determining a chunk, which could comprise all the training dataset], and
train the machine learning base model with the chunk [Paragraph 0103 teaches training the learning model; Paragraph 0104 teaches training the untrained model using the training dataset]; and
validate the plurality of machine learning base models using the testing data [Paragraph 0105 teaches evaluating the accuracy of the model trained by the training unit; Paragraph 0139 teaches the evaluation data may be data with correct answers that have not been used for training, for example, validation data].
Natsui does not appear to expressly disclose receive a first dataset; store a minority portion of the first dataset as testing data and a remaining portion of the first dataset as training data; separate the training data into majority cases and minority cases; iteratively for each machine learning base model of the plurality of machine learning base models until all machine learning base models of the plurality of machine learning base models are trained: determine a chunk for a machine learning base model, wherein the chunk comprises all minority cases from the training data and a subset of majority cases from the training data that is unique from other chunks associated with other machine learning base models of the plurality of machine learning base models.
Zhang discloses:
receive a first dataset [Column 39, lines 46 to 49 teaches a chunked data set that will be processed with filtering and splitting operations, therefore, the dataset must be received];
store a minority portion of the first dataset as testing data and a remaining portion of the first dataset as training data [Column 39, lines 58 to 62 teach a split operation applied to the dataset, where 70% of the chunks are placed in a training set, and 30% of the chunks are placed in a test set];
separate the training data into majority cases and minority cases [Column 54, lines 39 to 42 teach records of a raw training data set are classified into a majority category and two minority categories, hence, separating the training data into majority and minority cases].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Natsui, by receive a first dataset; store a minority portion of the first dataset as testing data and a remaining portion of the first dataset as training data; separate the training data into majority cases and minority cases, as taught by Zhang [Column 39, 54], because both applications are directed to generation and training of machine learning models; filtering and splitting the datasets into majority and minority cases provides improvements in prediction accuracy, training time and run-time performance (See Zhang [Col 56, line 59 - Col 57 line 5]).
Neither Natsui nor Zhang appear to expressly disclose iteratively for each machine learning base model of the plurality of machine learning base models until all machine learning base models of the plurality of machine learning base models are trained: determine a chunk for a machine learning base model, wherein the chunk comprises all minority cases from the training data and a subset of majority cases from the training data that is unique from other chunks associated with other machine learning base models of the plurality of machine learning base models.
Hao discloses:
iteratively for each machine learning base model of the plurality of machine learning base models until all machine learning base models of the plurality of machine learning base models are trained [Paragraph 0112 teaches selecting subsets of samples in the training set for each base classifier when training the initial base classification models; Paragraph 0115 teaches each base classification model uses different training data, therefore, iteratively until all base models are trained; Fig. 10, training all the models by selecting subsets of healthy data (majority cases) and all erroneous data (minority cases), until model n is trained, hence, until all models are trained]:
determine a chunk for a machine learning base model, wherein the chunk comprises all minority cases from the training data and a subset of majority cases from the training data that is unique from other chunks associated with other machine learning base models of the plurality of machine learning base models [Paragraph 0103 teaches each base classification model is trained by using different subsets of healthy data (majority cases); Paragraph 0117 teaches constructing a training set for each LSTM model with a method of majority-class under-sampling, and performing initial training, by selecting a part of the majority-class (i.e. healthy data) samples and all the minority-class (i.e. erroneous data) samples as a training set; Paragraph 0134 teaches each base classification model is an initial base classification model that is obtained by training using all of erroneous data in the historical SMART data of the plurality of storage devices (minority class) and a first subset of healthy data in the historical SMART data (majority class), wherein the healthy data in the historical SMART data is divided into a plurality of first subsets, wherein the plurality of first subsets do not cross or overlap each other, therefore, where the chunks comprise all minority cases and a subset of majority cases that is unique from other chunks].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Natsui, by iteratively for each machine learning base model of the plurality of machine learning base models until all machine learning base models of the plurality of machine learning base models are trained: determine a chunk for a machine learning base model, wherein the chunk comprises all minority cases from the training data and a subset of majority cases from the training data that is unique from other chunks associated with other machine learning base models of the plurality of machine learning base models, as taught by Hao [Paragraphs 0103, 0112, 0115, 0117, 0134], because the applications are directed to generation and training of machine learning models; selecting a subset of healthy data (majority cases) and all the erroneous data (minority cases) in a sampling method, ensures a difference of training data while alleviating the problem of unbalanced sample proportions for majority and minority categories (See Hao Para [0117]).
As to claim 2:
Natsui as modified by Zhang discloses:
wherein each chunk comprises no more than 50% minority cases [Column 56, lines 26 to 33 teach an example of sample ratios includes a 33% sample of minority category 4014A, and a 50% sample of minority category 4014B, therefore, no more than 50% minority cases].
As to claim 3:
Natsui as modified by Zhang discloses:
wherein the minority portion comprises 10 to 30% of the first dataset [Column 39, lines 58 to 62 teach a split operation applied to the dataset, where 70% of the chunks are placed in a training set, and 30% of the chunks are placed in a test set].
As to claim 5:
Natsui as modified by Zhang discloses:
wherein each machine learning base model comprises a logistic regression model, a gradient boosted tree method model, a k-nearest neighbor model, or combinations thereof [Column 59, lines 24 to 26 teach model types include regression models, etc.].
As to claim 7:
Natsui as modified by Zhang discloses:
wherein determining the chunk for a machine learning base model of the plurality of machine learning base models is conducted dynamically at runtime [Column 27, lines 31 to 49 teach run-time recipe manager may retrieve the executable version of R1, perform a set of run-time validations, and schedule the execution of the transformation operations of R1 at respective resource sets 1175A and 1175B, where respective outputs 1185A and 1185B may be produced by the application of the recipe R1 on input datasets 1 and 2, and where the outputs may represent data that is to be used as input for a model, in other words, the determining of the chunk is conducted dynamically at runtime].
As to claim 21:
Natsui as modified by Hao discloses:
wherein each majority case of the majority cases of training data is incorporated into at least one subset of majority cases associated with a chunk such that none of the majority cases of training data are excluded from training at least one machine learning base model of the plurality of machine learning base models [Hao - Paragraph 0112 teaches subset of the majority class samples is selected for each base classifier through an integrated strategy in order to use all sample information in the training set].
Same rationale applies to claims 8 to 10, 12, 14 to 17, and 20, since they recite similar limitations.
Claims 4, 6, 11, 16, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over NATSUI (U.S. Publication No. 2023/0186092), in view of Zhang (U.S. Patent No. 11,182,691), in view of HAO et a. (U.S. Publication No. 2023/0141749) Hao, and further in view of Merrill et al. (U.S. Publication No. 2019/0043070) hereinafter Merrill.
As to claim 4:
Natsui discloses all the limitations as set forth in claim 1 above, but does not appear to expressly disclose wherein each machine learning base model comprises a gradient boosted tree method model.
Merrill discloses:
wherein each machine learning base model comprises a gradient boosted tree method model [Paragraph 0158 teaches using tree-based methods such as gradient boosted trees].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Natsui, by incorporating gradient boosted tree method model, as taught by Merrill [Paragraph 0158], because the applications are directed to generation and training of machine learning models; using a gradient boosted tree method model improves accuracy of the models.
As to claim 6:
Natsui as modified by Merrill discloses:
the user input further comprises a selection of a logistic regression model, a gradient boosted tree method model, or a k-nearest neighbor model [Paragraph 0252 teaches receives user-selection of a model type and generates the at least one instruction that defines a model type of the protected class model based on the received user-selection; Paragraph 0298 teaches the tree model is a gradient boosted tree].
Same rationale applies to claims 11, 16, and 18, since they recite similar limitations.
Response to Arguments
This is in response to arguments filed on November 7, 2025. Applicant’s arguments have been fully and respectfully considered, but are moot in view of new grounds of rejections, as necessitated by the amendments.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RAQUEL PEREZ-ARROYO whose telephone number is (571)272-8969. The examiner can normally be reached Monday - Friday, 8:00am - 5:30pm, Alt Friday, EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sherief Badawi can be reached at 571-272-9782. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RAQUEL PEREZ-ARROYO/Primary Examiner, Art Unit 2169