Last updated: April 19, 2026
Application No. 17/977,941
MACHINE LEARNING ALGORITHM SELECTION

Non-Final OA §101§103
Filed
Oct 31, 2022
Examiner
HWANG, MEGAN ELIZABETH
Art Unit
2143
Tech Center
2100 — Computer Architecture & Software
Assignee
Fujitsu Limited
OA Round
1 (Non-Final)
This examiner grants 47% of cases after interview

— +60.2% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 19 resolved cases, 2023–2026
Examiner Intelligence

HWANG, MEGAN ELIZABETH View full profile →
Grants 47% of resolved cases
Career Allow Rate
9 granted / 19 resolved
-7.6% vs TC avg
Strong +60% interview lift
Without
With
+60.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
25 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
34.9%
-5.1% vs TC avg
§103
41.0%
+1.0% vs TC avg
§102
7.4%
-32.6% vs TC avg
§112
15.3%
-24.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 19 resolved cases
Office Action

§101 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are presented for examination.
Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because reference character “126” has been used to designate both “Evaluating Data Set” and “ML Algorithms”.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because reference characters "714" and "716" have both been used to designate the “communication unit” in the drawings and specification respectively.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
The disclosure is objected to because of the following informalities: 
In Paragraph [0023], “The operation 110 may generated a processed data set 112” should read “The operation 110 may generate a processed data set 112”.
In Paragraphs [0030] and [0068], “an equal or approximately percentage” should read “an equal or approximate percentage”.
Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Independent Claims
Step 1 – Claim 1 is drawn to a method, claim 11 is drawn to one or more non-transitory computer-readable storage media, and claim 12 is drawn to a system. Therefore, each of these claims fall under one of the four categories of statutory subject matter (process/method, machine/product/apparatus, manufacture or composition of matter).
Step 2A Prong 1 – Claims 1, 11 and 12 are directed to a judicially recognized exception of an abstract idea without significantly more. Claims 1, 11 and 12 recite:
binning the data entries into a plurality of data bins based on values of the target variable – This limitation is directed towards the abstract idea of a mathematical relationship, specifically organizing information and manipulating information through mathematical correlations (see MPEP § 2106.04(a)(2), section I, A). In Paragraph [0025] of the specification, it states “selecting the subset of the data entries may include binning the data entries into multiple data bins based on values in the target variable of the data entries. In these and other embodiments, the operation may bin the data entries based on an obtained number of bins 114.” Additionally, Figure 3 provides a visual representation of the claimed binned data entries, showing the distribution of data entries organized in a histogram. BRI in light of the specification would support that “binning the data entries based on values of the target variable” would encompass a defining a series of intervals to organize data into a quantized representation of the distribution of values and fall under the mathematical concepts grouping.
selecting a subset of the binned data entries from each of the plurality of data bins as the subset of the data entries – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0026] of the specification, it states “a subset of the binned data entries from each of the multiple data bins may be selected as the data entries in the sampled data set 122. In these and other embodiments, a number of the data entries selected from each of the data bins may be based on a sampling ratio 116. For example, for a sampling ratio 116 of 0.3, three of every ten data entries from a data bin may be selected for inclusion in the sampled data set 122.” BRI in light of the specification would support that “selecting a subset of binned data entries” would encompass utilizing a sampling ratio to isolate a representative subset for each bin and fall under the mathematical concepts grouping.
selecting one of the plurality of machine learning models based on an evaluation of the plurality of machine learning models – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0040] of the specification, it states “the scoring algorithm may use two inputs to generate a score for each of the machine learning models 132. The scores of the machine learning models 132 may allow the machine learning models 132 to be evaluated. In these and other embodiments, the machine learning models 132 may be ranked according to the scores with the higher-ranking machine learning models 132 indicating that these machine learning models 132 more accurately predicted values for the target features than other of the machine learning models 132.” BRI in light of the specification would support that “selecting one of the plurality of machine learning models based on an evaluation” would encompass applying a scoring algorithm to determine a highest performing model and fall under the mathematical concepts grouping.
Step 2A Prong 2 – The following additional limitations recited do not integrate the abstract idea into a practical application:
obtaining a dataset that includes a plurality of data entries, each of the data entries including a plurality of features and at least one of the plurality of features is designated as a target variable – This limitation recites an insignificant extra-solution activity of mere data gathering (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
constructing a plurality of machine learning models using the subset of the data entries – This limitation merely recites the idea of constructing a plurality of models and fails to recite details of how the constructing is accomplished. Reciting the idea of a solution or outcome without detailing how the result is accomplished is equivalent to saying "apply it" (see MPEP § 2106.05(f)) and thus, fails to integrate the exception into a practical application.
Claims 11 and 12:
one or more non-transitory computer-readable storage media configured to store instructions; and one or more processors configured to execute the instructions to cause to system to perform operations – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It recites a generic computer or generic computer components that merely act as a tool on which the method operates.
Step 2B – The additional elements in Step 2A Prong 2, view individually or wholistically, do not provide an inventive concept or otherwise amount to significantly more than the abstract idea itself.
obtaining a dataset that includes a plurality of data entries, each of the data entries including a plurality of features and at least one of the plurality of features is designated as a target variable – This limitation recites an insignificant extra-solution activity of mere data gathering (see MPEP § 2106.05(g)), which is well-understood, routine and conventional activity similar to cases reviewed by the courts involving receiving or transmitting data over a network (see MPEP § 2106.05(d)(II)) and thus, fails to provide significantly more to the judicial exception.
constructing a plurality of machine learning models using the subset of the data entries – This limitation merely recites the idea of constructing a plurality of models and fails to recite details of how the constructing is accomplished. Reciting the idea of a solution or outcome without detailing how the result is accomplished is equivalent to saying "apply it" (see MPEP § 2106.05(f)) and thus, fails to provide significantly more to the judicial exception.
Claims 11 and 12:
one or more non-transitory computer-readable storage media configured to store instructions; and one or more processors configured to execute the instructions to cause to system to perform operations – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.
As such, claims 1, 11 and 12 are not patent eligible.
Dependent Claims
Claims 2-10 and 13-20 merely narrow the previously cited abstract idea limitations. For the reasons described above with respect to independent claims 1 and 12, these judicial exceptions are not meaningfully integrated into a practical application, nor amount to significantly more than the abstract idea itself. The claims disclose similar limitations described for the independent claims above and do not provide anything more than the mathematical concepts that are achievable through mathematical computation. Therefore claims 2-10 and 13-20 also recite abstract ideas that do not integrate into a practical application or amount to significantly more than the judicial exception, and are rejected under U.S.C. § 101.
Step 1 – Claims 2-10 are drawn to a method and claims 13-20 are drawn to a system. Therefore, each of these claims fall under one of the four categories of statutory subject matter (process/method, machine/product/apparatus, manufacture or composition of matter).
Step 2A Prong 1 – These claims are directed to a judicially recognized exception of an abstract idea without significantly more.
Claims 2 and 13:
predict values of the target variable – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0022] of the specification, it states “the target variables 104 may be any feature for which a machine learning model may predict a value based on other values of features. As an example, the target variable 104 may be a current valuation of real state. In these and other embodiments, a machine learning model may be trained to predict the current valuation of a piece of real estate based on values from other features about the piece of real estate provided to the machine learning model.” BRI in light of the specification would support that “predict[ing] values of the target variable” would encompass a mental process with or without the assistance of pen and paper of estimating based on known information.
Claims 3 and 14:
binning the data entries into a plurality of second data bins based on values of the second target variable, wherein a first bin of the plurality of data bins corresponds to a second bin of the plurality of second data bins – This limitation is directed towards the abstract idea of a mathematical relationship, specifically organizing information and manipulating information through mathematical correlations (see MPEP § 2106.04(a)(2), section I, A). In Paragraph [0064] of the specification, it states “to determine data bin intersection, the data entries that are in corresponding data bins of each data bin set may be identified as union data entries. Thus, for a data entry to be identified as a union data entry, the data entry may be binned in corresponding data bins and not binned in non-corresponding data bins. Thus, if a data entry is binned in data bins in different data bin sets and the data bins do not correspond, the data entry may not be identified as a union data entry.” BRI in light of the specification would support that “binning the data entries into two different sets of data bins based on different target variables wherein a bin in the first set of bins corresponds to a bin in the second set of bins” would encompass organizing multi-variable data entries into a two separate but corresponding quantized distributions and fall under the mathematical concepts grouping. 
designating data entries as union data entries in response to the data entries including a first value in the first bin and a second value in the second bin – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0065] of the specification, it states “For example, a data entry in a first data bin of a first data bin set and in a first data bin of a second data bin set where the first data bins correspond, the data entry may be identified as a union data entry. In contrast, a data entry in the first data bin of the first data bin set and in second data bin of a second data bin where the first data bin does not correspond to the second data bin, the data entry may not be identified as a union data entry.” BRI in light of the specification would support that “designating data entries as union data entries” would encompass a mental process with or without the assistance of pen and paper of evaluating if a data entry includes values in corresponding bins. For example, if one target variable is age with a bin of 18-24 and another target variable is net worth with a corresponding bin of 0-50k, the human mind is capable of identifying the people for which ranges apply.
wherein the selecting a subset of the binned data entries from each of the plurality of data bins as the subset of the data entries includes selecting a subset of the binned union data entries from each of the plurality of data bins as the subset of the data entries – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0067] of the specification, it states “after identifying the union data entries in each of the data bins of the data bin sets, union data bin entries may be selected from each group of corresponding data bins from the data bin sets as described with respect to block 212. For example, the number of union data entries selected from each group of corresponding data bins may be based on the sampling ratio. For example, if there are ten union data entries in the corresponding group of first data bins, then the number of union data entries selected from the corresponding group of first data bins based on the sampling ratio of 3/10 may be three. The selected union data entries may be the selected data set resulting from block 212.” BRI in light of the specification would support that “selecting a subset of the binned union data entries from each plurality of data bins as the subset of data entries” would encompass a mental process with or without the assistance of pen and paper of only evaluating data that fulfills certain criteria.
Claims 4 and 15:
evaluating the plurality of machine learning models using the subset of data entries and a scoring algorithm – This limitation is directed towards the abstract idea of a mathematical formula (see MPEP § 2106.04(a)(2), section I, B). In Paragraph [0045] of the specification, it states “For example, the score for a machine learning model may be represented by the following equations: [Equations]. In these equations, yi may be a data entry in the evaluating data set 126, y may be the calculated value of the target variable for the data entry, Bi may be the bin number associated with the data entry when calculating the bin error distance and values in the bins when comparing to the calculated value y, D(yi) may be a density of the bin Bi of the data entry yi, where the density may be the second score discussed above, B may be the bin number of the calculated value y,||Bi - Bi|| may be the bin error distance, Ex(yi) may be an individual score assigned to each of the data entries yi, ᴦ may be the number of data entries in the evaluating data set 126, and M may be a score for the machine learning model.” BRI in light of the specification would support that “evaluating the plurality of machine learning models using the subset of data entries and a scoring algorithm” would encompass evaluating performance based on a calculated score and fall under the mathematical concepts grouping.
wherein an input to the scoring algorithm is a bin error distance representing a number of bins between bins of actual values of the target variable of the subset of the data entries and bins of generated values of the target variable generated by the plurality of machine learning models using values in the other plurality of features from the subset of the data entries – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0041] of the specification, it states “In some embodiments, a first input to the scoring algorithm may be a bin error distance. A bin error distance may be calculated for each of the data entries in the evaluating data set 126. A bin error distance may represent a number of data bins between a data bin of the actual value of the target variables 104 as assigned during operation 120 and a data bin of a calculated value of the target variables 104.” BRI in light of the specification would support that “input[ting] a bin error distance” would encompass a mathematical computation and fall under the mathematical concepts grouping.
Claims 5 and 16:
wherein a second input to the scoring algorithm is based on a value assigned to the bins based on a probability density function applied to the subset of the data entries – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0042] of the specification, it states “In some embodiments, a second input to the scoring algorithm may be a value assigned to the data bins of the actual values of the target variables 104. In these and other embodiments, the value assigned may be based on a probability density function applied to values of the target variables 104 of the evaluating data set 126 based on the binning of the data entries during operation 120.” BRI in light of the specification would support that “input[ting] a value assigned to the bins based on a probability density function” would encompass a mathematical function and fall under the mathematical concepts grouping.
Claims 6 and 17:
wherein a second input to the scoring algorithm is based on a value assigned to the bins based on a number of the subset of the data entries that include values in each bin – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0043] of the specification, it states “Alternately or additionally, the value assigned may be based on a number of the data entries in each data bin as assigned during operation 120. In these and other embodiments, the second score for a data entry may be the number of data entries in the data bin that includes the actual value of the target variables 104 of the data entry.” BRI in light of the specification would support that “input[ting] a value assigned to the bins based on a number of the subset of data entries that include values in each bin” would encompass utilizing the size of the bin in a scoring algorithm and fall under the mathematical concepts grouping.
Claims 8 and 19:
wherein one of the plurality of machine learning models constructed using the subset of the data entries includes outputs that are a mathematical combination of outputs from a plurality of different machine learning models that are each generated using a different machine learning algorithm and the subset of the data entries – This limitation is directed towards the abstract idea of a mathematical calculation (see MPEP § 2106.04(a)(2), section I, C). In Paragraph [0074] of the specification, it states “In some embodiments, the estimation module 420 may be configured to mathematically combine the values from each of the models 410 to generate an output 430. The estimation module 420 may mathematically combine the values based on mean, medium, weighted mean, or some other mathematical combination of the values. Alternately or additionally, the estimation module 420 may select from among the values from the models 410 or select from among the values and mathematically combine the selected values to generate the output 430.” BRI in light of the specification would support that “constructing a machine learning model that includes outputs that are a mathematical combination of outputs of multiple different machine learning models” would encompass a mathematical computation and fall under the mathematical concepts grouping.
Step 2A Prong 2 – These limitations do not recite any additional elements which integrate the abstract idea into a practical application.
Claims 2 and 13:
training, using the dataset, a particular machine learning model following a type of construction used to construct the selected one of the plurality of machine learning models – This limitation merely recites the idea of training the selected type of machine learning model and fails to recite details of how the training. Reciting the idea of a solution or outcome without detailing how the result is accomplished is equivalent to saying "apply it" (see MPEP § 2106.05(f)) and thus, fails to integrate the exception into a practical application.
applying data to the particular machine learning model to predict values of the target variable – This limitation merely recites the idea of “applying” data to implement the abstract idea of predicting values of the target variable on a computer (see MPEP § 2106.05(f)) and thus, fails to integrate the exception into a practical application.
Claims 3 and 14:
wherein the target variable is a first target variable and another of the plurality of variables is designated as a second target variable – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---predicting multiple variables and thus, fails to integrate the exception into a practical application.
Claims 7 and 18:
wherein first data entries in the subset of the data entries are used in the construction of the plurality of machine learning models and second data entries in the subset of the data entries are used in the evaluation of the plurality of machine learning models, wherein the first data entries and the second data entries are selected from among the subset of the data entries based on the data bins into which the first data entries and the second data entries are binned – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
Claims 9 and 20:
wherein at least a subset of the plurality of machine learning models are each constructed using a different one of a plurality of machine learning algorithms – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---multiple different machine learning algorithms and thus, fails to integrate the exception into a practical application.
Claim 10:
wherein the plurality of machine learning algorithms are selected based on the dataset and the target variable – This limitation merely recites the idea of selecting the plurality of machine learning algorithms and fails to recite details of how the dataset and the target variable are used to select the algorithms. Reciting the idea of a solution or outcome without detailing how the result is accomplished is equivalent to saying "apply it" (see MPEP § 2106.05(f)) and thus, fails to integrate the exception into a practical application.
Step 2B – These limitations, as a whole, do not amount to significantly more than the judicial exception.
Claims 2 and 13:
training, using the dataset, a particular machine learning model following a type of construction used to construct the selected one of the plurality of machine learning models – This limitation merely recites the idea of training the selected type of machine learning model and fails to recite details of how the training. Reciting the idea of a solution or outcome without detailing how the result is accomplished is equivalent to saying "apply it" (see MPEP § 2106.05(f)) and thus, fails to provide significantly more to the judicial exception.
applying data to the particular machine learning model to predict values of the target variable – This limitation merely recites the idea of “applying” data to implement the abstract idea of predicting values of the target variable on a computer (see MPEP § 2106.05(f)) and thus, fails to provide significantly more to the judicial exception.
Claims 3 and 14:
wherein the target variable is a first target variable and another of the plurality of variables is designated as a second target variable – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---predicting multiple variables and thus, fails to provide significantly more to the judicial exception.
Claims 7 and 18:
wherein first data entries in the subset of the data entries are used in the construction of the plurality of machine learning models and second data entries in the subset of the data entries are used in the evaluation of the plurality of machine learning models, wherein the first data entries and the second data entries are selected from among the subset of the data entries based on the data bins into which the first data entries and the second data entries are binned – This limitation recites the well-understood, routine, conventional activity of utilizing separate training and testing sets (see MPEP § 2106.05(d)) and thus, fails to provide significantly more to the judicial exception.
Claims 9 and 20:
wherein at least a subset of the plurality of machine learning models are each constructed using a different one of a plurality of machine learning algorithms – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---multiple different machine learning algorithms and thus, fails to provide significantly more to the judicial exception.
Claim 10:
wherein the plurality of machine learning algorithms are selected based on the dataset and the target variable – This limitation merely recites the idea of selecting the plurality of machine learning algorithms and fails to recite details of how the dataset and the target variable are used to select the algorithms. Reciting the idea of a solution or outcome without detailing how the result is accomplished is equivalent to saying "apply it" (see MPEP § 2106.05(f)) and thus, fails to provide significantly more to the judicial exception.
As such, claims 2-10 and 13-20 are not patent eligible.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 8-13 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Wujek et al. (“Best Practices for Machine Learning Applications”, published 2016), hereinafter Wujek; in view of Rendleman et al. (“Representative random sampling: an empiraical evaluation of a novel bin stratification method for model performance estimation”, published 10/27/2022), hereinafter Rendleman.
	Regarding Claim 1, Wujek teaches A method of machine learning algorithm selection (Wujek: “Machine learning typically involves training a succession of candidate configurations toward selecting a final model” [Page 12, Para. 3]), the method comprising: 
	obtaining a dataset that includes a plurality of data entries, each of the data entries including a plurality of features and at least one of the plurality of features is designated as a target variable (Wujek: “The original data set presented to you might be in a form that is ill-suited for applying machine learning algorithms to build models or identify patterns. If the data are unstructured or semi-structured (such as text from logs or information in XML format), you need to transform the data set in some way to produce a structured data set that is more suitable for modeling. But even if you have structured data, you should ensure that the rows represent what you consider to be observations and that the columns represent feature or target variables.” [Page 5, Para. 3]);
	selecting data entries, the selecting including: binning the data entries into a plurality of data bins based on values of the target variable (Wujek: “Binning is the process of discretizing numerical variables into fewer categorical counterparts. For example, “age” variables are often binned into categories such as 20–39, 40–59, and 60–79. Building a model against each individual age probably does not provide any more information for a model than building it against age groups; binning reduces the complexity of mapping the feature values to the response.” [Page 7, Para. 5]);
	constructing a plurality of machine learning models using the data entries (Wujek: “Machine learning typically involves training a succession of candidate configurations toward selecting a final model, and the error of each candidate must be assessed at every iteration of the training process.” [Page 12, Para. 3]); and
	selecting one of the plurality of machine learning models based on an evaluation of the plurality of machine learning models (Wujek: “Machine learning typically involves training a succession of candidate configurations toward selecting a final model, and the error of each candidate must be assessed at every iteration of the training process.” [Page 12, Para. 3]).
	However, Wujek fails to expressly disclose selecting a subset of the data entries, the selecting including: selecting a subset of the binned data entries from each of the plurality of data bins as the subset of the data entries.
	In the same field of endeavor, Rendleman teaches selecting a subset of the data entries, the selecting including: selecting a subset of the binned data entries from each of the plurality of data bins as the subset of the data entries (Rendleman: “equipopulous bin-stratified sampling restricts the set of possible random samplings to a subset that avoids statistically significant relationships between group assignment and time-to-event outcome.” [Section 4.1 RRS samplings]; “Data points are partitioned into bins based on the continuous outcome variable, and stratified sampling is used to assign groupings for HO or CV.” [Section 2. Represenative random sampling (RRS) procedure]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated selecting a subset of the data entries, the selecting including: selecting a subset of the binned data entries from each of the plurality of data bins as the subset of the data entries, as taught by Rendleman to the method of Wujek because both of these methods are directed towards selective sampling and binning of training data for increasing model performance. In making this combination and sampling from each data bin, it would allow the method of Wujek to reduce the dimensionality of the data while counteracting high variability in generalization performance estimates by “ensur[ing] that the training and test sets are representative of the full outcome distribution” (Rendleman: [Section 1: Introduction]).
	Regarding Claim 2, Wujek and Rendleman teach the method of Claim 1, further comprising: 
	training, using the dataset, a particular machine learning model following a type of construction used to construct the selected one of the plurality of machine learning models (Wujek: “After evaluating a number of hyperparameter settings, the hyperparameter tuner provides the setting that yields the best performing model. The last step is to train a new model on the entire data set (which includes both training and validation data) under the best hyperparameter setting.” [Page 15, Para. 5]); and
	applying data to the particular machine learning model to predict values of the target variable (Wujek: “Model deployment is the process of moving the model from an individual’s development environment to a large, powerful, and secure database or server where it can be used simultaneously by many mission-critical processes.” [Page 19, Para. 2]).
	Regarding Claim 8, Wujek and Rendleman teach the method of Claim 1, wherein one of the plurality of machine learning models constructed using the subset of the data entries includes outputs that are a mathematical combination of outputs from a plurality of different machine learning models that are each generated using a different machine learning algorithm and the subset of the data entries (Wujek: “Bagging (an abbreviation for bootstrap aggregating) is an approach in which multiple base learners (often decision trees) are trained on different sample sets of the data, which are randomly drawn with replacement, and their predictions are aggregated through a function such as majority voting or averaging. Bagging particularly focuses on combining the predictions of base learners to reduce variance and avoid overfitting; it is an efficient and ideal way to handle the bias-variance tradeoff.” [Page 17, Para. 4]; “Stacking involves training multiple models by using a diverse set of strong learners and then applying a higher-level “combiner” algorithm to generate a model that includes the predictions of the member models as inputs. The output of this combiner algorithm is the final prediction as a consensus among the member models.” [Page 17, Para. 6]).
	Regarding Claim 9, Wujek and Rendleman teach the method of Claim 1, wherein at least a subset of the plurality of machine learning models are each constructed using a different one of a plurality of machine learning algorithms (Wujek: See [Appendix: Machine Learning Algorithm Quick Reference]).
	Regarding Claims 11-13 and 19-20, they are non-transitory computer-readable storage media and system claims that correspond with Claims 1-2 and 8-9. Therefore, they are rejected for the same reasons as Claims 1-2 and 8-9.
	Regarding Claim 10, Wujek and Rendleman teach the method of Claim 9, wherein the plurality of machine learning algorithms are selected based on the dataset and the target variable (Wujek: “What are you trying to achieve with your model? Are you creating a model to classify observations, predict a value for an interval target, detect patterns or anomalies, or provide recommendations? Answering this question will direct you to a subset of machine learning algorithms that specialize in the particular type of problem.” [Page 13, Para. 6]).
Claims 3 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Wujek in view of Rendleman, as applied to Claims 1 and 12, in further view of Cappiello et al. (“Measuring Comovements by Regression Quantiles”, published July 2005), hereinafter Cappiello.
	Regarding Claim 3, Wujek and Rendleman teach the method of Claim 1. However, they fail to expressly disclose wherein the target variable is a first target variable and another of the plurality of variables is designated as a second target variable, the method further comprising: binning the data entries into a plurality of second data bins based on values of the second target variable, wherein a first bin of the plurality of data bins corresponds to a second bin of the plurality of second data bins; and designating data entries as union data entries in response to the data entries including a first value in the first bin and a second value in the second bin, wherein the selecting a subset of the binned data entries from each of the plurality of data bins as the subset of the data entries includes selecting a subset of the binned union data entries from each of the plurality of data bins as the subset of the data entries.
	In the same field of endeavor, teaches wherein the target variable is a first target variable and another of the plurality of variables is designated as a second target variable (Cappiello: “Let yt and xt denote two different random variables.” [Section 2. The comovement box]), the method further comprising: 
	binning the data entries into a plurality of second data bins based on values of the second target variable, wherein a first bin of the plurality of data bins corresponds to a second bin of the plurality of second data bins (Cappiello: “Our approach is based on the computation - over both a test and a benchmark period - of the conditional probability that a random variable yt is lower than a given quantile, when the other random variable xt is also lower than its corresponding quantile, for any set of prespecified quantiles.” [Abstract]; “Thresholds are identified using time-varying conditional univariate quantiles” [Section: Non-technical summary]; “Let qY θt be the time t θ-quantile of the conditional distribution of yt. Analogously, for xt, we define qX θt.” [Section 2. The comovement box]); and 
	designating data entries as union data entries in response to the data entries including a first value in the first bin and a second value in the second bin (Cappiello: “The approach is based on the estimation of the conditional probability that a random variable yt falls below a given conditional quantile, when the other random variable xt is also falling below its corresponding quantile.” [Section 1. Introduction]; “we construct, for each series and for each quantile, indicator variables which are equal to one if the observed return is lower than the conditional quantile and zero otherwise.” [Section 3. The econometrics of the comovement box]), 
	wherein the selecting a subset of the binned data entries from each of the plurality of data bins as the subset of the data entries includes selecting a subset of the binned union data entries from each of the plurality of data bins as the subset of the data entries (Rendleman: “equipopulous bin-stratified sampling restricts the set of possible random samplings to a subset that avoids statistically significant relationships between group assignment and time-to-event outcome.” [Section 4.1 RRS samplings]; “Data points are partitioned into bins based on the continuous outcome variable, and stratified sampling is used to assign groupings for HO or CV.” [Section 2. Represenative random sampling (RRS) procedure]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated wherein the target variable is a first target variable and another of the plurality of variables is designated as a second target variable, the method further comprising: binning the data entries into a plurality of second data bins based on values of the second target variable, wherein a first bin of the plurality of data bins corresponds to a second bin of the plurality of second data bins; and designating data entries as union data entries in response to the data entries including a first value in the first bin and a second value in the second bin, wherein the selecting a subset of the binned data entries from each of the plurality of data bins as the subset of the data entries includes selecting a subset of the binned union data entries from each of the plurality of data bins as the subset of the data entries, as taught by Cappiello to the method of Wujek and Rendleman because both of these methods are directed towards partitioning data distributions into bins or quartiles. In making this combination and identifying corresponding data points through location in the distribution, it would allow the method of Wujek and Rendleman to identify potential codependence between variables, which can help to predict the likelihood of changes in one variable given changes in another (Cappiello: [Section: Non-technical summary]).
	Regarding Claim 14, it is a system claim that corresponds with Claim 3. Therefore, it is rejected for the same reason as Claim 3.
Claims 4 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Wujek in view of Rendleman, as applied to Claims 1 and 12, in further view of Gutierrez et al. (“Ordinal Regression Methods: Survey and Experimental Study”, published 07/17/2015), hereinafter Gutierrez.
	Regarding Claim 4, Wujek and Rendleman teach the method of Claim 1, further comprising evaluating the plurality of machine learning models using the subset of data entries and a scoring algorithm (Wujek: “Honest assessment, which is highly related to the bias-variance tradeoff, involves calculating error metrics from scoring the model on data that were not used in any way during the training process.” [Page 12, Para. 4]]).
	However, they fail to expressly disclose wherein an input to the scoring algorithm is a bin error distance representing a number of bins between bins of actual values of the target variable of the subset of the data entries and bins of generated values of the target variable generated by the plurality of machine learning models using values in the other plurality of features from the subset of the data entries.
	In the same field of endeavor, Gutierrez teaches wherein an input to the scoring algorithm is a bin error distance representing a number of bins between bins of actual values of the target variable of the subset of the data entries and bins of generated values of the target variable generated by the plurality of machine learning models using values in the other plurality of features from the subset of the data entries (Gutierrez: “Since there are not specific repositories of ordinal regression datasets, proposals are usually evaluated using discretised regression ones, where the target variable is simply divided into different bins or classes.” [Section 1. Introduction]; “The cost of misclassifications can be forced to be different depending on the distance between real and predicted classes, in the ordinal scale.” [Section 3.1.3 Cost-Sensitive Classification]; “In general, to allow the computation of the optimal one-dimensional mapping for the data, this algorithm analyses two main objectives: the maximisation of the between-class distance, and the minimisation of the within-class distance, by using variance-covariance matrices and the Rayleigh coefficient.” [Section 3.3.3 Discriminant Learning]; “this framework is used but different weights are imposed over the patterns of each binary system, in such a way that errors on training objects are penalised proportionally to the absolute difference between their rank and q (the category examined).” [Section 3.2.1 Multiple Model Approaches]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated wherein an input to the scoring algorithm is a bin error distance representing a number of bins between bins of actual values of the target variable of the subset of the data entries and bins of generated values of the target variable generated by the plurality of machine learning models using values in the other plurality of features from the subset of the data entries, as taught by Gutierrez to the method of Wujek and Rendleman because both of these methods are directed towards evaluating the performance of trained models for the purposes of model selection. In making this combination and evaluating the models based on bin error distance, it would allow the method of Wujek and Rendleman to assign greater penalties to predictions the farther they get from the actual value (Gutierrez: [Section 3.2.1 Multiple Model Approaches]).
	Regarding Claim 7, Wujek, Rendleman and Gutierrez teach the method of Claim 4, wherein first data entries in the subset of the data entries are used in the construction of the plurality of machine learning models and second data entries in the subset of the data entries are used in the evaluation of the plurality of machine learning models (Wujek: “Whether through a validation set or through cross validation, ensure that the training process assesses the error on data that are not used to train the model in order to avoid overfitting to the training data.” [Page 13, Para. 1]).	Regarding Claims 15 and 18, they are system claims that correspond with Claims 4 and 7. Therefore, they are rejected for the same reasons as Claims 4 and 7.
Claims 5-6 and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Wujek in view of Rendleman and Gutierrez, as applied to Claims 4 and 13, in further view of Steininger et al. (“Density-based weighting for imbalanced regression”, published 07/07/2021), hereinafter Steininger.
	Regarding Claim 5, Wujek, Rendleman and Gutierrez teach the method of Claim 4. However, they fail to expressly disclose wherein a second input to the scoring algorithm is based on a value assigned to the bins based on a probability density function applied to the subset of the data entries.
	In the same field of endeavor, Steininger teaches wherein a second input to the scoring algorithm is based on a value assigned to the bins based on a probability density function applied to the subset of the data entries (Steininger: “From the resulting 200,000 data points 10,000 were sampled in such a way that there are uniformly distributed target values. This uniform dataset’s target values range from − 32.13 to 76.42. Then, for each dataset a probability density function is defined corresponding to the desired target distribution. 1000 data points are sampled from the uniform dataset weighted by the samples’ desired densities, creating the datasets pareto, rpareto, normal, and dnormal.” [Section 4.1.1 Dataset creation]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated wherein a second input to the scoring algorithm is based on a value assigned to the bins based on a probability density function applied to the subset of the data entries, as taught by Steininger to the method of Wujek, Rendleman and Gutierrez because both of these methods are directed towards evaluating model performance based on binned data points. In making this combination and utilizing a probability density function to evaluate performance, it would allow the method of Wujek, Rendleman and Gutierrez to weight the loss for each data point based on the density of the distribution to account for non-uniform distributions (Steininger: [Figure 1]). 
	Regarding Claim 6, Wujek, Rendleman and Gutierrez teach the method of Claim 4. However, they fail to expressly disclose wherein a second input to the scoring algorithm is based on a value assigned to the bins based on a number of the subset of the data entries that include values in each bin.
	In the same field of endeavor, Steininger teaches wherein a second input to the scoring algorithm is based on a value assigned to the bins based on a number of the subset of the data entries that include values in each bin (Steininger: “To evaluate model performance for separate parts of the target domain, we bin the test data points based on their target value. Each bin spans 20% of the target variable’s range in the test set. We rank these bins per dataset by the number of data points. The bin with the fewest (most) samples has bin rank 1 (5) and is called the least (most) common bin.” [Section 4.1.3 Results]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated wherein a second input to the scoring algorithm is based on a value assigned to the bins based on a number of the subset of the data entries that include values in each bin, as taught by Steininger to the method of Wujek, Rendleman and Gutierrez because both of these methods are directed towards evaluating model performance based on binned data points. In making this combination and accounting for the number of data points in each bin in evaluating performance, it would allow the method of Wujek, Rendleman and Gutierrez to “allow performance comparisons between similarly rare bins over all datasets” (Steininger: [Section 4.1.3 Results]).
	Regarding Claims 16-17, they are system claims that correspond with Claims 5-6. Therefore, they are rejected for the same reasons as Claims 5-6.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Naimi et al. (“Stacked generalization: an introduction to super learning”) discusses constructing an ensemble super-learner from a selected weighted combination of predictions from candidate algorithms based on the performance of the algorithms in modeling the data.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MEGAN E HWANG whose telephone number is (703)756-1377. The examiner can normally be reached Monday-Thursday 10:00-7:30 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Welch can be reached at (571) 272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/M.E.H./Examiner, Art Unit 2143                                                                                                                                                                                                        
/JENNIFER N WELCH/Supervisory Patent Examiner, Art Unit 2143
Read full office action
Prosecution Timeline

Oct 31, 2022
Application Filed
Jan 07, 2026
Non-Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/446,509
Patent 12456093
Corporate Hierarchy Tagging
2y 5m to grant Granted Oct 28, 2025
17/521,057
Patent 12437514
VIDEO DOMAIN ADAPTATION VIA CONTRASTIVE LEARNING FOR DECISION MAKING
2y 5m to grant Granted Oct 07, 2025
18/484,826
Patent 12437517
VIDEO DOMAIN ADAPTATION VIA CONTRASTIVE LEARNING FOR DECISION MAKING
2y 5m to grant Granted Oct 07, 2025
18/484,832
Patent 12437518
VIDEO DOMAIN ADAPTATION VIA CONTRASTIVE LEARNING FOR DECISION MAKING
2y 5m to grant Granted Oct 07, 2025
18/484,839
Patent 12437519
VIDEO DOMAIN ADAPTATION VIA CONTRASTIVE LEARNING FOR DECISION MAKING
2y 5m to grant Granted Oct 07, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
47%
Grant Probability
99%
With Interview (+60.2%)
3y 0m
Median Time to Grant
Low
PTA Risk
Based on 19 resolved cases by this examiner. Grant probability derived from career allow rate.
MACHINE LEARNING ALGORITHM SELECTION

This examiner grants 47% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email