Last updated: May 29, 2026
Application No. 17/210,803
DETECT FIELD INTERACTIONS BASED ON RANDOM TREE STUMPS

Non-Final OA §103
Filed
Mar 24, 2021
Examiner
RIFKIN, BEN M
Art Unit
2123
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
4 (Non-Final)
Interview Optional

— +15.7% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 44% grant rate with +15.7% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 317 resolved cases, 2023–2026
Examiner Intelligence

RIFKIN, BEN M View full profile →
Grants 44% of resolved cases
Career Allowance Rate
139 granted / 317 resolved
-11.2% vs TC avg
Strong +16% interview lift
Without
With
+15.7%
Interview Lift
resolved cases with interview
Typical timeline
5y 0m
Avg Prosecution
21 currently pending
Career history
355
Total Applications
across all art units
Statute-Specific Performance

§101
12.2%
-27.8% vs TC avg
§103
76.4%
+36.4% vs TC avg
§102
3.3%
-36.7% vs TC avg
§112
6.5%
-33.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 317 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
The instant application having Application No. 17210803 has a total of 25 claims pending in the application. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 7-8, 9-10, 15-16, 17-18, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Reese (US 11093864 B1)  in view of Bampis et al (US 20190295242 A1) , Suthaharan (“Chapter 10: Decision Tree Learning”) and Zhang et al (US 20070271223 A1). 
As per claim 1, Reese discloses, “A computer implemented method comprising” (Fig.1 and associated paragraphs; EN: this denotes computer hardware and the various components that run the system). 
“generating a set of bootstrap samples based on random selection from a set of data records”  (C11, particularly L40-55; EN: this denotes taking bootstraps from the training data, with the bootstrap sample being random). “each comprising a plurality of fields” (C6, particularly L8-33; EN: this denotes the input data being made up of rows and columns, the rows and columns representing the different fields of the data). 
“creating… a set of decision tree stumps from the set of bootstrap samples”  (C11, particularly L40-55; EN: this denotes taking bootstraps from the training data and using them to create trees. As the tree grows/is created, the smaller pieces represent a “stump”). “wherein each one of the set of decision tree stumps comprise a plurality of leaf nodes corresponding to one or more of the plurality of fields” (Figure 3 A and associated paragraphs; EN: this denotes the tree created from the training data, including leaf nodes and the like representing that data). 
“Generating… a set of new features from the set of decision tree stumps, wherein each one of the set of new features indicates at least one field interaction between two or more of the plurality of fields, further comprising” (C3, particularly L29-43; EN: this denotes using the trees to determine feature importance/contribution). 
“assigning a first field from a first set of fields as a target field” (C8, particularly L61-68; C9, particularly L1-18; EN: This denotes setting the target variable for the tree/splitting process, each tree will have its own target and set of fields).
“Building a first one of the set of decision tree stumps from a first bootstrap sample of the set of bootstrap samples using the target field as a root node, wherein a first decision tree stump of the set of decision tree stumps comprises a set of leaf nodes from the plurality of leaf nodes” (C8, particularly L61-68; C9, particularly L1-18; EN: This denotes setting the target variable for the tree/splitting process and leaf nodes associated with that target variable, this will occur for each tree ). “further comprising a tree based … strategy of splitting each node from the plurality of leaf nodes…” (C8, particularly L61-68; C9, particularly L1-18; EN: This denotes splitting each node from the root downward in order to improve predictions of the tree). 
“Encoding the first decision tree stump based on the set of fields corresponding to the set of leaf nodes” (Fig. 2B and 2C and associated paragraphs; EN: This denotes the creation process of the trees. The Examiner is interpreting the encoding of the stump to be the creation of the tree from the splitting of these various leaf nodes over time, with each incarnation of the tree before completion being a stump). 
“Generating a first feature from the set  of  new features based on the encoded first decision tree stump” (Fig. 2B and 2C and associated paragraphs; EN: this denotes using the importance values determined in previous trees to train future trees).
“Storing the first feature from the set of new features in a new feature store” (C4, particularly L23-68; EN: this denotes the various hardware for storing and processing the data). 
“Training a predictive model based on the first feature from the set of new features” (Fig. 2B and 2C and associated paragraphs; EN: this denotes using the importance values determined in previous trees to train future trees). 
“Selecting the first bootstrap sample of the set of bootstrap samples” (C11, particularly L40-55; EN: this denotes taking bootstraps from the training data)  “wherein the first bootstrap sample comprises a set of fields from the plurality of fields” (C6, particularly L8-33; EN: this denotes the input data being made up of rows and columns, the rows and columns representing the different fields of the data).
“filtering the set of new features” (C20, particularly L58-68; C21, particularly line 1-3; EN: this denotes rejecting variables (i.e. filtering them) based on scores). 
“Utilizing a trained predictive model to generate one or more predictions based on one or more new data records” (C18, particularly L21-29; EN; this denotes running the trained models). 
However, Reese fails to explicitly disclose, “concurrently to the generating the set of bootstrap samples”, “concurrently to the creating the set of decision tree stumps”, “further comprising a tree-based searching strategy of splitting each node form the plurality of leaf nodes by using a supervised learning mechanism”, “computing a quality measure of each of the set of new features”, “ranking the set of new features based on their corresponding quality measure and their uniqueness”,  and “selecting a portion of the set of new features to train the predictive model based on their corresponding ranking.”
Bampis discloses, “concurrently to the generating the set of bootstrap samples”, “concurrently to the creating the set of decision tree stumps” (pg.5, particularly paragraph 0052; EN: this denotes the system running the training processes in parallel, so each tree will have its bootstrapping, training, and growing operating in parallel with all the other bootstrapping, training, and growing operations). 
	Suthaharan discloses, “further comprising a tree-based searching strategy of splitting each node form the plurality of leaf nodes” (Pg.257, particularly number 1; EN: This denotes the search for splits for the tree). “by using a supervised learning mechanism”  (Abstract; EN: this denotes the learning of the tree being supervised). 
	 
	Zhang discloses, “computing a quality measure of each of the set of new features” (pg.3, particularly paragraph 0026; EN: this denotes various methods of performing feature selection to find the best features to use). 
“ranking the set of new features based on their corresponding quality measure” (Pg.3, particularly paragraph 0030-0038; EN: this denotes the mathematics performed to rank the features).  “and their uniqueness” (Pg.3, particularly paragraph 0031; EN: this denotes filtering out duplicates for uniqueness). 
“selecting a portion of the set of new features to train the predictive model based on their corresponding ranking” (pg.4, particularly paragraph 0046; EN: This denotes selecting the top ranked features). 
Reese and Bampis are analogous art because both involve machine learning.  
Before the effective filing date it would have been obvious to one skilled in the art of machine learning to combine the work of Reese and Bampis in order to allow parallel training of models. 
	The motivation for doing so would be to allow “any number of the perceptual model trainer [to] execute in parallel … to generate the bootstrap models” (Bampis, Pg.5, paragraph 0052) or in the case of Reese, allow the system to perform the steps concurrently with other steps in order to speed up processing of the various trees by executing the training in parallel. 
Therefore before the effective filing date it would have been obvious to one skilled in the art of machine learning to combine the work of Reese and Bampis in order to allow parallel training of models.
Reese and Suthaharan are analogous art because both involve decision trees. 
Before the effective filing date it would have been obvious to one skilled in the art of decision trees to combine the work of Reese and Suthaharan in order to use supervised learning to search for splits in decision trees. 
	The motivation for doing so would be to “require tree split algorithms to build the decision trees and require quantitative measures to build an efficient tree via training” (Suthaharan, Abstract) or in the case of Reese, allow the use of supervised learning to search for the appropriate splits efficiently via training. 
Therefore before the effective filing date it would have been obvious to one skilled in the art of decision trees to combine the work of Reese and Suthaharan in order to use supervised learning to search for splits in decision trees.
Reese and Zhang are analogous art because both involve machine learning.  
Before the effective filing date it would have been obvious to one skilled in the art of machine learning to combine the work of Reese and Zhang in order to use feature selection to determine the best features to use for machine learning. 
	The motivation for doing so would be to “Small sample size coupled with high dimensional feature space poses a significant obstacle in machine learning. In particular, as the dimensionality increases, inference drawn by machine learning algorithm require extrapolation, as the points in the training set are too sparse to be able to apply interpolation. Such extrapolation in turn introduces uncertainty and reduces accuracy. Dimensionality reduction techniques, such as feature selection are typically applied in such cases to avoid the ‘curse of dimensionality’” (Zhang, Pg.1, paragraph 0008) or in the case of Reese, allow the system to use whatever feature selection method they would like in order to improve the creation and effectiveness of their decision trees. 
Therefore before the effective filing date it would have been obvious to one skilled in the art of machine learning to combine the work of Reese and Zhang in order to use feature selection to determine the best features to use for machine learning. 
As per claims 2, 10, and 18, Reese discloses, “training the predictive model based on the set of new features in parallel with training data” (C12, particularly L35-68; C13, particularly L1-5; EN: This denotes using features from the training data to train the system. Since the features are part of the training data, the examiner is interpreting this to be done in parallel as the training data/features are used simultaneously). 
AS per claims 4, 12, and 20, Zhang discloses,  “wherein selecting a portion of the set of new features further comprises of assigning a selected portion of the set of new features as new field interaction features using corresponding encoded fields”  (pg.4, particularly paragraph 0046; EN: This denotes selecting the top ranked features and when combined with the Reese reference, denotes encoding and using them as fields as needed by the Reese reference).
As per claims, 7, 15, and 23, Reese discusses: 
generating a set of features from the set of data records based on the plurality of fields; and (Col. 3, lines 29-43 of Reese discusses how “a tree model … computes a feature importance of a variable relevance for each variable including in training the tree model” and that the “Tree model training application 122 inherently captures the contribution of each node due to n-way feature interactions”.)
training the predictive model utilizing the set of features and the set of new features.  (Col. 3, lines 29-35 of Reese discusses “Tree model training application 122 trains a tree model and computes a feature importance or a variable relevance for each variable included in training the tree model.”.)
As per claims 8 and 16, Reese discusses: 
the set of bootstrap samples comprise at least one million bootstrap samples; and (Col. 27, lines 16-17 of Reese discusses the relevance to “big data” which would typically include millions of samples, from which the bootstrap samples would be constructed to create at least one million bootstrap samples.)
each of the set of decision tree stumps comprise a tree depth less than four.  (Col. 11, lines 44-47 of Reese discusses how “The forest model type hyperparameters further may include a maximum depth (maxlevel) of a decision tree to be grown.”. The depth can be set to any value, including a value less than four (4).)
As per claims 9 and 17, Reese discloses, “An information handling system comprising:” (Fig.1 and associated paragraphs; EN: this denotes computer hardware and the various components that run the system).
“one or more processors “ (Fig.1 and associated paragraphs; EN: this denotes computer hardware and the various components that run the system). 
“a memory coupled to at least one of the processors” (Fig.1 and associated paragraphs; EN: this denotes computer hardware and the various components that run the system).
“a set of computer program instructions stored in the memory and executed by at least one of the processors in order to perform actions of” (Fig.1 and associated paragraphs; EN: this denotes computer hardware and the various components that run the system).
“generating a set of bootstrap samples based on random selection from a set of data records”  (C11, particularly L40-55; EN: this denotes taking bootstraps from the training data, with the bootstrap sample being random). “each comprising a plurality of fields…” (C6, particularly L8-33; EN: this denotes the input data being made up of rows and columns, the rows and columns representing the different fields of the data). 
“creating… a set of decision tree stumps from the set of bootstrap samples”  (C11, particularly L40-55; EN: this denotes taking bootstraps from the training data and using them to create trees. As the tree grows/is created, the smaller pieces represent a “stump”). “wherein each one of the set of decision tree stumps comprise a plurality of leaf nodes corresponding to one or more of the plurality of fields” (Figure 3 A and associated paragraphs; EN: this denotes the tree created from the training data, including leaf nodes and the like representing that data). 
“Generating… a set of new features from the set of decision tree stumps, wherein each one of the set of new features indicates at least one field interaction between two or more of the plurality of fields, further comprising” (C3, particularly L29-43; EN: this denotes using the trees to determine feature importance/contribution). 
“assigning a first field from a first set of fields as a target field” (C8, particularly L61-68; C9, particularly L1-18; EN: This denotes setting the target variable for the tree/splitting process, each tree will have its own target and set of fields).
“Building a first one of the set of decision tree stumps from a first bootstrap sample of the set of bootstrap samples using the target field as a root node, wherein a first decision tree stump of the set of decision tree stumps comprises a set of leaf nodes from the plurality of leaf nodes” (C8, particularly L61-68; C9, particularly L1-18; EN: This denotes setting the target variable for the tree/splitting process and leaf nodes associated with that target variable, this will occur for each tree ). “further comprising a tree based … strategy of splitting each node from the plurality of leaf nodes…” (C8, particularly L61-68; C9, particularly L1-18; EN: This denotes splitting each node from the root downward in order to improve predictions of the tree). 
“Encoding the first decision tree stump based on the set of fields corresponding to the set of leaf nodes” (Fig. 2B and 2C and associated paragraphs; EN: This denotes the creation process of the trees. The Examiner is interpreting the encoding of the stump to be the creation of the tree from the splitting of these various leaf nodes over time, with each incarnation of the tree before completion being a stump). 
“Generating a first feature from the set  of  new features based on the encoded first decision tree stump” (Fig. 2B and 2C and associated paragraphs; EN: this denotes using the importance values determined in previous trees to train future trees).
“Storing the first feature from the set of new features in a new feature store” (C4, particularly L23-68; EN: this denotes the various hardware for storing and processing the data). 
“Training a predictive model based on the first feature from the set of new features” (Fig. 2B and 2C and associated paragraphs; EN: this denotes using the importance values determined in previous trees to train future trees). 
“Selecting the first bootstrap sample of the set of bootstrap samples” (C11, particularly L40-55; EN: this denotes taking bootstraps from the training data)  “wherein the first bootstrap sample comprises a set of fields from the plurality of fields” (C6, particularly L8-33; EN: this denotes the input data being made up of rows and columns, the rows and columns representing the different fields of the data).
“filtering the set of new features” (C20, particularly L58-68; C21, particularly line 1-3; EN: this denotes rejecting variables (i.e. filtering them) based on scores). 
“Utilizing a trained predictive model to generate one or more predictions based on one or more new data records” (C18, particularly L21-29; EN; this denotes running the trained models). 
However, Reese fails to explicitly disclose, “in parallel to creating…”, “in parallel to generating…”, “concurrently to the generating the set of bootstrap samples”, “concurrently to the creating the set of decision tree stumps”, “further comprising a tree-based searching strategy of splitting each node form the plurality of leaf nodes by using a supervised learning mechanism”, “computing a quality measure of each of the set of new features”, “ranking the set of new features based on their corresponding quality measure and their uniqueness”,  and “selecting a portion of the set of new features to train the predictive model based on their corresponding ranking.”
Bampis discloses, “in parallel to creating…”, “in parallel to generating…”, “concurrently to the generating the set of bootstrap samples”, “concurrently to the creating the set of decision tree stumps” (pg.5, particularly paragraph 0052; EN: this denotes the system running the training processes in parallel, so each tree will have its bootstrapping, training, and growing operating in parallel with all the other bootstrapping, training, and growing operations). 
	Suthaharan discloses, “further comprising a tree-based searching strategy of splitting each node form the plurality of leaf nodes” (Pg.257, particularly number 1; EN: This denotes the search for splits for the tree). “by using a supervised learning mechanism”  (Abstract; EN: this denotes the learning of the tree being supervised). 
	Zhang discloses, “computing a quality measure of each of the set of new features” (pg.3, particularly paragraph 0026; EN: this denotes various methods of performing feature selection to find the best features to use). 
“ranking the set of new features based on their corresponding quality measure” (Pg.3, particularly paragraph 0030-0038; EN: this denotes the mathematics performed to rank the features).  “and their uniqueness” (Pg.3, particularly paragraph 0031; EN: this denotes filtering out duplicates for uniqueness). 
“selecting a portion of the set of new features to train the predictive model based on their corresponding ranking” (pg.4, particularly paragraph 0046; EN: This denotes selecting the top ranked features). 
Reese and Bampis are analogous art because both involve machine learning.  
Before the effective filing date it would have been obvious to one skilled in the art of machine learning to combine the work of Reese and Bampis in order to allow parallel training of models. 
	The motivation for doing so would be to allow “any number of the perceptual model trainer [to] execute in parallel … to generate the bootstrap models” (Bampis, Pg.5, paragraph 0052) or in the case of Reese, allow the system to perform the steps concurrently with other steps in order to speed up processing of the various trees by executing the training in parallel. 
Therefore before the effective filing date it would have been obvious to one skilled in the art of machine learning to combine the work of Reese and Bampis in order to allow parallel training of models.
Reese and Suthaharan are analogous art because both involve decision trees. 
Before the effective filing date it would have been obvious to one skilled in the art of decision trees to combine the work of Reese and Suthaharan in order to use supervised learning to search for splits in decision trees. 
	The motivation for doing so would be to “require tree split algorithms to build the decision trees and require quantitative measures to build an efficient tree via training” (Suthaharan, Abstract) or in the case of Reese, allow the use of supervised learning to search for the appropriate splits efficiently via training. 
Therefore before the effective filing date it would have been obvious to one skilled in the art of decision trees to combine the work of Reese and Suthaharan in order to use supervised learning to search for splits in decision trees.
Reese and Zhang are analogous art because both involve machine learning.  
Before the effective filing date it would have been obvious to one skilled in the art of machine learning to combine the work of Reese and Zhang in order to use feature selection to determine the best features to use for machine learning. 
	The motivation for doing so would be to “Small sample size coupled with high dimensional feature space poses a significant obstacle in machine learning. In particular, as the dimensionality increases, inference drawn by machine learning algorithm require extrapolation, as the points in the training set are too sparse to be able to apply interpolation. Such extrapolation in turn introduces uncertainty and reduces accuracy. Dimensionality reduction techniques, such as feature selection are typically applied in such cases to avoid the ‘curse of dimensionality’” (Zhang, Pg.1, paragraph 0008) or in the case of Reese, allow the system to use whatever feature selection method they would like in order to improve the creation and effectiveness of their decision trees. 
Therefore before the effective filing date it would have been obvious to one skilled in the art of machine learning to combine the work of Reese and Zhang in order to use feature selection to determine the best features to use for machine learning. 

Claim Rejections - 35 USC § 103
Claims 3, 11, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Reese (US 11093864 B1)  in view of Bampis et al (US 20190295242 A1) , Suthaharan (“Chapter 10: Decision Tree Learning”) and Zhang et al (US 20070271223 A1) as applied to claims 1, 9, and 17 above, and further in view of Chickering et al (US 6519599 B1).
As per claims 3, 11, and 19,  Reese discusses:
selecting a first leaf node from the set of leaf nodes, wherein the first leaf node is based on a threshold value of a second one of the plurality of fields; (Col. 8, lines 54-60 of Reese discusses how “Tree models where the target variable can take a discrete set of values are called classification trees where leaves of the classification tree indicate a class label and branches represent combinations of features and threshold values that result in a respective class label.”)
including the threshold value of the second field and the target value of the target field in the encoding of the first decision tree stump. (Col. 8, lines 54-60 of Reese discusses how “Tree models where the target variable can take a discrete set of values are called classification trees where leaves of the classification tree indicate a class label and branches represent combinations of features and threshold values that result in a respective class label.” and Col. 9, lines 19-27 Reese discusses how “each node is a split test based on a comparison of a variable value of a variable of the plurality of variables using a logical operation, such as <, >, ≤, ≥, ≠, =, and a threshold value”.)
Reese fails to explicitly discuss “determining a target value of the target field at the first leaf node based on a probability value of the target field at the first leaf node”.
However, in analogous art, Chickering discloses:
determining a target value of the target field at the first leaf node based on a probability value of the target field at the first leaf node; and (Col. 6, line 54 – col. 7, line 3 of Chickering discusses how “A decision tree T is a structure used to encode a conditional probability distribution of a target variable Y, given a set of predictor variables X={X.sub.l, . . . , X.sub.n }, denoted p(Y.vertline.X). The structure is a tree, where each internal node I stores a mapping from the values of a predictor variable X.sub.j to the children of I in the tree. Each leaf node L in the tree stores a probability distribution for the target variable Y. The probability of the target variable Y, given a set of values {X.sub.l =x.sub.l, . . . , X.sub.n =x.sub.n } for the predictor variables, is obtained by starting at the root of T and using the internal-node mappings to traverse down the tree to a leaf node.”.)
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to use the tree modeling training system of Chickering to modify the target variable probability distribution of Chickering in order to ensure that the most relevant variables are included within the decision tree, to provide the most relevant results to a user. 
Claim Rejections - 35 USC § 103
Claims 5, 13, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Reese (US 11093864 B1)  in view of Bampis et al (US 20190295242 A1) , Suthaharan (“Chapter 10: Decision Tree Learning”) and Zhang et al (US 20070271223 A1) as applied to claims 1, 9, and 17 above, and further in view of Gunes et al (US 20190370684 A1).

	As per claims 5, 13, and 21, Reese fails to explicitly discuss:
selecting a first new feature from the set of new features, wherein the first new feature comprises a target value of a target field; 
testing the first new feature against the set of data records, wherein the testing compares the target value against a field value in the set of data records, and wherein the testing generates a set of test results; and 
computing the quality measure of the new feature based on the set of test results.  
However, in analogous art, Gunes discloses:
selecting a first new feature from the set of new features, wherein the first new feature comprises a target value of a target field; Para. [0098] of Gunes discusses determining “the trained model is validated by executing the trained model with each observation vector read from validation dataset 126 with the features (variables) defined by the feature set and using the hyperparameter values defined by the hyperparameter configuration to predict a target variable value for each observation vector” with the new feature being the target variable value predicted by the model. 
testing the first new feature against the set of data records, wherein the testing compares the target value against a field value in the set of data records, and wherein the testing generates a set of test results; and Para. [0099] of Gunes discusses determining “an accuracy value” that “computed by comparing the target variable value associated with the observation vector in validation dataset 126 to the target variable value predicted by the trained model” . computing the quality measure of the new feature based on the set of test results.  Para. [0099] of Gunes discusses determining “an accuracy value”
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to use the accuracy determination methodology of Gunes to modify the tree modeling training system of Chickering in order to provide a mechanism in which the trained data can be evaluated for accuracy based on known data, therefore, providing a way to evaluate and improve the effectiveness of the decision tree. 


Claim Rejections - 35 USC § 103
Claims 6, 14, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Reese (US 11093864 B1)  in view of Bampis et al (US 20190295242 A1) , Suthaharan (“Chapter 10: Decision Tree Learning”) and Zhang et al (US 20070271223 A1) as applied to claims 1, 9, and 17 above, and further in view of Wang et al (US 20170221075 A1). 
As per claims 6, 14, and 22, Reese fails to explicitly discuss:
identifying at least one of the plurality of leaf nodes in one of the set of decision tree stumps that comprises a probability value of the target field exceeding a probability threshold; and 
generating a report that indicates the identified at least one of the plurality of leaf nodes.  
However, in analogous art, Wang discloses:
identifying at least one of the plurality of leaf nodes in one of the set of decision tree stumps that comprises a probability value of the target field exceeding a probability threshold; and (Para. [0026] of Wang discusses filtering “out leaf nodes with very low probabilities, a predetermined threshold may be used. For example, setting the threshold to 0.6 enables the search technique to consider only leaf nodes with probability accuracies higher than 0.6.”.)
generating a report that indicates the identified at least one of the plurality of leaf nodes.  
(The report of Wang would be the provided results from the decision tree, including the reduced leaves from probability)
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to use the threshold evaluation method of Wang to modify the tree modeling training system of Chickering in order to provide a mechanism for tuning the decision tree based on preferences of a user, for example, changing how precise/narrow the decision tree results are when it is constructed. 

Claim Rejections - 35 USC § 103

Claims 24-25 are rejected under 35 U.S.C. 103 as being unpatentable over Reese (US 11093864 B1)  in view of Suthaharan (“Chapter 10: Decision Tree Learning”). 
As per claims 24 and 25, Reese discloses, “A computer implemented method comprising:” (Fig.1 and associated paragraphs; EN: this denotes computer hardware and the various components that run the system).
“generating a set of bootstrap samples from a set of data records”  (C11, particularly L40-55; EN: this denotes taking bootstraps from the training data). “each comprising a plurality of fields” (C6, particularly L8-33; EN: this denotes the input data being made up of rows and columns, the rows and columns representing the different fields of the data). 
“Creating a set of decision tree stumps from the set of bootstrap samples” ”  (C11, particularly L40-55; EN: this denotes taking bootstraps from the training data and using them to create trees. As the tree grows/is created, the smaller pieces represent a “stump”). “wherein each one of the set of decision tree stumps comprise a plurality of leaf nodes corresponding to one or more of the plurality of fields, and wherein the creating further comprises” (Figure 3 A and associated paragraphs; EN: this denotes the tree created from the training data, including leaf nodes and the like representing that data). 
“Selecting a first bootstrap sample one of the set of bootstrap samples” (C11, particularly L40-55; EN: this denotes taking bootstraps from the training data)  “wherein the first bootstrap sample comprises a set of fields from the plurality of fields” (C6, particularly L8-33; EN: this denotes the input data being made up of rows and columns, the rows and columns representing the different fields of the data).
“assigning a first field from the set of fields as a target field” (C8, particularly L61-68; C9, particularly L1-18; EN: This denotes setting the target variable for the tree/splitting process). 
“Building a first one of the set of decision tree stumps from the first bootstrap sample using the target field as a root node, wherein the first decision tree stump comprises a set of leaf nodes from the plurality of leaf nodes” (C8, particularly L61-68; C9, particularly L1-18; EN: This denotes setting the target variable for the tree/splitting process and leaf nodes associated with that target variable). “further comprising a tree based … strategy of splitting each node from the plurality of leaf nodes…” (C8, particularly L61-68; C9, particularly L1-18; EN: This denotes splitting each node from the root downward in order to improve predictions of the tree). 
“Encoding the first decision tree stump based on the set of fields corresponding to the set of leaf nodes” (Fig. 2B and 2C and associated paragraphs; EN: This denotes the creation process of the trees. The Examiner is interpreting the encoding of the stump to be the creation of the tree from the splitting of these various leaf nodes over time, with each incarnation of the tree before completion being a stump).
“generating a set of new features from the encoded set of decision tree stumps, wherein each one of the set of new features  indicates at least one field interaction between two or more of the plurality of fields” (C3, particularly L29-43; EN: this denotes using the trees to determine feature importance/contribution).
“Storing the set of new features in a new feature store” (C4, particularly L23-68; EN: this denotes the various hardware for storing and processing the data). 
“Training a predictive model based on the set of new features” (Fig. 2B and 2C and associated paragraphs; EN: this denotes using the importance values determined in previous trees to train future trees).
“Utilizing a trained predictive model to generate one or more predictions based on one or more new data records” (C5, particularly L20-28; EN: this denotes using the created models to make predictions)
However, Reese fails to explicitly disclose, “further comprising a tree-based searching strategy of splitting each node form the plurality of leaf nodes by using a supervised learning mechanism.” 
	Suthaharan discloses, “further comprising a tree-based searching strategy of splitting each node form the plurality of leaf nodes” (Pg.257, particularly number 1; EN: This denotes the search for splits for the tree). “by using a supervised learning mechanism”  (Abstract; EN: this denotes the learning of the tree being supervised). 
Reese and Suthaharan are analogous art because both involve decision trees. 
Before the effective filing date it would have been obvious to one skilled in the art of decision trees to combine the work of Reese and Suthaharan in order to use supervised learning to search for splits in decision trees. 
	The motivation for doing so would be to “require tree split algorithms to build the decision trees and require quantitative measures to build an efficient tree via training” (Suthaharan, Abstract) or in the case of Reese, allow the use of supervised learning to search for the appropriate splits efficiently via training. 
Therefore before the effective filing date it would have been obvious to one skilled in the art of decision trees to combine the work of Reese and Suthaharan in order to use supervised learning to search for splits in decision trees.

Response to Arguments

Applicant's arguments with respect to claims 1-25 have been considered but are moot in view of the new ground(s) of rejection.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to BEN M RIFKIN whose telephone number is (571)272-9768. The examiner can normally be reached Monday-Friday 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached at (571) 270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BEN M RIFKIN/Primary Examiner, Art Unit 2123
Read full office action
Prosecution Timeline

Show 9 earlier events
Apr 16, 2025
Response after Non-Final Action
Aug 25, 2025
Non-Final Rejection mailed — §103
Nov 03, 2025
Interview Requested
Nov 13, 2025
Examiner Interview Summary
Nov 13, 2025
Applicant Interview (Telephonic)
Nov 14, 2025
Response Filed
Jan 12, 2026
Final Rejection mailed — §103
Mar 11, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

17/001,746
Patent 12619865
DECOUPLING MEMORY AND COMPUTATION TO ENABLE PRIVACY ACROSS MULTIPLE KNOWLEDGE BASES OF USER DATA
5y 8m to grant Granted May 05, 2026
17/289,356
Patent 12608641
INFORMATION PROCESSING APPARATUS AND INFORMATION PROCESSING METHOD
4y 11m to grant Granted Apr 21, 2026
17/121,149
Patent 12541685
SEMI-SUPERVISED LEARNING OF TRAINING GRADIENTS VIA TASK GENERATION
5y 1m to grant Granted Feb 03, 2026
16/151,431
Patent 12455778
SYSTEMS AND METHODS FOR DATA STREAM SIMULATION
7y 0m to grant Granted Oct 28, 2025
16/746,866
Patent 12236335
SYSTEM AND METHOD FOR TIME-DEPENDENT MACHINE LEARNING ARCHITECTURE
5y 1m to grant Granted Feb 25, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

4-5
Expected OA Rounds
44%
Grant Probability
60%
With Interview (+15.7%)
5y 0m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 317 resolved cases by this examiner. Grant probability derived from career allowance rate.