Last updated: May 29, 2026
Application No. 18/317,898
TRAINING MACHINE LEARNING MODELS BASED ON FEATURE VALIDITY

Non-Final OA §102
Filed
May 15, 2023
Examiner
CHAKI, KAKALI
Art Unit
2122
Tech Center
2100 — Computer Architecture & Software
Assignee
Capital One Services LLC
OA Round
1 (Non-Final)
Interview Optional

— +40.6% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 20% grant rate with +40.6% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 44 resolved cases, 2023–2026
Examiner Intelligence

CHAKI, KAKALI View full profile →
Grants only 20% of cases
Career Allowance Rate
9 granted / 44 resolved
-34.5% vs TC avg
Strong +41% interview lift
Without
With
+40.6%
Interview Lift
resolved cases with interview
Typical timeline
3y 5m
Avg Prosecution
4 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
6.7%
-33.3% vs TC avg
§103
82.0%
+42.0% vs TC avg
§102
9.0%
-31.0% vs TC avg
§112
2.3%
-37.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 44 resolved cases
Office Action

§102
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the application filed on 5/15/2023. Claims 1-20 are pending and have been examined.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Katz et al. “ExploreKit: Automatic Feature Generation and Selection”, hereinafter “Katz”.
	
	Regarding Claim 1, Katz teaches:
A system for training machine learning models based on feature validity, the system comprising: 
one or more processors and one or more memories having computer-executable instructions stored thereon, the computer-executable instructions, when executed by the one or more processors, causing the system to perform operations (Katz performs their method on a computer in which processor, memory, and storage devices are inherent, p. 983, col. 1, paragraph 4, “We conducted our experiments on a 20-core machine with 64GB of RAM”) comprising: 
retrieving a feature generation log comprising a plurality of feature generation statuses for generating a plurality of features, wherein each feature generation status indicates whether a corresponding feature generation attempt was successful or unsuccessful (good and bad labels indicate whether features should be retained or not after training classifier, p. 982, col. 2, paragraph 1, “we label f cand,i T as “good” and “bad” otherwise” and “use the labeled set to train C… we created a ranking classifier”); 
determining, for each of the plurality of features based on the plurality of feature generation statuses, a corresponding subset of feature generation statuses for assessing a validity of each of the plurality of features, wherein each corresponding subset is based on a frequency of corresponding feature generation attempts for each of the plurality of features (subset of features overtime is based off ranking made from feature statuses, p. 982, col. 2, paragraph 1, ““good” and “bad”… labeled set to train C”, after ranking classifier is trained subsets are chosen based on ranking, Fig. 1 Candidate Feature Ranking -> Candidate Feature Evaluation & selection -> Add selected Candidate to Generated Features set, process is repeated with new subset which was based on corresponding feature generation attempts); 
determining, for each of the plurality of features, a corresponding ratio of successful feature generation statuses to the corresponding subset of feature generation statuses (ratio is determined when good and bad labels are assigned, p. 982, col. 2, paragraph 1, “assigning labels (“good” or “bad”) to each candidate feature generated for DT” and “We assign these labels to the meta-features and use the labeled set to train C”); 
based on each corresponding ratio for each of the plurality of features, generating a corresponding validity parameter for each of the plurality of features, wherein each corresponding validity parameter indicates a degree of success of the corresponding feature generation attempts for each of the plurality of features (score is validity parameter generated for features based off ratio, p. 980, col. 1, paragraph 4, “we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution”, score indicates degree of success because it is used to rank and decide if feature is worth keeping therefore is a degree of success, p. 982, paragraph 1, “We use the ranking classifier C to estimate the likelihood of each f cand i,j ∈ Fcand i to reduce the error, based on its meta-features”); 
assigning a corresponding weight to each of the plurality of features within a dataset, wherein the corresponding weight reflects the corresponding validity parameter for each of the plurality of features, and wherein higher weights are assigned to features having higher ratios of successful generation attempts (ranking is weight corresponding to validity parameter, p. 980, col. 1, paragraph 4, “In the candidate features ranking phase we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution and produce an ordered list of features RankedFcand”, higher ranking represents higher ratio of successful generation attempts because the ranking classifier assigns scores based on meta-features capturing past feature success, p. 982, col. 1, paragraph 2, “We use the ranking classifier C to estimate the likelihood of each f cand i,j ∈ Fcand i to reduce the error, based on its meta-features”); 
inputting, into a training routine of a machine learning model, the dataset to train the machine learning model based on weights of the plurality of features, wherein the training routine uses the weights to indicate feature importance within the machine learning model (training is based on weights of ranking because ranking helps selects joint feature set for training, p. 982, col. 2, paragraph 4, “we evaluate the performance of the classifier on the joint feature set Fi ∪ {f cand i,j }. The evaluation is conducted using k-fold cross validation”, weights indicate feature importance because ranking determines whether the feature is filtered out, p. 982, col. 1, paragraph 2, “Candidate features whose probability is below a predefined parameter thresholdf are filtered”); and 
initializing the machine learning model for use (p. 982, col. 2, paragraph 4, “we evaluate the performance of the classifier on the joint feature set Fi ∪ {f cand i,j }. The evaluation is conducted using k-fold cross validation”).

Regarding Claim 2, Katz teaches:
A method comprising: retrieving one or more feature generation logs comprising a plurality of feature generation statuses for generating a plurality of features, wherein each feature generation status indicates whether a corresponding feature generation attempt was successful (good and bad labels indicate whether features should be retained or not after training classifier, p. 982, col. 2, paragraph 1, “we label f cand,i T as “good” and “bad” otherwise” and “use the labeled set to train C… we created a ranking classifier”); 
determining, for each of the plurality of features based on the plurality of feature generation statuses, a corresponding subset of feature generation statuses for assessing a validity of each of the plurality of features (subset of features overtime is based off ranking made from feature statuses, p. 982, col. 2, paragraph 1, ““good” and “bad”… labeled set to train C”, after ranking classifier is trained subsets are chosen based on ranking, Fig. 1 Candidate Feature Ranking -> Candidate Feature Evaluation & selection -> Add selected Candidate to Generated Features set, process is repeated with new subset); 
generating, for each of the plurality of features, a corresponding validity parameter based on successful feature generation statuses of the corresponding subset of feature generation statuses, wherein each corresponding validity parameter indicates a degree of validity of each corresponding feature (score is validity parameter generated for features based off subset of feature generation statuses including successful(“good”), p. 982, col. 2, paragraph 1, ““good” and “bad”… labeled set to train C”, p. 980, col. 1, paragraph 4, “we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution”, score indicates degree of validity because it is used to rank and decide if feature is worth keeping, p. 982, col. 1, paragraph 1, “Candidate features whose probability is below a predefined parameter thresholdf are filtered”); 
generating, based on each corresponding validity parameter, a corresponding weight for each of the plurality of features and assigning, to each of the plurality of features within a dataset, the corresponding weight (ranking is weight corresponding to validity parameter, p. 980, col. 1, paragraph 4, “In the candidate features ranking phase we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution and produce an ordered list of features RankedFcand”, Fig. 1, Candidate Feature Ranking); 
inputting, into a training routine of a machine learning model, the dataset to train the machine learning model to make predictions based on degrees of validity of the plurality of features (training is based on degree of validity of ranking because ranking helps selects joint feature set for training, p. 982, col. 2, paragraph 4, “we evaluate the performance of the classifier on the joint feature set Fi ∪ {f cand i,j }. The evaluation is conducted using k-fold cross validation”); and 
initializing the machine learning model for use (p. 982, col. 2, paragraph 4, “we evaluate the performance of the classifier on the joint feature set Fi ∪ {f cand i,j }. The evaluation is conducted using k-fold cross validation”).

	Regarding Claim 3, Katz teaches the method of Claim 2. Katz further teaches: 
determining whether a first validity parameter associated with a first feature of the plurality of features meets a validity threshold; and based on determining that the first validity parameter associated with the first feature of the plurality of features does not meet the validity threshold, removing the first feature from the plurality of features within the dataset (validity parameter is compared to threshold and is filtered out of dataset if threshold not met, p. 982, col. 1, paragraph 1, “Candidate features whose probability is below a predefined parameter thresholdf are filtered”, Algorithm 1, line 7).

Regarding Claim 4, Katz teaches the method of Claim 2. Katz further teaches: 
determining, based on the one or more feature generation logs, that a subset of feature generation statuses for a first feature of the plurality of features has changed (each iteration it is determined that subset of feature generation statuses has changed, p.982, col. 2, paragraph 2, “performing this process over a diverse group of datasets”, p.980, col. 1, paragraph 3, “an iterative process where each iteration comprises of three phases: candidate features generation, candidate feature ranking, and candidate features evaluation & selection); 
retrieving a first validity parameter associated with the first feature and a state map for updating the first validity parameter (random forest is state map for mapping features to scores p. 983, col. 2, paragraph 1, “We used random forest to train a ranking model on the meta-features”, p. 980, col. 1, paragraph 4, “we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution”, in following iterations newly generated features will be scored and ranked); and 
updating, based on a change within the subset of feature generation statuses, the first validity parameter and the state map (process is repeated with new features on next iteration which creates updated first validity parameter and state map, p. 983, col. 1, paragraph 1, “Once a candidate feature that satisfies the criteria is found, it is denoted as f select i . We then define Fi+1 ← Fi ∪ {f select i } as the current features set for the next iteration and repeat the process described in this section”, random forest will have different meta-features per iteration causing different decisions and output states, p. 983, col. 2, paragraph 1, “We used random forest to train a ranking model on the meta-features”, p. 980, col. 1, paragraph 4, “we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution”).

Regarding Claim 5, Katz teaches the method of Claim 2. Katz further teaches: 
wherein assigning the corresponding weight to each of the plurality of features further comprises: 
determining a corresponding validity threshold for each of the plurality of features based on a corresponding frequency of feature generation attempts for each of the plurality of features (corresponding validity threshold is determined by applying threshold to validity parameter, candidate features are based on previously generated features, p. 982, col. 1, paragraph 2, “Candidate features whose probability is below a predefined parameter thresholdf are filtered”); 
in response to determining that a first validity parameter for a first feature of the plurality of features meets a first validity threshold, assigning a first weight to the first feature (rankings are assigned to features which are updated after meeting threshold, p. 982, col. 1, paragraph 2, “We sort the candidates by this likelihood and create a ranked list RankedFcand i . Candidate features whose probability is below a predefined parameter thresholdf are filtered”); and 
in response to determining that a second validity parameter for a second feature of the plurality of features does not meet a second validity threshold, assigning a second weight to the second feature, wherein the first weight is higher than the second weight (if score does not reach threshold feature is not included in set so has no ranking(weight), p. 982, col. 1, paragraph 2, “Candidate features whose probability is below a predefined parameter thresholdf are filtered”).

Regarding Claim 6, Katz teaches the method of Claim 2. Katz further teaches: 
wherein generating the corresponding weight for each of the plurality of features comprises: 
retrieving a first validity parameter for a first feature of the plurality of features and a state map (p. 983, col. 2, paragraph 1, “We used random forest to train a ranking model on the meta-features”, p. 980, col. 1, paragraph 4,  “we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution”); 
determining a location of the first validity parameter within the state map (validity(likelihood score) is derived from states of random forest states, p. 981, col. 2, paragraph 2, “We use the ranking classifier C to estimate the likelihood of each f cand i,j ∈ Fcand i to reduce the error, based on its meta-features. We sort the candidates by this likelihood”, p. 983, col. 2, paragraph 1, “We used random forest to train a ranking model”); and 
generating a first weight for the first feature based on the location of the first validity parameter within the state map (ranking is based on validity(score) from state map(random forest), p. 982, col. 1, paragraph 2, “We use the ranking classifier C to estimate the likelihood of each f cand i,j ∈ Fcand i to reduce the error, based on its meta-features. We sort the candidates by this likelihood and create a ranked list RankedFcand”, p. 981, col. 2, paragraph 2, “we use the ranking classifier C to assign a score to each f cand”, p. 983, col. 2, paragraph 1, “We used random forest to train a ranking model on the meta-features”)

Regarding Claim 7, Katz teaches the method of Claim 2. Katz further teaches: 
retrieving a first state map associated with a first feature of the plurality of features, wherein each of the plurality of features is associated with a state map of a plurality of state maps (p. 982, col. 1, paragraph 2, “We use the ranking classifier C to estimate the likelihood of each f cand”); 
in response to determining that a first validity parameter for a first feature matches a first state of the first state map, assigning a first weight to the first feature (feature is put through states of random forest classifier then is assigned a ranking, p. 980, col. 1, paragraph 4, “we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution and produce an ordered list of features RankedFcand i”, p. 983, col. 2, paragraph 1, “We used random forest to train a ranking model”); and 
in response to determining that a second validity parameter for the first feature matches a second state of the state map, assigning a second weight to the first feature, wherein the second weight is higher than the first weight and the second state indicates higher validity than the first state (in subsequent iterations a new validity parameter can be found from first feature being input into state map causing a ranking that is higher than the last iteration, p. 980, col. 1, paragraph 4, “we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution and produce an ordered list of features RankedFcand i”, p. 983, col. 2, paragraph 1, “We used random forest to train a ranking model”).

Regarding Claim 8, Katz teaches the method of Claim 2. Katz further teaches: 
detecting a new plurality of feature generation statuses within the one or more feature generation logs (new plurality of feature generated statuses is detected on subsequent iterations, p.980, col. 1, paragraph 2, “each iteration comprises of three phases: candidate features generation, candidate feature ranking, and candidate features evaluation & selection”, p. 982, col. 2, paragraph 1, “we label f cand,i T as “good” and “bad” otherwise”, ); 
determining, based on the new plurality of feature generation statuses, that a first feature generation status for a first feature has changed (feature can be run through ranking model in multiple iterations where status can change, p.980, col. 1, paragraph 2, “each iteration comprises of three phases: candidate features generation, candidate feature ranking, and candidate features evaluation & selection”); 
modifying a first validity parameter for the first feature based on the first feature generation status (first validity parameter for feature can be changed in subsequent iterations based on ranking model, p.980, col. 1, paragraph 2, “each iteration comprises of three phases: candidate features generation, candidate feature ranking, and candidate features evaluation & selection”); 
adjusting, within the dataset, based on the first validity parameter, a first weight for the first feature (ranking weight is adjusted based on validity parameter, p. 980, col. 1, paragraph 4, “we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution and produce an ordered list of features RankedFcand i”); and 
updating the machine learning model based on the first weight of the first feature within the dataset (p. 982, col. 1, paragraph 2, “create a ranked list RankedFcand i . Candidate features whose probability is below a predefined parameter thresholdf are filtered”, after filtering out features based on weight classifier is updated, p. 982, col. 2, paragraph 4, “evaluate the performance of the classifier on the joint feature set Fi ∪ {f cand i,j }”).

Regarding Claim 9, Katz teaches the method of Claim 2. Katz further teaches: 
wherein generating, for each of the plurality of features, the corresponding validity parameter further comprises: 
determining, based on the one or more feature generation logs, a first error type associated with a first feature generation attempt for a first feature (first error type is meta feature of entropy and statistical tests for the candidate feature, p. 981, col. 2, paragraph 5, “we partition Fi into subgroups based on their type and use the chi-sqaured and paired t-tests to derive statistics on f cand i,j ’s correlation to each groups. In addition, we derive entropy-based measures for f cand”, p. 982, col. 2, paragraph 1,“we label f cand,i T as “good” and “bad” otherwise. We assign these labels to the meta-features and use the labeled set to train C”, associated with attempt to generate features after ranking with C); 
generating, based on the first error type, a first validity parameter for the first feature (generates a validity parameter for a feature based on meta-features including first error type, Fig. 1, p. 981, col. 2, paragraph 2, “Once the meta-features are generated, we use the ranking classifier C to assign a score”);
determining, based on the one or more feature generation logs, a second error type associated with a second feature generation attempt for a second feature (second error type is meta feature of statistical tests on parent features, p. 982, col. 1, paragraph 1, “Statistical tests on parent features: we derive statistics on the inter-correlation of the parent features of f cand i,j , as well as their correlation with the remaining features of Fi. In addition, we generate information on the used operator(s)”, associated with attempt to generate features after ranking with C, this is repeated iteratively including a second attempt); and 
generating, based on the second error type, a second validity parameter for the second feature, wherein the first validity parameter is different from the second validity parameter (iteratively generates a validity parameter for a feature including a second validity parameter for a second feature, Fig. 1, p. 981, col. 2, paragraph 2, “Once the meta-features are generated, we use the ranking classifier C to assign a score”).

Regarding Claim 10, Katz teaches the method of Claim 9. Katz further teaches: 
determining that the first error type indicates a lack of new data for the first feature(first type of error measures entropy and correlation where low entropy and high correlation with other features represent non new data, p. 982, col. 1, paragraph 1, “derive statistics on f cand i,j ’s correlation to each groups. In addition, we derive entropy-based measures for f cand i,j”) and the second error type indicates an update failure for the second feature (meta features, including second error type of statistical tests on parent features, indicate an update failure because they are used to evaluate whether a candidate feature meaningfully improves the dataset which contributes to determining whether the candidate is not added to the dataset e.g. an update failure, p. 981, col. 2, paragraph 2, “generates a set of meta-features F meta i,j for each candidate feature f cand i,j . Once the meta-features are generated, we use the ranking classifier C to assign a score to each f cand”, p. 982, col. 1, paragraph 2, “estimate the likelihood of each f cand i,j ∈ Fcand i to reduce the error, based on its meta-features… Candidate features whose probability is below a predefined parameter thresholdf are filtered”); and 
assigning the first validity parameter to the first feature and the second validity parameter to the second feature, wherein the first validity parameter indicates a higher degree of validity than the second validity parameter (features are assigned scores where higher scores indicate a higher degree of validity, p. 980, col. 1, paragraph 4, “In the candidate features ranking phase we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution”).

Regarding Claim 11, Katz teaches the method of Claim 2. Katz further teaches: 
wherein a first feature of the plurality of features comprises a transformation of a combination of a second feature and a third feature of the plurality of features (p. 979, col.1, paragraph 1, “transform each feature individually or combine several of them together”), and further comprising assigning, to the first feature, a first validity parameter based on a second validity parameter associated with the second feature and a third validity parameter associated with the third feature (validity parameter for candidate feature is based on parent features through meta features,  p. 982, col. 1, paragraph 1, “Statistical tests on parent features: we derive statistics on the inter-correlation of the parent features of f cand i,j , as well as their correlation with the remaining features of Fi”, parent features are retained in feature set from validity parameters of previous iterations, p. 983, col. 1, paragraph 2, “We then define Fi+1 ← Fi ∪ {f select i } as the current features set for the next iteration and repeat the process described in this section”).

Regarding Claim 12, Katz teaches the method of Claim 2. Katz further teaches: 
wherein generating the corresponding validity parameter for each of the plurality of features comprises: 
determining, for each of the plurality of features, a corresponding ratio of the successful feature generation statuses to the corresponding subset of feature generation statuses ((ratio is determined when good and bad labels are assigned, p. 982, col. 2, paragraph 1, “assigning labels (“good” or “bad”) to each candidate feature generated for DT” and “We assign these labels to the meta-features and use the labeled set to train C”); and 
based on the corresponding ratio for each of the plurality of features, generating the corresponding validity parameter for each of the plurality of features (validity parameter is based on estimated contribution to good and bad labels, p. 980, col. 1, paragraph 3, “we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution”).

Regarding Claim 13, Katz teaches the method of Claim 2. Katz further teaches: 
further comprising, for one or more features having one or more corresponding validity parameters not meeting a validity threshold, updating the dataset with one or more corresponding sets of historic values for the one or more features, wherein the one or more corresponding sets of historic values are associated with one or more corresponding historic validity parameters meeting the validity threshold (historical features that met validity threshold in earlier iterations are retained and can be in final dataset while features who did not meet threshold are excluded, Fig. 1, p. 980, col. 1, paragraph 3, “It is an iterative process where each iteration comprises of three phases: candidate features generation, candidate feature ranking, and candidate features evaluation & selection”, p. 982, col. 1, paragraph 2, “Candidate features whose probability is below a predefined parameter thresholdf are filtered”).

Regarding Claim 14, Katz teaches the method of Claim 2. Katz further teaches: 
determining a residual value for the machine learning model (reduction in classification error is residual value, p. 980, col. 2, paragraph 2, “compute the reduction in classification error compared with F”); 
determining whether the residual value meets a residual threshold (p. 980, col. 2, paragraph 2, “When the performance improvement exceeds a predefined threshold w”, Algorithm 1, line 14); and 
in response to determining that the residual value meets the residual threshold, adjusting corresponding weights of the plurality of features (if feature meets threshold feature is selected, p. 980, col. 2, paragraph 2, “When the performance improvement exceeds a predefined threshold w, the evaluation process terminates and we select the current candidate feature, denoted as f select i . We define the joint set Fi ∪ {f select i } as the current feature set of the following iteration Fi+1”).

Regarding Claim 15, Katz teaches: 
One or more non-transitory, computer-readable media storing instructions that when executed by one or more processors (Katz performs their method on a computer in which processor, memory, and storage devices are inherent, p. 983, col. 1, paragraph 4, “We conducted our experiments on a 20-core machine with 64GB of RAM”) cause the one or more processors to perform operations comprising: 
retrieving a first feature generation log comprising a plurality of first feature generation statuses for a first feature of a plurality of features, wherein the plurality of first feature generation statuses associated with the first feature indicates whether each first feature generation attempt was successful (good and bad labels indicate whether first features should be retained or not after training classifier, p. 982, col. 2, paragraph 1, “we label f cand,i T as “good” and “bad” otherwise” and “use the labeled set to train C… we created a ranking classifier”); 
determining, for the first feature based on the plurality of first feature generation statuses, a first subset of the plurality of first feature generation statuses for assessing a first validity of the first feature (first subset of features overtime is based off ranking made from feature statuses, p. 982, col. 2, paragraph 1, ““good” and “bad”… labeled set to train C”, after ranking classifier is trained subsets are chosen based on ranking, Fig. 1 Candidate Feature Ranking -> Candidate Feature Evaluation & selection -> Add selected Candidate to Generated Features set, process is repeated with new subset, first validitiy is assessed based on first subset of the plurality of first feature generation statuses); 
generating, for the first feature, a first validity parameter based on successful feature generation statuses of the first subset of the plurality of first feature generation statuses, wherein the first validity parameter indicates the first validity of the first feature (score is first validity parameter generated for features based off subset of feature generation statuses including successful(“good”), p. 982, col. 2, paragraph 1, ““good” and “bad”… labeled set to train C”, p. 980, col. 1, paragraph 4, “we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution”, score indicates degree of first validity because it is used to rank and decide if feature is worth keeping, p. 982, col. 1, paragraph 1, “Candidate features whose probability is below a predefined parameter thresholdf are filtered”); 
removing, from the plurality of features within a dataset, the first feature based on the first validity parameter not meeting a validity threshold (p. 982, col. 1, paragraph 2, “Candidate features whose probability is below a predefined parameter thresholdf are filtered”); 
inputting, into a training routine of a machine learning model, the dataset to train the machine learning model (p. 982, col. 2, paragraph 4, “we evaluate the performance of the classifier on the joint feature set Fi ∪ {f cand i,j }. The evaluation is conducted using k-fold cross validation”); and 
initializing the machine learning model for use (p. 982, col. 2, paragraph 4, “we evaluate the performance of the classifier on the joint feature set Fi ∪ {f cand i,j }. The evaluation is conducted using k-fold cross validation”).

Regarding Claim 16, Katz teaches the method of Claim 15. Katz further teaches:
wherein the instructions further cause the one or more processors to perform operations comprising: 
retrieving a plurality of feature generation logs comprising a plurality of feature generation statuses for generating one or more remaining features of the plurality of features, wherein each feature generation status indicates whether a corresponding feature generation attempt was successful (good and bad labels indicate whether features should be retained or not after training classifier, p. 982, col. 2, paragraph 1, “we label f cand,i T as “good” and “bad” otherwise” and “use the labeled set to train C… we created a ranking classifier”); 
determining, for the one or more remaining features of the plurality of features based on a plurality of feature generation statuses, a corresponding subset of feature generation statuses for assessing validity of each of the plurality of features of the one or more remaining features (subset of features overtime is based off ranking made from feature statuses, p. 982, col. 2, paragraph 1, ““good” and “bad”… labeled set to train C”, after ranking classifier is trained subsets are chosen based on ranking, Fig. 1 Candidate Feature Ranking -> Candidate Feature Evaluation & selection -> Add selected Candidate to Generated Features set, process is repeated with new subset); 
generating, for each of the plurality of features of the one or more remaining features, a corresponding validity parameter based on successful feature generation statuses of the corresponding subset of feature generation statuses (score is validity parameter generated for features based off feature generation statuses including successful(“good”), p. 982, col. 2, paragraph 1, ““good” and “bad”… labeled set to train C”, p. 980, col. 1, paragraph 4, “we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution”, score indicates degree of validity because it is used to rank and decide if feature is worth keeping, p. 982, col. 1, paragraph 1, “Candidate features whose probability is below a predefined parameter thresholdf are filtered”); 
generating, based on each corresponding validity parameter, a corresponding weight for each of the plurality of features of the one or more remaining features and assigning, to each of the plurality of features of the one or more remaining features within the dataset, the corresponding weight (ranking is weight corresponding to validity parameter, p. 980, col. 1, paragraph 4, “In the candidate features ranking phase we assign a score to each candidate feature f cand i,j ∈ Fcand i based on its estimated contribution and produce an ordered list of features RankedFcand”, Fig. 1, Candidate Feature Ranking); 
 updating the dataset based on corresponding weights of the one or more remaining features (p. 982, col. 1, paragraph 1, “create a ranked list RankedFcand i . Candidate features whose probability is below a predefined parameter thresholdf are filtered”).

Regarding Claim 17, the rejection of 16 is incorporated and further, the claim is rejected for the same reasons as set forth in Claim 9.

Regarding Claim 18, the rejection of 17 is incorporated and further, the claim is rejected for the same reasons as set forth in Claim 10.

Regarding Claim 19, the rejection of 15 is incorporated and further, the claim is rejected for the same reasons as set forth in Claim 11.

Regarding Claim 20, the rejection of 15 is incorporated and further, the claim is rejected for the same reasons as set forth in Claim 12.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JESSE CHEN COULSON whose telephone number is (571)272-4716. The examiner can normally be reached Monday-Friday 8:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/JESSE C COULSON/
Examiner, Art Unit 2122

/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122
Read full office action
Prosecution Timeline

May 15, 2023
Application Filed
Mar 03, 2026
Non-Final Rejection mailed — §102
May 05, 2026
Examiner Interview Summary
May 05, 2026
Applicant Interview (Telephonic)
May 06, 2026
Response Filed
Precedent Cases

Applications granted by this same examiner with similar technology

17/174,316
Patent 12626162
SYSTEM AND METHOD FOR UTILIZING A RECURSIVE REASONING GRAPH IN MULTI-AGENT REINFORCEMENT LEARNING
5y 3m to grant Granted May 12, 2026
17/219,582
Patent 12586342
INFORMATION PROCESSING APPARATUS THAT PERFORMS MACHINE LEARNING OF LEARNING MODEL, LEARNING METHOD, AND STORAGE MEDIUM
4y 11m to grant Granted Mar 24, 2026
15/225,545
Patent 10726356
TARGET VARIABLE DISTRIBUTION-BASED ACCEPTANCE OF MACHINE LEARNING TEST DATA SETS
3y 12m to grant Granted Jul 28, 2020
15/483,859
Patent 10726334
GENERATION AND USE OF MODEL PARAMETERS IN COLD-START SCENARIOS
3y 3m to grant Granted Jul 28, 2020
14/984,216
Patent 10706368
SYSTEMS AND METHODS FOR EFFICIENTLY CLASSIFYING DATA OBJECTS
4y 6m to grant Granted Jul 07, 2020
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
20%
Grant Probability
61%
With Interview (+40.6%)
3y 5m (~4m remaining)
Median Time to Grant
Low
PTA Risk
Based on 44 resolved cases by this examiner. Grant probability derived from career allowance rate.