DETAILED ACTION
This office action is responsive to the request for continued examination filed 1/7/2026. The application contains claims 1-20, all examined and rejected.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/2/2025 has been entered.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 4, 11, and 18 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claim 4, 11, and 18 recite “the machine learning model”, it is unclear if the requirement is referring to the first or second machine learning model.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-4, 8-11, and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over “Permutation importance: a corrected feature importance measure” Published 2010 [hereinafter D1] in view of “Learning to Reweight Examples for Robust Deep Learning” Published 2018 [hereinafter D2].
With regard to Claim 1,
D1 teach a method of training a machine learning model, executable by a processor, comprising:
identifying a feature associated with training data derived from a dataset (P. 1341, 2, ¶2, “The VI of a feature is computed as the average decrease in model accuracy on the OOB samples when the values of the respective feature are randomly permuted”, features identified from dataset by considering individual features as units whose importance is computed; VI computed per feature by permuting the feature and measuring accuracy, P. 1341, 2, ¶1, “random subset of a fixed size is selected from the features”);
generating a first machine learning model based on the training data (P. 1341, 2, ¶1, “T decision trees using the CART methodology (Breiman et al., 1984) are trained on T bootstrap samples of the data”), wherein training the machine learning model comprises:
training the machine learning model with the training data to emphasize a feature (P. 1341, 2, ¶1, “the one yielding the maximum decrease in Gini index is chosen for the split”, P. 1346, Sec. 5, ¶1, “We also introduced an improved RF model that is computed based on the most significant features determined with the PIMP algorithm”, P. 1342, 2.4, “by applying for example the classical 0.05 significance threshold. We will call the improved model PIMP-RF. The idea of using the most predictive features for retraining RF model in order to reduce variance and improve accuracy …”, RF model weights features via split criteria and later permutation importance identifies those emphasized features);
utilizing the first machine learning model to compute permutations to identify the training data so that the computing of the permutations results in identifying the feature being emphasize, wherein identifying the feature comprises:
evaluating feature importance within permutations of the training data (P. 1341, 2, ¶2, “The VI of a feature is computed as the average decrease in model accuracy on the OOB samples when the values of the respective feature are randomly permuted”);
selecting at least a portion of the training data associated with maximizing an importance value associated with the identified feature (P. 1345, 4.3, ¶1, “The RF trained on the top-ranking 1%, 5% and 10% of the features …”, P. 1346, 5, “improved RF model that is computed based on the most significant features determined with the PIMP algorithm”), wherein the importance value corresponds to a need associated with the machine learning model (P. 1340, Abstract, “improved RF model that uses the significant variables with respect to the PIMP measure and show that its prediction accuracy is superior to that of other existing models”, P. 1341, Col. 1, ¶3, “an improved RF model termed PIMP-RF whose computation is based on the significant features and which incurs clear improvement in prediction accuracy”, P. 1347, Col. 1, ¶1, “corrected RF model based on the PIMP scores of the features and we demonstrated that in most of the cases it is superior in accuracy to the cforest model”);
wherein updating the machine learning model comprises utilizing the permutations to build a new second machine learning model with the training data which emphasizes the identified feature (Abstract, “improved RF model that uses the significant variables with respect to the PIMP measure and show that its prediction accuracy is superior to that of other existing models”, P. 1341, Col. 1, ¶3, “an improved RF model termed PIMP-RF whose computation is based on the significant features and which incurs clear improvement in prediction accuracy”, P. 1347, “major drawback of the PIMP method is the requirement of time-consuming permutations of the response vector and subsequent computation of feature importance. However, our simulations showed that already a small number of permutations (e.g. 10) provided improvements over a biased base method. For stability of the results any number from 50 to 100 permutations is recommended”).
D1 does not explicitly teach assigning one or more weight values to the selected portion of the training data, wherein the weight values are used to emphasize the importance of the feature within the data; and updating the machine learning model based on the assigned weight values.
D2 teach a method of training a machine learning model, executable by a processor, comprising:
identifying a feature associated with training data derived from a dataset (P. 2, 3.1, “Let (x; y) be an input-target pair, and {(xi; yi); 1 ≤ i ≤ N} be the training set”, P. 3, ¶2, “Let ϕ(x; ϴ) be our neural network model”);
generating a first machine learning model based on the training data (P. 3, Col. 1, ¶2, “In standard training, we aim to minimize the expected loss for the training set”), wherein training the machine learning model comprises:
training the machine learning model with the training data to emphasize a feature (P. 3, Col. 1, ¶3, “we aim to learn a reweighting of the inputs, where we minimize a weighted loss … since minimizing the negative training loss can usually result in unstable behavior”);
selecting at least a portion of the training data associated with maximizing an importance value associated with the identified feature (P. 2, Col. 1, ¶2, “the best example weighting should minimize the loss of a set of unbiased clean validation examples”, P. 3, Col. 1, ¶6, “reweight them according to their similarity to the descent direction of the validation loss surface”, Eq. 8), wherein the importance value corresponds to a need associated with the machine learning model (P. 3, ¶4, “optimal selection of w is based on its validation performance”);
assigning one or more weight values to the selected portion of the training data, wherein the weight values are used to emphasize the importance of the feature within the data (P. 1, Col. 2, ¶3, “assigning a weight to each example and minimizing a weighted training loss”, P. 4, Algorithm 1, step 11, P. 3, Col. 1, ¶3, “minimize a weighted loss … eq(1), P. 3, Col. 2, “rectify the output to get a non-negative weighting … eq(7) … eq(8)”, P. 3, “normalizing the weights of all examples in a training batch so that they sum up to one”); and
updating the machine learning model based on the assigned weight values (P. 3,
PNG
media_image1.png
200
400
media_image1.png
Greyscale
,
PNG
media_image2.png
200
400
media_image2.png
Greyscale
, P. 3, “normalizing the weights of all examples in a training batch so that they sum up to one”, P. 4, Algorithm 1, Steps 12-14, , training update form, weights change which examples contribute through the weighted objective) wherein updating the machine learning model comprises to build a new second machine learning model with the training data which emphasizes the identified feature (P. 4, Algorithm 1, Step 14, optimizer step).
D1 and D2 are analogous art to the claimed invention because they are from a similar field of endeavor of training machine learning models. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1 resulting in resolutions as disclosed by D2 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1 as described above to include the ability to reweight examples to improve validation performance using a novel meta-learning algorithm that learns to assign weights to training examples based on their gradient directions to minimize the loss on a clean unbiased validation set to achieves impressive performance on class imbalance and corrupted label problems where only a small amount of clean validation data is available (D2, Abstract).
With regard to Claim 2,
D1-D2 teach the method of claim 1, further comprising partitioning the dataset into the training data and testing data (D1, P. 1341, 2.1, ”after each tree has been grown, the inputs that did not participate in the training bootstrap sample are used as test set”, , “The VI of a feature is computed as the average decrease in model accuracy on the OOB samples when the values of the respective feature are randomly permuted”, D2, Algorithm 1, Steps 2-3, P. 2, Col. 1, Col. 2, ¶4, “in order to learn general forms of training set biases, it is necessary to have a small unbiased validation to guide training’, P. 6, “Clean validation set”, Col. 1-2, “Hyper-validation set For monitoring training progress and tuning baseline hyperparameters, we split out another 5,000 hyper-validation set from the 50,000 training images”, P. 2, Col. 2, ¶2, “This is reasonable since we are optimizing on the validation set, which is strictly a subset of the full training set, and therefore suffers from its own subsample bias”). The same motivation to combine for claim 1 equally applies for current claim.
With regard to Claim 3,
D1-D2 teach the method of claim 2, further comprising testing the updated machine learning model based on the testing data (D1, P. 1341, 2.1, ”after each tree has been grown, the inputs that did not participate in the training bootstrap sample are used as test set, then averaging over all trees gives the test error estimate.”, “The VI of a feature is computed as the average decrease in model accuracy on the OOB samples when the values of the respective feature are randomly permuted”,D2, P. 3, “We can then look for the optimal that minimizes the validation loss fv locally at step t:”, Col. 1, Online approximation “each training iteration, we inspect the descent direction of some training examples locally on the training loss surface and reweight them according to their similarity to the descent direction of the validation loss surface”). The same motivation to combine for claim 1 equally applies for current claim.
With regard to Claim 4,
D1-D2 teach the method of claim 1, further comprising determining an accuracy value associated with the machine learning model (D1, P. 1341, 2.1, “The VI of a feature is computed as the average decrease in model accuracy on the OOB samples when the values of the respective feature are randomly permuted”, Col. 1-2, “Hyper-validation set For monitoring training progress and tuning baseline hyperparameters”). The same motivation to combine for claim 1 equally applies for current claim.
With regard to Claim 8,
Claim 8 is similar in scope to claim 1; therefore it is rejected under similar rationale. D1-D2 further teach one or more computer-readable non-transitory storage media configured to store computer program code; and one or more computer processors configured to access said computer program code and operate as instructed by said computer program code (D1, P. 1342, 2.4, “(i) training a classical RF model on the training data; (ii) computing the PIMP scores of the covariates; and (iii) training a new model with the classical RF but now using only the significant variables”, P. 1343, 4, “4.1 Simulations”, D2, P. 3, Col. 1, ¶, “For most training of deep neural networks“, P. 4, Col. 1, “implemented using popular deep learning frameworks such as TensorFlow “, Algorithm 1, algorithm must be stored in a memory and executed by a processor, “Training time Our automatic reweighting method will introduce a constant factor of overhead”).
With regard to Claim 15,
Claim 15 is similar in scope to claim 1; therefore it is rejected under similar rationale. D1-D2 further teach a non-transitory computer readable medium having stored there on a computer program for training a machine learning model, the computer program configured to cause one or more computer processors (D1, P. 1342, 2.4, “(i) training a classical RF model on the training data; (ii) computing the PIMP scores of the covariates; and (iii) training a new model with the classical RF but now using only the significant variables”, P. 1343, 4, “4.1 Simulations”, D2, P. 3, Col. 1, ¶, “For most training of deep neural networks“, P. 4, Col. 1, “implemented using popular deep learning frameworks such as TensorFlow “, Algorithm 1, algorithm must be stored in a memory and executed by a processor, “Training time Our automatic reweighting method will introduce a constant factor of overhead”).
With regard to Claim 9,
Claim 9 is similar in scope to claim 2; therefore it is rejected under similar rationale.
With regard to Claim 16,
Claim 16 is similar in scope to claim 2; therefore it is rejected under similar rationale.
With regard to Claim 10,
Claim 10 is similar in scope to claim 3; therefore it is rejected under similar rationale.
With regard to Claim 17,
Claim 17 is similar in scope to claim 3; therefore it is rejected under similar rationale.
With regard to Claim 11,
Claim 11 is similar in scope to claim 4; therefore it is rejected under similar rationale.
With regard to Claim 18,
Claim 18 is similar in scope to claim 4; therefore it is rejected under similar rationale.
Claims 5-6, 12-13, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over “Permutation importance: a corrected feature importance measure” Published 2010 [hereinafter D1] in view of “Learning to Reweight Examples for Robust Deep Learning” Published 2018 [hereinafter D2] in view of “Wrappers for feature subset selection” Published 1997 [hereinafter D3].
With regard to Claim 5,
D1-D2 teach the method of claim 1. The same motivation to combine for claim 1 equally applies for current claim.
D1-D2 does not explicitly teach portion of the training data is selected, based on the accuracy value remaining above a threshold value.
D3 teach portion of the training data is selected (D3, P.44, “Aha and Bankert [2] used the wrapper for identifying feature subsets“), based on the accuracy value (P. 37, “accuracy is a natural performance metric, but one can trivially use a cost function instead of accuracy as the evaluation function for the wrapper”) remaining above a threshold value (P. 21, 3.3, “An improved node is defined as a node with an accuracy estimation at least E higher than the best one found so far”, P. 37, “cross-validation estimated its accuracy to be lower than the node with 97.22% test-set accuracy”).
D1-D2 and D3 are analogous art to the claimed invention because they are from a similar field of endeavor of selecting best features for training machine learning models. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D2 resulting in resolutions as disclosed by D3 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1-D2 as described above to maximize classification accuracy on an unseen test set by guiding the feature subset selection. Instead of trying to maximize accuracy, identify which features were relevant, and use only those features during learning, which increase accuracy and save resources (D3, P. 2, ¶3).
With regard to Claim 6,
D1-D2-D3 teach the method of claim 5, further comprising stopping the machine learning model from updating based on the accuracy value falling below the threshold value (D3, P. 21, “if we have not found an improved node in the last k expansions, we terminate the search. An improved node is defined as a node with an accuracy estimation at least E higher than the best one found so far”, wrapper method iteratively updates the model by retraining on modified training portions and ceases updating when no candidate portion yields accuracy exceeding the current threshold). The same motivation to combine for claim 5 equally applies for current claim.
With regard to Claim 12,
Claim 12 is similar in scope to claim 5; therefore it is rejected under similar rationale.
With regard to Claim 19,
Claim 19 is similar in scope to claim 5; therefore it is rejected under similar rationale.
With regard to Claim 13,
Claim 13 is similar in scope to claim 6; therefore it is rejected under similar rationale.
With regard to Claim 20,
Claim 20 is similar in scope to claim 6; therefore it is rejected under similar rationale.
Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over “Permutation importance: a corrected feature importance measure” Published 2010 [hereinafter D1] in view of “Learning to Reweight Examples for Robust Deep Learning” Published 2018 [hereinafter D2] in view of “Four Principles of Explainable Artificial Intelligence” Published 2021 [hereinafter D4].
With regard to Claim 7,
D1-D2 teach the method of claim 1.
D1-D2 does not explicitly teach identified feature comprises one or more from among fidelity, completeness, stability, certainty, compactness, comprehensibility, actionability, interactivity, translucence, coherence, novelty, and personalization associated with the machine learning model.
D4 teach identified feature comprises one or more from among fidelity (P. 11, ¶7, “the paper introduces faithfulness of an explanation as ... broadly beneficial for society provided that explanations given are faithful, in the sense that they accurately convey a true understanding without hiding important details”), completeness, stability, certainty, compactness, comprehensibility, actionability, interactivity, translucence, coherence, novelty, and personalization associated with the machine learning model.
D1-D2 and D4 are analogous art to the claimed invention because they are from a similar field of endeavor of improving machine learning models training via evaluation and selection mechanisms. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D2 resulting in resolutions as disclosed by D4 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1-D2 as described above to Evaluate the identified features according to known explainability quality metric as fidelity is to support system trustworthiness are accuracy, privacy, reliability, robustness, safety, security (resilience), mitigation of harmful bias, transparency, fairness, and accountability (D4, ii, Executive Summary, ¶1).
With regard to Claim 14,
Claim 14 is similar in scope to claim 7; therefore it is rejected under similar rationale.
Response to Amendment
Applicant’s arguments, see Remarks P. 13-15, filed 12/2/2025/ with respect to how the current invention represent several improvements to the technology have been fully considered and are persuasive. The rejection under 35 USC 101 of claims 1-20 has been withdrawn.
Applicant’s arguments with respect to claims 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Conclusion
The prior art made of record and not relied upon is considered pertinent to the applicant’s disclosure.
US Patent Application Publication No. 2021/0390458 A1 filed by Blumstein et al. that teach the ability determine features importance using permutation importance analysis See at least ¶96, “Determining the “feature importance” of various features may involve permutation importance analysis”
Examiner has pointed out particular references contained in the prior arts of record in the body of this action for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and Figures may apply as well. It is respectfully requested from the applicant, in preparing the response, to consider fully the entire references as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior arts or disclosed by the examiner. It is noted that any citation to specific pages, columns, figures, or lines in the prior art references any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331-33, 216 USPQ 1038-39 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMED ABOU EL SEOUD whose telephone number is (303)297-4285. The examiner can normally be reached Monday-Thursday 9:00am-6:00pm MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michelle Bechtold can be reached at (571) 431-0762. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MOHAMED ABOU EL SEOUD/Primary Examiner, Art Unit 2148