DETAILED ACTION
Continued Examination Under 37 CFR 1.114
The following is a NON-FINAL Office action upon examination of application number 18/470,482 filed on 09/20/2023, in response to Applicant’s Request for Continued Examination (RCE) filed on January 23, 2026. Claims 1-4, 6-14, 16-18, and 20 are pending in this application, and have been examined on the merits discussed below.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
In the response filed January 23, 2026, Applicant amended claims 1, 4, 6, 8, 12, 14, and 17-18, and canceled claims 5, 15, and 19. No new claims were presented for examination.
4. Applicant's amendments to the claims are hereby acknowledged. The amendments are not sufficient to overcome the previously issued claim rejection under 35 U.S.C. 101; accordingly, this rejection has been maintained.
Response to Arguments
5. Applicant's arguments filed December 17, 2025, have been fully considered.
6. Applicant submits “To advance prosecution, Applicant has amended independent claims 1, 12, and 17 to include additional limitations that Applicant asserts incorporate the alleged abstract ideas into a practical application, specifically an improvement to how datasets are managed for an enterprise environment. The Specification discusses why there is a need for improvements in maintaining model accuracy for implementing policies when features from large datasets are deleted in an enterprise environment, at least, at 013.” [Applicant’s Remarks, 12/17/2025, page 11]
In response to Applicant’s argument that “applicant has amended independent claims 1, 12, and 17 to include additional limitations that Applicant asserts incorporate the alleged abstract ideas into a practical application,” the Examiner respectfully disagrees. Under Step 2A Prong Two of the eligibility inquiry, any additional elements are evaluated individually and in combination to determine whether they integrate the judicial exception into a practical application, with consideration of the following exemplary considerations that may be indicative of a practical application: an additional element that reflects an improvement to the functioning of a computer or to any other technology or technical field, applying the exception with a particular machine, applying the judicial exception to effect a particular treatment or prophylaxis for a disease or medical condition, effecting a transformation of a particular article to a different state or thing, and applying or using the judicial exception some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment.
In this instance, the additional elements recited in exemplary claim 1 include: an enterprise environment comprising a processor communicatively coupled to a memory, a policy knowledge base, and a machine unlearning model. These elements have been considered individually and in combination, however these computing elements amount to using a generic computer programmed with computer-executable instructions/software to perform the abstract idea, similar to adding the words “apply it” (or an equivalent), which merely serves to link the use of the judicial exception to a particular technological environment, which is not sufficient to amount to a practical application, as noted in MPEP 2106. See also MPEP 2106.05(f) and 2106.05(h). Furthermore, these additional elements fail to provide an improvement to the functioning of a computer or to any other technology or technical field, fail to apply the exception with a particular machine, fail to apply the judicial exception to effect a particular treatment or prophylaxis for a disease or medical condition, fail to effect a transformation of a particular article to a different state or thing, and fail to apply/use the abstract idea in a meaningful way beyond generally linking the use of the judicial exception to a particular technological environment. Instead, the processor amounts to using generic computing devices as tools to implement the abstract idea, which does not amount to a technological improvement or otherwise indicate a practical application. See MPEP 2106.05(f).
It is not clear how the claimed limitations provide an actual improvement to another technology or technical field, an improvement to the functioning of the computer itself, or meaningful limitations beyond generally linking the use of the abstract idea to a particular technological environment evident in the claims. The Applicant’s claims do not adequately explain how the additional elements of the claim integrate to add any meaningful limits on the abstract idea. At the most, the claimed invention seems to provide improvement beneficial to the end users. The focus of the claims of the instant application is not on such an improvement in computers as tools, but on certain independently abstract ideas that use computers as tools. Even reviewing the Applicant’s Specification (which describes the hardware and software), it is not made clear how the hardware and software result in an improvement to the technology or hardware itself, etc. The claimed invention does not provide an improvement to another technology/technical field or the functioning of the computer itself. Applicant's invention is directed towards providing business solutions to business problems rather than providing technical solutions to technical problems; thus, the claimed invention does not provide an improvement to another technology/technical field or the functioning of the computer itself. The Examiner further points out there is no actual improvement to another technology or technical field, no improvement to the functioning of the computer itself, and no meaningful limitations beyond generally linking the use of the abstract idea to a particular technological environment evident in the claims.
It is noted that Applicant’s claims are devoid of any discernible change, transformation, or improvement to a computer (software or hardware) or any existing technology. Applicant has not shown that any specific technological improvement is achieved within the scope of the claims. It bears emphasis that no processor, memory, knowledge base, or technological elements are modified or improved upon in any discernible manner. Instead, the result produced by the claims is simply information including a recommendation, which is not a technical result or improvement thereof.
Lastly, while Applicant asserts that the amendments improve how datasets are managed and dynamically maintain model accuracy in enterprise environments, it is noted that the alleged improvements is directed to improving decision making regarding feature deletion and informing users of impact. These are improvement in information analysis and business operations, not improvements to computer functionality itself. For the reasons above, this argument is found unpersuasive.
7. Applicant submits “the amended independent claims claim improved techniques for automatically removing features from enterprise datasets with efficient and selective use of human-in-the-loop training. It would be impossible to provide these improvements without the use of a computer, particularly in the context of the vast datasets used by enterprise environments, as claimed in the amended independent claims. The amended independent claims include steps providing the improvements described above.” [Applicant’s Remarks, 12/17/2025, page 12]
The Examiner respectfully disagrees. In response to Applicant’s argument that “it would be impossible to provide these improvements without the use of a computer, particularly in the context of the vast datasets used by enterprise environments,” it is noted that the proper inquiry under §101 is not whether a computer is necessary to efficiently execute the steps, but whether the claims are directed to an improvement in computer functionality itself. Performing data analysis, impact prediction, reward calculator and conditional decision making on datasets using generic processors and machine learning models constitutes implementation of an abstract idea on a computer which does not render the claims eligible under 35 U.S.C. 101.
Furthermore, Applicant’s assertions that the amendments provide “improved techniques” for selective human-in-the-loop training is also unpersuasive because the claim recites the results of analysis and decision making without specifying a particular technological mechanism that improves processor operation, memory structure or machine unlearning technology. The claimed steps user conventional computing components to carry out abstract data processing and policy evaluation Therefore, the claims remain directed to an abstract idea and do not integrate the judicial exception into a practical application or amount to significantly more under §101
For the reasons above, in addition to the reasons provided in the updated §101 rejection below, Applicant’s amendment and supporting arguments rejection are not sufficient to overcome the §101 rejection.
8. Applicant submits “that the combination of Yuan, Fani, and Jain does not teach or render obvious the process of managing datasets for enterprise environments claimed in the amended independent claims. For example, the references do not teach operations including carrying out association rule-based mapping to map models trained on the datasets to policies of the enterprise based on how the models are used to implement the policies, receiving a request to remove at least one feature from the datasets automatically generated according to a data retention policy, identifying a policy associated with the feature(s) using an ARM model, identifying " a model from the models that is used to implement the associated policy; predicting using the ARM model, a level of impact removing the at least one feature will have on the identified policy;" generating performance scores for the model with and without the feature(s), calculating a reward function for the feature(s) " based on the level of impact and the performance scores;" in response to determining that the reward function is above a threshold, generating instructions to carry out human-in-the-loop training on the identified model, and " automatically applying, by the processor and in response to the request, a machine unlearning model to the identified model to remove the at least one feature," as claimed in the amended independent claims.” [Applicant’s Remarks, 12/17/2025, page 14]
In response to the Applicant’s argument that “the combination of Yuan, Fani, and Jain does not teach or render obvious the process of managing datasets for enterprise environments claimed in the amended independent claims. For example, the references do not teach operations including carrying out association rule-based mapping to map models trained on the datasets to policies of the enterprise based on how the models are used to implement the policies, receiving a request to remove at least one feature from the datasets automatically generated according to a data retention policy, identifying a policy associated with the feature(s) using an ARM model, identifying " a model from the models that is used to implement the associated policy; predicting using the ARM model, a level of impact removing the at least one feature will have on the identified policy;" generating performance scores for the model with and without the feature(s), calculating a reward function for the feature(s) " based on the level of impact and the performance scores;" in response to determining that the reward function is above a threshold, generating instructions to carry out human-in-the-loop training on the identified model, and " automatically applying, by the processor and in response to the request, a machine unlearning model to the identified model to remove the at least one feature," as claimed in the amended independent claims,” it is noted that this argument is a mere allegation of patentability by the Applicant with no supporting rationale or explanation. Merely stating that the claims do not teach a feature does not offer any insight as to why the specific sections of the prior art relied upon by the Examiner fail to disclose the claimed features. Applicant's arguments amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references. Moreover, the Examiner notes the limitations being argued by Applicant as being newly amended to the claims in the response filed 12/17/2025, which have been addressed in the updated rejection below. Applicant’s argument has been considered, but it pertains to amendments to independent claim 1 that are believed to be addressed via the new ground of rejection under §103 set forth in the instant Office action, which incorporates a new reference and new citations to address the amended limitations in claim 1 and supports a conclusion of obviousness of the amended claims.
9. Applicant’s remaining arguments either logically depend from the above-rejected arguments, in which case they too are unpersuasive for the reasons set forth above, or they are directed to features which have been newly added via amendment. Therefore, this is now the Examiner's first opportunity to consider these limitations and as such any arguments regarding these limitations would be inappropriate since they have not yet been examined. A full rejection of these limitations will be presented later in this Office Action.
Claim Rejections - 35 USC § 112
10. The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
11. Claims 1-4 and 6-11 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA the applicant regards as the invention.
12. Claim 1 was amended to recite “applying, by the processor and in response to the instruction…” The phrase “the instruction” lacks antecedent basis and therefore renders the claim indefinite. While claim 1 recites “generating instructions” (i.e., plural) claim 1 does not introduce “an instruction.” However, in view of corresponding language in independent claims 12 and 17, which recite “applying the machine unlearning model” “in response to the request,” the term “the instruction” in claim 1 will be interpreted as referring to “the request.” Appropriate correction is required.
13. Claim 8 was amended to recite “The method of claim 1, further comprising: receiving a user-input request to remove a next at least one feature; calculating a reward function for the next at least one feature; in response to determining that the reward function for the next at least one feature is greater than the threshold value, requiring approval before removing the next at least one feature; removing the at least one feature in response to the user inputting the approval; and retraining the model.” Specifically, the claim recites “removing the at least one feature in response to the user inputting the approval,” after introducing “a next at least one feature.” It is unclear whether “the at least one feature” refers to the feature recited in independent claim 1 or the “next at least one feature” introduced in claim 8. This inconsistency creates ambiguity as to which feature is being removed, therefore rendering the claim indefinite. Appropriate correction is required.
14. All claims dependent from above rejected claims are also rejected due to dependency.
Claim Rejections - 35 USC § 101
15. 35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
16. Claims 1-4, 6-14, 16-18, and 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The eligibility analysis in support of these findings is provided below, in accordance with MPEP 2106.
With respect to Step 1 of the eligibility inquiry (as explained in MPEP 2106), it is first noted that the method (claims 1-4, 6-11), system (claims 12-14, 16), and computer program product (claims 17-18, 20), and is directed to at least one potentially eligible category of subject matter (i.e., process, machine, and article of manufacture, respectively). Paragraph 0062 of the Specification indicates: A computer readable storage medium, as that term is used in the present disclosure, is not to be construed as storage in the form of transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide, light pulses passing through a fiber optic cable, electrical signals communicated through a wire, and/or other transmission media.” Thus, Step 1 of the Subject Matter Eligibility test for claims 1-4, 6-14, 16-18, and 20 is satisfied.
With respect to Step 2A Prong One, it is next noted that the claims recite an abstract idea that falls into the “Mental Processes” abstract idea set forth in MPEP 2106 because the claims recite steps that can be performed in the human mind (including observation, evaluation, judgment, opinion), and therefore fall under the “Mental Processes” abstract idea grouping, and also falls into the “Mathematical Concepts” such as mathematical relationships, formulas and calculations. With respect to independent claim 1, the limitations reciting the abstract idea are indicated in bold below: obtaining, by the processor, models pretrained on sets of features from the datasets; mapping, by the processor, policies of the enterprise environment to the models, wherein the mapping comprises: accessing a policy knowledge base storing information about how the models are used to implement the policies; and carrying out association rule-based mapping to map each of the models to at least one corresponding policy from the set of policies based on the accessed information; receiving, by the processor, a request to remove at least one feature from the sets of features, wherein the request is automatically generated according to a data retention policy; in response to the receiving, identifying, by the processor using an association rule mining (ARM) model, a policy from the set of policies that is associated with the at least one feature; identifying, by the processor and based on the mapping, a model from the models that is used to implement the associated policy; predicting, by the processor using the ARM model, a level of impact removing the at least one feature will have on the identified policy; generating, by the processor using a performance evaluation model, performance scores for the model with and without the at least one feature; calculating, by the processor, a reward function for the at least one feature based on the level of impact and the performance scores; determining, by the processor, that the reward function is greater than a threshold value; generating, by the processor and in response to the determining, a recommendation for a user, wherein the generating the recommendation comprises generating instructions to carry out human-in-the-loop training on the model; and automatically applying, by the processor and in response to the instruction, a machine unlearning model to the model to remove the at least one feature. These steps encompasses mental processes since the steps may be accomplished by human judgment or evaluation, such as with the aid of pen and paper. The generating, calculating, and determining steps recite mathematical concepts, relationships, formulas or equations, or calculations.
Therefore, because the limitations above set forth activities falling within the “Mental Processes” and “Mathematical Concepts” abstract idea groupings described in MPEP 2106, the additional elements recited in the claims are further evaluated, individually and in combination, under Step 2A Prong Two and Step 2B below. Independent claims 12 and 17 recite similar limitations as those discussed above and are therefore found to recite the same or substantially the same abstract idea as claim 1.
With respect to Step 2A Prong Two, the judicial exception is not integrated into a practical application. With respect to independent claims 1/12/17, the additional elements are: the enterprise environment comprising a processor communicatively coupled to a memory, a policy knowledge base, and a machine unlearning model (claim 1); a memory, a processor communicatively coupled to the memory, the enterprise environment, a policy knowledge base, a machine unlearning model (claim 12), a computer readable storage medium having program instructions embodied therewith, a processor, a device, the enterprise environment, a policy knowledge base, and a machine unlearning model (claim 17). These additional elements have been evaluated, but fail to integrate the abstract idea into a practical application because they amount to using generic computing elements or computer-executable instructions (software) to perform the abstract idea, similar to adding the words “apply it” (or an equivalent), and merely serve to link the use of the judicial exception to a particular technological environment. See MPEP 2106.05(f) and 2106.05(h). Even if the step for obtaining and receiving are not deemed part of the abstract idea, this step is at most directed to insignificant extra-solution activity, which is not sufficient to amount to a practical application. See MPEP 2106.05(g). In addition, these limitations fail to provide an improvement to the functioning of a computer or to any other technology or technical field, fail to apply the exception with a particular machine, fail to apply the judicial exception to effect a particular treatment or prophylaxis for a disease or medical condition, fail to effect a transformation of a particular article to a different state or thing, and fail to apply/use the abstract idea in a meaningful way beyond generally linking the use of the judicial exception to a particular technological environment.
Accordingly, because the Step 2A Prong One and Prong Two analysis resulted in the conclusion that the claims are directed to an abstract idea, additional analysis under Step 2B of the eligibility inquiry must be conducted in order to determine whether any claim element or combination of elements amount to significantly more than the judicial exception.
With respect to Step 2B of the eligibility inquiry, it has been determined that the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. With respect to independent claims 1/12/17, the additional elements are: the enterprise environment comprising a processor communicatively coupled to a memory, a policy knowledge base, and a machine unlearning model (claim 1); a memory, a processor communicatively coupled to the memory, the enterprise environment, a policy knowledge base, a machine unlearning model (claim 12), a computer readable storage medium having program instructions embodied therewith, a processor, a device, the enterprise environment, a policy knowledge base, and a machine unlearning model (claim 17). These elements have been considered individually and in combination, but fail to add significantly more to the claims because they amount to using generic computing elements or instructions (software) to perform the abstract idea, similar to adding the words “apply it” (or an equivalent), and merely serve to link the use of the judicial exception to a particular technological environment and does not amount to significantly more than the abstract idea itself. Notably, Applicant’s Specification suggests that virtually any type of computing device under the sun can be used to implement the claimed invention (Specification at paragraph [0022]). Accordingly, the generic computer involvement in performing the claim steps merely serves to generally link the use of the judicial exception to a particular technological environment, which does not add significantly more to the claim. See, e.g., Alice Corp., 134 S. Ct. 2347, 110 USPQ2d 1976.). Next, the steps for obtaining and receiving are considered insignificant extra-solution activity, which has been recognized as well-understood, routine, and conventional, and thus insufficient to add significantly more to the abstract idea. See MPEP 2106.05(d).
In addition, when taken as an ordered combination, the ordered combination adds nothing that is not already present as when the elements are taken individually. There is no indication that the combination of elements integrate the abstract idea into a practical application. Their collective functions merely provide generic computer implementation. Therefore, when viewed as a whole, these additional claim elements do not provide meaningful limitations to transform the abstract idea into a practical application of the abstract idea or that, as an ordered combination, amount to significantly more than the abstract idea itself.
Dependent claims 2-4, 6-11, 13-14, 16, 18, and 20 recite the same abstract idea as recited in the independent claims, and when evaluated under Step 2A Prong One are found to merely recite details that serve to narrow the same abstract idea recited in the independent claims accompanied by the same generic computing elements or software as those addressed above in the discussion of the independent claims, which is not sufficient to amount to a practical application or add significantly more, or other additional elements that fail to amount to a practical application or add significantly more, as noted above. In particular, dependent claims 2-4 and 6-11 recite “wherein the calculating the reward function comprises determining support for an association between the at least one feature and the policy using rule learning to find relationships among variables in the sets of features,” “wherein the calculating the reward function further comprises calculating a confidence for the association,” “wherein the generating the recommendation further comprises notifying the user that a difference between the performance scores for the model with and without the at least one feature is greater than a threshold difference,” “wherein the generating the recommendation further comprises notifying the user of the policy,” “further comprising mapping the sets of features to the models,” “further comprising: receiving a user-input request to remove a next at least one feature; calculating a reward function for the next at least one feature; in response to determining that the reward function for the next at least one feature is greater than the threshold value, requiring approval before removing the next at least one feature; removing the at least one feature in response to the user inputting the approval; and retraining the model,” “further comprising: receiving multiple requests to delete selected features from the sets of features; and in response to the receiving the multiple requests, generating a next recommendation for the user,” “wherein the next recommendation comprises a suggestion that the selected features be deleted as a batch,” “wherein the set of policies comprises business rules”, however these limitations also recite steps that may also be accomplished mentally such as via human observation and perhaps with the aid of pen and paper, and further narrow the abstract ideas recited in independent claim 1 by reciting additional details or steps that set forth mathematical relationships, formulas and calculations, which therefore fall under the “Mathematical Concepts” group. The other dependent claims have been evaluated as well, but similar to claims 2-4 and 6-11, these claims also recite details of the abstract ideas themselves accompanied by, at most, generic computer implementation, which is not enough to transform the claims into a practical application of the abstract idea or amount to significantly more than the abstract idea itself. See MPEP 2106.05(f),(h). See also, Alice Corp., 134 S. Ct. 2347, 110 USPQ2d 1976. Dependent claim 8 recites additional elements of: retraining the model. However, when evaluated under Step 2A Prong Two and Step 2B, these additional elements do not amount to a practical application or significantly more since they merely require generic computing devices (or computer-implemented instructions/code) which as noted in the discussion of the independent claims above is not enough to render the claims as eligible.
Even if the retraining the model was evaluated as an element beyond software/code for a generic computer to execute, it is noted that that the claimed retraining is recited at a high level of generality these elements amount to well-understood, routine, and conventional activity in the art, which fails to add significantly more to the claims. See, e.g., Hughes et al., US 2024/0020698 A1 (paragraph 0006: “Conventional techniques for machine learning (ML) model re-training on datasets that may be more reflective of recent trends and patterns”)
The ordered combination of elements in the dependent claims (including the limitations inherited from the parent claim(s)) add nothing that is not already present as when the elements are taken individually. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology. Their collective functions merely provide generic computer implementation. Accordingly, the subject matter encompassed by the dependent claims fails to amount to a practical application or significantly more than the abstract idea itself.
For more information, see MPEP 2106.
Claim Rejections - 35 USC § 103
17. In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
18. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
19. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
20. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
21. Claims 1, 6-9, 11-12, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Yuan et al., Pub. No.: US 2020/0334492 A1, [hereinafter Yuan], in view of Fani Sani et al., Pub. No.: US 2024/0028996 A1, [hereinafter Fani], in view of Jain et al., Patent No.: US 7,831,613 B2, [hereinafter Jain], in further view of Bannihattu Kumar et al., Pub. No.: US 2024/0202587 A1, [hereinafter Kumar].
As per claim 1, Yuan teaches a method for managing datasets in an enterprise environment (paragraph 0014: “a computer-implemented method for determining an influence of a component of an input on a prediction generated according to a machine learning model”),
the enterprise environment comprising a processor communicatively coupled to a memory (paragraph 0036, discussing a computing system comprising one or more processors; paragraph 0115, discussing that the computing system comprises a processor coupled to a mass storage unit and accessing a working memory; paragraph 0116), and the method (paragraph 0014) comprising:
obtaining, by the processor, models pretrained on sets of features from the datasets (paragraph 0013, discussing that existing approaches rely on removing individual latent features and measuring the influence on the model predictions. The embodiments described rely on the removal or adaptation of meaningful components from the input/observed data leading to the removal or modification of combinations of multiple latent features (after the adapted input has been embedded onto the latent space); paragraph 0053, discussing that machine learning models are based on observed data, as they are models having parameters that have been fit to the observed data based on a number of training steps; paragraph 0060, discussing a method of determining the importance of observed features on predictions made by a machine learning model; paragraph 0076, discussing that the methods relate generally to identifying the influential components observed data that contributed towards a prediction; paragraph 0045);
receiving, by the processor, a request to remove at least one feature from the sets of features (paragraph 0013, discussing that the embodiments described rely on the removal of meaningful components from the input/observed data leading to the removal or modification of combinations of multiple latent features; paragraph 0056, discussing that the calculation of importance based on the processed input X′ requires access to the latent variables x′. For instance, if determining the importance of a latent variable, the latent variable may be deleted and the system may be retrained without using this latent variable to determine its influence on the prediction; paragraph 0095, discussing that for each identified component, an adjusted, or perturbed, input is produced. The adjusted input is adjusted through the adjustment of the identified component within the input. In this case, the adjusted input is produced through the ablation (deletion or removal) of the component from the input text);
generating, by the processor using a performance evaluation model, performance scores for the model with and without the at least one feature (paragraph 0018, discussing that the difference in the measure of confidence in the first prediction and the measure of confidence in the second prediction is a difference relative to the measure of confidence in the first prediction. That is, calculating the difference between the measure of confidence in the first prediction and the measure of confidence in the second prediction might comprise subtracting the measure of confidence in the second prediction from the measure of confidence in the first prediction to determine a change in the measure of confidence, and dividing the change in the measure of confidence by the measure of confidence in the first prediction to obtain the difference in the measure of confidence. Taking the relative difference allows the influence score to be comparable across different models and observations; paragraph 0056, discussing that the calculation of importance based on the processed input X′ requires access to the latent variables x′. For instance, if determining the importance of a latent variable, the latent variable may be deleted and the system may be retrained without using this latent variable to determine its influence on the prediction; paragraph 0057, discussing that calculating the importance of the latent variables on the model determines the importance of a given attribute on a prediction; paragraph 0070, discussing that the difference in accuracy score might also be utilised, taking the relative difference allows the influence score to be comparable across different models and observations; paragraphs 0019, 0063);
calculating, by the processor, a reward function for the at least one feature based on the level of impact and the performance scores (paragraph 0002, discussing that machine learning methods generally aim to make predictions based on models that have been trained based on observed (training) data. Generally, machine learning training methods adjust the parameters of a given model in an attempt to minimize some form of loss function or maximize some form of reward function based on predictions made by the model. One example of this is the adjustment of parameters to minimize the prediction error of the model; paragraph 0021, discussing that the first prediction is a first action and the second prediction is a second action and the measure of confidence in the first prediction is a reward for a first action and the measure of confidence in the second prediction is a reward for the second action. Accordingly, the method may be applied to determine influence on a machine learning agent configured to take actions in response to an input. The agent may have been trained via reinforcement learning. The rewards may be determined by a reward function. Equally, the rewards may be losses calculated through a loss function; paragraph 0052, discussing that each prediction has an associated measure of confidence (for instance, a classification confidence score, a prediction accuracy, or a reward for predicting a particular action). This represents the confidence that the prediction is accurate or correct. It can therefore be considered an accuracy score; paragraph 0111, discussing that each instance might have a different impact on the prediction depending on its context within the input);
determining, by the processor, that the reward function is greater than a threshold value (paragraph 0002, discussing that machine learning methods generally aim to make predictions based on models that have been trained based on observed (training) data. Generally, machine learning training methods adjust the parameters of a given model in an attempt to minimize some form of loss function or maximize some form of reward function based on predictions made by the model. One example of this is the adjustment of parameters to minimize the prediction error of the model; paragraph 0021, discussing that the first prediction is a first action and the second prediction is a second action and the measure of confidence in the first prediction is a reward for a first action and the measure of confidence in the second prediction is a reward for the second action. Accordingly, the method may be applied to determine influence on a machine learning agent configured to take actions in response to an input. The agent may have been trained via reinforcement learning. The rewards may be determined by a reward function. Equally, the rewards may be losses (through the provision of negative rewards) calculated through a loss function; paragraph 0052, discussing that the prediction Y might be a set of one or more confidence scores for classification, might be an action for application to an environment, or might be a synthetically generated data point. The prediction Y is a single instance of predicted data. The prediction includes one or more predicted values. That is, the prediction Y comprises the set of m predicted values. Each prediction has an associated measure of confidence (for instance, a classification confidence score, a prediction accuracy, or a reward for predicting a particular action). This represents the confidence that the prediction is accurate or correct. It can therefore be considered an accuracy score.); and
generating, by the processor and in response to the determining, a recommendation for a user (paragraph 0073, discussing that identifying influential components can help users to debug or further improve the machine learning model. For instance, if a classifier is producing classifications that appear to be erroneous (or at least anomalous), identifying the influential components that caused these erroneous classifications can help a user to assess whether the data is indeed erroneous (e.g. through comparison to the influential components within the data); paragraph 0074, discussing that identifying the influential component within the input data that caused the classification can help the user determine whether the classification is correct. For instance, in a classifier that attempts to identify malicious emails an email may appear on the face of it to be benign but might have a difficult to identify issue (such as an incorrect URL that directs the user to a malicious site). The methods described are able to direct the user's attention to the most important component within the observed data to the help assess the accuracy of the classification; paragraph 0114, discussing that the influence scores can be used to advise users as to how to improve the machine learning model or how to achieve improved results).
Yuan does not explicitly teach mapping, by the processor, policies of the enterprise environment to the models, wherein the mapping comprises: accessing a policy knowledge base storing information about how the models are used to implement the policies; and carrying out association rule-based mapping to map each of the models to at least one corresponding policy from the set of policies based on the accessed information; wherein the request is automatically generated according to a data retention policy; in response to the receiving, identifying, by the processor using an association rule mining (ARM) model, a policy from the set of policies that is associated with the at least one feature; identifying, by the processor and based on the mapping, a model from the models that is used to implement the associated policy; predicting, by the processor using the ARM model, a level of impact removing the at least one feature will have on the identified policy; wherein the generating the recommendation comprises generating instructions to carry out human-in-the-loop training on the model; and automatically applying, by the processor and in response to the instruction, a machine unlearning model to the model to remove the at least one feature. Fani in the analogous art of feature removal systems teaches:
mapping, by the processor, policies of the enterprise environment to the models (paragraph 0019, discussing that the performance of root cause analysis (RCA) may be improved, for example, by selecting and/or excluding features based on at least one of correlation, granularity, and/or relevance thresholds. An RCA engine may perform RCA on a set of features to generate a set of RCA rules (e.g., actions triggered by conditions). A visualizer may generate a visualization of the RCA rules. Visualizations may be interactive. For example, users may select a condition to view actions triggered by the condition. Rules may be developed by modifying feature values. Users may provide target goals to improve a process, for example, before and/or after visualization. The RCA engine may perform an RCA based on modified feature values to achieve the target goal and present (e.g., visualize) target modifications; paragraph 0060, discussing that insightful conditions may be determined, for example, using subgroup discovery techniques. Actions may be determined for a (e.g., each) condition. For example, actions with the highest positive/negative impact may be determined. In an example, the rule generator may receive a preselected, labeled feature table. Rule generator may discover conditions based on the preselected, labeled feature table, for example, using subgroup discovery. The rule generator may discover actions (e.g., perform action discovery), for example, using one or more techniques, such as tree-based decision methods. The rule generator may generate one or more RCA rules (e.g., based on the label value(s), the condition(s) and the action(s); paragraph 0087, discussing that one or more RCA rules may be generated based on the selected features and label(s). RCA rules may be generated by an RCA engine, which may be or may include a machine learning (ML) model. Rule generation may be based on visualization of rules, e.g., to show the impact of features on the target, which form the basis of rules. RCA generation may be based on, for example, a Shapley technique, subgroup discovery, uplifting trees, etc., alone or in combination. A Shapley value may be computed by making changes in input features to determine how the changes to the input features correspond to a model prediction. A Shapley value of a feature may be calculated as an average marginal contribution to the overall model score. An RCA rule may indicate, for example, that there is a delay in 85% of cases where middle-aged customers visit on weekends. In some examples, there may be multiple rules, such as a sequence of rules; paragraph 0102);
wherein the request is automatically generated according to a data retention policy (paragraph 0086, discussing that one or more features may be selected (e.g., automatically by featurizer 130) from the feature table. Feature selection may select and/or remove features from a feature table. Features may be adjusted to improve RCA performance. In some examples, a feature selection may (e.g., automatically) select or not select features, for example, based on knowledge of a sequence of activities/steps in a process, types of users, time between activities, resources and/or employees involved in a process, data types or categories. For example, feature selection may eliminate, discard data in input data that bear little to no relation to the RCA model. Feature selection may be automated...For example, RCA model training or an analysis may indicate which features in a data set are useful to predict root cause in an analysis of a particular topic. An algorithm may be applied to select features. Features may be removed, for example, based on granularity, high correlation, and/or irrelevance. The granularity of features may indicate their utility in predicting root cause. For example, a feature may be too high-level or too low-level to be of significant value in determining root cause. A feature with a high correlation level may be removed (e.g., not selected) for RCA model features, for example, so that the RCA model provides useful insight. In an example, if age is used as a feature and there is only one case among 1,000 cases for a 25-year old, age may be too granular. Features may be customized automatically, for example, to provide customizable insights based on business knowledge; paragraph 0099, 0123);
in response to the receiving, identifying, by the processor using an association rule mining (ARM) model, a policy from the set of policies that is associated with the at least one feature (paragraph 0021, discussing that root cause analysis (RCA) may be used in process mining to detect the cause(s) of issues/problems in processes with stored records of events for one or more sequences of activity in one or more processes. A web log may be a sequence of activity/events, such as clicks and/or data entry, by a user engaging with a process on a website. In an example, an insurance company process to provide insurance policies may take too much time. RCA may be used to determine why the process is so slow. Event records for empirical insurance policy cases may be categorized and labeled regarding delays to form data sets for an RCA model. An RCA model (e.g., supervised or unsupervised model) may be used to determine the causes of the delays; paragraph 0074, discussing that the visualizer may utilize feature contribution algorithms to help users understand various factors associated with process issues/labeled classes, as indicated by rules based on conditions and actions. The visualizer may perform a descriptive analysis, for example, to help users understand the process characteristics that lead to issues/problems identified by labeled classes (e.g., excessive delay). The classification model (e.g., Rule generator) and/or other machine learning explanation tools may be used to understand the sources of issues/problems. The visualizer may, for example, quantitatively attribute labeled classes to each of multiple features; paragraph 0087, discussing that one or more RCA rules may be generated based on the selected features and label(s). RCA rules may be generated by an RCA engine, which may be or may include a machine learning (ML) model. Rule generation may be based on visualization of rules, e.g., to show the impact of features on the target, which form the basis of rules. RCA generation may be based on, for example, a Shapley technique, subgroup discovery, uplifting trees, etc., alone or in combination. A Shapley value may be computed by making changes in input features to determine how the changes to the input features correspond to a model prediction. A Shapley value of a feature may be calculated as an average marginal contribution to the overall model score. An RCA rule may indicate, for example, that there is a delay in 85% of cases where middle-aged customers visit on weekends. In some examples, there may be multiple rules, such as a sequence of rules; paragraph 0099, discussing that the set of features may be reduced by removing features that would negatively impact a root cause analysis (RCA) to generate a reduced set of features. For example, as shown in FIG. 1, the feature selector may reduce the set of features generated by the featurizer by selecting or excluding features based on at least one of a correlation threshold, a granularity threshold, or a relevance threshold (e.g., to remove features that would negatively impact the utility of RCA rules generated by the RCA));
predicting, by the processor using the ARM model, a level of impact removing the at least one feature will have on the identified policy (paragraph 0060, discussing that insightful conditions may be determined, for example, using subgroup discovery techniques. Actions may be determined for a (e.g., each) condition. For example, actions with the highest positive/negative impact may be determined. In an example, rule generator 134 may receive a preselected, labeled feature table. Rule generator may discover conditions based on the preselected, labeled feature table, for example, using subgroup discovery. Rule generator 134 may discover actions, for example, using one or more techniques, such as tree-based decision methods. Rule generator 134 may generate one or more (e.g., a set of) RCA rules 120 (e.g., based on the label value(s), the condition(s) and the action(s); paragraph 0071, discussing that rule selector 136 may select rules for visualization based on one or more thresholds indicating a threshold impact in order to be visualized; paragraph 0099, discussing that the set of features may be reduced by removing features that would negatively impact a root cause analysis (RCA) to generate a reduced set of features. For example, feature selector 132 may reduce the set of features generated by featurizer 130 by selecting or excluding features based on at least one of a correlation threshold, a granularity threshold, or a relevance threshold (e.g., to remove features that would negatively impact the utility of RCA rules generated by the RCA); paragraph 0114, discussing that the program code may comprise a featurizer that performs featurization on an event log for a process with a plurality of process instances to generate a set of features with feature values for the plurality of process instances. The set of features may exclude features that would negatively impact a root cause analysis (RCA); paragraph 0115, discussing that the featurizer may comprise a feature selector that reduces the set of generated features by removing the features that would negatively impact the RCA; paragraph 0123); and
wherein the generating the recommendation comprises generating instructions to carry out human-in-the-loop training on the model (paragraph 0021, discussing that event records for empirical insurance policy cases may be categorized and labeled regarding delays to form data sets for an RCA model. An RCA model (e.g., supervised or unsupervised model) may be used to determine the causes of the delays; paragraph 0053, discussing that in some examples, users may select/deselect features; paragraph 0085, discussing that one or more labels (e.g., classifications) may be added manually by user(s) to the feature table. Labeling may create target values or classes in an ML model).
Yuan is directed to ablation on observable data for determining influence on machine learning systems. Fani is directed to machine learning systems. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuan with Fani because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying Yuan to include Fani’s features for including mapping, by the processor, policies of the enterprise environment to the models, and in response to the receiving, identifying, by the processor using an association rule mining (ARM) model, a policy from the set of policies that is associated with the at least one feature, predicting, by the processor using the ARM model, a level of impact removing the at least one feature will have on the identified policy, and wherein the generating the recommendation comprises generating instructions to carry out human-in-the-loop training on the model, in the manner claimed, would serve the motivation of identifying problems and improving processes (Fani at paragraph 0001); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
The Yuan-Fani combination does not explicitly teach wherein the mapping comprises: accessing a policy knowledge base storing information about how the models are used to implement the policies; and carrying out association rule-based mapping to map each of the models to at least one corresponding policy from the set of policies based on the accessed information; identifying, by the processor and based on the mapping, a model from the models that is used to implement the associated policy; predicting, by the processor using the ARM model, a level of impact removing the at least one feature will have on the identified policy; and automatically applying, by the processor and in response to the instruction, a machine unlearning model to the model to remove the at least one feature. Jain in the analogous art of modeling systems teaches:
wherein the mapping comprises: accessing a policy knowledge base storing information about how the models are used to implement the policies (col. 1, lines 15-20, discussing that the invention generally relates to the management of models; col. 16, lines 30-39, discussing that for example, if a variable relates to the interest rate for a financial product, a user may increase this variable then determine how such an increase will affect the various models that are impacted by the variable. More specifically, if the interest rate is raised from 5% to 10%, then the user may see that a model which models financial product purchases in the southeastern United States shows a decrease in the number of expected financial product purchases to decrease due to consumers historically not desiring a financial product with such a high rate; col. 16, lines 51-67, discussing that the report lists all (or any subset of) models which are directly impacted by the selected variable identified at 610. User 105 may further view the number of models directly affected by the selected model. A second table displays models that are indirectly impacted by the selected variable as identified. In other words, table lists models that are directly impacted by the selected variable…; col. 17, lines 12-34, discussing that FIG. 7 is a screenshot of an exemplary interface for displaying detailed model attributes in accordance with an embodiment of the present invention. For a more holistic view of model details, the user may select to view a Detailed Report web page that presents models with sufficient detail to enable user to quickly discern model attributes. In one embodiment, only models that are directly impacted by the selected variable are displayed within the Detailed Report web page…); and
carrying out association rule-based mapping to map each of the models to at least one corresponding policy from the set of policies based on the accessed information (col. 13, lines 23-31, discussing that the models in the tree structure are color coded. Varying colors are used to indicate that a model has been marked for deletion, has not been used for a defined period of time, is a new model, and/or the like. Furthermore, models may be color coded according to model label, model identifier, modeler, model owner, number of records associated with the model, frequency of use, business group, population segment, model program environment, population, and most recent deployment date; col. 16, lines 30-39, discussing that for example, if a variable relates to the interest rate for a financial product, a user may increase this variable then determine how such an increase will affect the various models that are impacted by the variable. More specifically, if the interest rate is raised from 5% to 10%, then the user may see that a model which models financial product purchases in the southeastern United States shows a decrease in the number of expected financial product purchases to decrease due to consumers historically not desiring a financial product with such a high rate; col. 16, lines 51-67, discussing that the report lists all (or any subset of) models which are directly impacted by the selected variable identified. User 105 may further view the number of models directly affected by the selected model. A second table displays models that are indirectly impacted by the selected variable as identified. In other words, table lists models that are directly impacted by the selected variable…; col. 17, lines 12-34, discussing that FIG. 7 is a screenshot of an exemplary interface for displaying detailed model attributes... For a more holistic view of model details, the user may select to view a Detailed Report web page that presents models with sufficient detail to enable user to quickly discern model attributes. In one embodiment, only models that are directly impacted by the selected variable are displayed within the Detailed Report web page…; col. 6, lines 3-16);
(col. 1, lines 15-20, discussing that the invention generally relates to the management of models; col. 16, lines 30-39, discussing that for example, if a variable relates to the interest rate for a financial product, a user may increase this variable then determine how such an increase will affect the various models that are impacted by the variable. More specifically, if the interest rate is raised from 5% to 10%, then the user may see that a model which models financial product purchases in the southeastern United States shows a decrease in the number of expected financial product purchases to decrease due to consumers historically not desiring a financial product with such a high rate; col. 16, lines 51-67, discussing that the report lists all (or any subset of) models which are directly impacted by the selected variable identified at 610. User 105 may further view the number of models directly affected by the selected model. A second table displays models that are indirectly impacted by the selected variable as identified. In other words, table lists models that are directly impacted by the selected variable…; col. 17, lines 12-34, discussing that FIG. 7 is a screenshot of an exemplary interface for displaying detailed model attributes in accordance with an embodiment of the present invention. For a more holistic view of model details, the user may select to view a Detailed Report web page that presents models with sufficient detail to enable user to quickly discern model attributes. In one embodiment, only models that are directly impacted by the selected variable are displayed within the Detailed Report web page…; col. 4, lines 36-49).
Examiner notes that Jain, in addition to Fani as cited above, also teaches wherein the generating the recommendation comprises generating instructions to carry out human-in-the-loop training on the model (col. 14, lines 56-67 & col. 15, lines 1-6, discussing that a user may interact with any of the interfaces to enter, modify, and/or delete data relating to metadata, variables, and models. Various levels of editing may be permitted according to user privileges that have been defined and stored at user database 130. For example, only an administrator may be permitted to delete a variable, but a developer may be permitted to modify metadata. Such modifications may further be subject to authorization by any one or more defined users. For example, an administrator may select a variable in a model metadata report to delete, however, the deletion will not occur until the model owner and developer have been notified and authorize the deletion. According to this embodiment, the invention further contemplates a workflow manager to ensure adherence to organizational policies and to safeguard a modeling environment against the erroneous modifications of any single user; col. 15, lines 7-23, discussing that similar to the editing abilities described, the system may also provide decommissioning tools. When it is determined that a model or variable have become obsolete, are no longer used, or provide inaccurate output, it may be desirable to remove it from the modeling environment. However,… removing a model may have far reaching consequences due to interdependencies among models and model variables. Thus, when analysis proves that a model should be decommissioned, the system may control the processes, such that the appropriate personnel are notified and that appropriate authorizations are obtained. MVS 175 further incorporates intelligence tools that prevent the removal of models and variables when it is determined that such removal will compromise the integrity of the modeling environment. MVS 175 may only permit the removal when issues of dependencies are resolved or on authorization from a super user).
The Yuan-Fani combination describes features related to machine learning systems. Jain is directed to management of models. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuan-Fani combination with Jain because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying the Yuan-Fani combination to include Jain’s features for including wherein the mapping comprises: accessing a policy knowledge base storing information about how the models are used to implement the policies, carrying out association rule-based mapping to map each of the models to at least one corresponding policy from the set of policies based on the accessed information, and identifying, by the processor and based on the mapping, a model from the models that is used to implement the associated policy, in the manner claimed, would serve the motivation of maintaining an efficient model and model development environment (Jain at col. 20, lines 36-39); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
The Yuan-Fani-Jain combination does not explicitly teach automatically applying, by the processor and in response to the instruction, a machine unlearning model to the model to remove the at least one feature. However, Kumar in the analogous art of machine learning teaches this concept. Kumar teaches:
automatically applying, by the processor and in response to the instruction, a machine unlearning model to the model to remove the at least one feature (paragraph 0013, discussing methods and systems for ML training, and retraining, wherein the influence of one or more specific data points can be “forgotten” or “un-learned” by a machine learning (ML) model in an efficient way…An ML training system can train multiple instances of a machine learning model using a set of training data, where each instance is trained on a different subset of that training data. Individual subsets of training data will be referred to herein as shards. Each shard can also be broken down into a number of further subsets, referred to herein as slices. During training, checkpoints can be registered after use of the data in each slice of a shard. It might be the case that a request is received to remove a specific data point from the training data set, as well as any influence or related “learnings” of the ML instances...The data point(s) can be removed from the slice(s) of the shard. The ML instance corresponding to that shard can then be retrained; paragraph 0015, discussing that an ML training system can ensure that the requested data to be removed is completely expunged, a significant improvement over existing methods that can only assure (potentially with a certain probability) that a user's data has been forgotten or cannot be inferred. The ML training system can remove data points without causing substantial degradation to the model's performance; paragraph 0020, discussing that the ML training system can receive a request remove data related to one or more individuals from a system…Information associated with the request can be directed to the ML training system for performing data removal and machine unlearning tasks…The ML training system can receive such a request and perform one or more ML training tasks based on the request; paragraph 0024, discussing a retraining module that manages retraining upon receiving requests to remove a data point; paragraph 0031, discussing that the retraining module can perform actions to retrain an instance of the model upon receiving requests to remove a data point. For example, the retraining module can handle the request to remove a specific data point and identify the data point in its respective shard and slice. The retraining module can remove the data point from the shard and slice and perform retraining of the instance).
The Yuan-Fani-Jain combination describes features related to machine learning systems. Kumar is directed to methods and systems for a machine learning (ML) model training system. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the Yuan-Fani-Jain combination with Kumar because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying the Yuan-Fani-Jain combination to include Kumar’s feature for including automatically applying, by the processor and in response to the instruction, a machine unlearning model to the model to remove the at least one feature, in the manner claimed, would serve the motivation of allowing for more precise and efficient handling of data during tasks such as retraining of machine learning models (Kumar at paragraph 0022); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
As per claim 6, the Yuan-Fani-Jain-Kumar combination teaches the method of claim 1. Although not explicitly taught by Yuan, Fani in the analogous art of machine learning system teaches wherein the generating the recommendation further comprises notifying the user of the policy (paragraph 0060, discussing that insightful conditions may be determined, for example, using subgroup discovery techniques. Actions may be determined for a (e.g., each) condition. For example, actions with the highest positive/negative impact may be determined. In an example, the rule generator may receive a preselected, labeled feature table. Rule generator may discover conditions based on the preselected, labeled feature table, for example, using subgroup discovery. The rule generator may discover actions (e.g., perform action discovery), for example, using one or more techniques, such as tree-based decision methods. The rule generator may generate one or more RCA rules (e.g., based on the label value(s), the condition(s) and the action(s); paragraph 0074, discussing that the visualizer may utilize feature contribution algorithms to help users understand various factors associated with process issues/labeled classes, as indicated by rules based on conditions and actions. The visualizer may perform a descriptive analysis, for example, to help users understand the process characteristics that lead to issues/problems identified by labeled classes (e.g., excessive delay). The classification model (e.g., Rule generator) and/or other machine learning explanation tools may be used to understand the sources of issues/problems. The visualizer may, for example, quantitatively attribute labeled classes to each of multiple features; paragraph 0087, discussing that one or more RCA rules may be generated based on the selected features and label(s). RCA rules may be generated by an RCA engine, which may be or may include a machine learning (ML) model. Rule generation may be based on visualization of rules, e.g., to show the impact of features on the target, which form the basis of rules. RCA generation may be based on, for example, a Shapley technique, subgroup discovery, uplifting trees, etc., alone or in combination. A Shapley value may be computed by making changes in input features to determine how the changes to the input features correspond to a model prediction. A Shapley value of a feature may be calculated as an average marginal contribution to the overall model score. An RCA rule may indicate, for example, that there is a delay in 85% of cases where middle-aged customers visit on weekends. In some examples, there may be multiple rules, such as a sequence of rules; paragraph 0099, discussing that the set of features may be reduced by removing features that would negatively impact a root cause analysis (RCA) to generate a reduced set of features. For example, as shown in FIG. 1, the feature selector may reduce the set of features generated by the featurizer by selecting or excluding features based on at least one of a correlation threshold, a granularity threshold, or a relevance threshold (e.g., to remove features that would negatively impact the utility of RCA rules generated by the RCA)).
Yuan is directed to ablation on observable data for determining influence on machine learning systems. Fani is directed to machine learning systems. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuan with Fani because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying Yuan to include Fani’s feature for including wherein the generating the recommendation further comprises notifying the user of the policy, in the manner claimed, would serve the motivation of identifying problems and improving processes (Fani at paragraph 0001); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
As per claim 7, the Yuan-Fani-Jain-Kumar combination teaches the method of claim 1. Yuan further teaches further comprising mapping the set of features to the models (paragraph 0013, discussing that existing approaches rely on removing individual latent features and measuring the influence on the model predictions. The embodiments described rely on the removal or adaptation of meaningful components from the input/observed data leading to the removal or modification of combinations of multiple latent features (after the adapted input has been embedded onto the latent space; paragraph 0056, discussing that the calculation of importance based on the processed input X′ requires access to the latent variables x′. For instance, if determining the importance of a latent variable, the latent variable may be deleted and the system may be retrained without using this latent variable to determine its influence on the prediction; paragraph 0057, discussing that calculating the importance of the latent variables on the model determines the importance of a given attribute on a prediction; paragraph 0060, discussing a method of determining the importance of observed features on predictions made by a machine learning model; paragraph 0070, discussing that the difference in accuracy score might also be utilised, taking the relative difference allows the influence score to be comparable across different models and observations).
As per claim 8, the Yuan-Fani-Jain-Kumar combination teaches the method of claim 1. Yuan further teaches further comprising: receiving a user-input request to remove a next at least one feature (paragraph 0013, discussing removal or modification of combinations of multiple latent features);
calculating a reward function for the next at least one feature (paragraph 0002, discussing that generally, machine learning training methods adjust the parameters of a given model in an attempt to minimize some form of loss function or maximize some form of reward function based on predictions made by the model. One example of this is the adjustment of parameters to minimize the prediction error of the model; paragraph 0021, discussing that the first prediction is a first action and the second prediction is a second action and the measure of confidence in the first prediction is a reward for a first action and the measure of confidence in the second prediction is a reward for the second action. Accordingly, the method may be applied to determine influence on a machine learning agent configured to take actions in response to an input. The agent may have been trained via reinforcement learning. The rewards may be determined by a reward function. Equally, the rewards may be losses calculated through a loss function; paragraph 0052, discussing that each prediction has an associated measure of confidence (for instance, a classification confidence score, a prediction accuracy, or a reward for predicting a particular action). This represents the confidence that the prediction is accurate or correct. It can therefore be considered an accuracy score); and
removing the at least one feature (paragraph 0013, discussing that the embodiments described rely on the removal of meaningful components from the input/observed data leading to the removal or modification of combinations of multiple latent features; paragraph 0056, discussing that the calculation of importance based on the processed input X′ requires access to the latent variables x′. For instance, if determining the importance of a latent variable, the latent variable may be deleted and the system may be retrained without using this latent variable to determine its influence on the prediction; paragraph 0095, discussing that for each identified component, an adjusted, or perturbed, input is produced. The adjusted input is adjusted through the adjustment of the identified component within the input. In this case, the adjusted input is produced through the ablation (deletion or removal) of the component from the input text).
The Yuan-Fani combination does not explicitly teach in response to determining that the reward function for the next at least one feature is greater than the threshold value, requiring approval before removing the next at least one feature; and retraining the model. Jain in the analogous art of modeling systems teaches:
response to determining that the reward function for the next at least one feature is greater than the threshold value, requiring approval before removing the next at least one feature (col. 14, lines 56-67 & col. 15, lines 1-6, discussing that a user may interact with any of the interfaces to enter, modify, and/or delete data relating to metadata, variables, and models. Various levels of editing may be permitted according to user privileges that have been defined and stored at user database 130. For example, only an administrator may be permitted to delete a variable, but a developer may be permitted to modify metadata. Such modifications may further be subject to authorization by any one or more defined users. For example, an administrator may select a variable in a model metadata report to delete, however, the deletion will not occur until the model owner and developer have been notified and authorize the deletion. According to this embodiment, the invention further contemplates a workflow manager to ensure adherence to organizational policies and to safeguard a modeling environment against the erroneous modifications of any single user; col. 15, lines 7-23, discussing that similar to the editing abilities described above, the system may also provide decommissioning tools. When it is determined that a model or variable have become obsolete, are no longer used, or provide inaccurate output, it may be desirable to remove it from the modeling environment. However, as will be discussed, removing a model may have far reaching consequences due to interdependencies among models and model variables. Thus, when analysis proves that a model should be decommissioned, the system may control the processes, such that the appropriate personnel are notified and that appropriate authorizations are obtained. MVS 175 further incorporates intelligence tools that prevent the removal of models and variables when it is determined that such removal will compromise the integrity of the modeling environment. MVS 175 may only permit the removal when issues of dependencies are resolved or on authorization from a super user).
The Yuan-Fani combination describes features related to machine learning systems. Jain is directed to management of models. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuan-Fani combination with Jain because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying the Yuan-Fani combination to include Jain’s feature for including in response to determining that the reward function for the next at least one feature is greater than the threshold value, requiring approval before removing the next at least one feature, in the manner claimed, would serve the motivation of maintaining an efficient model and model development environment (Jain at col. 20, lines 36-39); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
While Yuan describes retraining a system (paragraph 0056, discussing that the calculation of importance based on the processed input X′ requires access to the latent variables x′. For instance, if determining the importance of a latent variable, the latent variable may be deleted and the system may be retrained without using this latent variable to determine its influence on the prediction), the Yuan-Fani-Jain combination does not explicitly teach retraining the model. However, Kumar in the analogous art of machine learning teaches this concept. Kumar teaches:
retraining the model (paragraph 0013, discussing methods and systems for ML training, and retraining, wherein the influence of one or more specific data points can be “forgotten” or “un-learned” by a machine learning (ML) model in an efficient way…; paragraph 0020, discussing that the ML training system can receive a request remove data related to one or more individuals from a system…Information associated with the request can be directed to the ML training system for performing data removal and machine unlearning tasks…The ML training system can receive such a request and perform one or more ML training tasks based on the request; paragraph 0024, discussing a retraining module that manages retraining upon receiving requests to remove a data point; paragraph 0031, discussing that the retraining module can perform actions to retrain an instance of the model upon receiving requests to remove a data point. For example, the retraining module can handle the request to remove a specific data point and identify the data point in its respective shard and slice. The retraining module can remove the data point from the shard and slice and perform retraining of the instance).
The Yuan-Fani-Jain combination describes features related to machine learning systems. Kumar is directed to methods and systems for a machine learning (ML) model training system. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the Yuan-Fani-Jain combination with Kumar because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying the Yuan-Fani-Jain combination to include Kumar’s feature for including retraining the model, in the manner claimed, would serve the motivation of allowing for more precise and efficient handling of data during tasks such as retraining of machine learning models (Kumar at paragraph 0022); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
As per claim 9, the Yuan-Fani-Jain-Kumar combination teaches the method of claim 1. Yuan further teaches further comprising: receiving multiple requests to delete selected features from the sets of features (paragraph 0013, discussing that the embodiments described rely on the removal of meaningful components from the input/observed data leading to the removal or modification of combinations of multiple latent features; paragraph 0056, discussing that the calculation of importance based on the processed input X′ requires access to the latent variables x′. For instance, if determining the importance of a latent variable, the latent variable may be deleted and the system may be retrained without using this latent variable to determine its influence on the prediction; paragraph 0095, discussing that for each identified component, an adjusted, or perturbed, input is produced. The adjusted input is adjusted through the adjustment of the identified component within the input. In this case, the adjusted input is produced through the ablation (deletion or removal) of the component from the input text); and
in response to the receiving the multiple requests, generating a next recommendation for the user (paragraph 0073, discussing that identifying influential components can help users to debug or further improve the machine learning model. For instance, if a classifier is producing classifications that appear to be erroneous (or at least anomalous), identifying the influential components that caused these erroneous classifications can help a user to assess whether the data is indeed erroneous (e.g. through comparison to the influential components within the data); paragraph 0074, discussing that identifying the influential component within the input data that caused the classification can help the user determine whether the classification is correct. For instance, in a classifier that attempts to identify malicious emails an email may appear on the face of it to be benign but might have a difficult to identify issue (such as an incorrect URL that directs the user to a malicious site). The methods described are able to direct the user's attention to the most important component within the observed data to the help assess the accuracy of the classification; paragraph 0114, discussing that the influence scores can be used to advise users as to how to improve the machine learning model or how to achieve improved results).
As per claim 11, the Yuan-Fani-Jain-Kumar combination teaches the method of claim 1. Although not explicitly taught by Yuan, Fani in the analogous art of machine learning systems teaches wherein the set of policies comprises business rules (paragraph 0021, discussing that root cause analysis (RCA) may be used in process mining to detect the cause(s) of issues/problems in processes with stored records of events for one or more sequences of activity in one or more processes. A web log may be a sequence of activity/events, such as clicks and/or data entry, by a user engaging with a process on a website. In an example, an insurance company process to provide insurance policies may take too much time. RCA may be used to determine why the process is so slow. Event records for empirical insurance policy cases may be categorized and labeled regarding delays to form data sets for an RCA model. An RCA model (e.g., supervised or unsupervised model) may be used to determine the causes of the delays; paragraph 0060, discussing that insightful conditions may be determined, for example, using subgroup discovery techniques. Actions may be determined for a (e.g., each) condition. For example, actions with the highest positive/negative impact may be determined. In an example, the rule generator may receive a preselected, labeled feature table. Rule generator may discover conditions based on the preselected, labeled feature table, for example, using subgroup discovery. The rule generator may discover actions (e.g., perform action discovery), for example, using one or more techniques, such as tree-based decision methods. The rule generator may generate one or more RCA rules (e.g., based on the label value(s), the condition(s) and the action(s); paragraph 0074, discussing that the visualizer may utilize feature contribution algorithms to help users understand various factors associated with process issues/labeled classes, as indicated by rules based on conditions and actions. The visualizer may perform a descriptive analysis, for example, to help users understand the process characteristics that lead to issues/problems identified by labeled classes (e.g., excessive delay). The classification model (e.g., Rule generator) and/or other machine learning explanation tools may be used to understand the sources of issues/problems. The visualizer may, for example, quantitatively attribute labeled classes to each of multiple features; paragraph 0087, discussing that one or more RCA rules may be generated based on the selected features and label(s). RCA rules may be generated by an RCA engine, which may be or may include a machine learning (ML) model. Rule generation may be based on visualization of rules, e.g., to show the impact of features on the target, which form the basis of rules. RCA generation may be based on, for example, a Shapley technique, subgroup discovery, uplifting trees, etc., alone or in combination. A Shapley value may be computed by making changes in input features to determine how the changes to the input features correspond to a model prediction. A Shapley value of a feature may be calculated as an average marginal contribution to the overall model score. An RCA rule may indicate, for example, that there is a delay in 85% of cases where middle-aged customers visit on weekends. In some examples, there may be multiple rules, such as a sequence of rules; paragraph 0099).
Yuan is directed to ablation on observable data for determining influence on machine learning systems. Fani is directed to machine learning systems. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuan with Fani because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying Yuan to include Fani’s feature for including wherein the set of policies comprises business rules, in the manner claimed, would serve the motivation of identifying problems and improving processes (Fani at paragraph 0001); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Claims 12 and 17 recite substantially similar limitations that stand rejected via the art citations and rationale applied to claim 1, as discussed above. Further, as per claim 12 the Yuan-Fani-Jain-Kumar combination teaches a system for managing datasets in an enterprise environment, comprising: a memory; and a processor communicatively coupled to the memory, wherein the processor is configured to perform a method (paragraphs 0036, 0037, 0115, 0122). As per claim 17, the Yuan-Fani-Jain-Kumar combination teaches a computer program product for managing datasets in an enterprise environment, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause a device to perform a method (paragraph 0037, 0122).
22. Claims 2-3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Yuan in view of Fani, in view of Jain, in view of Kumar, in view of Ye, Pub. No.: US 2025/0077960 A1, [hereinafter Ye], in further view of Shivaraman et al., Pub. No.: US 2024/0320596 A1, [hereinafter Shivaraman].
As per claim 2, the Yuan-Fani-Jain-Kumar combination teaches the method of claim 1. Although not explicitly taught by the Yuan-Fani-Jain combination, Ye in the analogous art of machine learning systems teaches wherein the calculating the reward function comprises determining support for an association between the at least one feature and the policy (paragraph 0002, discussing a system for feature selection for a reinforcement learning model...The one or more processors may be configured to determine, via one or more logistic regression feature selection models, a set of features for the reinforcement learning model. The one or more processors may be configured to determine, using off-policy evaluation, a performance level of the reinforcement learning model when using the set of features. The one or more processors may be configured to determine, based on the performance level and using a classification model, a ranking of the set of features, wherein the ranking indicates an order of importance of the set of features on the performance level. The one or more processors may be configured to generate one or more subsets of features from the set of features based on iteratively removing one or more features, from the set of features, in accordance with the ranking; paragraph 0010, discussing that reinforcement learning is a subfield of machine learning that is associated with developing algorithms capable of making optimal decisions in dynamic environments. Reinforcement learning may be associated with training a model to interact with an environment, learn from experiences, and optimize behavior over time. The model may receive feedback in the form of rewards or penalties based on actions, and the objective may be to maximize the cumulative reward the model obtains. A reinforcement learning model may “learn” through trial and error, iteratively exploring the environment, taking actions, and observing the consequences; paragraph 0031, discussing that the machine learning model may be trained using a first set of features. The feature selection device may evaluate a performance of a second set of features for the machine learning model using off-policy evaluation (e.g., where the target policy includes using the second set of features). The second set of features may be the set of features that are determined based on the one or more logistic regression feature selection models or techniques. This enables the feature selection device to estimate a performance of the machine learning model for the set of features that are determined based on the one or more logistic regression feature selection models or techniques using off-policy evaluation. The performance level may be indicated by one or more performance metrics, such as an average reward, one or more value functions, one or more policy quality measures, and/or one or more error rates, among other examples; paragraph 0032, discussing that the feature selection device may determine, based on the off-policy evaluation, a ranking of the set of features (e.g., the set of features used to estimate the off-policy evaluation). The ranking may indicate an order of importance of the set of features on the performance level. For example, the feature selection device may order the set of features from a greater impact on the performance level to a lower impact on the performance level. In other words, the ranking may order the set of features from least importance to greatest importance to the performance level of the machine learning model. The feature selection device may determine the ranking using a classification model. The classification model may include a decision tree, a random forest model, and/or another type of classification model; paragraph 0039, discussing that the feature selection device may determine, for each subset of features, a performance level of the machine learning model (e.g., using off-policy evaluation). For example, the target policy of the off-policy evaluation may be the subset(s) of features. For example, the feature selection device may generate a first subset of features by removing the one or more least important features from the set of features. The feature selection device may perform off-policy evaluation of the machine learning model using the first subset of features (e.g., as the target policy) to obtain a first performance level of the machine learning model (e.g., where the first performance level is associated with the first subset of features). The feature selection device may generate a second subset of features by removing one or more least important features from the first subset of features. The feature selection device may perform off-policy evaluation of the machine learning model using the second subset of features (e.g., as the target policy) to obtain a second performance level of the machine learning model (e.g., where the second performance level is associated with the second subset of features). The feature selection device may perform off-policy evaluation to obtain performance levels for respective subsets of features.).
The Yuan-Fani-Jain-Kumar combination describes features related to machine learning systems. Ye is directed to training machine learning models. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the Yuan-Fani-Jain-Kumar combination with Ye because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying the Yuan-Fani-Jain-Kumar combination to include Ye’s feature for including wherein the calculating the reward function comprises determining support for an association between the at least one feature and the policy, in the manner claimed, would serve the motivation of improving model performance (Ye at paragraph 0012); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
The Yuan-Fani-Jain-Kumar-Ye combination does not explicitly teach using rule learning to find relationships among variables in the sets of features. However, Shivaraman in the analogous art of machine learning systems teaches this concept. Shivaraman teaches:
using rule learning to find relationships among variables in the sets of features (paragraph 0046, discussing that the machine learning module performs model training using training data, e.g., data from other modules, that contains input and correct output, to allow the model to learn over time. The training is performed based on the deviation of a processed result from a documented result when the inputs are fed into the machine learning model, e.g., an algorithm measures its accuracy through the loss function, adjusting until the error has been sufficiently minimized. In one embodiment, the machine learning module randomizes the ordering of the training data, visualizes the training data to identify relevant relationships between different variables, identifies any data imbalances, and splits the training data into two parts where one part is for training a model and the other part is for validating the trained model, de-duplicating, normalizing, correcting errors in the training data, and so on. The machine learning module implements various machine learning techniques, e.g., K-nearest neighbors, cox proportional hazards model, decision tree learning, association rule learning, neural network (e.g., recurrent neural networks, graph convolutional neural networks, deep neural networks), inductive programming logic, support vector machines, Bayesian models, Gradient boosted machines (GBM), LightGBM (LGBM), Xtra tree classifier, etc.; paragraph 0062, discussing that in one embodiment, the correlation detection module processes the data received from the data preparation module for a statistical measure of the relationship between two or more variables. In one embodiment, the correlation detection module determines a degree of relationship between two or more variables based upon changes in one variable in relation to the other variables. If two variables are closely correlated, then one variable can be predicted from the other, e.g., the correlation detection module determines seven attributes are correlated and provides the same information, hence instead of using all seven attributes, any one of these attributes may be utilized. In one example embodiment, two variables are positively correlated when the value of one variable increases with an increase in the value of the other variable(s). In another example embodiment, two variables are negatively correlated when the value of one variable increases with a decrease in the value of the other variable(s)).
The Yuan-Fani-Jain-Kumar-Ye combination describes features related to machine learning systems. Shivaraman is directed to machine learning systems. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuan-Fani-Jain-Kumar-Ye combination with Shivaraman because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying the Yuan-Fani-Jain-Kumar-Ye combination to include Shivaraman’s feature for including using rule learning to find relationships among variables in the sets of features, in the manner claimed, would serve the motivation of making it easier for the model to learn and provide better information (Shivaraman at paragraph 0064); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
As per claim 3, the Yuan-Fani-Jain-Kumar-Ye-Shivaraman combination teaches the method of claim 2. While Yuan describes a measure of confidence (paragraphs 0022-0023, 0063), the Yuan-Fani combination does noy explicitly teach wherein the calculating the reward function further comprises calculating a confidence for the association. However, Ye in the analogous art of machine learning systems teaches this concept. Ye teaches wherein the calculating the reward function further comprises calculating a confidence for the association (paragraph 0010, discussing that reinforcement learning is a subfield of machine learning that is associated with developing algorithms capable of making optimal decisions in dynamic environments. Reinforcement learning may be associated with training a model to interact with an environment, learn from experiences, and optimize behavior over time. The model may receive feedback in the form of rewards or penalties based on actions, and the objective may be to maximize the cumulative reward the model obtains. A reinforcement learning model may “learn” through trial and error, iteratively exploring the environment, taking actions, and observing the consequences; paragraph 0027, discussing that as another example, the feature selection device may determine confidence intervals for respective features. For example, a confidence interval may be associated with a range of plausible values for an odds ratio or coefficient of a given feature. A wider (or larger) confidence interval may indicate greater uncertainty about the feature, while a narrower interval may indicate higher precision. The feature selection device may remove one or more features associated with confidence intervals that do not satisfy a confidence threshold (e.g., the feature selection device may remove feature(s) associated with wide or large confidence intervals). Additionally, or alternatively, the feature selection device may perform RFE using the set of features determines from the one or more logistic regression assumptions for feature selection. In some implementations, the feature selection device may perform RFE with cross validation (RFECV). For example, the feature selection device may perform RFECV to maximize an area under the receiver operating characteristic curve for the machine learning model. This enables the feature selection device to determine a set of features that are most relevant and/or informative for the one or more logistic regression models included in the machine learning model. As a result, this may improve a performance of the one or more logistic regression models).
The Yuan-Fani-Jain-Kumar combination describes features related to machine learning systems. Ye is directed to training machine learning models. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuan-Fani-Jain-Kumar combination with Ye because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying the Yuan-Fani-Jain-Kumar combination to include Ye’s feature for including wherein the calculating the reward function further comprises calculating a confidence for the association, in the manner claimed, would serve the motivation of improving model performance (Ye at paragraph 0012); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Claim 13 recites substantially similar limitations that stand rejected via the art citations and rationale applied to claims 2 and 3, as discussed above.
23. Claims 4, 14, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Yuan in view of Fani, in view of Jain, in view of Kumar, in further view of Dela Rosa et al., Pub. No.: US 2024/0406477 A1, [hereinafter Dela].
As per claim 4, the Yuan-Fani-Jain-Kumar combination teaches the method of claim 1. Yuan further teaches wherein the generating the recommendation further comprises notifying the user of a difference between the performance scores for the model with and without the at least one feature (paragraph 0013, discussing that the embodiments described provide improvements in computational efficiency and interpretability to the determination of the influence of inputs on predictions. Existing approaches rely on removing individual latent features and measuring the influence on the model predictions…The embodiments described rely on the removal of meaningful components from the input/observed data leading to the removal or modification of combinations of multiple latent features; paragraph 0018, discussing that the difference in the measure of confidence in the first prediction and the measure of confidence in the second prediction is a difference relative to the measure of confidence in the first prediction. That is, calculating the difference between the measure of confidence in the first prediction and the measure of confidence in the second prediction might comprise subtracting the measure of confidence in the second prediction from the measure of confidence in the first prediction to determine a change in the measure of confidence, and dividing the change in the measure of confidence by the measure of confidence in the first prediction to obtain the difference in the measure of confidence. Taking the relative difference allows the influence score to be comparable across different models and observations; paragraphs 0045-0047, discussing that a two-step approach is proposed that can be applied to any machine learning model to study the behavior of the model and provide insight into predictions by the model: Step 1: adjust (e.g. ablate) the raw input data by adapting (e.g. deleting/occluding) components that are sensible to humans (e.g. words, phrases, or groups of words, objects in images), which are identified using machine learning technologies, such as natural language processing techniques; Step 2: determining the importance of various components within the input data and how they influence the prediction that the model makes on that data; paragraph 0052, discussing that the prediction Y might be a set of one or more confidence scores for classification, might be an action for application to an environment, or might be a synthetically generated data point. The prediction Y is a single instance of predicted data. The prediction includes one or more predicted values. That is, the prediction Y comprises the set of m predicted values. Each prediction has an associated measure of confidence (for instance, a classification confidence score, a prediction accuracy, or a reward for predicting a particular action). This represents the confidence that the prediction is accurate or correct. It can therefore be considered an accuracy score; paragraph 0070, discussing that the difference in accuracy score might also be utilised, taking the relative difference allows the influence score to be comparable across different models and observations).
The Yuan-Fani-Jain-Kumar combination does not explicitly teach a difference between the performance scores for the model with and without the at least one feature is greater than a threshold difference. However, Dela Rosa in the analogous art of machine learning systems teaches this concept (paragraph 0176, discussing that the interaction system initiates retaining based on performance metrics of the machine learning models. If there is a noticeable decline in metrics such as a threshold difference in accuracy, precision, recall, or user satisfaction measures, the interaction system initiates retraining of the models; paragraph 0167).
The Yuan-Fani-Jain combination describes features related to machine learning systems. Dela Rosa is directed to a machine learning model continuous training system. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuan-Fani-Jain-Kumar combination with Dela Rosa because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying the Yuan-Fani-Jain-Kumar combination to include Dela Rosa’s feature for including that a difference between the performance scores for the model with and without the at least one feature is greater than a threshold difference, in the manner claimed, would serve the motivation of ensuring that the machine learning model remains effective (Dela Rosa at paragraph 0118); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Claims 14 and 18 recite substantially similar limitations that stand rejected via the art citations and rationale applied to claim 4, as discussed above.
24. Claims 10, 16, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Yuan in view of Fani, in view of Jain, in view of Kumar, in further view of Berger et al., Patent No.: US 6,304,841 B1, [hereinafter Berger].
As per claim 10, the Yuan-Fani-Jain-Kumar combination teaches the method of claim 9, but it does not explicitly teach wherein the next recommendation comprises a suggestion that the selected features be deleted as a batch. However, Berger in the analogous art of modeling systems teaches this concept (col., 28, lines 19-23, discussing that with a means of quickly gauging the redundancy between two features .function. and .function.', we can adapt a batch feature-selection algorithm to be more selective about its batches, discarding those features which exhibit a high level of correlation with other features in the same batch).
The Yuan-Fani-Jain-Kumar combination describes features related to machine learning systems. Berger is directed to constructing models. Therefore, they are deemed to be analogous as they both are directed towards solutions for modeling systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Yuan-Fani-Jain-Kumar combination with Berger because the references are analogous art because they are both directed to solutions for modeling systems, which falls within applicant’s field of endeavor (machine learning models), and because modifying the Yuan-Fani-Jain-Kumar combination to include Berger’s feature for including wherein the next recommendation comprises a suggestion that the selected features be deleted as a batch, in the manner claimed, would serve the motivation of producing an efficient feature selection algorithm (Berger at col. 26, lines 52-53); and further obvious because the claimed invention is merely a combination of old elements, and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Claims 16 and 20 recite substantially similar limitations that stand rejected via the art citations and rationale applied to claims 9 and 10, as discussed above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Wang et al., Pub. No.: US 2024/0086730 A1 – describes that association rule learning is a rule-based machine learning for discovering interesting relations between variables or data in large databases.
Ezrielev et al., Pub. No.: US 2025/0080587 A1 – describes methods and systems for managing an artificial intelligence (AI) model.
Fraboni et al., Pub. No.: US 2024/0086760 A1 – describes a system implementing techniques that can perform machine unlearning with reduced retraining times.
Sun et al., Pub. No.: US 2024/0070525 A1 – describes techniques of performing machine unlearning in a recommendation model.
Dam, Tobias, Maximilian Henzl, and Lukas Daniel Klausner. "Delete my account: Impact of data deletion on machine learning classifiers." 2021 International Conference on Software Security and Assurance (ICSSA). IEEE, 2021 – provides insight into the effects of different plausible scenarios for right to erasure usage on data quality of machine learning.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DARLENE GARCIA-GUERRA whose telephone number is (571) 270-3339. The examiner can normally be reached M-F 7:30a.m.-5:00p.m. EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Brian M. Epstein can be reached on (571) 270-5389. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Darlene Garcia-Guerra/
Primary Examiner, Art Unit 3625