DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1
According to the first part of the analysis, in the instant case, claims 1-9 are directed to a system comprising at least a processor, claims 10-18 are directed to a method, and claims 19-20 are directed to a non-transitory computer-readable medium. Thus, each of the claims falls within one of the four statutory categories (i.e. process, machine, manufacture, or composition of matter).
Claim 1 recites:
Step 2A, Prong 1
“iteratively generating the plurality of decision rules based on the rule training data and a plurality of ML model training techniques associated with the ML engine” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can repeatedly generate rules based on training data and ML model training techniques. For example, a ML model can be trained to output a classification. A human can generate a rule based on the classification (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
“testing each of the plurality of decision rules for a corresponding performance using rule testing data associated with the ML engine” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can test a rule using the test data to see if it triggers a condition in the rule (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
“filtering the plurality of decision rules by the corresponding performances to remove rules underperforming a rule performance threshold, wherein the filtering includes a rule correlation analysis that correlates each rule to one or more other rules to remove two or more rules when meeting or exceeding a threshold correlation” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can exclude two or more rules that are similar to each other more than a threshold in their mind (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
“selecting a set of the plurality of decision rules that maximizes an ML task alert metric within a maximum alert rate for the ML task” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can select rules that maximize a task alert metric within a maximum alert rate threshold (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
“evaluating the set of the plurality of decision rules based on an individual rule performance and a relative rule contribution of each rule in the set to outputs by the set for the ML task” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can evaluate a set of rules based on their performance and their given impact on the set of outputs in their mind (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
“generating, based on the evaluating, a decision ruleset for the ML task, wherein the decision ruleset comprises at least a portion of the set of the plurality of decision rules selected and evaluated” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can generate a rule set using the selected and evaluated rules (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
Step 2A, Prong 2
“a processor and a computer readable medium operably coupled thereto, the computer readable medium comprising a plurality of instructions stored in association therewith that are accessible to, and executable by, the processor, to perform rule training operations which comprise” (Mere instructions to apply the exception using a generic computer component. See 2106.05(f).)
“accessing rule training data for determining a plurality of decision rules for an ML task of the ML engine” (Insignificant extra-solution activity)
The additional elements do not integrate the judicial exception into a practical application.
Step 2B
“a processor and a computer readable medium operably coupled thereto, the computer readable medium comprising a plurality of instructions stored in association therewith that are accessible to, and executable by, the processor, to perform rule training operations which comprise” (Mere instructions to apply the exception using a generic computer component. See 2106.05(f).)
“accessing rule training data for determining a plurality of decision rules for an ML task of the ML engine” (This step is directed to retrieving data from memory. Retrieving data from a computer memory is well-understood, routine, and conventional as evidenced by the court cases cited at MPEP 2106.05(d), section II, example iv. Storing and retrieving information in memory. Thereby, a conclusion that the claimed storing step is well-understood, routine, conventional activity is supported under Berkheimer.)
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim 2 recites:
Step 2A, Prong 1
Claim 2 recites at least the abstract idea identified above in claim 1.
Step 2A, Prong 2
“wherein the rule training data and the ML task are associated with fraud detection and the plurality of decision rules comprise a plurality of fraud detection rules, and wherein the decision ruleset comprises a fraud detection ruleset configured to detect a fraudulent transaction in a transaction processing system” (Linking the judicial exception to a field of use. See MPEP 2106.05(h).)
The additional elements do not integrate the judicial exception into a practical application.
Step 2B
“wherein the rule training data and the ML task are associated with fraud detection and the plurality of decision rules comprise a plurality of fraud detection rules, and wherein the decision ruleset comprises a fraud detection ruleset configured to detect a fraudulent transaction in a transaction processing system” (Linking the judicial exception to a field of use. See MPEP 2106.05(h).)
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim 3 recites:
Step 2A, Prong 1
“determining, based on the applying the at least one unsupervised ML model training technique, at least one unsupervised ML model rule of the plurality of decision rules” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can determine an unsupervised ML model rule based on an output of the trained unsupervised ML model (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
Step 2A, Prong 2
“wherein the iteratively generating comprises: applying at least one unsupervised ML model training technique of the plurality of ML model training techniques to unlabeled data from the rule training data” (Mere instructions to apply the exception using a generic computer component. See 2106.05(f).)
The additional elements do not integrate the judicial exception into a practical application.
Step 2B
“wherein the iteratively generating comprises: applying at least one unsupervised ML model training technique of the plurality of ML model training techniques to unlabeled data from the rule training data” (Mere instructions to apply the exception using a generic computer component. See 2106.05(f).)
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim 4 recites:
Step 2A, Prong 1
“determining, based on the applying the at least one supervised ML model training technique, at least one supervised ML model rule of the plurality of decision rules” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can determine a supervised rule based on an output of a trained supervised ML model (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
“testing the at least one unsupervised ML model rule and the at least one supervised ML model rule separately” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can test two rules separately (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
“combining the at least one unsupervised ML model rule and the at least one supervised ML model rule after the testing” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can combine two rules. For example, a programmer can combine two if statements (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
Step 2A, Prong 2
“wherein the iteratively generating further comprises: applying at least one supervised ML model training technique of the plurality of ML model training techniques to labeled data from the rule training data” (Mere instructions to apply the exception using a generic computer component. See 2106.05(f).)
The additional elements do not integrate the judicial exception into a practical application.
Step 2B
“wherein the iteratively generating further comprises: applying at least one supervised ML model training technique of the plurality of ML model training techniques to labeled data from the rule training data” (Mere instructions to apply the exception using a generic computer component. See 2106.05(f).)
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim 5 recites:
Step 2A, Prong 1
Claim 5 recites at least the abstract idea identified above in claim 4.
Step 2A, Prong 2
“wherein the at least one supervised ML model training technique utilizes at least one XGBoost model with an iterative data-refinement procedure for iterative training using the labeled data, and wherein the at least one unsupervised ML model training technique utilizes at least one isolation forest model for the iterative training using the unlabeled data” (Linking the judicial exception to a field of use. See MPEP 2106.05(h).)
The additional elements do not integrate the judicial exception into a practical application.
Step 2B
“wherein the at least one supervised ML model training technique utilizes at least one XGBoost model with an iterative data-refinement procedure for iterative training using the labeled data, and wherein the at least one unsupervised ML model training technique utilizes at least one isolation forest model for the iterative training using the unlabeled data” (Linking the judicial exception to a field of use. See MPEP 2106.05(h).)
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim 6 recites:
Step 2A, Prong 1
“wherein the filtering utilizes at least one of a supervised feature importance test or an unsupervised feature importance test for features of the plurality of decision rules, and wherein the selecting the set of the plurality of decision rules is further based on: limiting alerts generated by the set from meeting or exceeding the maximum alert rate, and performing a stability test of each rule in the set of the plurality of decision rules” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can exclude rules based on an important feature associated with the rule. A human can further select rules based on a maximum alert rate and a stability test (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
Step 2A, Prong 2 & 2B
The claim does not recite any additional elements.
Claim 7 recites:
Step 2A, Prong 1
Claim 7 recites at least the abstract idea identified above in claim 6.
Step 2A, Prong 2
“wherein the alerts are limited from meeting or exceeding the maximum alert rate using an executable operation that maximizes a detection rate of the set of the plurality of decision rules at or below the maximum alert rate for ones of the plurality of decision rules selected for the set” (Linking the judicial exception to a field of use. See MPEP 2106.05(h).)
The additional elements do not integrate the judicial exception into a practical application.
Step 2B
“wherein the alerts are limited from meeting or exceeding the maximum alert rate using an executable operation that maximizes a detection rate of the set of the plurality of decision rules at or below the maximum alert rate for ones of the plurality of decision rules selected for the set” (Linking the judicial exception to a field of use. See MPEP 2106.05(h).)
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim 8 recites:
Step 2A, Prong 1
“wherein, before generating the decision ruleset, the rule training operations further comprise: applying at least one corrective rule that filters or masks at least one decision rule of the set from affecting the outputs by the set for the ML task based on false positives by the at least one decision rule” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can apply a rule that will effectively cancel out another rule. For example, a rule may incorrectly classify a transaction as fraudulent but another rule may be applied to override the rule and classify the transaction as non-fraudulent (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
Step 2A, Prong 2 & 2B
The claim does not recite any additional elements.
Claim 9 recites:
Step 2A, Prong 1
“wherein the selecting requires a minimum number of decision rules for the set, and wherein the selecting includes: ranking the plurality of decision rules based on an alert rate and a rule performance analysis for each rule of the plurality of decision rules” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can select a minimum number of rules by ranking them based on their performance and alert rate. (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
“removing correlated rules from the set” (This step is a recitation of a mental process that is practical to perform in the human mind. A human can exclude rules from a set. (i.e. observation, evaluation, judgement, opinion). See MPEP § 2106.04(a)(2), subsection III.).
Step 2A, Prong 2 & 2B
The claim does not recite any additional elements.
Claim 10 recites:
See rejection of claim 1. Same rationale applies.
Claim 11 recites:
See rejection of claim 2. Same rationale applies.
Claim 12 recites:
See rejection of claim 3. Same rationale applies.
Claim 13 recites:
See rejection of claim 4. Same rationale applies.
Claim 14 recites:
See rejection of claim 5. Same rationale applies.
Claim 15 recites:
See rejection of claim 6. Same rationale applies.
Claim 16 recites:
See rejection of claim 7. Same rationale applies.
Claim 17 recites:
See rejection of claim 8. Same rationale applies.
Claim 18 recites:
See rejection of claim 9. Same rationale applies.
Claim 19 recites:
Step 2A, Prong 1
See rejection of claim 1. Same rationale applies.
Step 2A, Prong 2 & 2B
The claim recites additional elements (“A non-transitory computer-readable medium having stored thereon computer-readable instructions executable to automatically generate machine learning (ML) rules using a rule training system for intelligent decision-making by an ML engine, the computer-readable instructions executable to perform rule training operations”). (Mere instructions to apply the exception using a generic computer component. See 2106.05(f).)
This judicial exception is not integrated into a practical application.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim 20 recites:
See rejection of claim 2. Same rationale applies.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-2, 9-11, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Azizsoltani et al. (US-20220121967-A1) in view of Duarte et al. (US-20220083915-A1).
Regarding Claim 1,
Azizsoltani (US 20220121967 A1) A rule training system configured to automatically generate machine learning (ML) rules for intelligent decision-making by an ML engine, the rule training system comprising:
a processor and a computer readable medium operably coupled thereto, the computer readable medium comprising a plurality of instructions stored in association therewith that are accessible to, and executable by, the processor (para [0132]), to perform rule training operations which comprise:
accessing rule training data for determining a plurality of decision rules for an ML task of the ML engine (para [0034] “The rule-building engine can automatically generate the logical rules by training a group of decision trees using training data, extracting a set of logical rules from the trained decision trees, and then applying various techniques to the extracted set of logical rules to identify a subset of logical rules that can detect a target event with a high degree of accuracy.”);
iteratively generating the plurality of decision rules based on the rule training data (para [0034] “The rule-building engine can automatically generate the logical rules by training a group of decision trees using training data, extracting a set of logical rules from the trained decision trees, and then applying various techniques to the extracted set of logical rules to identify a subset of logical rules that can detect a target event with a high degree of accuracy…The automated process can also be easily repeated at periodic intervals to add, remove, and update logical rules, thereby helping to ensure that the event detection system can detect new types of events or can detect existing events with improved accuracy.”) and a plurality of ML model training techniques associated with the ML engine (para [0138] “In unsupervised training, the training data includes inputs, but not desired outputs, so that the machine-learning model has to find structure in the inputs on its own. In semi-supervised training, only some of the inputs in the training data are correlated to desired outputs.” unsupervised and semi-supervised training (i.e., multiple training techniques).);
testing each of the plurality of decision rules for a corresponding performance using rule testing data associated with the ML engine (para [0035] “The union of the logical rules generated by the forest of decision trees may have a significant overlap, so in some examples the rule-building engine can implement a process to stochastically decouple these logical rules and measure the performance of each individual logical rule using test data.”);
filtering the plurality of decision rules by the corresponding performances to remove rules underperforming a rule performance threshold (para [0160] “The performance metrics for the logical rules can then be compared to a predefined threshold so that poor-performing logical rules can be eliminated.”), wherein the filtering includes a rule correlation analysis that correlates each rule to one or more other rules to remove two or more rules when meeting or exceeding a threshold correlation (para [0171]-[0172] “In operation 1706, the processing device compares the similarity score to a predefined similarity threshold, which may be selected by a user. If the similarity score is less than the predefined similarity threshold, then two logical rules may not be substantially similar (e.g., overlapping) and the process can return to operation 1704. Otherwise, the process can proceed to operation 1708. In operation 1708, the processing device applies a cleanup process to the two logical rules. In some examples, the cleanup process can involve deleting one of the two logical rules, thereby reducing duplicative rules in the subset of logical rules. In other examples, the cleanup process can involve merging the two logical rules together, for example by combining one or more of the criteria in a first logical rule with one or more criteria in a second logical rule, to arrive at a single logical rule. After applying the cleanup process, the process can return to operation 1704 where another similarity score can be selected. This process can repeat until some or all of the similarity scores have been analyzed.” Process can be repeated removing at least two or more rules.);
evaluating the set of the plurality of decision rules based on an individual rule performance (para [0158] “In operation 1308, the processing device selects a collection of logical rules from the group of logical rules based on the first group of performance metric values. For example, the processing device can compare the performance metric value for each respective logical rule to a predefined threshold. If the performance metric value for a particular logical rule is greater than the predefined threshold, the processing device can incorporate the logical rule into the collection of logical rules.”) and a relative rule contribution of each rule in the set to outputs by the set for the ML task (para [0164] “For example, since there are two logical rules in the multiple logical rules 1608, each of the logical rules may have its own credit counter that is incremented by 0.50. On the other hand, if the selected rule 1604 for the sample of test data does not produce the correct target value 1506, each of the logical rules may have its own penalty counter that is incremented by 0.50.” para [0165] “In operation 1312, the processing device determines a second group of performance metric values corresponding to the collection of logical rules based on the count values. As one example, if the credit counter value for a logical rule is 57 and the penalty counter value for the logical rule is 33, the net performance of the logical rule can be 57−33=24.”); and
generating, based on the evaluating, a decision ruleset for the ML task, wherein the decision ruleset comprises at least a portion of the set of the plurality of decision rules selected and evaluated (para [0158] “In operation 1308, the processing device selects a collection of logical rules from the group of logical rules based on the first group of performance metric values. para [0166] “In operation 1314, the processing device selects a subset of logical rules from the collection of logical rules based on the second group of performance metrics. For example, the processing device can compare the second group of performance metrics to a predefined threshold, which may be different from the predefined threshold described above with respect to operation 1308.”).
Azizsoltani does not explicitly disclose
selecting a set of the plurality of decision rules that maximizes an ML task alert metric within a maximum alert rate for the ML task;
However, Duarte (US 20220083915 A1) teaches
selecting a set of the plurality of decision rules (para [0016] “In various embodiments, an approach to producing a decision rule is to train a machine learning model that either outputs a decision directly, or that outputs a prediction which can be used to produce a decision based on a selected threshold. Stated alternatively, a machine learning system can be configured by: (1) training a model to output an optimized decision directly or (2) using a decision module after the model.” A rule is produced based on a threshold to optimize a decision. Para [0021] “Thus, a main goal of a decision stage may be to derive a decision rule that achieves a best generalization performance as measured by some metric of interest, in other words, to achieve a smallest expected loss (or a highest expected utility) with respect to a true distribution of inputs.”) that maximizes an ML task alert metric within a maximum alert rate for the ML task (para [0053] “For example, with respect to fraud detection, a true positive rate may be maximized subject to a false positive rate constraint (e.g., keeping the false positive rate below a specified threshold). In various embodiments, the decision result is based on a score corresponding to optimizing the one or more decision metrics. In some embodiments, the score is score 218 of FIG. 2 and/or score 318 of FIG. 3. In various embodiments, a scoring function generates the score. In various embodiments, a decision rule determines the decision result based on the score.” specified threshold (i.e., maximum alert rate).);
Azizsoltani and Duarte are analogous because they are directed to machine learning models utilizing rules for fraud detection.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the rules of Azizsoltani with the decision metrics of Duarte.
Doing so would allow for reducing the number of false positives in the fraud detection model by keeping the false positive rate under a specified threshold (Duarte para [0053]).
Regarding Claim 2,
Azizsoltani and Duarte teach the rule training system of claim 1. Azizsoltani teaches wherein the rule training data and the ML task are associated with fraud detection and the plurality of decision rules comprise a plurality of fraud detection rules, and wherein the decision ruleset comprises a fraud detection ruleset configured to detect a fraudulent transaction in a transaction processing system (para [0002] “The present disclosure relates generally to computerized event-detection systems. More specifically, but not by way of limitation, this disclosure relates to automatically generating rules for event detection systems.” Para [0003] “For example, event detection systems can be used to detect fraudulent transactions or money laundering.”).
Regarding Claim 9,
Azizsoltani and Duarte teach the rule training system of claim 1. Azizsoltani further teaches wherein the selecting requires a minimum number of decision rules for the set, and wherein the selecting includes:
ranking the plurality of decision rules based on an alert rate and a rule performance analysis for each rule of the plurality of decision rules (para [0166] “For example, the processing device may select the top 50 logical rules having the least false positive rates or the highest hit rates. As another example, the processing device may select logical rules with false positive rates of less than 10% or hit rates of more than 10%. Any number and combination of criteria can be used to select the logical rules for the subset.”); and
removing correlated rules from the set (para [0167] “In operation 1316, the processing device removes duplicative logical rules from the subset of logical rules. In some examples, duplicative rules can include logical rules that are identical to one another.” Duplicative rules (i.e., correlated rules).).
Regarding Claim 10,
Claim 10 is the method corresponding to the system of claim 1. Claim 10 is substantially similar to claim 1 and is rejected on the same grounds.
Regarding Claim 11,
Claim 11 is the method corresponding to the system of claim 2. Claim 11 is substantially similar to claim 2 and is rejected on the same grounds.
Regarding Claim 18,
Claim 18 is the method corresponding to the system of claim 9. Claim 18 is substantially similar to claim 9 and is rejected on the same grounds.
Regarding Claim 19,
Claim 19 is the computer-readable medium corresponding to the system of claim 1. Claim 19 is substantially similar to claim 1 and is rejected on the same grounds.
Regarding Claim 20,
Claim 20 is the computer-readable medium corresponding to the system of claim 2. Claim 20 is substantially similar to claim 2 and is rejected on the same grounds.
Claims 3-4 and 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Azizsoltani/Duarte, as applied above, and further in view of Malleron et al. (US-20170293595-A1).
Regarding Claim 3,
Azizsoltani and Duarte teach the rule training system of claim 1. Azizsoltani further teaches wherein the iteratively generating comprises:
applying at least one unsupervised ML model training technique of the plurality of ML model training techniques to unlabeled data from the rule training data (para [0138] “This may enable the machine-learning model to learn a mapping between the inputs and desired outputs. In unsupervised training, the training data includes inputs, but not desired outputs, so that the machine-learning model has to find structure in the inputs on its own.”); and
Azizsoltani and Duarte do not explicitly disclose
determining, based on the applying the at least one unsupervised ML model training technique, at least one unsupervised ML model rule of the plurality of decision rules.
However, Malleron (US 20170293595 A1) teaches
determining, based on the applying the at least one unsupervised ML model training technique, at least one unsupervised ML model rule of the plurality of decision rules (para [0076] “Following the retrieval of the training data, the unsupervised learner learns a rule that relates to the information elements in the training data. Typically, the unsupervised learner learns the rule by ascertaining that a subset of the information elements in the training data are sufficiently close in value to each other.”).
Azizsoltani, Duarte, and Malleron are analogous because they are directed towards extracting rules from ML models.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the rules model of Azizsoltani and Duarte with the supervised and unsupervised learning of Malleron.
Doing so would allow for automatically learn the rules that are used to tag the information elements and to continually learn and update the tagging rules, in order to further improve the accuracy of the tagging (Malleron para [0047]).
Regarding Claim 4,
Azizsoltani, Duarte, and Malleron teach the rule training system of claim 3. Azizsoltani further teaches wherein the iteratively generating further comprises:
applying at least one supervised ML model training technique of the plurality of ML model training techniques to labeled data from the rule training data (para [0138] “In supervised training, each input in the training data is correlated to a desired output. This desired output may be a scalar, a vector, or a different type of data structure such as text or an image. This may enable the machine-learning model to learn a mapping between the inputs and desired outputs.”);
Malleron further teaches
determining, based on the applying the at least one supervised ML model training technique, at least one supervised ML model rule of the plurality of decision rules (para [0064] “Following the retrieval of the training data, the supervised learner learns a rule that relates to the information elements in the training data, based on instances of both positive correspondence and negative correspondence in the training data. To learn the rule, the supervised learner first extracts potentially relevant features associated with the uncertain location elements, and then learns which features, or combinations of features, indicate the respective semantic roles of the location elements.”);
testing the at least one unsupervised ML model rule (para [0084] “Following attempted-rule-learning step 56, the unsupervised learner, at a rule-learning-evaluation step 58, evaluates whether a rule was successfully learned, i.e., whether a sufficiently large subset of the training data are sufficiently close in value to each other.”) and the at least one supervised ML model rule separately (para [0072] “In performing second evaluation step 49, the supervised learner may learn a rule from a first subset of the training data, apply the rule to a second subset of the training data, and evaluate the sufficiency of the training data based on how well the rule performs on the second subset.”); and
combining the at least one unsupervised ML model rule and the at least one supervised ML model rule after the testing (para [0087] “At a rule-seeking step 62, the tagger attempts to find a rule that is suitable for tagging the information element. If a suitable rule exists, the tagger uses the rule to tag the information element with a deduced semantic role (and, typically, a level of certainty), at a tagging step 64.” The rules are combined at the tagger 34 as shown in figure 1 which searches for rules to apply a tag to an element.).
Azizsoltani, Duarte, and Malleron are analogous because they are directed towards extracting rules from ML models.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the rules model of Azizsoltani and Duarte with the supervised and unsupervised learning of Malleron.
Doing so would allow for automatically learn the rules that are used to tag the information elements and to continually learn and update the tagging rules, in order to further improve the accuracy of the tagging (Malleron para [0047]).
Regarding Claim 12,
Claim 12 is the method corresponding to the system of claim 3. Claim 12 is substantially similar to claim 3 and is rejected on the same grounds.
Regarding Claim 13,
Claim 13 is the method corresponding to the system of claim 4. Claim 13 is substantially similar to claim 4 and is rejected on the same grounds.
Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Azizsoltani/Duarte/Malleron, as applied above, and further in view of Travalini et al. (US-20230259821-A1).
Regarding Claim 5,
Azizsoltani, Duarte, and Malleron the rule training system of claim 4.
Azizsoltani, Duarte, and Malleron do not explicitly disclose
wherein the at least one supervised ML model training technique utilizes at least one XGBoost model with an iterative data-refinement procedure for iterative training using the labeled data, and wherein the at least one unsupervised ML model training technique utilizes at least one isolation forest model for the iterative training using the unlabeled data.
However, Travalini (US 20230259821 A1) teaches
wherein the at least one supervised ML model training technique utilizes at least one XGBoost model (para [0058] “The one or more NLP algorithms may be supervised machine learning models configured to perform text classification, like k-nearest neighbor (KNN), naive-bayes, XGBoost, catboost, lightGBM, or any other suitable gradient boosting machine learning model.”) with an iterative data-refinement procedure for iterative training using the labeled data (para [0032] “As elaborated herein, in practice, machine learning systems and their underlying components are tuned by data scientists to perform numerous steps to perfect machine learning systems. The process is sometimes iterative and may entail looping through a series of steps: (1) understanding the domain, prior knowledge, and goals; (2) data integration, selection, cleaning, and pre-processing; (3) learning models; (4) interpreting results; and/or (5) consolidating and deploying discovered knowledge.”), and wherein the at least one unsupervised ML model training technique utilizes at least one isolation forest model for the iterative training using the unlabeled data (para [0069] “The unsupervised diagnosis engine may use any machine learning model to detect anomalies within the graph including support vector machines, isolation forest model, K-nearest neighbors (KNN), Naive Bayes (NB), and/or other techniques.”).
Azizsoltani, Duarte, Malleron, and Travalini are analogous because they are directed towards supervised and unsupervised learning.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the ML model of Azizsoltani, Duarte, and Malleron with the gradient boosting algorithm of Travalini.
Doing so would allow for applying a clustering technique to cluster the unsupervised feature vectors to determine whether any of the nodes (e.g., items, components, locations, etc.) are exhibiting unusual behavior/symptoms (Travalini para [0069]).
Regarding Claim 14,
Claim 14 is the method corresponding to the system of claim 5. Claim 14 is substantially similar to claim 5 and is rejected on the same grounds.
Claims 6-7 and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Azizsoltani/Duarte, as applied above, and further in view of Gupta et al. (US-20210280276-A1).
Regarding Claim 6,
Azizsoltani and Duarte teach the rule training system of claim 1. Azizsoltani further teaches and wherein the selecting the set of the plurality of decision rules is further based on:
limiting alerts generated by the set from meeting or exceeding the maximum alert rate (para [0160] “The performance metrics for the logical rules can then be compared to a predefined threshold so that poor-performing logical rules can be eliminated. For example, the false positive rate for R.sub.1T.sub.1 can be compared to a predefined threshold and, if the false positive rate exceeds the predefined threshold, that logical rule can be eliminated.”), and
performing a stability test of each rule in the set of the plurality of decision rules (para [0165] “For example, the performance metric for a logical rule can be a false positive rate, a false negative rate, or a hit rate that is determined based on the credit counter value or the penalty counter value corresponding to the logical rule.” Hit rate (i.e., stability)).
Azizsoltani and Duarte do not explicitly disclose
wherein the filtering utilizes at least one of a supervised feature importance test or an unsupervised feature importance test for features of the plurality of decision rules,
However, Gupta (US 20210280276 A1) teaches
wherein the filtering utilizes at least one of a supervised feature importance test or an unsupervised feature importance test for features of the plurality of decision rules (para [0026] “In an example, the ML workbench 104 can include the ML model 112 which is based on a supervised ML algorithm and trained to produce feature importance scores that are used to construct the data structures 114 and hence aid in rule extraction.”),
Azizsoltani, Duarte, and Gupta are analogous because they are directed towards extracting rules from ML models.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the ML model of Azizsoltani and Duarte with the feature extraction of Gupta.
Doing so would allow for selecting features to build the ML model from which the rules are extracted (Gupta para [0026]).
Regarding Claim 7,
Azizsoltani, Duarte, and Gupta teach the rule training system of claim 6.
Duarte further teaches
wherein the alerts are limited from meeting or exceeding the maximum alert rate using an executable operation that maximizes a detection rate of the set of the plurality of decision rules at or below the maximum alert rate for ones of the plurality of decision rules selected for the set (para [0053] “For example, with respect to fraud detection, a true positive rate may be maximized subject to a false positive rate constraint (e.g., keeping the false positive rate below a specified threshold).).
Azizsoltani and Duarte are analogous because they are directed to machine learning models utilizing rules for fraud detection.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the rules of Azizsoltani with the decision metrics of Duarte.
Doing so would allow for reducing the number of false positives in the fraud detection model by keeping the false positive rate under a specified threshold (Duarte para [0053]).
Regarding Claim 15,
Claim 15 is the method corresponding to the system of claim 6. Claim 15 is substantially similar to claim 6 and is rejected on the same grounds.
Regarding Claim 16,
Claim 16 is the method corresponding to the system of claim 7. Claim 16 is substantially similar to claim 7 and is rejected on the same grounds.
Claims 8 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Azizsoltani/Duarte, as applied above, and further in view of Perri et al. (US-20200294528-A1).
Regarding Claim 8,
Azizsoltani and Duarte teach the rule training system of claim 1.
Azizsoltani and Duarte do not explicitly disclose
wherein, before generating the decision ruleset, the rule training operations further comprise: applying at least one corrective rule that filters or masks at least one decision rule of the set from affecting the outputs by the set for the ML task based on false positives by the at least one decision rule.
However, Perri (US 20200294528 A1) teaches
wherein, before generating the decision ruleset, the rule training operations further comprise:
applying at least one corrective rule that filters or masks at least one decision rule of the set from affecting the outputs by the set for the ML task (para [0043] detecting frustration cue (i.e., ML task).) based on false positives by the at least one decision rule (para [0092] “The following provides an example of a “NEGATION_DIST_2” rule type. The rule logic includes a single token “not” and the rule behavior is “DEFAULT”. Thus, this rule may find the negation intensifier “not” in interactions, at a distance of up to two tokens from a frustration cue found by another linguistic rule. In case of a match, this rule may neutralize or cancel frustration cues detected by the other frustration rule.” A frustration rule incorrectly detects a frustration cue (i.e., false positive), the rule may cancel out the other rule.).
Azizsoltani, Duarte, and Perri are analogous because they are directed towards implementing rules for machine learning models.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the rules of Azizsoltani and Duarte with the negation rule of Perri.
Doing so would allow for implementing a rule to override a logic of the ML model which may provide a significant reduction in computing power required for calculating the output of the ML model (Perri para [0161]).
Regarding Claim 17,
Claim 17 is the method corresponding to the system of claim 8. Claim 17 is substantially similar to claim 8 and is rejected on the same grounds.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Gupta (US-20230289887-A1) discloses maximizing detection rate [0005] This is due to the fact that data mining and machine learning techniques have the potential to detect suspicious cases in a timely manner, and therefore potentially significantly reduce economic losses, both to the insurers and policy holders. Indeed there is great demand for effective predictive methods which maximize the true positive detection rate, minimize the false positive rate, and are able to quickly identify new and emerging fraud scheme.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HENRY K NGUYEN whose telephone number is (571)272-0217. The examiner can normally be reached Mon - Fri 7:00am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached at 5712723768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/HENRY NGUYEN/Examiner, Art Unit 2121