DETAILED ACTION
This action is responsive to communications filed on February 16, 2023. This action is made Non-Final.
Claims 1-20 are pending in the case.
Claims 1, 8, and 15 are independent claims.
Claims 1-20 are rejected.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS(s)) submitted on 02/16/2023 is/are in compliance with the provisions of 37 C.F.R. 1.97. Accordingly, the IDS(s) is/are being considered by the examiner.
Claim Objections
Claims 6, 13, and 20 recite the limitation "...the set of drifted production records...". There is insufficient antecedent basis for this limitation in the claim.
Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.
Step 1: Independent claims 1, 8, and 15 are directed towards a method, system, and program product (see Spec. para. 0057 on CRMs), respectively. Therefore, these claims, as well as their dependent claims, are directed towards one of the four statutory categories (process, machine (i.e. apparatus), manufacture, or composition of matter.
With respect to claim 1:
2A Prong 1:
Claim 1 recites the following judicial exceptions:
determining a set of drifted records during payload time (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may identify records that have conceptually drifted during a model runtime.).
pruning the determined set of drifted records to use (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may identify records that have conceptually drifted during a model runtime and group the identified records for separate or isolated action(s).).
2A Prong 2: The additional elements recited in the claim do not integrate the judicial exception into a practical application.
Additional elements:
a computer-implemented method, the computer-implemented method comprising: (mere instructions to apply the exception or implement the exception on a computer (e.g. – implementing a method or process using a computer; see MPEP §2106.05(f).). The additional elements do not effectively integrate the abstract idea into a practical application. 2B: revisiting the additional elements, the additional elements do not amount to significantly more than the judicial exception – recited high level of generality and corresponds to storing and retrieving information in memory.).
training a machine learning model ... of the machine learning model ... in retraining the machine learning model (generally linking the use of a judicial exception to a particular technological environment or field of use (e.g. using setting up and maintaining a machine learning model; see MPEP §2106.05(h).). The additional elements do not effectively integrate the abstract idea into a practical application. 2B: revisiting the additional elements, the additional elements do not amount to significantly more than the judicial exception – recited high level of generality and corresponds to storing and retrieving information and performing calculations.).
Claim 2:
2A Prong 1:
Claim 2 recites the following judicial exceptions:
obtaining a model confidence distribution for the first record (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may calculate or obtain model confidence data.).
determining whether the first record is an outlier with respect to the model confidence distribution by using a model confidence of the first record (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may calculate or obtain model confidence data and identifier outliers.).
outputting the first record as a relabeling candidate based on the determination (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may calculate or obtain model confidence data and identifier outliers and put forward specific data for relabeling or reannotating.).
2A Prong 2: The additional elements recited in the claim do not integrate the judicial exception into a practical application.
Additional elements:
receiving a first record flagged as drifted, the first record belonging to an existing category or interval which has a number of records smaller than a predetermined threshold (mere instructions to apply the exception or implement the exception on a computer (e.g. – implementing a method or process using a computer to send or receive particular data; see MPEP §2106.05(f).). The additional elements do not effectively integrate the abstract idea into a practical application. 2B: revisiting the additional elements, the additional elements do not amount to significantly more than the judicial exception – recited high level of generality and corresponds to storing and retrieving information in memory and performing calculations).
in a training data of the machine learning model ... at training time of the machine learning model ... at the payload time of the machine learning model (generally linking the use of a judicial exception to a particular technological environment or field of use (e.g. using setting up and maintaining a machine learning model; see MPEP §2106.05(h).). The additional elements do not effectively integrate the abstract idea into a practical application. 2B: revisiting the additional elements, the additional elements do not amount to significantly more than the judicial exception – recited high level of generality and corresponds to storing and retrieving information and performing calculations.).
Claim 3:
2A Prong 1:
Claim 3 recites the following judicial exceptions:
obtaining feature importance vector of the input features (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may receive particular records flagged as drifted and produce characteristic or feature importance structures.).
selecting proportionate number of the second records from each feature, based on the feature importance vector (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may receive particular records flagged as drifted and produce characteristic or feature importance structures and select records based thereon.).
2A Prong 2: The additional elements recited in the claim do not integrate the judicial exception into a practical application.
Additional elements:
receiving a plurality of second records flagged as drifted, the second records belonging to a new category or interval which is not seen in the training data (mere instructions to apply the exception or implement the exception on a computer (e.g. – implementing a method or process using a computer to send or receive particular data; see MPEP §2106.05(f).). The additional elements do not effectively integrate the abstract idea into a practical application. 2B: revisiting the additional elements, the additional elements do not amount to significantly more than the judicial exception – recited high level of generality and corresponds to storing and retrieving information in memory and performing calculations).
of the machine learning model ... of the machine learning model (generally linking the use of a judicial exception to a particular technological environment or field of use (e.g. using setting up and maintaining a machine learning model; see MPEP §2106.05(h).). The additional elements do not effectively integrate the abstract idea into a practical application. 2B: revisiting the additional elements, the additional elements do not amount to significantly more than the judicial exception – recited high level of generality and corresponds to storing and retrieving information and performing calculations.).
Claim 4:
2A Prong 1:
Claim 4 recites the following judicial exceptions:
obtaining feature importance vector of the input features (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may receive particular records flagged as drifted and produce characteristic or feature importance structures.).
selecting or ignoring the third records based on the feature importance vector (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may receive particular records flagged as drifted and produce characteristic or feature importance structures and select records based thereon.).
2A Prong 2: The additional elements recited in the claim do not integrate the judicial exception into a practical application.
Additional elements:
receiving a plurality of third records flagged as drifted, the third records belonging to an existing category or interval which has a number of records equal to or greater than the predetermined threshold in the training data (mere instructions to apply the exception or implement the exception on a computer (e.g. – implementing a method or process using a computer to send or receive particular data; see MPEP §2106.05(f).). The additional elements do not effectively integrate the abstract idea into a practical application. 2B: revisiting the additional elements, the additional elements do not amount to significantly more than the judicial exception – recited high level of generality and corresponds to storing and retrieving information in memory and performing calculations).
of the machine learning model ... of the machine learning model (generally linking the use of a judicial exception to a particular technological environment or field of use (e.g. using setting up and maintaining a machine learning model; see MPEP §2106.05(h).). The additional elements do not effectively integrate the abstract idea into a practical application. 2B: revisiting the additional elements, the additional elements do not amount to significantly more than the judicial exception – recited high level of generality and corresponds to storing and retrieving information and performing calculations.).
Claim 5:
2A Prong 1:
Claim 5 recites the following judicial exceptions:
wherein the set of drifted records result at least from production datasets at payload time having different characteristics from training datasets during the training (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may analyze particular records flagged as drifted.).
Claim 6:
2A Prong 1:
Claim 6 recites the following judicial exceptions:
wherein the set of drifted production records at payload time are pruned for relabeling with categories or intervals having an occurrence less than a predetermined threshold in training data by using a model confidence distribution (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may analyze particular records flagged as drifted and remove or isolate the particular records for specific actions.).
Claim 7:
2A Prong 1:
Claim 7 recites the following judicial exceptions:
wherein drifted production records at payload time are pruned for relabeling with unseen categories or ranges in training data by using feature importance (mental process –can be performed in the human mind, or by a human using a pen and paper (e.g. a person may analyze particular records flagged as drifted and remove or isolate the particular records for specific actions based on characteristic or feature importance.).
Claim 8:
Claim 8 substantially corresponds to claim 1 and rejected under the same rationale.
Claim 9:
Claim 9 substantially corresponds to claim 2 and rejected under the same rationale.
Claim 10:
Claim 10 substantially corresponds to claim 3 and rejected under the same rationale.
Claim 11:
Claim 11 substantially corresponds to claim 4 and rejected under the same rationale.
Claim 12:
Claim 12 substantially corresponds to claim 5 and rejected under the same rationale.
Claim 13:
Claim 13 substantially corresponds to claim 6 and rejected under the same rationale.
Claim 12:
Claim 12 substantially corresponds to claim 5 and rejected under the same rationale.
Claim 14:
Claim 14 substantially corresponds to claim 7 and rejected under the same rationale.
Claim 15:
Claim 15 substantially corresponds to claim 1 and rejected under the same rationale.
Claim 16:
Claim 16 substantially corresponds to claim 2 and rejected under the same rationale.
Claim 17:
Claim 17 substantially corresponds to claim 3 and rejected under the same rationale.
Claim 18:
Claim 18 substantially corresponds to claim 4 and rejected under the same rationale.
Claim 19:
Claim 19 substantially corresponds to claim 5 and rejected under the same rationale.
Claim 20:
Claim 20 substantially corresponds to claim 6 and rejected under the same rationale.
2B continued: After considering all claim elements individually and as an ordered combination, it is determined that the claims do not include any additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1, 8, and 15 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Cmielowski et al., US Publication 2021/0295231 (“Cmielowski”).
Claim 1:
Cmielowski discloses a computer-implemented method, the computer-implemented method comprising:
training a machine learning model (see para. 0015 - machine learning model (hereinafter, "model") can be a computer software and/or hardware architecture that makes specific classifications based on a training process wherein the model learns to make these classifications; para. 0017 - Machine learning models learn to make their specific classifications by analyzing labeled training data.);
determining a set of drifted records during payload time of the machine learning model (see para. 0016 - metrics for representing the accuracy can include drift, confidence, and uncertainty. Drift can refer to a scenario where the model may develop a growing bias in favor or disfavor of certain classifications; para. 0020 - classifier 110 can select the classification for each of multiple input transactions and update the transactions 112 to include the selected classification; para. 0022 - if the machine learning model 104 begins to drift; para. 0023 - metrics that are based on the accuracy of predictions and/or the probability of accurate predictions. Accordingly, the ML model manager 108 can generate scored transactions 114, identify training outliers 116 of the scored transactions 114; para. 0024 - Scoring the transactions 112 can involve determining a metric, such as the confidence and/or uncertainty for each of the transactions 112. These and/or other metrics can describe how well the classifier 110 of the machine learning model 104 is performing its classifications. The training outliers 116 can represent a subset of the scored transactions 114 that impact the specific metric goal; para. 0025 – metric goals may include uncertainty, drift, fairness, and the like.);
pruning the determined set of drifted records to use in retraining of the machine learning model (see para. 0023 – identify training outliers 116 of the scored transactions 114, and generate re-training data 118; para. 0024 – training outliers 116 can represent a subset of the scored transactions 114 that impact the specific metric goal. In other words, by removing the training outliers 116 from the scored transactions 114, the metric value can achieve the requested goal. In an example set of scored transactions, the ML model manager 108 can determine that removing a specific set of scored transactions with comparatively lower confidence values can improve the mean confidence of the remaining scored transactions by a predetermined amount; para. 0025 - client may generate re training data 118 by re-labeling the training outliers 116. Alternatively, the ML model manager 108 can provide an interface (not shown) to re-label the training outliers 116. Re-labeling the training outliers 116 can involve selecting a new classification for the training outlier 116. Accordingly, the machine learning model 104 can retrain the classifier 110 with the re-labeled transactions in the re-training data 118. In this way, the classifier 110 can learn to make different classifications on the training outliers 116, thus correcting the confidence, for example, in the machine learning model 104. According to some embodiments of the present disclosure, confidence is merely one potential metric goal. Other metric goals may include uncertainty, drift, fairness, and the like.).
Claim(s) 8 and 15:
Claim(s) 8 and 15 correspond to Claim 1, and thus, Cmielowski discloses the limitations of claim(s) 8 and 15 as well.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 2, 5, 6, 9, 12, 13, 16, 19 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cmielowski, and further in view of Brandes et al., US Publication 2020/0364520 (“Brandes”).
Claim 2:
Cmielowski further teaches or suggests receiving a first record flagged as drifted, the first record belonging to an existing category or interval ... in a training data of the machine learning model, obtaining a model confidence distribution for the first record at a training time of the machine learning model; determining whether the first record is an outlier with respect to the model confidence distribution by using a model confidence of the first record at the payload time of the machine learning model; and outputting the first record as a relabeling candidate based on the determination (see para. 0020 - classifier 110 can select the classification for each of multiple input transactions and update the transactions 112 to include the selected classification; para. 0016 - metrics for representing the accuracy can include drift, confidence, and uncertainty. Drift can refer to a scenario where the model may develop a growing bias in favor or disfavor of certain classifications; para. 0022 - if the machine learning model 104 begins to drift; para. 0023 - metrics that are based on the accuracy of predictions and/or the probability of accurate predictions. Accordingly, the ML model manager 108 can generate scored transactions 114, identify training outliers 116 of the scored transactions 114; para. 0024 - Scoring the transactions 112 can involve determining a metric, such as the confidence and/or uncertainty for each of the transactions 112. These and/or other metrics can describe how well the classifier 110 of the machine learning model 104 is performing its classifications. The training outliers 116 can represent a subset of the scored transactions 114 that impact the specific metric goal; para. 0025 – may generate re-training data 118 by re-labeling the training outliers 116. Alternatively, the ML model manager 108 can provide an interface (not shown) to re-label the training outliers 116. Re-labeling the training outliers 116 can involve selecting a new classification for the training outlier 116. metric goals may include uncertainty, drift, fairness, and the like.).
Cmielowski does not explicitly disclose which has a number of records smaller than a predetermined threshold.
Brandes teaches or suggests which has a number of records smaller than a predetermined threshold (see Fig. 3, 4; para. 0005 - in the training data sets only a small number of examples occur, or the examples are very much underrepresented compared to other cases; para. 0023 - e.g., in 98+% of all predictions 9 of 10 classes are always predicted, this may have two reasons: ... or (ii) the training model underlying the classifier does not "see" the last class because it has not been reflected accordingly in the training data set; para. 0040 - underrepresented class' may denote a class of a classifier which may more or less never-or close to never-be predicted because the underlying machinelearning model has not been trained enough for this class because the training data set has too few examples of the underrepresented class; para. 0054 – q classes (q<n) are underrepresented in the training data set, i.e., (the number of samples in the class <the number of overall samples/(n* (Imean-medianl)), also called rare cases. Different thresholds may be used to define a rare case and the Imean-medianl) factor is one example for an implementation; para. 0058 - it is determined by the evaluator engine 306 that the case is a rare case, the input data are forwarded to the rare case extractor 310. This module is used to potentially enlarge the corpus of training data; para. 0066 - modify the underlying machine-learning model for the classifier 302 so that in future a recognition, also of rare cases is enhanced.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Cmielowski, to include which has a number of records smaller than a predetermined threshold for the purpose of efficiently causing a machine learning model to detect rare or underrepresented cases and to take actions to improve training on the rare or underrepresented cases, improving model performance , as taught by Brandes (0023, 0040, 0054, 0066).
Claim(s) 9 and 16:
Claim(s) 9 and 16 correspond to Claim 2, and thus, Cmielowski and Brandes teach or suggest the limitations of claim(s) 9 and 16 as well.
Claim 5:
Cmielowski further teaches or suggests wherein the set of drifted records result at least from production data sets at payload time (see para. 0016 - metrics for representing the accuracy can include drift, confidence, and uncertainty. Drift can refer to a scenario where the model may develop a growing bias in favor or disfavor of certain classifications; para. 0020 - classifier 110 can select the classification for each of multiple input transactions and update the transactions 112 to include the selected classification; para. 0022 - if the machine learning model 104 begins to drift; para. 0023 - metrics that are based on the accuracy of predictions and/or the probability of accurate predictions. Accordingly, the ML model manager 108 can generate scored transactions 114, identify training outliers 116 of the scored transactions 114; para. 0024 - Scoring the transactions 112 can involve determining a metric, such as the confidence and/or uncertainty for each of the transactions 112. These and/or other metrics can describe how well the classifier 110 of the machine learning model 104 is performing its classifications. The training outliers 116 can represent a subset of the scored transactions 114 that impact the specific metric goal; para. 0025 – metric goals may include uncertainty, drift, fairness, and the like.).
Brandes further teaches or suggests having different characteristics from training datasets during the training (see Fig. 3, 4; para. 0005 - in the training data sets only a small number of examples occur, or the examples are very much underrepresented compared to other cases; para. 0023 - e.g., in 98+% of all predictions 9 of 10 classes are always predicted, this may have two reasons: ... or (ii) the training model underlying the classifier does not "see" the last class because it has not been reflected accordingly in the training data set; para. 0040 - underrepresented class' may denote a class of a classifier which may more or less never-or close to never-be predicted because the underlying machinelearning model has not been trained enough for this class because the training data set has too few examples of the underrepresented class; para. 0054 – q classes (q<n) are underrepresented in the training data set, i.e., (the number of samples in the class <the number of overall samples/(n* (Imean-medianl)), also called rare cases. Different thresholds may be used to define a rare case and the Imean-medianl) factor is one example for an implementation; para. 0058 - it is determined by the evaluator engine 306 that the case is a rare case, the input data are forwarded to the rare case extractor 310. This module is used to potentially enlarge the corpus of training data; para. 0066 - modify the underlying machine-learning model for the classifier 302 so that in future a recognition, also of rare cases is enhanced.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Cmielowski, to include having different characteristics from training datasets during the training for the purpose of efficiently causing a machine learning model to detect rare or underrepresented cases and to take actions to improve training on the rare or underrepresented cases, improving model performance , as taught by Brandes (0023, 0040, 0054, 0066).
Claim(s) 12 and 19:
Claim(s) 12 and 19 correspond to Claim 5, and thus, Cmielowski, Brandes, teach or suggest the limitations of claim(s) 12 and 19 as well.
Claim 6:
Cmielowski further teaches or suggests wherein the set of drifted production records at payload time are pruned for relabeling ... by using model confidence distribution (see para. 0016 - metrics for representing the accuracy can include drift, confidence, and uncertainty. Drift can refer to a scenario where the model may develop a growing bias in favor or disfavor of certain classifications; para. 0020 - classifier 110 can select the classification for each of multiple input transactions and update the transactions 112 to include the selected classification; para. 0022 - if the machine learning model 104 begins to drift; para. 0023 - metrics that are based on the accuracy of predictions and/or the probability of accurate predictions. Accordingly, the ML model manager 108 can generate scored transactions 114, identify training outliers 116 of the scored transactions 114; para. 0024 - Scoring the transactions 112 can involve determining a metric, such as the confidence and/or uncertainty for each of the transactions 112. These and/or other metrics can describe how well the classifier 110 of the machine learning model 104 is performing its classifications. The training outliers 116 can represent a subset of the scored transactions 114 that impact the specific metric goal; para. 0025 – may generate re-training data 118 by re-labeling the training outliers 116. Alternatively, the ML model manager 108 can provide an interface (not shown) to re-label the training outliers 116. Re-labeling the training outliers 116 can involve selecting a new classification for the training outlier 116. metric goals may include uncertainty, drift, fairness, and the like.).
Brandes further teaches or suggests with categories or intervals having an occurrence less than a predetermined threshold in training data (see Fig. 3, 4; para. 0005 - in the training data sets only a small number of examples occur, or the examples are very much underrepresented compared to other cases; para. 0023 - e.g., in 98+% of all predictions 9 of 10 classes are always predicted, this may have two reasons: ... or (ii) the training model underlying the classifier does not "see" the last class because it has not been reflected accordingly in the training data set; para. 0040 - underrepresented class' may denote a class of a classifier which may more or less never-or close to never-be predicted because the underlying machinelearning model has not been trained enough for this class because the training data set has too few examples of the underrepresented class; para. 0054 – q classes (q<n) are underrepresented in the training data set, i.e., (the number of samples in the class <the number of overall samples/(n* (Imean-medianl)), also called rare cases. Different thresholds may be used to define a rare case and the Imean-medianl) factor is one example for an implementation; para. 0058 - it is determined by the evaluator engine 306 that the case is a rare case, the input data are forwarded to the rare case extractor 310. This module is used to potentially enlarge the corpus of training data; para. 0066 - modify the underlying machine-learning model for the classifier 302 so that in future a recognition, also of rare cases is enhanced.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Cmielowski, to include with categories or intervals having an occurrence less than a predetermined threshold in training data for the purpose of efficiently causing a machine learning model to detect rare or underrepresented cases and to take actions to improve training on the rare or underrepresented cases, improving model performance , as taught by Brandes (0023, 0040, 0054, 0066).
Claim(s) 13 and 20:
Claim(s) 13 and 20 correspond to Claim 6, and thus, Cmielowski and Brandes, teach or suggest the limitations of claim(s) 13 and 20 as well.
Claim(s) 3, 10, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cmielowski, in view of Brandes, in view of Karimibiuki et al., US Publication 2023/0319099 (“Karimibiuki”), and further in view of Jain et al., US Publication 2023/0351252 (“Jain”).
Claim 3:
Karimibiuki teaches or suggests receiving a plurality of second records flagged as drifted, the second records belonging to a new category or interval which is not seen in the training data of the machine learning model (see para. 0102 - drifting samples are detected that deviate from existing classes, and are used as a new class of input for fuzzing the target model; para. 0103 - detecting is based on the plurality of features obtained from each input example, and, in response to the detecting, retraining the model using one or more of the input examples as input; para. 0104 – detecting that a generated label associated with a subset of the input examples is different from a corresponding groundtruth label, and, in response to the detecting, retraining the model using subset of the input examples as input.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Cmielowski, to include receiving a plurality of second records flagged as drifted, the second records belonging to a new category or interval which is not seen in the training data of the machine learning model for the purpose of efficiently concept drift in a machine learning model based on samples deviating from existing classes, enabling appropriate model modification, as taught by Karimibiuki (0012 and 0104).
Jain further teaches or suggests obtaining a feature importance vector of input features of the machine learning model; and selecting proportionate number of the second records from each feature, based on the feature importance vector (see para. 0092 - predictor vector X obtained by considering each feature, x; para. 0134 – feature weights obtained by server 2111, e.g., as determined locally and/or aggregated from or with client feature weights may be used to select which features are important for the model outputs. this information may be used to improve the local model, e.g., by changing the data that is collected, and thus by adding one or more features to the client training data. This does not necessarily have to imply that existing client training data needs to be extended with the new feature; instead one could use the new feature for new training samples that are added to the client training set; para. 0144 - to selected training samples, e.g., an explainability algorithm, and obtain sample feature weights, indicating the importance of the features for the selected training samples.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Cmielowski, to include obtaining a feature importance vector of input features of the machine learning model; and selecting proportionate number of the second records from each feature, based on the feature importance vector for the purpose of efficiently selecting training samples based on feature importance, improving model training, as taught by Jain (0134 and 0144).
Claim(s) 10 and 17:
Claim(s) 10 and 17 correspond to Claim 3, and thus, Cmielowski, Brandes, Karimibiuki, and Jain teach or suggest the limitations of claim(s) 10 and 17 as well.
Claim(s) 4, 11, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cmielowski, in view of Brandes, in view of Karimibiuki, in view of Jain, and further in view of Vandikas et al., US Publication 2024/0357380 (“Vandikas”).
Claim 4:
Cmielowski further teaches or suggests receiving a plurality of third records flagged as drifted (see para. 0016 - metrics for representing the accuracy can include drift, confidence, and uncertainty. Drift can refer to a scenario where the model may develop a growing bias in favor or disfavor of certain classifications; para. 0020 - classifier 110 can select the classification for each of multiple input transactions and update the transactions 112 to include the selected classification; para. 0022 - if the machine learning model 104 begins to drift; para. 0023 - metrics that are based on the accuracy of predictions and/or the probability of accurate predictions. Accordingly, the ML model manager 108 can generate scored transactions 114, identify training outliers 116 of the scored transactions 114; para. 0024 - Scoring the transactions 112 can involve determining a metric, such as the confidence and/or uncertainty for each of the transactions 112. These and/or other metrics can describe how well the classifier 110 of the machine learning model 104 is performing its classifications. The training outliers 116 can represent a subset of the scored transactions 114 that impact the specific metric goal; para. 0025 – may generate re-training data 118 by re-labeling the training outliers 116. Alternatively, the ML model manager 108 can provide an interface (not shown) to re-label the training outliers 116. Re-labeling the training outliers 116 can involve selecting a new classification for the training outlier 116. metric goals may include uncertainty, drift, fairness, and the like.).
Vandikas further teaches or suggests the third records belonging to an existing category or interval which has a number of records equal to or greater than the predetermined threshold in the training data of the machine learning model (see para. 0004 - to select and balance imbalanced datasets from different communication devices for the decentralized autoencoder to learn from. As used herein an "imbalanced dataset" refers to a dataset that includes more than one class of data, e.g. two classes, and distribution of samples of data across the classes, or within a class, is not uniform. The classes include a "majority class" having a greater number of samples and a "minority class" having a fewer number of samples than the majority class. The distribution of samples can range from a slight imbalance to a more severe imbalance (e.g., where there is one sample in the minority class and hundreds, thousands, millions, etc. of samples in the majority class; para. 0073 - shares information with communication device 101 about the number of samples it has and the label distribution or the number of samples that are marked as a minority dataset class (e.g., sleeping) and the number of samples that are marked as a majority dataset class (e.g., non-sleeping). Once that is received, communication device 101 (e.g., a parameter server (PS)) computes 305 the expected size and label distribution for a training dataset.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Cmielowski, to include the third records belonging to an existing category or interval which has a number of records equal to or greater than the predetermined threshold in the training data of the machine learning model for the purpose of efficiently ascertaining data imbalance and undertaking specific training data set actions based on the imbalance information, improving model training and performance, as taught by Vandikas (0004 and 0073).
Jain further teaches or suggests obtaining a feature importance vector of input features of the machine learning model; and selecting or ignoring the third records based on the feature importance vector (see para. 0092 - predictor vector X obtained by considering each feature, x; para. 0134 – feature weights obtained by server 2111, e.g., as determined locally and/or aggregated from or with client feature weights may be used to select which features are important for the model outputs. this information may be used to improve the local model, e.g., by changing the data that is collected, and thus by adding one or more features to the client training data. This does not necessarily have to imply that existing client training data needs to be extended with the new feature; instead one could use the new feature for new training samples that are added to the client training set; para. 0144 - to selected training samples, e.g., an explainability algorithm, and obtain sample feature weights, indicating the importance of the features for the selected training samples.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Cmielowski, to include obtaining a feature importance vector of input features of the machine learning model; and selecting or ignoring the third records based on the feature importance vector for the purpose of efficiently selecting training samples based on feature importance, improving model training, as taught by Jain (0134 and 0144).
Claim(s) 11 and 18:
Claim(s) 11 and 18 correspond to Claim 4, and thus, Cmielowski, Brandes, Karimibiuki, Jain, and Vandikas teach or suggest the limitations of claim(s) 11 and 18 as well.
Claim(s) 7 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cmielowski, Brandes, and further in view of Jain.
Claim 7:
Cmielowski further teaches or suggests wherein the drifted production records at payload time are pruned for relabeling (see para. 0016 - metrics for representing the accuracy can include drift, confidence, and uncertainty. Drift can refer to a scenario where the model may develop a growing bias in favor or disfavor of certain classifications; para. 0020 - classifier 110 can select the classification for each of multiple input transactions and update the transactions 112 to include the selected classification; para. 0022 - if the machine learning model 104 begins to drift; para. 0023 - metrics that are based on the accuracy of predictions and/or the probability of accurate predictions. Accordingly, the ML model manager 108 can generate scored transactions 114, identify training outliers 116 of the scored transactions 114; para. 0024 - Scoring the transactions 112 can involve determining a metric, such as the confidence and/or uncertainty for each of the transactions 112. These and/or other metrics can describe how well the classifier 110 of the machine learning model 104 is performing its classifications. The training outliers 116 can represent a subset of the scored transactions 114 that impact the specific metric goal; para. 0025 – may generate re-training data 118 by re-labeling the training outliers 116. Alternatively, the ML model manager 108 can provide an interface (not shown) to re-label the training outliers 116. Re-labeling the training outliers 116 can involve selecting a new classification for the training outlier 116. metric goals may include uncertainty, drift, fairness, and the like.).
Brandes further teaches or suggests with unseen categories or ranges in the training data (see Fig. 3, 4; para. 0005 - in the training data sets only a small number of examples occur, or the examples are very much underrepresented compared to other cases; para. 0023 - e.g., in 98+% of all predictions 9 of 10 classes are always predicted, this may have two reasons: ... or (ii) the training model underlying the classifier does not "see" the last class because it has not been reflected accordingly in the training data set; para. 0040 - underrepresented class' may denote a class of a classifier which may more or less never-or close to never-be predicted because the underlying machinelearning model has not been trained enough for this class because the training data set has too few examples of the underrepresented class; para. 0054 – q classes (q<n) are underrepresented in the training data set, i.e., (the number of samples in the class <the number of overall samples/(n* (Imean-medianl)), also called rare cases. Different thresholds may be used to define a rare case and the Imean-medianl) factor is one example for an implementation; para. 0058 - it is determined by the evaluator engine 306 that the case is a rare case, the input data are forwarded to the rare case extractor 310. This module is used to potentially enlarge the corpus of training data; para. 0066 - modify the underlying machine-learning model for the classifier 302 so that in future a recognition, also of rare cases is enhanced.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Cmielowski, to include with unseen categories or ranges in the training data for the purpose of efficiently causing a machine learning model to detect rare or underrepresented cases and to take actions to improve training on the rare or underrepresented cases, improving model performance , as taught by Brandes (0023, 0040, 0054, 0066).
Jain further teaches or suggests by using a feature importance of the machine learning model (see para. 0092 - predictor vector X obtained by considering each feature, x; para. 0134 – feature weights obtained by server 2111, e.g., as determined locally and/or aggregated from or with client feature weights may be used to select which features are important for the model outputs. this information may be used to improve the local model, e.g., by changing the data that is collected, and thus by adding one or more features to the client training data. This does not necessarily have to imply that existing client training data needs to be extended with the new feature; instead one could use the new feature for new training samples that are added to the client training set; para. 0144 - to selected training samples, e.g., an explainability algorithm, and obtain sample feature weights, indicating the importance of the features for the selected training samples.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Cmielowski, to include by using a feature importance of the machine learning model for the purpose of efficiently selecting training samples based on feature importance, improving model training, as taught by Jain (0134 and 0144).
Claim(s) 14:
Claim(s) 14 correspond to Claim 7, and thus, Cmielowski, Brandes, and Jain teach or suggest the limitations of claim(s) 14 as well.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Andrew T McIntosh whose telephone number is (571)270-7790. The examiner can normally be reached M-Th 8:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached at 571-272-4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ANDREW T MCINTOSH/Primary Examiner, Art Unit 2144