Office Action Analysis: 18147759 — SYSTEM AND METHOD FOR IDENTIFYING POISONED TRAINING DATA USED TO TRAIN ARTIFICIAL INTELLIGENCE MODELS

Office Action

§103
DETAILED ACTION
This Office Action is in response to communications filed on December 29, 2025 for Application No. 18/147,759, in which claims 1-20 are presented for examination. The amendments filed on December 29, 2025 have been entered, where claims 1-2, 4-7, 9, 13-14, 16-18, and 20 are amended.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statements submitted on 09/19/2025, 10/01/2025, 10/24/2025, and 12/26/2025 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements were considered by the examiner.

Claim Objections
Claims 1-20 are objected to because of the following informalities: 
“generating, by the first instance of the AI model to, a first plurality of inferences” (Claim 1, ln. 9; Claim 13, ln. 11; Claim 17, ln. 13) should be “generating, by the first instance of the AI model, a first plurality of inferences” (objection applies equally to dependent claims 2-12, 14-16, and 18-20).
“with confidence scores with confidence scores” (Claim 1, ln. 21-22; Claim 13, ln. 22-23; Claim 17, ln. 26) should be “with confidence scores” (objection applies equally to dependent claims 2-12, 14-16, and 18-20).
“one of the first plurality of inferences the first instance of the AI model” (Claim 4, ln. 3-4; Claim 16, ln. 3-4; Claim 20, ln. 3-4) should be “one of the first plurality of inferences of the first instance of the AI model” (objection applies equally to dependent Claim 5).
Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 6-13, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Jurzak et al. (hereinafter Jurzak) (Pat. Pub. No. US 2023/0004654 A1) in view of Tran et al. (hereinafter Tran) (“Spectral Signatures in Backdoor Attacks”) and Hendrycks et al. (hereinafter Hendrycks) (“Natural Adversarial Examples”) .

Regarding Claim 1, Jurzak teaches a method for identifying poisoned training . . . [of] an artificial intelligence (AI) model, comprising (Para. [0010], “Disclosed herein are methods and apparatus for detecting malicious re-training of an anomaly detection system”, where, in this instance, “malicious re-training” is done through training data poisoning because the “malicious offender” provides poisoned data, see Para. [0021], “the offender may, through multiple engagements with the system, incrementally push the threshold higher until the anomaly detection system learns to ignore the particular scenario, rather than identifying it as an anomaly and triggering an alert”, which is used by the system for continual training, see Para. [0071], “snapshots of the anomaly detection models used by the analysis engine over time may be taken and stored in a training repository at predetermined or configurable intervals or when the number or significance of changes made to the anomaly detection model through machine learning, whether malicious or not, trigger a snapshot) (emphasis added):
obtaining a first instance of the AI model, the first instance of the AI model having been trained using a first training dataset (Fig. 5; Para. [0071], “block 504 with storing a plurality of anomaly detection models used in an anomaly detection system at respective points in time. For example, in various embodiments, snapshots of the anomaly detection models used by the analysis engine over time may be taken and stored in a training repository at predetermined or configurable intervals or when the number or significance of changes made to the anomaly detection model through machine learning, whether malicious or not, trigger a snapshot”, where the “plurality of anomaly detection models” are “snapshots” that correspond to “significance of changes made to the anomaly detection model through machine learning”, and therefore, any of the plurality qualify as the first instance of the AI model and its associated training data is the first training dataset; see also Para. [0012], “a repository storing a plurality of anomaly detection models, and a machine-learning-based analysis engine”);
training the first instance of the AI model using a second training dataset to obtain a second instance of the AI model (Para. [0073], “determining, by a current anomaly detection model”, where the “current anomaly detection model”, which is the second instance of the AI model, must be obtained to be used for determining; Para. [0071], “snapshots of the anomaly detection models used by the analysis engine over time may be taken and stored in a training repository at predetermined or configurable intervals or when the number or significance of changes made to the anomaly detection model through machine learning, whether malicious or not, trigger a snapshot”, where, as discussed above, “the anomaly detection model” is continuously “change[d]” “through machine learning” and a “plurality of anomaly detection models” are stored at “significan[t]” points of change, so the “current anomaly detection model” is obtained by training the first instance of the AI model with additional “machine learning” data, which is the second training dataset); 
generating, by the first instance of the AI model to, a first . . . [inference] (Fig. 5; Para. [0072], “506, the method includes storing, for each of the anomaly detection models, a respective classification result determined by the anomaly detection model for a given input indicating whether the given input is considered to represent an anomaly, where the classification result is associated with a respective classification score”);
generating, by the second instance of the AI model, a second . . . [inference] (Fig. 5; Para. [0073], “At 508, method 500 includes determining, by a current anomaly detection model, a classification result for the given input indicating whether the given input is considered to represent an anomaly”);
making a comparison between the first . . . [inference] and the second . . . [inference] to identify whether the second training dataset comprises poisoned training data (Fig. 5, “510 NEGATIVE CLASSIFICATION RESULT FROM CURRENT MODEL DIFFERS FROM POSITIVE RESULT OF AN EARLIER MODEL?”, where the comparison is used to determine whether the current model was “deliberately re-trained to falsely classify”, which is determining whether the second dataset comprises poisoned training data, see generally Para. [0059], “determining, by the anomaly detection system dependent on the respective classification results, that it is likely that the anomaly detection system has been deliberately re-trained to falsely classify the input as representing an object or event that should not be classified as an anomaly” and Para. [0021], “a malicious offender may be aware of machine learning or other artificial intelligence modules in a security system and, as a part of a planned attack, may influence these modules in a way that an anomaly will not be correctly identified, and no alert will be triggered in response to the attack. More specifically, a malicious offender may re-train an anomaly detection system by deliberately causing a change to a threshold value at which an alert is triggered for a particular scenario”), 
the comparison being based on a variation . . . in the first . . . [inference] and the second . . . [inference] . . . [the inferences] being ones of the first . . . [inference] and the second . . . [inference]  (Fig. 5, “510 NEGATIVE CLASSIFICATION RESULT FROM CURRENT MODEL DIFFERS FROM POSITIVE RESULT OF AN EARLIER MODEL?”, where the comparison, occurring at “510” is based on a variation, “DIFFERS”, in the outputs of the first and second inferences, “CLASSIFICATION RESULT FROM CURRENT MODEL DIFFERS FROM POSITIVE RESULT OF AN EARLIER MODEL”)
while the first . . . [inference] and the second . . . [inference] can also comprise . . . [instances of the first inference and the second inference] that are ones of the first . . . [inference] and the second . . . [not used in the comparison] . . . (Para. [0064], “at 406, the input is not classified as an anomaly, the method includes auditing the anomaly detection system to determine if there has been a malicious re-training of the machine-learning-based analysis engine, as in 408. An example method for auditing the anomaly detection system is illustrated in FIG. 5 and described below. On the other hand, if the input is classified as an anomaly, the method proceeds to 410”, where the use of an inference for “auditing the anomaly detection system to determine if there has been a malicious re-training” is based on a characteristic of the inference, “not classified as an anomaly”, such that a subset of the inferences trigger auditing, “408”, while other inferences like the second inference indicating an anomaly, “if the input is classified as an anomaly, the method proceeds to 410”, and the associated first inference are not used in the comparison, see Fig. 4);
in a first instance of the comparison where the second training dataset is poisoned (Fig. 5; Para. [0074], “If, at 510, it is determined that a negative classification result produced by the analysis engine using the current anomaly detection model differs from positive result produced by the analysis engine using an earlier anomaly detection model, the method continues at 512”, where the method may lead to the conclusion that the second training dataset is poisoned, see Para. [0078], “[if further conditions are met,] outputting an indication that malicious re-training of the anomaly detection models has likely occurred”): 
identifying a poisoned . . . [model that was poisoned by] the second training dataset, and remediating the poisoned . . . [retraining] (Para. [0076], “If, at 514, it is determined that the change in classification scores over time is inconsistent with natural learning patterns, the method continues at 516. For example, determining that it is likely that the anomaly detection system has been deliberately re-trained to falsely classify the input”; Para. [0047], “Corrective action may then be taken to reverse the re-training”; see also Para. [0060], “At 314, the method includes, in response to determining that it is likely that the anomaly detection system has been deliberately re-trained, initiating, by the anomaly detection system, an action to correctly classify the input as representing an object or event that should be classified as an anomaly. For example, in various embodiments, taking corrective action may include . . . modifying a parameter of the first anomaly detection model to generate a third anomaly detection model to be used by the analysis engine at a future point in time”); and
in a second instance of the comparison where the second training dataset is unpoisoned (Fig. 5; Para. [0074], “If, at 510, it is determined that a negative classification result produced by the analysis engine using the current anomaly detection model differs from positive result produced by the analysis engine using an earlier anomaly detection model, the method continues at 512. Otherwise, the method proceeds to 518”, where the absence of a condition indicating poisoning, “a negative classification . . . [by] the current anomaly detection model differs from positive result produced by . . . an earlier anomaly detection model”, results in a branching in the model for unpoisoned current model instances, e.g. “Otherwise”): 
further training the second instance of the AI model using a third training dataset to obtain a third instance of the AI model (Fig. 5; Para. [0079], “At 518, the method includes providing the given input and the final classification result, which may represent the initial classification result produced by the analysis engine using the current anomaly detection model at 508 . . . to the training repository”; where the training data at the “training repository” is updated, for use as third training data, and the model continues to be iteratively updated, see generally Para. [0071], “storing a plurality of anomaly detection models used in an anomaly detection system at respective points in time. For example, in various embodiments, snapshots of the anomaly detection models used by the analysis engine over time may be taken and stored in a training repository at predetermined or configurable intervals or when the number or significance of changes made to the anomaly detection model through machine learning, whether malicious or not, trigger a snapshot”, where the “current” model may become a snapshot if one of the above-mentioned conditions are met).  
Jurzak does not explicitly disclose: . . . data used for training (where the poisoned data is not specifically identified) . . . plurality of inferences (where a plurality of first and second inferences are not specifically described as being part of the comparison; subsequent recitations omitted) . . . in confidence scores of only hard examples included . . . the hard examples . . . with confidence scores that are lower than a score threshold . . . easy examples . . . with confidence scores with confidence scores that are higher than the score threshold (where the comparison is not specifically described as involving confidence scores or a subset of confidence scores indicating difficulty) . . . portion of training data of . . . and . . . portion of training data (where the poisoned data in the second training dataset is not specifically identified).
However, Tran teaches . . . [identifying poisoned training] data used for training  (Pg. 1, Abstract, “A recent line of work has uncovered a new form of data poisoning: so-called backdoor attacks . . . we identify a new property of all known backdoor attacks, which we call spectral signatures. This property allows us to utilize tools from robust statistics to thwart the attacks. We demonstrate the efficacy of these signatures in detecting and removing poisoned examples on real image sets”, where the “poisoned examples” are poisoned training data, which must be identified to be detected) . . . 
[using a] plurality of inferences (Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”, where an output is generated for “each input from the class”, and where this plurality of outputs are classification inferences, see generally Pg. 1-2, Para. 4-1, “Rather than causing the model’s test accuracy to degrade, the adversary’s goal is for the network to misclassify the test inputs when the data point has been altered by the adversary’s choice of perturbation”)  
. . . [identifying a poisoned] portion of training data of [a training dataset] . . . [and remediating the poisoned] portion of training data . . . (Pg. 1, Abstract, “We demonstrate the efficacy of these signatures in detecting and removing poisoned examples on real image sets”; Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”, where “remov[al]” and “re-train[ing]” is remediation).
Before the effective filing date of the invention, it would have been obvious to one of ordinary skill in the art to combine the method for identifying poisoned training data used to poison an AI model by comparing an inference of the model with an inference of a previous version of the model, as well as the remediation of model after the identification of poisoning of Jurzak with the identification and remediation of poisoned training data using a plurality of model inferences of Tran in order to compare a plurality of inferences instead of a single inference, which increases the likelihood that all poisoned data will be detected (Tran, Pg. 2, Para. 3, “In our experiments, we are able to use spectral signatures to reliably remove many—in fact, often all—of the corrupted training examples, reducing the misclassification rate on backdoored test points to within 1% of the rate achieved by a standard network trained on a clean training set”), which allows for effective removal of poisoned data, which in turn allows unpoisoned data associated with for previously poisoned datasets to be reused for further training (Tran, Pg. 5, Fig. 3, “Finally, we remove inputs with the top scores and re-train”), thereby preventing waste of resources. 
 	Additionally, Hendrycks teaches . . . [the use of inference confidence scores to determine whether additional model evaluation computations should be performed, where]in confidence scores of only hard examples included [in the conditional logic pathway triggering additional computations] . . . the hard examples . . . with confidence scores that are lower than a score threshold (Pg. 4, Col. 2, Para. 3, “If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed; this is done so that adversarially filtered examples yield misclassifications with low confidence in the correct class, like in untargeted adversarial attacks”, where the model, “ResNet-50”, generates an inference, see Pg. 1, Col. 1, Fig. 1, “the red text is a ResNet-50 prediction and its confidence”; and where the inference’s “confidence” score, when indicating a hard example lower than a score threshold - scores not “greater than 15% confidence to the correct class”, triggers additional model evaluation computations in a conditional logic pathway, see Pg. 2, Col. 1, Para. 2, “By using adversarial filtration, we can test how well models perform when simple-to-classify examples are removed, which includes examples that are solved with simple spurious cues”, where removing the “simple-to-classify examples” means that only hard examples will be included for use in model evaluation computations, “test how well models perform”)
[from a set of inferences that also comprise] . . . easy examples . . . with confidence scores with confidence scores that are higher than the score threshold (Pg. 4, Col. 2, Para. 3, “If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed; this is done so that adversarially filtered examples yield misclassifications with low confidence in the correct class, like in untargeted adversarial attacks”, where the easy examples are the “ResNet-50” inferences that are “assign[ed] greater than 15% confidence to the correct class”).
Before the effective filing date of the invention, it would have been obvious to one of ordinary skill in the art to combine the method, where variances between a first plurality of inferences and a second plurality of inferences are used to identify whether poisoned training data was used to generate the model associated with the second plurality of references, and wherein characteristics of the plurality of inferences are used to determine whether the inference should be used in the comparison of Jurzak in view of Tran with the use of inference confidence scores to determine whether additional model evaluation computations should be performed, wherein confidence scores of only hard examples are included in the conditional logic pathway triggering additional computations, and wherein the hard examples have confidence scores that are lower than a score threshold and are included from a set of inferences that also comprises easy examples with confidence scores higher than the score threshold of Hendrycks in order to trigger auditing of machine learning models only when outputs are associated with an insufficient level of confidence, which may indicate the presence of both targeted and untargeted adversarial attacks (compare Hendrycks, Pg. 4, Col. 2, Para. 3, “If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed; this is done so that adversarially filtered examples yield misclassifications with low confidence in the correct class, like in untargeted adversarial attacks”, where outputs associated with “low confidence” outputs indicate “adversarial attacks” and trigger additional computations, with Jurzak, Abstract, “[If the method] determines, based on the respective classification results, that it is likely that the anomaly detection system has been deliberately re-trained to falsely classify the input . . . [it] initiates an action to correctly classify the input as representing an object or event that should be classified as an anomaly”, where outputs indicating an adversarial attack trigger additional computations, “an action to correctly classify”, and where a person of ordinary skill in the art would reasonably expect “deliberately re-train[ing]” to, at least temporarily, decrease model confidence), and which will reduce storage constraints (see Jurzak, Para. [0067], “In either case, at 416, the method includes storing the input and the final classification result in a training repository the analysis engine”, where use of a relative metric for detection of adversarial training requires storage of every input, regardless of level of suspicious activity, which would not be necessary when employing an absolute metric, like the threshold of Hendrycks, see generally Hendrycks, Pg. 4, Col. 2, Para. 3, “If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed”, where removal of data associated with greater confidence allows for the implementation of the practice of sparse storage, which a person of ordinary skill in the art would understand to reduce storage constraints), reduce computational load, and increase accuracy (Hendrycks, Pg. 6, Col. 2, Para. 2, “Our metric for assessing robustness to adversarially filtered examples for classifiers is the top-1 accuracy on IMAGENET-A”; see also Hendrycks, Pg. 2, Col. 1, Para. 2, “Our examples demonstrate that it is possible to reliably fool many models with clean natural images, while previous attempts at exposing and measuring model fragility rely on synthetic distribution corruptions [20, 29], artistic renditions [27], and adversarial distortions”).

Regarding Claim 6, Jurzak in view of Tran and Hendrycks teach the method of claim 1, wherein remediating the poisoned portion of training data comprises: updating the second training dataset by removing the poisoned portion of training data from the second training dataset to obtain an updated second training dataset; and replacing the second instance of the Al model by training the first instance of the Al model using the updated second training dataset (Jurzak, Para. [0060], “in response to determining that it is likely that the anomaly detection system has been deliberately re-trained . . . taking corrective action may include . . . modifying a parameter of the first anomaly detection model to generate a third anomaly detection model to be used by the analysis engine at a future point in time”, where the “first anomaly detection model” is analogous to the current anomaly detection model of Fig. 4, see Jurzak, Para. [0010], “a first anomaly detection model used by the analysis engine at a current point in time”, and where in view of Tran, the “modifying . . . to generate a third anomaly detection model” includes “remov[al]” of poisoned data from the second training dataset, which obtains an updated second training dataset, and “re-train[ing]” of the model, see Tran, Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”).
	The reasons of obviousness are discussed above in regard to the rejection of Claim 1 and remain applicable here.

Regarding Claim 7, Jurzak in view of Tran and Hendrycks teach the method of claim 1, wherein remediating the poisoned portion of training data comprises: identifying a poisoned inference and an unpoisoned inference generated by from the second instance of the Al model (Jurzak, Para. [0073], “At 508, method 500 includes determining, by a current anomaly detection model, a classification result for the given input indicating whether the given input is considered to represent an anomaly”; Jurzak, Para. [0060], “in response to determining that it is likely that the anomaly detection system has been deliberately re-trained . . . taking corrective action may include . . .”, and where in view of Tran, the “corrective action” includes remediation of the poisoned portion by “remov[al]” of “inputs with the top scores”, which are poisoned inferences of the model, and retention of all other data, which are associated with unpoisoned inferences that do not generate “top scores”, see Tran, Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”), 
the second instance of the Al model being trained by the second training dataset comprising the poisoned portion of training data, and the poisoned inference comprises a hard example of the hard examples (Jurzak, Para. [0073], “determining, by a current anomaly detection model”, where as discussed above, the “current anomaly detection model” is the second instance trained with second data, which includes the poisoned portion leading to a poisoned inference when “malicious re-training” has occurred see Jurzak, Para. [0071], “the number or significance of changes made to the anomaly detection model through machine learning, whether malicious or not” and Jurzak, Para. [0073], “At 508, method 500 includes determining, by a current anomaly detection model, a classification result for the given input indicating whether the given input is considered to represent an anomaly”, and which in view of Hendrycks comprises a hard example of the hard examples, see Hendrycks, Pg. 4, Col. 2, Para. 3, “If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed; this is done so that adversarially filtered examples yield misclassifications with low confidence in the correct class, like in untargeted adversarial attacks”); 
remediating the poisoned inference using a replacement second instance of the Al model (Jurzak, Para. [0060], “in various embodiments, taking corrective action may include . . . modifying a parameter of the first anomaly detection model to generate a third anomaly detection model to be used by the analysis engine at a future point in time”; Jurzak, Para. [0078], “At 516, method 500 includes outputting a classification result indicating that the given input likely represents an anomaly, which may trigger an alert”), 
the replacement second instance of the Al model being trained by an updated second training dataset, the updated second training dataset not comprising the poisoned portion of training data (Jurzak, Para. [0060], “taking corrective action may include . . . modifying . . . the first anomaly detection model to generate a third anomaly detection model”, where in view of Tran the modifying includes “re-train[ing]” after removing the poisoned portion of data, see Tran, Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”); 
and retaining the unpoisoned inference (Jurzak, Para. [0074] - [0079], “If, at 510, it is determined that a negative classification result produced by the analysis engine using the current anomaly detection model differs from positive result produced by the analysis engine using an earlier anomaly detection model, the method continues at 512. Otherwise, the method proceeds to 518 . . . At 518, the method includes providing the given input and the final classification result, which may represent the initial classification result produced by the analysis engine using the current anomaly detection model at 508 or may represent the classification result output at 516 in response to a determination that malicious re-training has taken place, to the training repository”, where, as discussed above and in view of Tran, the plurality of inferences includes both unpoisoned and poisoned inferences, see Tran, Pg. 5, Fig. 3). 
The reasons of obviousness are discussed above in regard to the rejection of Claim 1 and remain applicable here.

Regarding Claim 8, Jurzak in view of Tran and Hendrycks teach the method of claim 7, wherein remediating the poisoned inference comprises notifying an inference consumer that consumed the poisoned inference, of the poisoned inference (Jurzak, Para. [0015], “In any of the disclosed embodiments, taking corrective action may include at least one of outputting a notification of a potentially malicious re-training of the anomaly detection system, triggering an alert indicating that the input represents an object or event that is classified as an anomaly”; Jurzak, Para. [0078], “At 516, method 500 includes outputting a classification result indicating that the given input likely represents an anomaly, which may trigger an alert, and outputting an indication that malicious re-training of the anomaly detection models has likely occurred . . . For example, a notification may be provided to an owner of a facility, space, server, or network being protected using the anomaly detection system or to an operator or administrator of the anomaly detection system or any components thereof, such as any or all of the analysis engine, training repository, or monitoring devices”, where the “owner” or “operator” is within the broadest reasonable interpretation of a consumer because they consumed the poisoned interface by relying on an incorrect “classification” to “protect” the “facility, space, server, or network”).
  
Regarding Claim 9, Jurzak in view of Tran and Hendrycks teach the method of claim 7, wherein remediating the poisoned inference comprises: generating a replacement inference using the replacement second instance of the Al model and an ingest dataset that was used to generate the poisoned inference (Jurzak, Fig. 5; Jurzak, Para. [0078], ”At 516, method 500 includes outputting a classification result indicating that the given input likely represents an anomaly, which may trigger an alert”, where in view of Tran, the “corrective action” includes “remov[al]” of poisoned data from the ingest dataset used to generate the poisoned inference and “re-train[ing]” of the model in order to generate a replacement instance that then outputs the replacement inference, e.g. the “classification result indicating that the given input likely represents an anomaly”, see Tran, Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”; see also Tran, Pg. 2, Para. 3, “In our experiments, we are able to use spectral signatures to reliably remove many—in fact, often all—of the corrupted training examples, reducing the misclassification rate on backdoored test points to within 1% of the rate achieved by a standard network trained on a clean training set”); and 
providing the replacement inference to an inference consumer that consumed the poisoned inference (Jurzak, Fig. 5; Jurzak, Para. [0078], ”At 516, method 500 includes outputting a classification result indicating that the given input likely represents an anomaly, which may trigger an alert . . . For example, a notification may be provided to an owner of a facility, space, server, or network being protected using the anomaly detection system or to an operator or administrator of the anomaly detection system or any components thereof, such as any or all of the analysis engine, training repository, or monitoring devices”, where an “alert” “that the given input likely represents an anomaly” is provided to a “owner” or “administrator”, which within the broadest reasonable interpretation of a consumer because they consumed the poisoned interface by relying on an incorrect “classification” to “protect” the “facility, space, server, or network”).
The reasons of obviousness are discussed above in regard to the rejection of Claim 1 and remain applicable here.
   
Regarding Claim 10, Jurzak in view of Tran and Hendrycks teach the method of claim 1, wherein the first training dataset is a subset of the second training dataset (Jurzak, Fig. 5; Jurzak, Para. [0071], “block 504 with storing a plurality of anomaly detection models used in an anomaly detection system at respective points in time. For example, in various embodiments, snapshots of the anomaly detection models used by the analysis engine over time may be taken and stored in a training repository at predetermined or configurable intervals or when the number or significance of changes made to the anomaly detection model through machine learning, whether malicious or not, trigger a snapshot”; Jurzak, Para. [0073], “determining, by a current anomaly detection model”, where, as discussed above, the first training data is the training data used to train the model at previous “respective points in time” whereas the second training data was used to create the “current anomaly detection model” “through machine learning”; Jurzak, Para. [0079], “At 518, the method includes providing the given input and the final classification result, which may represent the initial classification result produced by the analysis engine using the current anomaly detection model at 508 or may represent the classification result output at 516 in response to a determination that malicious re-training has taken place, to the training repository” and Tran, Pg. 1-2, Para. 4-1, “Rather than causing the model’s test accuracy to degrade, the adversary’s goal is for the network to misclassify the test inputs when the data point has been altered by the adversary’s choice of perturbation”, where the first training dataset is a subset of the second training dataset because the latter is the former plus additional data from previous iterations of “the method” of Fig. 5 and poisoned data from an “adversary”), 
and the first training dataset does not comprise the poisoned portion of training data (Jurzak, Fig. 5; Jurzak, Para. [0076], “If, at 514, it is determined that the change in classification scores over time is inconsistent with natural learning patterns, the method continues at 516”, where the point of “inconsistenc[y]” delineates the models not trained with poisoned data, which includes the first model, form the poisoned models; see also Tran, Pg. 7, Para. 1, “Here, we record the norms of the mean of the representation vectors for both the clean inputs as well as the clean plus corrupted inputs”, where the first dataset is only the “clean inputs”).
The reasons of obviousness are discussed above in regard to the rejection of Claim 1 and remain applicable here.

Regarding Claim 11, Jurzak in view of Tran and Hendrycks teach the method of claim 6, wherein the first training dataset (Jurzak, Fig. 5; Jurzak, Para. [0071], “block 504 with storing a plurality of anomaly detection models used in an anomaly detection system at respective points in time. For example, in various embodiments, snapshots of the anomaly detection models used by the analysis engine over time may be taken and stored in a training repository at predetermined or configurable intervals or when the number or significance of changes made to the anomaly detection model through machine learning, whether malicious or not, trigger a snapshot”; Jurzak, Para. [0073], “determining, by a current anomaly detection model”, where, as discussed above, the first training data is the training data used to train the model at previous “respective points in time”, prior to being poisoned, see Jurzak, Para. [0076], “If, at 514, it is determined that the change in classification scores over time is inconsistent with natural learning patterns, the method continues at 516”, where the point of “inconsistenc[y]” delineates the models not trained with poisoned data, which includes the first model, form the poisoned models; see also Tran, Pg. 7, Para. 1, “Here, we record the norms of the mean of the representation vectors for both the clean inputs as well as the clean plus corrupted inputs”) 
and the updated second training dataset (Jurzak, Fig. 5; Jurzak, Para. [0079], “At 518, the method includes providing the given input and the final classification result, which may represent the initial classification result produced by the analysis engine using the current anomaly detection model at 508 or may represent the classification result output at 516 in response to a determination that malicious re-training has taken place, to the training repository”, whereas the updated training dataset is the first training dataset, already in the “training repository”, plus the additional unpoised data “provid[ed]” to the ”training repository” in response to the method; see also Tran, Pg. 7, Para. 1, “Here, we record the norms of the mean of the representation vectors for both the clean inputs as well as the clean plus corrupted inputs”; Tran, Pg. 4, Fig. 3, “We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”) 
are subsets of the third training dataset (Jurzak, Para. [0079], “the training repository”, therefore the first training dataset and the updated training dataset are subsets of  “the training repository”, which in Tran is the third training datatset, see Tran, Pg. 4, Fig. 3, “we remove inputs with the top scores and re-train”; see also Jurzak, Fig. 5; Jurzak, Para. [0079], “At 518, the method includes providing the given input and the final classification result . . . to the training repository”, where, while a subset can include the entire set, both the first training set and update training set will become a subset of some, but not all of the third training set after subsequent iterations of the method), 
and the updated second training dataset does not comprise the poisoned portion of training data (Jurzak, Para. [0060], “in response to determining that it is likely that the anomaly detection system has been deliberately re-trained . . . taking corrective action may include . . .”, where in view of Tran, the “corrective action” includes “remov[al]” of poisoned data and “re-train[ing]” of the model, see Tran, Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”, and where the retraining does not include not include the poisoned portion of the training data because it is “often” entirely “remove[d]”, see  Tran, Pg. 2, Para. 3, “In our experiments, we are able to use spectral signatures to reliably remove many—in fact, often all—of the corrupted training examples, reducing the misclassification rate on backdoored test points to within 1% of the rate achieved by a standard network trained on a clean training set”).
The reasons of obviousness are discussed above in regard to the rejection of Claim 1 and remain applicable here.
  
Regarding Claim 12, Jurzak in view of Tran and Hendrycks teach the method of claim 6, wherein the second training dataset includes the updated second training dataset, the updated second training dataset not comprising the poisoned portion of training data (Jurzak, Para. [0060], “in response to determining that it is likely that the anomaly detection system has been deliberately re-trained . . . taking corrective action may include . . .”, where in view of Tran, the “corrective action” includes “remov[al]” of poisoned data and “re-train[ing]” of the model, see Tran, Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”; and therefore, the second training dataset, e.g. the dataset used for malicious “retrain[ing]”, must include the updated training dataset, e.g. the dataset used for remediating “re-train[ing]”, because the latter is created by only removing data from the former; and the latter does not include the poisoned portion of the training data because it is “often” entirely “remove[d]”, see Tran, Pg. 2, Para. 3, “In our experiments, we are able to use spectral signatures to reliably remove many—in fact, often all—of the corrupted training examples, reducing the misclassification rate on backdoored test points to within 1% of the rate achieved by a standard network trained on a clean training set”).
	The reasons of obviousness are discussed above in regard to the rejection of Claim 1 and remain applicable here.

Regarding Claim 13, Jurzak teaches a non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform . . . (Jurzak, Para. [0087], “an embodiment can be implemented as a computer-readable storage medium having computer-readable code stored thereon for programming a computer (e.g., comprising a processor) to perform a method as described and claimed herein. Examples of such computer-readable storage mediums include, but are not limited to, a hard disk . . . ”; Jurzak, Para. [0011], “In one embodiment, a disclosed machine-learning-based analysis engine of an anomaly detection system includes a processor, and a memory storing program instructions. When executed by the processor, the program instructions cause the processor to perform”).
	The remaining limitations are substantially the same as limitations of Claim 1, therefore it is rejected under the same rationale.

Regarding Claim 17, Jurzak teaches a data processing system, comprising: a processor; and a memory coupled to the processor to store instructions, which when executed by the processor, cause the processor to perform . . . (Jurzak, Para. [0011], “In one embodiment, a disclosed machine-learning-based analysis engine of an anomaly detection system includes a processor, and a memory storing program instructions. When executed by the processor, the program instructions cause the processor to perform”, where the “memory” must be coupled to the processor for its “program instructions” to be “executed by the processor” and the “anomaly detection system” is a data processing system, see Jurzak, Abstract, “An analysis engine of an anomaly detection system receives an input captured by a monitoring device, determines, based on a currently used anomaly detection model, that the input represents an object or event that should not be classified as an anomaly, and determines, based on a previously used model, that the input was previously classified as an anomaly”).
The remaining limitations are substantially the same as limitations of Claim 1, therefore it is rejected under the same rationale.

Claims 2, 4, 14, 16, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Jurzak in view of Tran, Hendrycks, and Zeng et al. (hereinafter Zeng) (“CNNComparator: Comparative Analytics of Convolutional Neural Networks”).

Regarding Claim 2, Jurzak in view of Tran and Hendrycks teach the method of claim 1, wherein comparing the first plurality of inferences to the second plurality of inferences . . . (Jurzak, Fig. 5, “510 NEGATIVE CLASSIFICATION RESULT FROM CURRENT MODEL DIFFERS FROM POSITIVE RESULT OF AN EARLIER MODEL?”, where, in view of Tran, the comparison is for a plurality of inferences for each model, see Tran, Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”): 
obtaining a first snapshot of the first instance of the AI model (Jurzak, Para. [0071], “in various embodiments, snapshots of the anomaly detection models used by the analysis engine over time may be taken and stored in a training repository”), 
the first snapshot comprising a first inference (Jurzak, Para. [0054], “these scores or anomaly classification results and the corresponding inputs representing scenario S may be stored in a repository in association with the machine learning model snapshots that produced them”)
for a first hard example of the hard examples, the first hard example having a first confidence score among the confidence scores, the first confidence score being lower than the (Hendrycks, Pg. 2, Col. 1, Para. 2, “By using adversarial filtration, we can test how well models perform when simple-to-classify examples are removed, which includes examples that are solved with simple spurious cues”, where removing the “simple-to-classify examples” means that all examples will be hard examples, including a first hard example; see also Hendrycks, Pg. 4, Col. 2, Para. 3, “If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed; this is done so that adversarially filtered examples yield misclassifications with low confidence in the correct class, like in untargeted adversarial attacks”, where “greater than 15%” is the score threshold, which the first hard example will have a first confidence score lower than the score threshold); and 
obtaining a second snapshot of the second instance of the AI model (Jurzak, Fig. 5; Jurzak, Para. [0082], “Snapshots of the anomaly detection models used by a machine-learning-based analysis engine of an anomaly detection system over time . . . may be stored in a training repository for the analysis engine”, where a “snapshot” of the second instance is taken at  “518” for use by the “analysis engine” for comparison), 
the second snapshot comprising a second inference for a corresponding example (Jurzak, Para. [0054], “, these scores or anomaly classification results and the corresponding inputs representing scenario S may be stored in a repository in association with the machine learning model snapshots that produced them”; Jurzak, Para. [0082], “Snapshots of the anomaly detection models used by a machine-learning-based analysis engine of an anomaly detection system over time, along with initial and final classification results produced by those models and results of anomaly detection process audits such as those described herein, may be stored in a training repository”)
that corresponds to the first hard example (Jurzak, Fig. 4; Jurzak, Para. [0064], “at 406, the input is not classified as an anomaly, the method includes auditing the anomaly detection system to determine if there has been a malicious re-training of the machine-learning-based analysis engine, as in 408. An example method for auditing the anomaly detection system is illustrated in FIG. 5 and described below. On the other hand, if the input is classified as an anomaly, the method proceeds to 410 . . . In either case, at 416, the method includes storing the input and the final classification result in a training repository the analysis engine”, where the corresponding inferences are “stor[ed]” for later “auditing the anomaly detection system”, which, in view of Hendrycks, consists only of inferences corresponding to hard examples, including a first hard example, see Hendrycks, Pg. 2, Col. 1, Para. 2, “By using adversarial filtration, we can test how well models perform when simple-to-classify examples are removed, which includes examples that are solved with simple spurious cues” and Hendrycks, Pg. 4, Col. 2, Para. 3, “If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed; this is done so that adversarially filtered examples yield misclassifications with low confidence in the correct class, like in untargeted adversarial attacks”); and
obtaining a relationship between the first inference and the second inference (Jurzak, Fig. 5, “510 NEGATIVE CLASSIFICATION RESULT FROM CURRENT MODEL DIFFERS FROM POSITIVE RESULT OF AN EARLIER MODEL?”, where the logical determination is used to obtain a relationship corresponding to one of four possible inference configurations).  
The reasons of obviousness are discussed above in regard to the rejection of Claim 1 and remain applicable here.
Jurzak in view of Tran and Hendrycks do not explicitly disclose . . . comprises . . . for a hard example, the hard example having a confidence score inferior to a score threshold . . . similar to the hard example . . . (where only the obtaining of the first snapshot and the obtaining of the relationship, but not the obtaining of the second snapshot, are explicitly disclosed as comprising components of the comparison) .
However, Zeng teaches [a comparison] . . . comprising . . . [obtaining both a first and second snapshot] . . . (Pg. 1, Abstract, “In this paper, we present a visual analytics approach to compare two different snapshots of a trained CNN model taken after different numbers of epochs, so as to provide some insight”).
Before the effective filing date of the invention, it would have been obvious to one of ordinary skill in the art to combine the comparing of a first and second plurality of inferences comprising obtaining a snapshot of a first instance of a model, including an inference, and obtaining a relationship between the inference and an inference from a second instance of the model, where a snapshot of the second instance is later obtained of Jurzak in view of Tran and Hendrycks with the comparison comprising obtaining both a first and second model snapshot of Zeng in order to generate comparisons with more relationship information than can be readily obtained by a comparison directly with a model itself (Zeng, Pg. 2, Col. 1, Para. 1, “However, existing CNN visualization techniques usually lack the ability to systematically explore and compare the differences in parameters/weights of two model snapshots. The performance of a model usually improves over time during the training process, but users normally can only obtain the training status from accuracy and loss information, so it is hard for them to know what happens to the parameters of the network and how they affect the performance of the CNN model. Thus, it would be helpful to get some insight into how model parameters evolve from a state with low accuracy to a state with high accuracy”, see Zeng, Pg. 1, Fig. 1 for more information). 
In the alternative, it could be argued Jurzak in view of Tran and Hendrycks teach all three obtaining components of the comprising, including the obtaining a second snapshot of the second instance of the Al model (Jurzak, Fig. 5; Jurzak, Para. [0082], “Snapshots of the anomaly detection models used by a machine-learning-based analysis engine of an anomaly detection system over time . . . may be stored in a training repository for the analysis engine”, where a “snapshot” of the second instance is taken at  “518” for use by the “analysis engine” for comparison, which is considered as a comprising component of the comparison because it is an essential step, in the same way that resetting from a cyclically repeating activity can reasonably be considered as part of the activity; additionally, obtaining of the second snapshot is taught to occur at any point in relation to the comparison, see Jurzak, Para. [0070], “While a particular order of operations is indicated in FIG. 5 for illustrative purposes, the timing and ordering of such operations may vary where appropriate without negating the purpose and advantages of the examples set forth in detail throughout the remainder of this disclosure” and see generally Jurzak, Para. [0085], where occurring prior to the comparison would allow for ease of comparison between two like instances of the same data object). 

Regarding Claim 4, Jurzak in view of Tran, Hendrycks and Zeng teach the method of claim 2, wherein obtaining the first snapshot of the first instance of the Al model comprises (Jurzak, Para. [0071], “in various embodiments, snapshots of the anomaly detection models used by the analysis engine over time may be taken and stored in a training repository”): 
storing first metadata for one of the first plurality of inferences the first instance of the Al model, the first metadata comprising a first identifier for an input used to generate the one of the first plurality of inferences, and a second identifier for the one of the first  plurality of inferences (Jurzak, Para. [0054], “, these scores or anomaly classification results and the corresponding inputs representing scenario S may be stored in a repository in association with the machine learning model snapshots that produced them”, which in view of Zeng includes metadata identifiers for the input and at least one of the plurality of inferences, see Zeng, Pg. 1 Fig. 1; Zeng, Pg. 4, Col. 1, Para. 2, “we mainly adopt side-by-side comparison on the performance of two model snapshots (T1). Users can select an input image. Then, the classification result is shown side-by-side in bar charts, thereby making the observation of the distribution of probability for each class easier for users to read”, where a person of ordinary skill in the art would understand this functionality to require identifiers corresponding to the “select[ed] input image” and the displayed “classification result” inference in order to be execute by a program; see also Zeng, Pg. 4, Col. 1, Para. 2, “We use selective search [28] to crop some image patches and then rank them by activation value on the selected channel (T4). When users hover on the image patches, the corresponding positions will be highlighted in the original image”; for example, see generally Hendrycks, Sections 8-9, where “ImageNet classes” correspond with “WordNet IDs” to execute the functionality of the program; and where the inference is one of a first plurality of inferences, see also Tran Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”, where an output is generated for “each input from the class”, and where this plurality of outputs are classification inferences, see generally Tran, Pg. 1-2, Para. 4-1, “Rather than causing the model’s test accuracy to degrade, the adversary’s goal is for the network to misclassify the test inputs when the data point has been altered by the adversary’s choice of perturbation”); 
storing a copy of a structure of the first instance of the Al model (Jurzak, Para. [0071], “snapshots of the anomaly detection models used by the analysis engine over time”, which in view of Zeng include structure, see Zeng, Pg. 2, Col. 1, Para. 2, “we compare model snapshots by adopting a top-down analytical visualization method with different levels of detail, i.e., the model, layer, channel and neuron levels”), 
the copy of the structure comprising a weight of a first element in a hidden layer of the first instance of the Al model and a connection connecting the first element to a second element of the hidden layer of the first instance of the Al model (Zeng, Pg. 3, Col. 1, Para. 1, “Once users select a layer to explore, users should be able to easily grasp what is different about this layer. A quick look should enable them to see the distribution and the parameters that change most . . . To gain further insight into differences, users should be able to easily locate differences, such as the position(s) where most weights change and the position(s) where most channels are activated”; see also Zeng, Pg. 3, Col. 2, Fig. 3, “in the convolutional operation layer (b), each column (blue) in the matrix represents a kernel, and each rectangle in the column represents one channel kernel map; in the output layer (c), each rectangle (green) represents one channel map”, where the “channel kernel map” includes connections between elements in the hidden layers, and Zeng, Pg. 3, Col. 1, Para. 3-4 “We simply use Euclidean distance to show the differences in parameters of each operation layer . . . As actual parameter weights are quite small and near zero, and users usually focus more on relative change rather than absolute change”, where “each operation layer” includes hidden layers, see Zeng, Pg. 3, Col. 2, Para. 3, “The operation layer corresponds to the selected convolutional layer, and the input and output layers correspond to the previous layer and next layer of the selected convolutional layer, respectively”); and 
storing second metadata for the first training dataset used to train the first instance of the Al model (Zeng, Pg. 4, Fig. 4, where “class[es]” corresponding with flower types of the data, such as “Pansy”, are metadata; see also Zeng, Pg. 1, Fig. 1, “A visual analytics system for comparing two different snapshots of the AlexNet model after the 10th and 100th epochs”, where the number of “epochs” is metadata on the training dataset because it is data about how the training dataset is used; see generally Zeng Pg. 4, Col. 1, Para. 3, “After running the model for 100 epochs, we obtained 97.2% accuracy on the training set”).
The reasons of obviousness are discussed above in regard to the rejection of Claim 1, for the combination with Tran, and in regard to the rejection of Claim 2, for the combination with Zeng, and remain applicable here.
 
Regarding Claim 14, the additional elements of the dependent claim are substantially the same as limitations of Claim 2, therefore it is rejected under the same rationale.

Regarding Claim 16, the additional elements of the dependent claim are substantially the same as limitations of Claim 4, therefore it is rejected under the same rationale.

Regarding Claim 18, the additional elements of the dependent claim are substantially the same as limitations of Claim 2, therefore it is rejected under the same rationale.

Regarding Claim 20, the additional elements of the dependent claim are substantially the same as limitations of Claim 4, therefore it is rejected under the same rationale.

Claims 3, 15, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Jurzak in view of Tran, Hendrycks, Zeng, and Xu et al. (hereinafter Xu) (“Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks”).

Regarding Claim 3, Jurzak in view of Tran, Hendrycks, and Zeng teach the method of claim 2, wherein comparing the first plurality of inferences to the second plurality of inferences further comprises (Jurzak, Fig. 5, “510 NEGATIVE CLASSIFICATION RESULT FROM CURRENT MODEL DIFFERS FROM POSITIVE RESULT OF AN EARLIER MODEL?”, where, in view of Tran, the comparison is for a plurality of inferences for each model, see Tran, Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”; Jurzak, Fig. 5, and where, as elaborated on below, the steps corresponding to reference numbers “512” and “514” can further be considered as components of the comparing): 
	. . . indicating that the second training dataset is poisoned (Jurzak, Fig. 5, “510 NEGATIVE CLASSIFICATION RESULT FROM CURRENT MODEL DIFFERS FROM POSITIVE RESULT OF AN EARLIER MODEL?”, where the comparison is used to determine whether the current model was “deliberately re-trained to falsely classify”, which is determining whether the second dataset comprises poisoned training data, see generally Jurzak, Para. [0059], “determining, by the anomaly detection system dependent on the respective classification results, that it is likely that the anomaly detection system has been deliberately re-trained to falsely classify the input as representing an object or event that should not be classified as an anomaly” and Jurzak, Para. [0021], “a malicious offender may be aware of machine learning or other artificial intelligence modules in a security system and, as a part of a planned attack, may influence these modules in a way that an anomaly will not be correctly identified, and no alert will be triggered in response to the attack. More specifically, a malicious offender may re-train an anomaly detection system by deliberately causing a change to a threshold value at which an alert is triggered for a particular scenario; see also Tran, Pg. 1, Abstract, “A recent line of work has uncovered a new form of data poisoning: so-called backdoor attacks . . . we identify a new property of all known backdoor attacks, which we call spectral signatures. This property allows us to utilize tools from robust statistics to thwart the attacks. We demonstrate the efficacy of these signatures in detecting and removing poisoned examples on real image sets”).
The reasons of obviousness are discussed above in regard to the rejection of Claim 1 and remain applicable here.
Jurzak in view of Tran, Hendrycks, and Zeng do not explicitly disclose . . . obtaining a delta based on a difference between the first inference and the second inference; and comparing the delta to a threshold, the delta exceeding the threshold . . . .
However,  Xu teaches . . . obtaining a delta based on a difference between the first inference and the second inference (Pg. 1, Col. 2, Para. 3, “The key idea is to compare the model’s prediction on the original sample with its prediction on the sample after squeezing”, where the “compar[ing]” includes obtaining a delta based on the deference between the two inferences, e.g. the inferences corresponding to the “original” and “squeezing” samples, see Pg. 10, Col. 1, Para. 2, “comparing the model’s original prediction with the prediction on the squeezed sample involves comparing two probability distribution vectors. There are many possible ways to compare the probability distributions, such as the L1 norm, the L2 norm and K-L divergence [3]. For this work, we select the L1 norm2 as a natural measure of the difference between the original prediction vector and the squeezed prediction: [Equation 6]”); and 
comparing the delta to a threshold, the delta exceeding the threshold [indicating adversarial activity] . . . (Pg. 1, Fig. 1, “If the difference between the model’s prediction on a squeezed input and its prediction on the original input exceeds a threshold level, the input is identified to be adversarial”).
Before the effective filing date of the invention, it would have been obvious to one of ordinary skill in the art to combine the comparing of a first plurality of inferences to a second plurality of inferences to determine whether the training data associated with the second plurality of inferences was poisoned of Jurzak in view of Tran, Hendrycks, and Zeng with the obtaining a delta between two model inferences and comparing the delta to a threshold to determine whether adversarial activity had occurred of Xu in order to determine adversarial activity (Xu, Pg. 1, Col. 2, Para. 3, “If the original and squeezed inputs produce substantially different outputs from the model, the input is likely to be adversarial”) with high accuracy and few false positives (Xu, Pg. 1, Col. 1, Abstract, “Previous studies to defend against adversarial examples mostly focused on refining the DNN models, but have either shown limited success or required expensive computation. We propose a new strategy, feature squeezing, that can be used to harden DNN models by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample. By comparing a DNN model’s prediction on the original input with that on squeezed inputs, feature squeezing detects adversarial examples with high accuracy and few false positives”).

Regarding Claim 15, the additional elements of the dependent claim are substantially the same as limitations of Claim 3, therefore it is rejected under the same rationale.

Regarding Claim 19, the additional elements of the dependent claim are substantially the same as limitations of Claim 3, therefore it is rejected under the same rationale.

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Jurzak in view of Tran, Hendrycks, Zeng, and Lao (“Reorienting Machine Learning Education Towards Tinkerers and ML-Engaged Citizens”).

Regarding Claim 5, Jurak in view of Tran, Hendrycks, and Zeng teach the method of claim 4, wherein obtaining the first snapshot of the first instance of the Al model further comprises (Jurzak, Para. [0071], “in various embodiments, snapshots of the anomaly detection models used by the analysis engine over time may be taken and stored in a training repository”): 
storing third metadata for the one of the first plurality of inferences, the third metadata comprising a second confidence score for the one of the first plurality of inferences (Jurzak, Para. [0072], “storing, for each of the anomaly detection models, a respective classification result determined by the anomaly detection model for a given input indicating whether the given input is considered to represent an anomaly, where the classification result is associated with a respective classification score, as described herein”, where the “score[s]” are a confidence scores, including a second confidence score, see Jurzak, Para. [0017], “the method may further include assigning, by the analysis engine, a point value to the input indicative of the likelihood that the input represents an object or event that should be classified as an anomaly, the point value being dependent on which anomaly detection model is used by the analysis engine. Determining that the input represents an object or event that should not be classified as an anomaly may include determining that an assigned point value is less than a threshold point value for classifying the input as representing an object or event that should be classified as an anomaly” in view of Hendrycks, Pg. 4, col. 2, Para. 3, “If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed”, where the “score” is “confidence” score, and Tan, Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”, where an output is generated for “each input from the class”, and where this plurality of outputs each have a “score”; and where the inference is one of a first plurality of inferences, see also Tran Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”, where an output is generated for “each input from the class”, and where this plurality of outputs are classification inferences, see generally Tran, Pg. 1-2, Para. 4-1, “Rather than causing the model’s test accuracy to degrade, the adversary’s goal is for the network to misclassify the test inputs when the data point has been altered by the adversary’s choice of perturbation”); 
making a determination that the second confidence score is inferior to the score threshold (Hendrycks, Pg. 4, col. 2, Para. 3, “If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed”); 
and based on the determination, storing fourth metadata . . . the one of the first plurality of inferences is the first hard example (Hendrycks, Pg. 2, Col. 1, Para. 2, “By using adversarial filtration, we can test how well models perform when simple-to-classify examples are removed, which includes examples that are solved with simple spurious cues”, where removing the “simple-to-classify examples” means that all examples will be hard examples, including a first hard example, and, as a result, the additional stored metadata on the input data, which includes forth metadata, is based on this determination, see Zeng, Pg. 1, Fig. 1, “A visual analytics system for comparing two different snapshots of the AlexNet model after the 10th and 100th epochs. The network architecture view (a) shows the architecture of the Alexnet. The difference distribution view (b) shows the distribution of the parameter differences in a selected layer. The convolutional operation view (c) presents a selected convolutional operation as a 2D matrix to facilitate comparison. The performance comparison view (d) provides a side-by-side comparison of the model performance and image patches of top activation values”; and where the inference is one of a first plurality of inferences, see also Tran Pg. 5, Fig. 3, “Figure 3: Illustration of the pipeline. We first train a neural network on the data. Then, for each class, we extract a learned representation for each input from that class. We next take the singular value decomposition of the covariance [matrix] of these representations and use this to compute an outlier score for each example. Finally, we remove inputs with the top scores and re-train”, where an output is generated for “each input from the class”, and where this plurality of outputs are classification inferences, see generally Tran, Pg. 1-2, Para. 4-1, “Rather than causing the model’s test accuracy to degrade, the adversary’s goal is for the network to misclassify the test inputs when the data point has been altered by the adversary’s choice of perturbation”). 
The reasons of obviousness are discussed above in regard to the rejection of Claim 1, for the combination with Tran and Hendrycks, and in regard to the rejection of Claim 2, for the combination with Zeng, and remain applicable here.
Jurzak in view of Tran, Hendrycks, and Zeng do not explicitly disclose . . . indicating . . . (where the fourth metadata does not explicitly indicate the inference is a hard example).
However, Lao teaches . . .  [storing metadata] indicating [the difficulty of the inference, including whether it is a hard example] . . . (Pg. 165-166, Para. 1-4, “Step 4: Results and Analysis Step 4 helps users gain insight into model behavior and inform retraining . . . This instance stores all of the data needed to display results to the user . . . The other analysis tool available is the Confidence Graph (see Figure 6-9). For a given label, this graph buckets all testing images based on prediction confidence. The confidence buckets include "medium," "high," and "very high" confidences . . . Images with lower than 0.4 confidence predictions are not displayed”, where the metadata “labels” for “low” must be stored to filter out associated outputs from the display; see also Pg. 167, Fig. 6-9).
Before the effective filing date of the invention, it would have been obvious to one of ordinary skill in the art to combine the storing of metadata containing a second confidence score for the first instance of the AI model, determining that the confidence score is below a threshold, and, based on this determination that the inference is a hard example, storing metadata for the inference of Jurzak in view of Tran, Hendrycks, and Zeng with the storing of metadata indicating the difficulty of the inference, including whether it is a hard example, of Lao in order to use metadata to efficiently organize and operationalize model inferences based on difficulty category (Lao, Pg. 166, Para. 4-5, “Images with lower than 0.4 confidence predictions are not displayed. The Confidence Graph allows users to infer the characteristics of images that a model learns for specific labels, and to plan a retraining strategy”, where both users and program functions execute category-specific operations; compare Hendrycks, Pg. 4, Col. 2, Para. 2-3, “IMAGENET-A Data Aggregation  . . . If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed” with Hendrycks, Pg. 5, Col. 1-2, Para. 3-1, “IMAGENET-O Data Aggregation . . . but we curate OOD examples that have high confidence predictions”, where it is useful to have data for both low and high confidence predictions and it is therefore useful to store metadata for each category instead of merely filtering out all inferences that are not hard examples).

Response to Arguments

Applicant's arguments filed on December 29, 2025 have been fully considered. Each argument is addressed in detail below.

I. 	Applicant argues the rejections to the claims, under 35 USC § 112(b), should be withdrawn (Applicant’s Remarks, 12/29/2025, Pg. 10, Section “Rejection Under 35 U.S.C. § 112(b)”). 

Applicant’s amendments have overcome each and every rejection to the claims, under 35 USC § 112(b), previously set forth in the September 24th, 2025 Office Action. As a result, these rejections have been withdrawn. 

II. 	Applicant argues the rejections to the claims, under 35 USC § 101, should be withdrawn (Applicant’s Remarks, 12/29/2025, Pg. 11-14, Section “Rejection Under 35 U.S.C. § 101”). 

With reference to Applicant’s arguments in favor of subject matter eligibility, Applicant’s amendments have overcome each and every rejection to the claims, under 35 USC § 101, previously set forth in the September 24th, 2025 Office Action. As a result, these rejections have been withdrawn. 

III. 	Applicant argues the rejections to the claims, under 35 USC § 103, should be withdrawn (Applicant’s Remarks, 12/29/2025, Pg. 14-15, Section “Rejection Under 35 U.S.C. § 103”). 

In response to Applicant’s amendments, the previously communicated rejections under 35 U.S.C. § 103, have been withdrawn. However, Applicants arguments are not persuasive in light of the new grounds for rejection, under 35 U.S.C. § 103, discussed in detail above. The new grounds of rejection rely on new combinations of the existing prior art of record to teach the new combination of elements in the amended independent claims, which were not presented in this arrangement in any of the previously presented claims. As a result, Applicant arguments against the previously communicated rejections under 35 U.S.C. § 103 are rendered moot.
However, for clarity of the record and to expedite prosecution, arguments that remain relevant to the new grounds of rejection are discussed below.
Specifically, Applicant argues that, while Hendrycks teaches the alleged claimed hard examples, these examples are not generated by the neural networks of Hendrycks and are natural adversarial examples of images. As a result, Applicant asserts these examples “are completely different from the claimed hard and easy examples that are inferences” (Pg. 15, Para. 3).
According to MPEP 2111, “During patent examination, the pending claims must be given their broadest reasonable interpretation consistent with the specification” (internal quotation marks omitted) (see also Phillips v. AWH Corp., 415 F.3d 1303, 1316, 75 USPQ2d 1321, 1329 (Fed. Cir. 2005)).
Additionally, according to MPEP 2145, “Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims” (see also In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993)).
Here, the claims do not positively recite any limitations that would require the models to be neural networks or any limitations that would prohibit the examples from being natural adversarial examples of images. Additionally, limitations from the specification that would require the models to be neural networks or that may prohibit the examples from being natural adversarial examples of images are not read into the claims (but see Hendrycks, Pg. 4, Col. 2, Para. 3, “If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed”, where “ResNet-50” is a convolutional neural network). Instead, as discussed in detail above, the adversarial examples are within the broadest reasonable interpretation of hard and easy examples that are inferences because they are output generated by an AI model that vary in difficulty from easy to hard, as indicated by their associated confidence scores (see Hendrycks, Pg. 4, Col. 2, Para. 3, “If either ResNet-50 assigns greater than 15% confidence to the correct class, the image is also removed; this is done so that adversarially filtered examples yield misclassifications with low confidence in the correct class, like in untargeted adversarial attacks”, where the “ResNet-50” is an AI model, which generates “classification[s]” that are within the broadest reasonable interpretation of inferences, and where “greater than 15% confidence to the correct class” is easy and the remaining inferences are hard, “low confidence”).
As a result, the argument is not persuasive. 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MATTHEW BRYCE GOLAN whose telephone number is (571)272-5159. The examiner can normally be reached Monday through Friday, 8:00 AM to 5:00 PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached at (571) 270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MATTHEW BRYCE GOLAN/Examiner, Art Unit 2123                                                                                                                                                                                                        
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123
Read full office action
SYSTEM AND METHOD FOR IDENTIFYING POISONED TRAINING DATA USED TO TRAIN ARTIFICIAL INTELLIGENCE MODELS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

SYSTEM AND METHOD FOR IDENTIFYING POISONED TRAINING DATA USED TO TRAIN ARTIFICIAL INTELLIGENCE MODELS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email