Prosecution Insights
Last updated: April 19, 2026
Application No. 18/412,261

METHODS AND APPARATUS TO SELF-GENERATE A MULTIPLE-OUTPUT ENSEMBLE MODEL DEFENSE AGAINST ADVERSARIAL ATTACKS

Non-Final OA §101§103
Filed
Jan 12, 2024
Examiner
MANG, VAN C
Art Unit
2126
Tech Center
2100 — Computer Architecture & Software
Assignee
Intel Corporation
OA Round
3 (Non-Final)
75%
Grant Probability
Favorable
3-4
OA Rounds
3y 10m
To Grant
99%
With Interview

Examiner Intelligence

Grants 75% — above average
75%
Career Allow Rate
181 granted / 241 resolved
+20.1% vs TC avg
Strong +27% interview lift
Without
With
+26.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 10m
Avg Prosecution
31 currently pending
Career history
272
Total Applications
across all art units

Statute-Specific Performance

§101
31.2%
-8.8% vs TC avg
§103
42.5%
+2.5% vs TC avg
§102
8.0%
-32.0% vs TC avg
§112
13.5%
-26.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 241 resolved cases

Office Action

§101 §103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Amendment Applicant’s submission filed 11/25/2025 has been entered. Applicant’s amendments to the claims have overcome the previously issued claim objections. The status of claims is as follow: Claims 1, 3-8, 10-15, 17-19 and 21 remain pending in the application. Claims 1, 8 and 15 are amended. Claims 2, 9, 16 and 20 are cancelled. Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 11/25/2025 has been entered. Response to Arguments Applicant's arguments filed in response to rejections under 35 USC 101 have been fully considered but they are not persuasive. Applicant asserts that “Applicant further notes that in Ex parte Desjardins et al., Director Squires noted that "claims directed to an improvement in the functioning of a computer, or an improvement to other technology or technical field are patent eligible." He further noted that "the Federal Circuit held that the eligibility determination should turn on whether "the claims are directed to an improvement to computer functionality versus being directed to an abstract idea." Ex parte Desjardins also noted that the Office tends to "evaluate claims at ... a high level of generality" which this Office Action does by saying that everything is simply a mental process or mathematical concept without doing the evaluation required by the MPEP (such as the evaluation described in MPEP 2106.04(d)(1)). For at least this rationale, Applicant respectfully submits that the claims are directed toward statutory subject matter.” (Remarks pg. ) Examiner’s response: Regarding applicant’s reliance on the decision of the Appeals Review Panel in Ex parte Desjardins, No. 2024-000567 (P.T.A.B. Sept. 26, 2025), Examiner notes that, in Desjardins, unlike in the claims at issue here, the appellants specifically argued that the claimed invention “address[es] challenges in continual learning and model efficiency by reducing storage requirements and preserving task performance across sequential training”. Desjardins, op. at 7. That is, the appellant in Desjardins specifically alleged that the claimed subject matter improves machine learning itself. By contrast, Applicant in the instant case does not point to any specific claim language that characterizing an improvement, and does not point to any claim language that is analogous to the claims at issue in Desjardins. Applicant argues on Remarks Pages 6-7 that “’the ensemble model being a result of transforming a single trained model into the ensemble model by insertion of a first output layer at a first exit location of the single trained model and a second output layer at a second exit location of the single trained model’ … describes a non-conventional transformation of a trained machine learning model into an internally instrumented ensemble configuration. The inserted output layers allow the model to produce inference results at multiple stages of execution. These operations cannot be performed in the human mind and do not resemble abstract mental comparisons. They require configuration and execution of a neural network model in a specialized and structured manner, involving internal confidence scoring and exit-layer decision points.” Examiner respectfully disagrees, pointing out that Examiner never considered this limitation a mental process (rather, “identify an adversarial attack …” was identified as the mental process). However, this must be evaluated as an additional element to the mental process to determine if it integrates the mental process into a practical application or amounts to significantly more than the mental process. Applicant continues on Remarks Page 7 to say “Applicants respectfully disagree with Examiner’s assertion that the ensemble model is generically recited and amounts to a generic computer implementation” as it “expressly recites … being the result of transforming a single trained model” into one with a “first exit location” and a “second exit location” as explained above. Applicant argues that this is “not found in generic ensemble approaches” and “enables a novel form of detection using inter-layer confidence deviation.” Examiner respectfully disagrees, and first points out that the “transforming” is not positively recited – the claim merely states that the utilized ensemble model is the result of the insertion of first and second exit locations. No details are provided on how this is achieved, and as claimed, the model is pre-configured in this way before the steps of Applicant’s claimed invention. Therefore, Examiner maintains that Applicant is simply utilizing a machine learning model to practice an abstract idea of “identify an adversarial attack” based on a confidence score deviation. With no details on how the machine learning model is constructed, this amounts to mere instructions to implement the abstract idea on a generic computer. Examiner further notes that while claims 4, 11, and 18 do appear to positively recite the generation of the early exit model, again there is no detail on how this is performed, which leads to the next point. Examiner further disagrees that this is “not found in generic ensemble approaches”, as early-exit architectures are known in the art. Applicant themselves admit this in Specification [0013]: “Recent years have witnessed advances in machine learning methods regarding early exits from a machine learning model. Early exit approaches determine whether a prediction confidence at an exist location exceeds a threshold confidence and, if so, immediately cease further calculations and returns the generated prediction.” Applicant also has included the pioneering paper for early exit models by Teerapittayanon in the IDS filed 2024-01-12. Applicant also states in another application Laskaridis et al. (US 2021/0012194 A1) in [0008]: “The idea of early exits has been explored by several researchers.” Therefore, this amounts to merely utilizing a generic known type of machine learning model to perform an abstract idea. Examiner notes that executing such a model, as stated in the independent claims, amounts to mere instructions to implement the abstract idea on a computer. Generating the model as stated in Claims 4, 11, and 18, amounts to insignificant extra solution activity, as any machine learning model must be generated or constructed in some way in order to be subsequently executed, and thus this limitation is only nominally or tangentially related to the claimed invention, which is directed to executing said model to perform an abstract idea of identifying an adversarial attack. Applicant argues on Remarks Page 7 that the inclusion of the limitation “terminate execution of the ensemble model based on the identification of the adversarial attack” results “a control system that actively monitors internal model behavior and triggers runtime intervention”. Examiner respectfully disagrees. Terminating an identification process when an identification is reached is an evaluation that can be performed by a human in the mind or with pen and paper, and is thus a mental process. The generic recitation of a machine learning model with early exits again merely amounts to practicing this mental process on a generic computer. Applicant argues on Remarks Pages 7 that “this pattern of detection and conditional system behavior mirrors the structure of USPTO Example 40 … in which monitored network conditions are compared to a threshold, and abnormal conditions trigger an adaptive system response, such as collecting more detailed protocol data.” Examiner respectfully disagrees. The “adaptive system response” of “collecting more detailed protocol data” is a change to the functioning of a computer component, and is not merely terminating a process when a termination criterion is reached, which is a mental process. Applicant argues on Remarks Page 8 that “the specific model transformation, confidence-based inter-layer analysis, runtime control, and system messaging collectively amount to significantly more than any alleged abstract idea. The invention improves the functioning of a computer-implemented inference system in a non-conventional and non-trivial manner.” Examiner respectfully disagrees. Identifying a condition such as an adversarial attack based on two confidence scores is a mental process, and terminating an identification process when an identification is reached is a mental process, and transmitting a messages amounts to insignificant extra solution activity, mere data outputting. The claimed limitations do not improve the functioning of a computer, but rather they improve the process of identification of an adversarial attack, and thus are merely an improvement to an abstract idea itself. Applicant's arguments filed in response to rejections under 35 USC 103 have been fully considered, but rendered moot regarding independent claims 1 and 8, and are unpersuasive regarding claim 15. Regarding Claim 15, Applicant argues that “the alleged Bagnall/Abbaszadeh combination does not suggest such instructions.” However, Claim 15 was previously rejected over Bagnall, Abbaszadeh, and Kaya. Examiner notes that the newly amended matter in Claims 1, 8, and 15 is taught by Kaya. Thus, the same grounds of rejection is used for Claim 15 in this action, and a new grounds of rejection including Kaya is issued for Claims 1 and 8, as necessitated by amendment. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1, 3-8 and 10-14 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Step 1: Claims 1-7 are directed to a non-transitory computer-readable storage medium and Claims 8-14 are directed to an apparatus. Therefore, each of the claims are directed to one of the four statutory categories of patent eligible subject matter. Step 2A Prong 1: Claims 1 and 8 recite: “identify an adversarial attack based on (1) a first confidence score output by the first output layer, and (2) a second confidence score output by the second output layer”; identifying based on scores is a mental process “terminate execution [of the ensemble model] based on the identification of the adversarial attack”; terminating an identification process (which is a mental process as shown above) when an identification is made is a judgment that can be made in the human mind, and is thus a mental process Step 2A Prong 2: This judicial exception is not integrated into a practical application because the additional elements are as follows: “execute an ensemble model, the ensemble model being a result of transforming a single trained model into the ensemble model by insertion of a first output layer at a first exit location of the single trained model and a second output layer at a second exit location of the single trained model”; this is a generically recited early exit model as is known in the art, and machine learning models recited at a high level of generality amount to mere instructions to apply the abstract idea using a generic computer as per MPEP 2106.05(f) “cause transmission of a message to indicate the identification of the adversarial attack”; this amounts to insignificant extra solution activity (see MPEP 2106.05(g): “necessary data gathering and outputting”) Step 2B: The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements are as follows: “execute an ensemble model, the ensemble model being the result of transforming a single trained model into the ensemble model by insertion of a first output layer at a first exit location of the single trained model and a second output layer at a second exit location of the single trained model”; this is a generically recited early exit model as is known in the art, and machine learning models recited at a high level of generality amount to mere instructions to apply the abstract idea using a generic computer as per MPEP 2106.05(f) “cause transmission of a message to indicate the identification of the adversarial attack”; the courts have identified this as well-understood, routine, and conventional activity (see MPEP 2106.05(d): “Receiving or transmitting data over a network”) Dependent Claims: Claims 3-7 and 10-14 are also rejected under 35 USC 101 for the following reasons: Claims 3 and 10 recite: “aggregate the first output of the first output layer and the second output of the second output layer using a weighted average”; aggregating with a weighted average a mental process Claims 4 and 11 recite: “generate the ensemble model based on the single trained model”; this amounts to insignificant extra solution activity, “(2) Whether the limitation is significant (i.e. it imposes meaningful limits on the claim such that it is not nominally or tangentially related to the invention)”, as per MPEP 2106.05(g) under Step 2A Prong 2. Examiner notes that all machine learning models must be generated or constructed in some way in order to be subsequently executed. As per Step 2B, Examiner notes that this is also well-understood, routine, and conventional activity as per Applicant’s own Specification [0013]: “Recent years have witnessed advances in machine learning methods regarding early exits from a machine learning model. Early exit approaches determine whether a prediction confidence at an exist location exceeds a threshold confidence and, if so, immediately cease further calculations and returns the generated prediction.” Applicant also has included the pioneering paper for early exit models by Teerapittayanon in the IDS filed 2024-01-12. Applicant also states in another application Laskaridis et al. (US 2021/0012194 A1) in [0008]: “The idea of early exits has been explored by several researchers.”) Claims 5 and 12 recite: “analyze different classifications output by the first output layer and the second output layer”; analyzing is a mental process Claims 6 and 13 recite: “cause transmission of the message via network”; this amounts to insignificant extra solution activity (see MPEP 2106.05(g): “necessary data gathering and outputting”) under Step 2A Prong 2, and the courts have identified this as well-understood, routine, and conventional activity (see MPEP 2106.05(d): “Receiving or transmitting data over a network”) under Step 2B Claims 7 and 14 recite: “after a failure to detect an adversarial attack, cause transmission of a result of the execution of the ensemble model”; this amounts to insignificant extra solution activity (see MPEP 2106.05(g): “necessary data gathering and outputting”) under Step 2A Prong 2, and the courts have identified this as well-understood, routine, and conventional activity (see MPEP 2106.05(d): “Receiving or transmitting data over a network”) under Step 2B Claims 15, 17-19 and 21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Step 1: Claims 15-20 are directed to a non-transitory computer-readable storage medium. Therefore, each of the claims are directed to one of the four statutory categories of patent eligible subject matter. Step 2A Prong 1: Claim 15 recites: “analyze outputs of the execution of the ensemble model to identify an adversarial attack by comparing (1) a first deviation between a first confidence score output by the first output layer and an expected confidence score to a threshold deviation, and (2) a second deviation between a second confidence score output by the second output layer and the expected confidence score to the threshold deviation”; analyzing and identifying based on scores is a mental process “identification of the adversarial attack based on at least one of the first deviation or the second deviation meeting or exceeding the threshold deviation”; identifying based on scores is a mental process “after identification … terminate execution of the ensemble model”; terminating an identification process (which is a mental process as shown above) when an identification is made is a judgment that can be made in the human mind, and is thus a mental process Step 2A Prong 2: This judicial exception is not integrated into a practical application because the additional elements are as follows: “execute an ensemble model, the ensemble model being a result of transforming a single trained model into the ensemble model by insertion of a first output layer at a first exit location of the single trained model and a second output layer at a second exit location of the single trained model”; this is a generically recited early exit model, and machine learning models recited at a high level of generality amount to mere instructions to apply the abstract idea using a generic computer as per MPEP 2106.05(f) “cause transmission of a message, the message to indicate that the output of the execution of the ensemble model is indicative of an adversarial attack”; this amounts to insignificant extra solution activity (see MPEP 2106.05(g): “necessary data gathering and outputting”) Step 2B: The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements are as follows: “execute an ensemble model, the ensemble model being a result of transforming a single trained model into the ensemble model by insertion of a first output layer at a first exit location of the single trained model and a second output layer at a second exit location of the single trained model”; this is a generically recited early exit model, and machine learning models recited at a high level of generality amount to mere instructions to apply the abstract idea using a generic computer as per MPEP 2106.05(f) “cause transmission of a message, the message to indicate that the output of the execution of the ensemble model is indicative of an adversarial attack”; the courts have identified this as well-understood, routine, and conventional activity (see MPEP 2106.05(d): “Receiving or transmitting data over a network”) Dependent Claims: Claims 17-20 are also rejected under 35 USC 101 for the following reasons: Claim 17 recites: “aggregate a first output of the first output layer and a second output of the second output layer using a weighted average”; aggregating with a weighted average a mental process Claim 18 recites: “generate the ensemble model based on the single trained model”; this amounts to insignificant extra solution activity, “(2) Whether the limitation is significant (i.e. it imposes meaningful limits on the claim such that it is not nominally or tangentially related to the invention)”, as per MPEP 2106.05(g) under Step 2A Prong 2. Examiner notes that all machine learning models must be generated or constructed in some way in order to be subsequently executed. As per Step 2B, Examiner notes that this is also well-understood, routine, and conventional activity as per Applicant’s own Specification [0013]: “Recent years have witnessed advances in machine learning methods regarding early exits from a machine learning model. Early exit approaches determine whether a prediction confidence at an exist location exceeds a threshold confidence and, if so, immediately cease further calculations and returns the generated prediction.” Applicant also has included the pioneering paper for early exit models by Teerapittayanon in the IDS filed 2024-01-12. Applicant also states in another application Laskaridis et al. (US 2021/0012194 A1) in [0008]: “The idea of early exits has been explored by several researchers.”) Claim 19 recites: “analyze the outputs of the execution of the ensemble model further based on a count of different classifications output by the first output layer and the second output layer”; analyzing is a mental process Claim 21 recites: “cause transmission of the message via network”; this amounts to insignificant extra solution activity (see MPEP 2106.05(g): “wherein the first output layer comprises a fully connected layer and a softmax layer.”) under Step 2B Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1, 4-8, 11-15, 18-19 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Bagnall et al. (“Training Ensembles to Detect Adversarial Examples”; hereinafter “Bagnall”) in view of Kaya et al. (“Shallow-Deep Networks: Understanding and Mitigating Network Overthinking”; hereinafter “Kaya”), further in view of Abbaszadeh et al. (US 2019/0058715 A1; hereinafter “Abbaszadeh”). As per Claim 1, Bagnall teaches execute an ensemble model, [the ensemble model being the result of transforming a single trained model into the ensemble model by insertion of] a first output layer at a first exit location [of the single trained model] and a second output layer at a second exit location [of the single trained model] (Bagnall, Page 1 Section 2, discloses: “We propose to train an ensemble of N models that label clean examples accurately while also disagreeing on randomly perturbed examples.” Here, Bagnall discloses an “ensemble of N models”, wherein each model produces a “label”, and one of ordinary skill in the art will understand that this means each model has an output layer at an exit location where the “label” is produced for that model of the ensemble.) identify an adversarial attack based on (1) a first confidence score output by the first output layer, and (2) a second confidence score output by the second output layer (Bagnall, Page 2 Para 4, discloses: “Detection. At test time, the outputs of all ensemble members are combined using a rank voting mechanism. Each member assigns the rank 0 to the label it considers the most likely, rank 1 to the second most likely, and so on. For each label, the ranks are summed across all members, and the smallest label rank is used as the ensemble disagreement.” Here, Bagnall discloses confidence scores for each output (“rank voting mechanism. Each member assigns the rank 0 to the label it considers the most likely, rank 1 to the second most likely, and so on”), wherein each “rank” is a confidence score in each label for that model’s output. These ranks are used to identify an adversarial attack, as Bagnall continues: “For each label, the ranks are summed across all members, and the smallest label rank is used as the ensemble disagreement. For an ensemble trained using our method, a large ensemble disagreement is indicative of an input that lies outside of the data distribution. Correspondingly, we implement a simple rank-based criterion that rejects a test example as adversarial or outlier if and only if the ensemble disagreement is above a rank threshold hyperparameter τ.”) However, Bagnall does not teach the ensemble model being the result of transforming a single trained model into the ensemble model by insertion of a first output layer at a first exit location of the single trained model and a second output layer at a second exit location of the single trained model; terminate execution of the ensemble model based on the identification of the adversarial attack; at least one non-transitory computer-readable storage medium comprising instructions to cause at least one processor circuit to at least; cause transmission of a message to indicate the identification of the adversarial attack Kaya teaches the ensemble model being the result of transforming a single trained model into the ensemble model by insertion of a first output layer at a first exit location of the single trained model and a second output layer at a second exit location of the single trained model (Kaya, Page 4 Left Column: “Placement of the ICs. We pick a subset of internal layers to attach the internal classifiers after.”) terminate execution of the ensemble model based on the identification of the adversarial attack (Kaya Page 2 Bottom Left: “Our first heuristic uses the confidence of an internal prediction to assess its correctness. With this heuristic, we can reliably detect when the network should stop thinking and make an early prediction—an early exit.”) Kaya is analogous art because it is in the field of endeavor of early exit ensembles and adversarial examples. It would have been obvious before the effective filing date of the claimed invention to combine the ensemble for adversarial attack of Bagnall with the confidence score of Kaya. Bagnall and Kaya both discuss using ensemble networks to defend against adversarial attacks, as Bagnall states, “a large ensemble disagreement is indicative of an input that lies outside of the data distribution”, and Kaya echoes this by saying, “disagreements among them hint that the prediction is inconsistent and confused.” While Bagnall uses a ranking system to measure disagreement among outputs by identifying how high the score of the lowest ranking label is, Kaya discusses using either a “confidence” or a “confusion metric” to indicate “likely misclassifications”. The combination of these two references would suggest to one of ordinary skill in the art to use an ensemble model to identify adversarial examples based on deviations from a confidence threshold at each output, or even differences in confidence between each output. One of ordinary skill in the art would be motivated to do so in order to better guard against adversarial attacks (Kaya, Page 7 Top Right: “When the threshold is at q = 0:8; the backdoored network makes correct predictions on 84% of the backdoor inputs—up from 12% without early exits. Further, the network classifies only 17% of the backdoor samples to the attacker’s target class—down from 98%. Overall, our results suggest that early exits can mitigate this attack.”) However, the combination of Bagnall and Kaya does not teach at least one non-transitory computer-readable storage medium comprising instructions to cause at least one processor circuit to at least; cause transmission of a message to indicate the identification of the adversarial attack Abbaszadeh teaches at least one non-transitory computer-readable storage medium comprising instructions to cause at least one processor circuit to at least (Abbaszadeh, Para [0026], discloses: “For example, a computer-readable storage medium may store thereon instructions that when executed by a machine result in performance according to any of the embodiments described herein.”) cause transmission of a message to indicate the identification of the adversarial attack (Abbaszadeh, Para [0028], discloses: “Abnormalities may be detected by classifying the monitored data as being “normal”, “attacked”, or “fault”. This decision boundary may be constructed in feature space using dynamic models.” Abbaszadeh, Para [0033], discloses: “The system may be configurable and may distinguish between intelligent adversarial attacks and naturally occurring faults in each monitoring node.” Thus, Abbaszadeh discloses a classifier that identifies adversarial attacks. Abbaszadeh discloses transmitting a message via an external computing device (“cloud”) to indicate that the output of the model exceeds a confidence score (“decision boundary”) in [0056]: “At S1440, the system may compare each generated current monitoring node feature vector with a corresponding decision boundary for that monitoring node (the decision boundary separating normal state, attacked state, and fault state for that monitoring node). At S1450, the system may automatically transmit a threat alert signal based on results of said comparisons. The alert signal might be transmitted, for example, via a cloud-based application. According to some embodiments, the alert signal may be transmitted via one or more of a cloud-based system, an edge-based system, a wireless system, a wired system, a secured network, and a communication system.”) Abbaszadeh is analogous art because it is in the field of endeavor of machine learning and adversarial attack detection. It would have been obvious before the effective filing date of the claimed invention to combine the adversarial attack detection of Bagnall with the adversarial attack alert of Abbaszadeh. One of ordinary skill in the art would be motivated to do so in order to be able to have qualified personnel respond in a timely manner to limit damage from an attack (Abbaszadeh, [0028]: “This decision boundary may be constructed in feature space using dynamic models and may help enable early detection of vulnerabilities (and potentially avert catastrophic failures) allowing an operator to restore the control system to normal operation in a timely fashion.”) As per Claim 4, the combination of Bagnall, Kaya, and Abbaszadeh teaches the at least one non-transitory computer-readable storage medium of claim 1. Kaya teaches wherein the instructions cause one or more of the at least one processor circuit to generate the ensemble model based on the single trained model (Kaya, Page 4 Left Column: “Placement of the ICs. We pick a subset of internal layers to attach the internal classifiers after.”) It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Kaya with Bagnall for at least the reasons recited in the rejection to Claim 1. As per Claim 5, the combination of Bagnall, Kaya, and Abbaszadeh teaches the at least one non-transitory computer-readable storage medium of claim 1. Bagnall teaches wherein the instructions cause one or more of the at least one processor circuit to analyze different classifications output by the first output layer and the second output layer (Bagnall, Page 2 Para 4, discloses: “Detection. At test time, the outputs of all ensemble members are combined using a rank voting mechanism. Each member assigns the rank 0 to the label it considers the most likely, rank 1 to the second most likely, and so on. For each label, the ranks are summed across all members, and the smallest label rank is used as the ensemble disagreement.” Here, Bagnall discloses confidence scores for each output (“rank voting mechanism. Each member assigns the rank 0 to the label it considers the most likely, rank 1 to the second most likely, and so on”), wherein each “rank” is a confidence score in each label for that model’s output. These ranks are used to identify an adversarial attack, as Bagnall continues: “For each label, the ranks are summed across all members, and the smallest label rank is used as the ensemble disagreement. For an ensemble trained using our method, a large ensemble disagreement is indicative of an input that lies outside of the data distribution. Correspondingly, we implement a simple rank-based criterion that rejects a test example as adversarial or outlier if and only if the ensemble disagreement is above a rank threshold hyperparameter τ.”) As per Claim 6, the combination of Bagnall, Kaya, and Abbaszadeh teaches the at least one non-transitory computer-readable storage medium of claim 1. Abbaszadeh teaches wherein the instructions cause one or more of the at least one processor circuit to cause transmission of the message via a network (Abbaszadeh, Para [0056], discloses: “The alert signal might be transmitted, for example, via a cloud-based application. According to some embodiments, the alert signal may be transmitted via one or more of a cloud-based system, an edge-based system, a wireless system, a wired system, a secured network, and a communication system.”) As per Claim 7, the combination of Bagnall, Kaya, and Abbaszadeh teaches the at least one non-transitory computer-readable storage medium of claim 1. Abbaszadeh teaches wherein the instructions cause one or more of the at least one processor circuit to, after a failure to detect an adversarial attack, cause transmission of a result of the execution of the ensemble model. (Abbaszadeh, Para [0022], discloses: “The multi-class classifier model 155 may, for example, monitor streams of data from the monitoring nodes 110 comprising data from sensor nodes, actuator nodes, and/or any other critical monitoring nodes (e.g., monitoring nodes MN.sub.1 through MN.sub.N) and automatically output a classification result (e.g., indicating that operation of the industrial asset is normal, attacked, or fault) to one or more remote monitoring devices 170 when appropriate (e.g., for display to a user).”) As per Claims 8 and 11-14, these are apparatus claims corresponding to non-transitory computer-readable storage medium claims 1 and 4-7, and are rejected for similar reasons. As per Claim 15, Bagnall teaches execute an ensemble model, [the ensemble model being the result of transforming a single trained model into the ensemble model by insertion of] a first output layer at a first exit location [of the single trained model] and a second output layer at a second exit location [of the single trained model] (Bagnall, Page 1 Section 2, discloses: “We propose to train an ensemble of N models that label clean examples accurately while also disagreeing on randomly perturbed examples.” Here, Bagnall discloses an “ensemble of N models”, wherein each model produces a “label”, and one of ordinary skill in the art will understand that this means each model has an output layer at an exit location where the “label” is produced for that model of the ensemble.) analyze outputs of the execution of the ensemble model to identify an adversarial attack by comparing (1) [a first deviation score between] a first confidence score output by the first output layer [and an expected confidence score to a threshold deviation], and (2) [a second deviation between] a second confidence score output by the second output layer [and the expected confidence score to the threshold deviation] (Bagnall, Page 2 Para 4, discloses: “Detection. At test time, the outputs of all ensemble members are combined using a rank voting mechanism. Each member assigns the rank 0 to the label it considers the most likely, rank 1 to the second most likely, and so on. For each label, the ranks are summed across all members, and the smallest label rank is used as the ensemble disagreement.” Here, Bagnall discloses confidence scores for each output (“rank voting mechanism. Each member assigns the rank 0 to the label it considers the most likely, rank 1 to the second most likely, and so on”), wherein each “rank” is a confidence score in each label for that model’s output. These ranks are used to identify an adversarial attack, as Bagnall continues: “For each label, the ranks are summed across all members, and the smallest label rank is used as the ensemble disagreement. For an ensemble trained using our method, a large ensemble disagreement is indicative of an input that lies outside of the data distribution. Correspondingly, we implement a simple rank-based criterion that rejects a test example as adversarial or outlier if and only if the ensemble disagreement is above a rank threshold hyperparameter τ.”) However, Bagnall does not teach the ensemble model being the result of transforming a single trained model into the ensemble model by insertion of a first output layer at a first exit location of the single trained model and a second output layer at a second exit location of the single trained model; at least one non-transitory computer-readable storage medium comprising instructions to cause at least one processor circuit to at least; analyze outputs of the execution of the ensemble model to identify an adversarial attack by comparing (1) a first deviation score between a first confidence score output by the first output layer and an expected confidence score to a threshold deviation, and (2) a second deviation between a second confidence score output by the second output layer and the expected confidence score to the threshold deviation; after identification of the adversarial attack based on at least one of the first deviation or the second deviation meeting or exceeding the threshold deviation, terminate execution of the ensemble model, cause transmission of a message, the message to indicate that the output of the execution of the ensemble model is indicative of an adversarial attack Kaya teaches the ensemble model being the result of transforming a single trained model into the ensemble model by insertion of a first output layer at a first exit location of the single trained model and a second output layer at a second exit location of the single trained model (Kaya, Page 4 Left Column: “Placement of the ICs. We pick a subset of internal layers to attach the internal classifiers after.”) analyze outputs of the execution of the ensemble model to identify an adversarial attack by comparing (1) a first deviation score between a first confidence score output by the first output layer and an expected confidence score to a threshold deviation, and (2) a second deviation between a second confidence score output by the second output layer and the expected confidence score to the threshold deviation; identification of the adversarial attack based on at least one of the first deviation or the second deviation meeting or exceeding the threshold deviation (Kaya, Pages 6-7 Section 5.1, discloses: “In Section 4.1, we show that an ideal, but impractical, early exit mechanism could eliminate overthinking entirely. Here, as a practical mechanism, we propose using the internal prediction confidence for determining when the network should stop thinking. Confidence allows us to simply decide between making an early exit or forwarding the input sample to subsequent layers. If none of the internal predictions—or the final prediction—are confident enough to for an exit; our mechanism outputs the most confident among them. We opt for a simple confidence mechanism for highlighting that we mitigate overthinking regardless of the mechanism, which could be improved with schemes, such as (Teerapittayanon et al., 2016). To quantify confidence, we use the estimated probability of the sample x belonging to the predicted class, i.e. maxj F(j)i (x). We deem a prediction confident if this probability exceeds the threshold parameter q.” Here, Kaya discloses first and second outputs (“early exit or forwarding the input sample to subsequent layers”). These outputs each have a confidence score (“To quantify confidence, we use the estimated probability of the sample x belonging to the predicted class”). Kaya also discloses an expected confidence score (“threshold parameter q”). Kaya also discloses a deviation score between each confidence score and the expected confidence score (“probability exceeds the threshold parameter q”, thus the “deviation score” is the difference between the two). Kaya also discloses the deviation score meeting or exceeding the threshold deviation (the deviation score being the difference between the “probability” and the “threshold parameter q”, and the threshold deviation being 0, and the deviation score being greater than or equal to the threshold deviation of 0 indicates “confident”). Kaya also suggests that this can be used to identify an adversarial attack. Kaya, Page 6 Top Right, discloses: “Our results suggest that backdooring attacks leverage the destructive effect of overthinking.” Kaya, Page 7 Top Right, discloses: “Early Exits Mitigate the Backdoor Attack. In Section 4.2, we identified that a backdooring attack on VGG-16 induces the destructive effect. Our early exit mechanism significantly reduces the success of this attack. When the threshold is at q = 0:8; the backdoored network makes correct predictions on 84% of the backdoor inputs—up from 12% without early exits. Further, the network classifies only 17% of the backdoor samples to the attacker’s target class—down from 98%. Overall, our results suggest that early exits can mitigate this attack. We believe that Shallow-Deep Networks shed light on potential avenues for a defensive strategy against backdooring attacks.”) Here, Kaya discloses that when one of the deviation scores exceeds the threshold, then the early exit produces a correct classification by avoiding the “overthinking” of the subsequent exits which make the system vulnerable to adversarial examples. Thus, early exits taken are indicative of potential adversarial attacks. Furthermore, Kaya states in Page 7 Section 5.2 that: “An SDN’s internal predictions reveal how consistently the network reaches its final prediction. Disagreements among them hint that the prediction is inconsistent and confused; whereas an agreement indicates consistency. The destructive effect of overthinking also displays a pattern of disagreement—^yi = y 6= ^yfinal—, and confusion. We propose the confusion metric to capture this inconsistency. The confusion metric quantifies how much the final prediction diverged from the internal predictions.” Here, Kaya discloses that “disagreements among them hint that the prediction is inconsistent and confused”. Furthermore, Kaya on Page 8 discloses: “Further, when used as an indicator for likely misclassifications, confusion also produces fewer false negatives than confidence.” Thus, Examiner notes that Kaya acknowledges that both differences in the “confusion metric” and also in “confidence” can be used to identify “misclassifications”. Thus, this would suggest to one of ordinary skill in the art to use confidence score deviations in order to identify adversarial examples.) after identification of the adversarial attack based on at least one of the first deviation or the second deviation meeting or exceeding the threshold deviation, terminate execution of the ensemble model (Kaya Page 2 Bottom Left: “Our first heuristic uses the confidence of an internal prediction to assess its correctness. With this heuristic, we can reliably detect when the network should stop thinking and make an early prediction—an early exit.”) Kaya is analogous art because it is in the field of endeavor of early exit ensembles and adversarial examples. It would have been obvious before the effective filing date of the claimed invention to combine the ensemble for adversarial attack of Bagnall with the confidence score of Kaya. Bagnall and Kaya both discuss using ensemble networks to defend against adversarial attacks, as Bagnall states, “a large ensemble disagreement is indicative of an input that lies outside of the data distribution”, and Kaya echoes this by saying, “disagreements among them hint that the prediction is inconsistent and confused.” While Bagnall uses a ranking system to measure disagreement among outputs by identifying how high the score of the lowest ranking label is, Kaya discusses using either a “confidence” or a “confusion metric” to indicate “likely misclassifications”. The combination of these two references would suggest to one of ordinary skill in the art to use an ensemble model to identify adversarial examples based on deviations from a confidence threshold at each output, or even differences in confidence between each output. One of ordinary skill in the art would be motivated to do so in order to better guard against adversarial attacks (Kaya, Page 7 Top Right: “When the threshold is at q = 0:8; the backdoored network makes correct predictions on 84% of the backdoor inputs—up from 12% without early exits. Further, the network classifies only 17% of the backdoor samples to the attacker’s target class—down from 98%. Overall, our results suggest that early exits can mitigate this attack.”) However, the combination of Bagnall and Kaya does not teach at least one non-transitory computer-readable storage medium comprising instructions to cause at least one processor circuit to at least; after identification of the adversarial attack, cause transmission of a message, the message to indicate that the output of the execution of the ensemble model is indicative of an adversarial attack. Abbaszadeh teaches at least one non-transitory computer-readable storage medium comprising instructions to cause at least one processor circuit to at least (Abbaszadeh, Para [0026], discloses: “For example, a computer-readable storage medium may store thereon instructions that when executed by a machine result in performance according to any of the embodiments described herein.”) after identification of the adversarial attack, cause transmission of a message, the message to indicate that the output of the execution of the ensemble model is indicative of an adversarial attack (Abbaszadeh, Para [0028], discloses: “Abnormalities may be detected by classifying the monitored data as being “normal”, “attacked”, or “fault”. This decision boundary may be constructed in feature space using dynamic models.” Abbaszadeh, Para [0033], discloses: “The system may be configurable and may distinguish between intelligent adversarial attacks and naturally occurring faults in each monitoring node.” Thus, Abbaszadeh discloses a classifier that identifies adversarial attacks. Abbaszadeh discloses transmitting a message via an external computing device (“cloud”) to indicate that the output of the model exceeds a confidence score (“decision boundary”) in [0056]: “At S1440, the system may compare each generated current monitoring node feature vector with a corresponding decision boundary for that monitoring node (the decision boundary separating normal state, attacked state, and fault state for that monitoring node). At S1450, the system may automatically transmit a threat alert signal based on results of said comparisons. The alert signal might be transmitted, for example, via a cloud-based application. According to some embodiments, the alert signal may be transmitted via one or more of a cloud-based system, an edge-based system, a wireless system, a wired system, a secured network, and a communication system.”) Abbaszadeh is analogous art because it is in the field of endeavor of machine learning and adversarial attack detection. It would have been obvious before the effective filing date of the claimed invention to combine the adversarial attack detection of Bagnall with the adversarial attack alert of Abbaszadeh. One of ordinary skill in the art would be motivated to do so in order to be able to have qualified personnel respond in a timely manner to limit damage from an attack (Abbaszadeh, [0028]: “This decision boundary may be constructed in feature space using dynamic models and may help enable early detection of vulnerabilities (and potentially avert catastrophic failures) allowing an operator to restore the control system to normal operation in a timely fashion.”) As per Claim 18, the combination of Bagnall, Kaya, and Abbaszadeh teaches the at least one non-transitory computer-readable storage medium of claim 15. Kaya teaches wherein the instructions cause one or more of the at least one processor circuit to generate the ensemble model based on the single trained model (Kaya, Page 4 Left Column: “Placement of the ICs. We pick a subset of internal layers to attach the internal classifiers after.”) It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Kaya with Bagnall for at least the reasons recited in the rejection to Claim 15. As per Claim 19, the combination of Bagnall, Kaya, and Abbaszadeh teaches the at least one non-transitory computer-readable storage medium of claim 15. Bagnall teaches wherein the instructions cause one or more of the at least one processor circuit to analyze the outputs of the execution of the ensemble model further based on a count of different classifications output by the first output layer and the second output layer (Bagnall, Page 2 Para 4, discloses: “Detection. At test time, the outputs of all ensemble members are combined using a rank voting mechanism. Each member assigns the rank 0 to the label it considers the most likely, rank 1 to the second most likely, and so on. For each label, the ranks are summed across all members, and the smallest label rank is used as the ensemble disagreement.” Here, Bagnall discloses confidence scores for each output (“rank voting mechanism. Each member assigns the rank 0 to the label it considers the most likely, rank 1 to the second most likely, and so on”), wherein each “rank” is a confidence score in each label for that model’s output. These ranks are used to identify an adversarial attack, as Bagnall continues: “For each label, the ranks are summed across all members, and the smallest label rank is used as the ensemble disagreement. For an ensemble trained using our method, a large ensemble disagreement is indicative of an input that lies outside of the data distribution. Correspondingly, we implement a simple rank-based criterion that rejects a test example as adversarial or outlier if and only if the ensemble disagreement is above a rank threshold hyperparameter τ.” Here, Examiner notes that Bagnall does not merely produce one classification at each layer, but a ranking of multiple classifications at each layer. Based on this, each label gets a score (“For each label, the ranks are summed across all members, and the smallest label rank is used as the ensemble disagreement”). Examiner notes that the claimed limitation merely states “analyze…based on a count”, with no further detail on how this is accomplished or what is done with the count. These rankings, which are summed by Bagnall, are based on a count of different classifications output by each classifier, because the sum of these rankings are determined by how many ranked labels are produced at each layer. The sum will change if some layers produce X ranked labels, but some other layers produce Y ranked labels. A sum is thus based on a count of addends in the sum.) As per Claim 21, the combination of Bagnall, Kaya, and Abbaszadeh teaches wherein the first output layer comprises a fully connected layer and a softmax layer. Bagnall teaches wherein the first output layer comprises a fully connected layer and a softmax layer (section 2 “Let Wn = [wk n ]1≤k≤K be the matrix of parameters used by the softmax layer to compute the posterior probabilities corresponding to K classes, and let W = [Wn]1≤n≤N be the 3-dimensional tensor of the softmax parameters for the entire ensemble. Let σn(x) = softmax(Wn, x) = [σ k n (x)]1≤k≤K be the vector of softmax outputs computed by the ensemble member n on an input example x.”) Claims 3, 10, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Bagnall, Kaya, and Abbaszadeh, further in view of Schaal et al. (“From Isolation to Cooperation: An Alternative View of a System of Experts”; hereinafter “Schaal”). As per Claim 3, the combination of Bagnall, Kaya, and Abbaszadeh teaches the at least one non-transitory computer-readable storage medium of claim 1. However, the combination does not teach wherein the instructions cause one or more of the at least one processor circuit to aggregate the first output of the first output layer and the second output of the second output layer using a weighted average Schaal teaches wherein the instructions cause one or more of the at least one processor circuit to aggregate the first output of the first output layer and the second output of the second output layer using a weighted average (Schaal, Page 605 Intro, discloses: “Distributing a learning task among a set of experts has become a popular method in computational learning. One approach is to employ several experts, each with a global domain of expertise (e.g., Wolpert, 1990). When an output for a given input is to be predicted, every expert gives a prediction together with a confidence measure. The individual predictions are combined into a single result, for instance, based on a confidence weighted average.”) Schaal is analogous art because it is in the field of endeavor of ensemble learning. It would have been obvious before the effective filing date of the claimed invention to combine the ensemble model of Bagnall with the confidence weighted average of Schaal. Bagnall even hints at the use of a weighted average of all outputs during training, as they state at the top of Page 2: “Je is the standard cross-entropy error for clean example x and its true label y, averaged over all ensemble members”. Furthermore, the Kaya reference cites Teerapittayanon et al. (“BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks”), who also similarly states on Page 2: “BranchyNet jointly optimizes the weighted loss of all exit points.” One of ordinary skill in the art would be motivated to use a weighted average in order to take into account the confidence of each prediction when arriving at a final prediction (Schaal, Page 605 Intro: “When an output for a given input is to be predicted, every expert gives a prediction together with a confidence measure. The individual predictions are combined into a single result, for instance, based on a confidence weighted average.”) As per Claim 10, this is an apparatus claim corresponding to non-transitory computer-readable storage medium claims 3, and is rejected for similar reasons. As per Claim 17, the combination of Bagnall, Kaya, and Abbaszadeh teaches the at least one non-transitory computer-readable storage medium of claim 15. However, the combination does not teach wherein the instructions cause one or more of the at least one processor circuit to aggregate the first output of the first output layer and the second output of the second output layer using a weighted average Schaal teaches wherein the instructions cause one or more of the at least one processor circuit to aggregate the first output of the first output layer and the second output of the second output layer using a weighted average (Schaal, Page 605 Intro, discloses: “Distributing a learning task among a set of experts has become a popular method in computational learning. One approach is to employ several experts, each with a global domain of expertise (e.g., Wolpert, 1990). When an output for a given input is to be predicted, every expert gives a prediction together with a confidence measure. The individual predictions are combined into a single result, for instance, based on a confidence weighted average.”) Schaal is analogous art because it is in the field of endeavor of ensemble learning. It would have been obvious before the effective filing date of the claimed invention to combine the ensemble model of Bagnall with the confidence weighted average of Schaal. Bagnall even hints at the use of a weighted average of all outputs during training, as they state at the top of Page 2: “Je is the standard cross-entropy error for clean example x and its true label y, averaged over all ensemble members”. Furthermore, the Kaya reference cites Teerapittayanon et al. (“BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks”), who also similarly states on Page 2: “BranchyNet jointly optimizes the weighted loss of all exit points.” One of ordinary skill in the art would be motivated to use a weighted average in order to take into account the confidence of each prediction when arriving at a final prediction (Schaal, Page 605 Intro: “When an output for a given input is to be predicted, every expert gives a prediction together with a confidence measure. The individual predictions are combined into a single result, for instance, based on a confidence weighted average.”) Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to VAN C MANG whose telephone number is (571)270-7598. The examiner can normally be reached Mon - Fri 8:00-5:00pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached at 5712707519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /VAN C MANG/Primary Examiner, Art Unit 2126
Read full office action

Prosecution Timeline

Jan 12, 2024
Application Filed
Mar 07, 2025
Non-Final Rejection — §101, §103
Jun 12, 2025
Response Filed
Jun 23, 2025
Final Rejection — §101, §103
Nov 25, 2025
Request for Continued Examination
Dec 07, 2025
Response after Non-Final Action
Feb 06, 2026
Non-Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12591809
MACHINE LEARNING PLATFORM
2y 5m to grant Granted Mar 31, 2026
Patent 12591830
Machine Learning-Based Approach to Identify Software Components
2y 5m to grant Granted Mar 31, 2026
Patent 12586022
Machine Learning-Based Approach to Characterize Software Supply Chain Risk
2y 5m to grant Granted Mar 24, 2026
Patent 12579444
MACHINE LEARNING MODEL GENERATION AND UPDATING FOR MANUFACTURING EQUIPMENT
2y 5m to grant Granted Mar 17, 2026
Patent 12561555
NETWORK OF TENSOR TIME SERIES
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
75%
Grant Probability
99%
With Interview (+26.9%)
3y 10m
Median Time to Grant
High
PTA Risk
Based on 241 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month