Last updated: April 19, 2026
Application No. 18/217,051
EXPLANATION OF A CAUSE OF A MISTAKE IN A MACHINE LEARNING MODEL USING A DIAGNOSTIC ARTIFICIAL INTELLIGENCE MODEL

Non-Final OA §101§103§112
Filed
Jun 30, 2023
Examiner
SPRATT, BEAU D
Art Unit
2143
Tech Center
2100 — Computer Architecture & Software
Assignee
Qed Software Sp Z O O
OA Round
1 (Non-Final)
Interview Optional

— +26.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 432 resolved cases, 2023–2026
Examiner Intelligence

SPRATT, BEAU D View full profile →
Grants 79% — above average
Career Allow Rate
342 granted / 432 resolved
+24.2% vs TC avg
Strong +27% interview lift
Without
With
+26.6%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
37 currently pending
Career history
469
Total Applications
across all art units
Statute-Specific Performance

§101
12.2%
-27.8% vs TC avg
§103
63.7%
+23.7% vs TC avg
§102
11.9%
-28.1% vs TC avg
§112
5.4%
-34.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 432 resolved cases
Office Action

§101 §103 §112
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Claims 1-41 are presented in the case. Information Disclosure Statement The information disclosure statement submitted on 08/23/2023 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements is being considered by the examiner. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b ) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. The following is a quotation of 35 U.S.C. 112 (pre-AIA), second paragraph: The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the appl icant regards as his invention. Claim s 1-41 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. The term s “ optimal ” , “mistake” and “learnings” in claim s 1, 31, and 36 are relative term s which renders the claim indefinite. The term s “ optimal ” and “mistake” are not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. Examples are provided include that a mistake is not optimal (see PGPUB ¶ 113) but that does not repair the concerns since different users may have different definitions of optimal and mistake. Also, what counts as a cause of mistake is unclear. Using terms like loss, accuracy, confidence , sensor error and other measures or metrics would be more concrete. Claim 1, 31, and 36 recite the phrases " trusted operational data FILLIN "Enter appropriate information" \* MERGEFORMAT " , “ a known trusted data ” and “ past known trusted data ” . The se have insufficient antecedent basis for th ese in the claim because it is unclear if they are the same dataset, overlapping datasets or different datasets. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-41 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”) Claims 1, 31 and 36 have the following abstract idea analysis. Step 1 : The claims are directed to “a method and system”. The claims are directed to the statutory categories accordingly. Step 2A Prong 1: claims recite the abstract idea limitations of "determining that a predictive data produced by an evaluated model for a period of time does not match a known trusted data for the period of time", "determining that a mistake occurred when the predictive data is not the optimal result of the evaluated model", and "explaining a cause of the mistake". These limitations include mental concepts (act of evaluating. Mental processes – concepts performed in the human mind (including an observation, evaluation, judgment, opinion) (see MPEP § 2106.04(a)(2)). Identifying mismatches, mistakes and explaining them are analogous to analyzing information and can be performed in the human mind. The specification also provides example operations performed by human decision makers. See USPGPUB ¶11 and 79. Other sections of the claims such as "forming a diagnostic model", types of operational data and "fine-tuning" are advanced processes, too generic or high level to be listed as a judicial exception given the available descriptions and MPEP comparisons. Step 2A Prong 2: The judicial exceptions recited in these claims are not integrated into a practical application. Merely invoking "machine learning", "memory", "processors", "diagnostic model", "supply chain data" and other data types do not yield eligibility. Claims are still in line with mental concepts such as claim 1, 31 and 36 are not specific to a practical application. The additional elements as such are processors and memory which do not include specialized hardware. See MPEP § 2106.05(f). Claim 1, 31 and 36 do not include a particular field but even doing so may not be sufficient to overcome the abstract idea rejection. Merely applying an model to a field or data without an advancement in the new field or new hardware is ineligible. MPEP § 2106.05(h). Step 2B : The claims do not contain significantly more than their judicial exceptions. models, processors, memory and other hardware are in their standard forms in the field. These additional elements are well-understood, routine, and conventional activity, see MPEP 2106.05(d)(II). Claims lacks any particular "how" or algorithm for a solution in a field in a novel way. Claims require more specificity on processes that would be incapable of simple mathematics, mental processes or use more substantial structure than conventional devices such as non-textbook implementations. Regarding claims 2-30, 32-35 and 37-41, they merely narrow the previously recited abstract idea limitations with more abstract concepts and/or routine fundamental processes. For the reasons described above with respect to claim 1, 31 and 36 this judicial exception is not meaningfully integrated into a practical application, or significantly more than the abstract idea. Abstract idea steps 1, 2A prong 1 and 2 remain the same as independent analysis above. See specification for more practical application concepts as none are seen in claims 2-30, 32-35 and 37-41. With respect to step 2B These claims disclose similar limitations described for the dependent claims above and do not provide anything significantly more than mathematical or mental concepts. Claims 2-30, 32-35 and 37-41 recite the additional elements of "comparing the predictive data of the evaluated model with a simulated prediction of the diagnostic model when the mistake is observed to explain the cause of the mistake using the diagnostic model, and wherein the explaining the cause of the mistake is applied to diagnose a black-box model without requiring any knowledge about technical specifications of a specific machine learning algorithm of the black-box model and without having direct access to it, and wherein the diagnostic model is applied on a diagnosed dataset and an output of the evaluated model comprising at least one of a prediction and a classification, without using predictions on a training set. generating a most probable explanation of the cause of the mistake as a natural language text. determining the cause of the mistake is because a labeling of a historical training data set on which the evaluated model was formed was erroneous. determining the cause of the mistake is because an external condition changed that caused the predictive data produced by the evaluated model to no longer conform to predictive trends. determining the cause of the mistake is because of an error in an input data to the evaluated model. the error in the input data is caused by any one of an inaccurate sensor reading, a human error, and an anomaly. determining the cause of the mistake is because an input data is a novel scenario from previous input scenarios, and the evaluated model is unprepared in the novel scenario. determining the cause of the mistake is because of concept drift in a relationship between an input data and the predictive data caused because a property of a target variable has changed over time. determining the cause of the mistake is because the evaluated model is underfitted because while a similar input data to an input data happened in the past, the evaluated model was not sufficiently fitted to the input data. determining the cause of the mistake is because the evaluated model is overfitted because the diagnosed machine learning model is unable to generalize away from a narrow band of deep optimizations to extrapolate to a general case. determining the cause of the mistake is because the evaluated model is based on an anomaly meaning that normally the evaluated model would be correct and that a human decision maker would most likely make the same mistake in this special case because of a unique condition of an input data now received. wherein the mistake is caused by a non-determinism of a problem. determining the cause of the mistake is because an input data to the evaluated model is based on an intentional attack caused by malignant actors attempting to undermine an integrity of the evaluated model. wherein the intentional attack is an intentional modification of the input data to the model. determining the cause of the mistake and a corresponding fix recommendation, which suggests how to improve performance of the evaluated model; and generating a visual report emphasizing a ranked set of important findings based on an order of importance, and which comprise relevant statistics related to the evaluated model, the quality of its approximator, and the distributions of a diagnostic attribute, wherein the visual report includes an interactive plot to help to explore diagnoses for individual instances and analyze their statistics for specific groups, and wherein the visual report provides insights on the importance of original attributes, approximated by significance of attributes in the diagnostic model. generating reports containing relevant statistics related to the diagnostic model, the quality of its approximator, and the distributions of diagnostic attributes, wherein the diagnostic model is a system that is responsible for making a diagnosis of causes of errors made by the evaluated model that is being diagnosed and whose prediction is already a concrete cause of error, and wherein the approximator is encapsulated within the diagnostic model comprising of an ensemble of rough-set models for determining approximations and neighborhoods. generating interactive plots to explore diagnoses for individual instances and analyze their statistics for specific groups. determining the importance of original attributes, approximated by the significance of attributes in the diagnostic model, and wherein the diagnostic model is a surrogate model. generating a set of historical neighborhoods comprising a set of historical instances that were processed in a similar way to the current instance on which mistakes of the diagnosed machine learning model are observable. forming a set of diagnostic attributes which describe a current instance through analysis of contents of the historical neighborhoods; and forming the diagnostic model as a decision model which obtains vectors of the set of diagnostic attributes as an input data; delivering a most probable cause of the mistake as an output data of the diagnostic model. forming the diagnostic model based on an analysis of mistakes registered in the set of neighborhoods. forming the diagnostic model based on the trusted operational data and the past predictive data for different periods of time when compared with past known trusted data for the different periods of time using rough set-based models in which intelligent systems are characterized by insufficient and incomplete information. computing accurate approximations of past predictive data with rough set-based surrogate models and a heuristic optimization method. basing a surrogate machine learning model on the trusted operational data produced by the evaluated model; automatically applying a method of discretization; applying an algorithm to determine high-quality approximations; obtaining trusted neighborhoods of each current instance by looking for trusted instances that were processed in a similar way by the surrogate machine learning model; and training the surrogate machine learning model as a model approximator. obtaining a set of neighborhoods using the model approximator comprising an ensemble of approximate reducts known from the theory of rough sets; and determining a specific neighborhood of a diagnosed instance through a decision process of the model approximator, wherein neighborhood for a diagnosed instance relative to a single reduct is a subset of instances from the historical training dataset which belong to the same indiscernibility class. The final neighborhood is the sum of neighborhoods computed for all reducts in the ensemble. The instances from neighborhoods have weights that express how representative they are for a given neighborhood. approximating how many reducts in the ensemble of approximate reducts are able to process in a same way a given pair of instances; counting how many reducts the given pair of instances of the ensemble are processed in the same way; and determining a similarity measure between instances through the counting of how many reducts in the ensemble the given pair of instances are processed in the same way. analyzing the specific neighborhood to determine characteristics comprising at least one of consistency of ground truth labels, consistency of original model predictions, consistency of approximations, neighborhood size, and uncertainty of predictions; and determining a set of characteristics through analyzing the specific neighborhood to determine consistency of labels comprising at least one of ground truth labels, original model predictions, approximations, size, and uncertainty of predictions. specifying diagnostic attributes that can be derived from contents of computed neighborhoods through analyzing the specific neighborhood to determine characteristics and through determination of the set of characteristics; and providing meaningful information on model operations by including the set of characteristics as diagnostic attributes that constitute an input in diagnostic rules. linking the values of the diagnostic attributes to a set of possible causes of mistakes. wherein when a neighborhood of a particular current instance is in at least one of a null and a minimal condition, then a probable cause of the mistake of the evaluated model on the particular current instance is that this is a totally new dissimilar case to historic cases and the evaluated model was unprepared for such cases." These elements are more abstract concepts, generic applications to a field of use or well-understood, routine, conventional activity (see MPEP § 2106.05(d) and can't be simply appended to qualify as significantly more or being a practical application. What type of application, or structure of components beyond generic machine learning is still unknown for these claims. Therefore claims 2-30, 32-35 and 37-41 also recites abstract ideas that do not integrate into a practical application or amount to significantly more than the judicial exception, and are rejected under U.S.C. 101. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim s 1-3, 5-7, 19-20, 31-33 and 4 1 are r ejected under 35 U.S.C. 103 as being unpatentable over Chan et al. (US 20190325333 A1 hereinafter Chan) in view of Rohrkemper et al. (US 20230061280 A1 hereinafter Rohrkemper ) and DESHPANDE et al. (US 20150178638 A1 hereinafter Deshpande) . As to in dependent claim 1, Chan teaches a method comprising: forming a diagnostic model using machine learning by ingesting trusted operational data, [trains surrogate model (diagnostic model) using training data including actual outcome data (trusted operational data) ¶36, ¶50-52 " Training data 116 includes data that is used to train linear surrogate model 114 and/or one or more non-linear surrogate models 115. Training data 116 may include at least a portion of training data 106. Training Data 116 is comprised of a plurality of entries. Each entry is associated with one or more features having a corresponding value and associated actual outcomes"] wherein the trusted operational data is any one of a supply chain data, a sales data, a purchase data, a fulfillment data, a sensory capture data, an observation data, an empirical data, a historical data, an industrial data, and a financial data; [includes observation of data ¶44, ¶49 "all observations of the training set"] determining that a predictive data produced by an evaluated model for a period of time does not match a known trusted data for the period of time; [determines ML predictions does not correlate (match) with actual outcome (known trusted data) ¶70 "In some embodiments, the machine learning model prediction does not correlate with the actual outcome data"] analyzing whether the predictive data produced by the evaluated model for the period of time is an optimal result of the evaluated model; [non correlation reveals a need to investigate (non-optimal) ¶92 "In the event the global importance value for a feature and the local importance value for the feature do not correlate, the entry with which the prediction is associated may be flagged. In some embodiments, the feature importance model is investigated to determine why the model outputted such values. In the event a threshold number of entries are flagged, the non-linear model may be determined to be inaccurate and adjusted. For example, the global importance value 504a for feature “F18” does not correlate with the local importance value 504b. This indicates that the non-linear model associated with non-linear model graph 500 may need to be adjusted or the feature importance model should be investigated."] explaining a cause [provides reason code (explanation) of decisions using surrogate model to explain a more complex model ¶37, ¶21 "A surrogate model may not only provide a prediction that is similar to the prediction made by the machine learning model, but also provide one or more reasons that describe why the surrogate model made its decision."] fine-tuning [retrain (fine tune) based on flagging entries (learnings) (flagged an another time ¶92) , ¶126 "At 1212, the linear and/or nonlinear model(s) are retrained. In some embodiments, the linear and/or non-linear surrogate models are retrained in the event a threshold number of entries are flagged"] Chan does not specifically teach determining that a mistake occurred when the predictive data is not the optimal result of the evaluated model and explaining a cause of the mistake using the diagnostic model . However, Rohrkemper teaches determining that a mistake occurred when the predictive data is not the optimal result of the evaluated model; [determines a particular fault or anomaly occurred (mistake) ¶29 " The fault detection model receives the set of output values from the anomalous-signal-prediction model. The fault detection model analyzes the relationships among the output values from the anomalous-signal-prediction model to generate the plurality of operational results. For example, the fault detection model may determine that a particular sub-set of signals having a particular variation from predicted values corresponds to a particular anomaly."] explaining a cause of the mistake using the diagnostic model; and [identifies the root cause of the results using an abductive model on a result of a ML model ¶1, ¶31 "identifies the root cause based on identifying a set of data sources that have a highest influence on the generation of the particular operational result"] Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the model interpretation disclosed by Chan by incorporating the determining that a mistake occurred when the predictive data is not the optimal result of the evaluated model and explaining a cause of the mistake using the diagnostic model disclosed by Rohrkemper because both techniques address the same field of machine learning and by incorporating Rohrkemper into Chan better identifies root causes of problems over conventional techniques in a more timely manner [ Rohrkemper ¶ 3-4 ] Chan and Rohrkemper do not specifically teach fine-tuning the diagnostic model based on learnings from a past predictive data for different periods of time when compared with past known trusted data for the different periods of time using a processor and a memory. However, Deshpande teaches fine-tuning the diagnostic model based on learnings from a past predictive data for different periods of time when compared with past known trusted data for the different periods of time using a processor and a memory. [retraining based on real-time vs original historical data ¶8-9 "prediction model is dynamically retrained using the received real time transaction data. The real time transaction data can be a new transaction data or a change in the existing transaction data associated with the entity. In other words, when the relationship between the influencing parameters and the output does not match due to a change in a pattern of the transactional data, then the prediction model is retrained"] Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the model interpretation disclosed by Chan and Rohrkemper by incorporating the fine-tuning the diagnostic model based on learnings from a past predictive data for different periods of time when compared with past known trusted data for the different periods of time using a processor and a memory disclosed by Deshpande because all techniques address the same field of machine learning and by incorporating Deshpande into Chan and Rohrkemper ensures data and models used are more timely and accurate [ Deshpande ¶ 1 ] . As to dependent claim 2 , the rejection of claim 1 is incorporated, Chan, Rohrkemper and Deshpande further teach comparing the predictive data of the evaluated model with a simulated prediction of the diagnostic model when the mistake is observed to explain the cause of the mistake using the diagnostic model, and [ Rohrkemper Hypothetical result (simulated) ¶63 " operator may initiate a request to identify a root cause of a hypothetical fault state"] wherein the explaining the cause of the mistake is applied to diagnose a black-box model without requiring any knowledge about technical specifications of a specific machine learning algorithm of the black-box model and without having direct access to it, and [Chan black box model ¶1] wherein the diagnostic model is applied on a diagnosed dataset and an output of the evaluated model comprising at least one of a prediction and a classification, without using predictions on a training set. [Chan predictions from model ¶20 and surrogate uses complex model output ¶33 classification into clusters ¶35] As to dependent claim 3 , the rejection of claim 1 is incorporated, Chan, Rohrkemper and Deshpande further teach generating a most probable explanation of the cause of the mistake as a natural language text. [ Rohrkemper Human understandable explanation ¶34] As to dependent claim 5 , the rejection of claim 1 is incorporated, Chan, Rohrkemper and Deshpande further teach determining the cause of the mistake is because an external condition changed that caused the predictive data produced by the evaluated model to no longer conform to predictive trends. [DESHPANDE change in patterns of the data leads to the retraining ¶9] As to dependent claim 6 , the rejection of claim 1 is incorporated, Chan, Rohrkemper and Deshpande further teach determining the cause of the mistake is because of an error in an input data to the evaluated model. [ Rohrkemper identifies the root cause in the data sources (input data) ¶1, ¶31 "identifies the root cause based on identifying a set of data sources that have a highest influence on the generation of the particular operational result"] As to dependent claim 7 , the rejection of claim 6 is incorporated, Chan, Rohrkemper and Deshpande further teach wherein the error in the input data is caused by any one of an inaccurate sensor reading, a human error, and an anomaly. [ Rohrkemper anomaly ¶28, ¶31 " four of the ten sensors as having the anomalous values that contributed to the particular anomalous operational result"] As to dependent claim 19 , the rejection of claim 1 is incorporated, Chan, Rohrkemper and Deshpande further teach determining the importance of original attributes, approximated by the significance of attributes in the diagnostic model, and wherein the diagnostic model is a surrogate model. [Chan ranks influential input features (importance) ¶45 surrogate model ¶33 "surrogate model is a data mining and engineering technique in which a generally simpler model is used to explain another usually more complex model or phenomenon"] As to dependent claim 20 , the rejection of claim 1 is incorporated, Chan, Rohrkemper and Deshpande further teach generating a set of historical neighborhoods comprising a set of historical instances that were processed in a similar way to the current instance on which mistakes of the diagnosed machine learning model are observable. [Chan clusters are groups of neighborhoods with similarity ¶66 "Each cluster represents a subset of the entries that are similar to each other. An entry may be associated with a cluster based on a distance between the entry and a cluster centroid"] As to in dependent claim 31, Chan teaches a processing system comparing a bank of computation processors and associated memory; [processor and memory ¶17] a network; [network ¶25] a diagnostic module coupled with the processing system through the network, further comprising: [surrogate server ¶25] a ingestion module [surrogate server ¶33] to form a diagnostic model using machine learning by ingesting trusted operational data, [receives training data to form a surrogate model (diagnostic model) including actual outcome data (trusted operational data) ¶36, ¶50-52 " Training data 116 includes data that is used to train linear surrogate model 114 and/or one or more non-linear surrogate models 115. Training data 116 may include at least a portion of training data 106. Training Data 116 is comprised of a plurality of entries. Each entry is associated with one or more features having a corresponding value and associated actual outcomes"] wherein the trusted operational data is any one of a supply chain data, a sales data, a purchase data, a fulfillment data, a sensory capture data, an observation data, an empirical data, a historical data, an industrial data, and a financial data, [includes observation of data ¶44, ¶49 "all observations of the training set"] a matching module [server ¶25] to determine that a predictive data produced by an evaluated model for a period of time does not match a known trusted data for the period of time, [determines ML predictions does not correlate (match) with actual outcome (known trusted data) ¶70 "In some embodiments, the machine learning model prediction does not correlate with the actual outcome data"] an optimization module server ¶25] to analyze whether the predictive data produced by the evaluated model for the period of time is an optimal result of the evaluated model, [non correlation reveals a need to investigate (non-optimal) ¶92 "In the event the global importance value for a feature and the local importance value for the feature do not correlate, the entry with which the prediction is associated may be flagged. In some embodiments, the feature importance model is investigated to determine why the model outputted such values. In the event a threshold number of entries are flagged, the non-linear model may be determined to be inaccurate and adjusted. For example, the global importance value 504a for feature “F18” does not correlate with the local importance value 504b. This indicates that the non-linear model associated with non-linear model graph 500 may need to be adjusted or the feature importance model should be investigated."] explaining a cause [provides reason code (explanation) of decisions using surrogate model to explain a more complex model ¶37, ¶21 "A surrogate model may not only provide a prediction that is similar to the prediction made by the machine learning model, but also provide one or more reasons that describe why the surrogate model made its decision."] fine-tuning [retrain (fine tune) based on flagging entries (learnings) (flagged an another time ¶92) , ¶126 "At 1212, the linear and/or nonlinear model(s) are retrained. In some embodiments, the linear and/or non-linear surrogate models are retrained in the event a threshold number of entries are flagged"] Chan does not specifically teach a mistake-identification module to determine that a mistake occurred when the predictive data is not the optimal result of the evaluated model, an explanation module to explain a cause of the mistake using the diagnostic model . However, Rohrkemper teaches a mistake-identification module [model 119 ¶48] to determine that a mistake occurred when the predictive data is not the optimal result of the evaluated model, [Determines a particular fault or anomaly occurred (mistake) ¶29 " The fault detection model receives the set of output values from the anomalous-signal-prediction model. The fault detection model analyzes the relationships among the output values from the anomalous-signal-prediction model to generate the plurality of operational results. For example, the fault detection model may determine that a particular sub-set of signals having a particular variation from predicted values corresponds to a particular anomaly."] an explanation module [generator 122 ¶49] to explain a cause of the mistake using the diagnostic model, and [identifies the root cause of the results using an abductive model on a result of a ML model ¶1, ¶31 "identifies the root cause based on identifying a set of data sources that have a highest influence on the generation of the particular operational result"] Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the model interpretation disclosed by Chan by incorporating the mistake-identification module to determine that a mistake occurred when the predictive data is not the optimal result of the evaluated model, an explanation module to explain a cause of the mistake using the diagnostic model disclosed by Rohrkemper because both techniques address the same field of machine learning and by incorporating Rohrkemper into Chan better identifies root causes of problems over conventional techniques in a more timely manner [ Rohrkemper ¶ 3-4 ] . Chan and Rohrkemper do not specifically teach a tuning module to fine-tune the diagnostic model based on learnings from a past predictive data for different periods of time when compared with past known trusted data for the different periods of time using the processing system. However, Deshpande teaches a tuning module [retraining module 115 ¶11] to fine-tune the diagnostic model based on learnings from a past predictive data for different periods of time when compared with past known trusted data for the different periods of time using the processing system. [retraining based on real-time vs original historical data (different periods) ¶8-9 "prediction model is dynamically retrained using the received real time transaction data. The real time transaction data can be a new transaction data or a change in the existing transaction data associated with the entity. In other words, when the relationship between the influencing parameters and the output does not match due to a change in a pattern of the transactional data, then the prediction model is retrained"] Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the model interpretation disclosed by Chan and Rohrkemper by incorporating the a tuning module to fine-tune the diagnostic model based on learnings from a past predictive data for different periods of time when compared with past known trusted data for the different periods of time using the processing system disclosed by Deshpande because all techniques address the same field of machine learning and by incorporating Deshpande into Chan and Rohrkemper ensures data and models used are more timely and accurate [ Deshpande ¶ 1 ]. As to dependent claim 32 , the rejection of claim 31 is incorporated, Chan, Rohrkemper and Deshpande further teach an explanation module to compare the predictive data of the evaluated model with a simulated prediction of the diagnostic model when the mistake is observed to explain the cause of the mistake using the diagnostic model, and [ Rohrkemper Hypothetical result (simulated) ¶63 " operator may initiate a request to identify a root cause of a hypothetical fault state"] wherein the explaining the cause of the mistake is applied to diagnose a black-box model without requiring any knowledge about technical specifications of a specific machine learning algorithm of the black-box model and without having direct access to it, and [Chan black box model ¶1] wherein the diagnostic model is applied on a diagnosed dataset and an output of the evaluated model comprising at least one of a prediction and a classification, without using predictions on a training set. [Chan predictions from model ¶20 and surrogate uses complex model output ¶33 classification into clusters ¶35] As to dependent claim 33 , the rejection of claim 32 is incorporated, Chan, Rohrkemper and Deshpande further teach a natural language module to generate a most probable explanation of the cause of the mistake as a natural language text. [ Rohrkemper Human understandable explanation ¶34] As to dependent claim 41 , the rejection of claim 1 is incorporated, Chan, Rohrkemper and Deshpande further teach determining the cause of the mistake is because of an error in an input data to the evaluated model. [ Rohrkemper identifies the root cause in the data sources (input data) ¶1, ¶31 "identifies the root cause based on identifying a set of data sources that have a highest influence on the generation of the particular operational result"] Claims 4 , 34 and 35 are rejected under 35 U.S.C. 103 as being unpatentable over Chan in view of Rohrkemper and Deshpande as applied in the rejection of claim 1 above, and further in view of FARRÉ GUIU et al. (US20190034822A1 hereinafter Farre ) As to dependent claim 4 , Chan, Rohrkemper and Deshpande teach the method of claim 1 above that is incorporated, Chan, Rohrkemper and Deshpande do not specifically teach determining the cause of the mistake is because a labeling of a historical training data set on which the evaluated model was formed was erroneous . However, Farre teaches determining the cause of the mistake is because a labeling of a historical training data set on which the evaluated model was formed was erroneous . [labeling error ¶3 "determining a type of the first labeling error based on a second confusion matrix, and modifying the training dataset based on the determined type of the first labeling error."] Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the model interpretation disclosed by Chan, Rohrkemper and Deshpande by incorporating the determining the cause of the mistake is because a labeling of a historical training data set on which the evaluated model was formed was erroneous disclosed by Farre because all techniques address the same field of machine learning and by incorporating Farre into Chan, Rohrkemper and Deshpande enhance the dataset for more accuracy or improvement in models [ Farre ¶ 2] As to dependent claim 34 , Chan, Rohrkemper and Deshpande teach the method of claim 33 above that is incorporated, Chan, Rohrkemper and Deshpande do not specifically teach a label-analysis module to determine the cause of the mistake is because a labeling of a historical training data set on which the evaluated model was formed was erroneous. However, Farre teaches a label-analysis module to determine the cause of the mistake is because a labeling of a historical training data set on which the evaluated model was formed was erroneous. [labeling error ¶3 "determining a type of the first labeling error based on a second confusion matrix, and modifying the training dataset based on the determined type of the first labeling error."] Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the model interpretation disclosed by Chan, Rohrkemper and Deshpande by incorporating the a label-analysis module to determine the cause of the mistake is because a labeling of a historical training data set on which the evaluated model was formed was erroneous disclosed by Farre because all techniques address the same field of machine learning and by incorporating Farre into Chan, Rohrkemper and Deshpande enhance the dataset for more accuracy or improvement in models [ Farre ¶ 2] As to dependent claim 35 , the rejection of claim 34 is incorporated, Chan, Rohrkemper , Deshpande and Farre further teac h an external-change module to determine the cause of the mistake is because an external condition changed that caused the predictive data produced by the evaluated model to no longer conform to predictive trends. [DESHPANDE change in patterns of the data leads to the retraining ¶9] Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Chan in view of Rohrkemper and Deshpande as applied in the rejection of claim 1 above, and further in view of Christiansen et al. (US 20210125104 A1 hereinafter Christiansen) As to dependent claim 8 , Chan, Rohrkemper and Deshpande teach the method of claim 1 above that is incorporated, Chan, Rohrkemper and Deshpande do not specifically teach determining the cause of the mistake is because an input data is a novel scenario from previous input scenarios, and the evaluated model is unprepared in the novel scenario. However, Christiansen teaches determining the cause of the mistake is because an input data is a novel scenario from previous input scenarios, and the evaluated model is unprepared in the novel scenario. [determines subpar performance and computes similarity (how novel from past) ¶11-12 "comparing, using a data representation model, the sample data to the training data to determine a similarity score; only if the similarity score is above or equal to a first predetermined similarity threshold, sending the sample data to the machine learning model for processing. In this way, the machine learning inference system is able to evaluate sample data before the sample data is provided to the machine learning model for processing. The machine learning inference system may then restrict sample data which is not deemed suitable for being processed by the machine learning model. This means that the machine learning model is no longer expected to be able to generalize to process sample data that is not sufficiently similar to the training data. As a consequence, the number of inaccurate outputs of machine learning model is dramatically reduced."] Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the model interpretation disclosed by Chan, Rohrkemper and Deshpande by incorporating the determining the cause of the mistake is because an input data is a novel scenario from previous input scenarios, and the evaluated model is unprepared in the novel scenario disclosed by Christiansen because all techniques address the same field of machine learning and by incorporating Christiansen into Chan, Rohrkemper and Deshpande help maintain the performance of models in the wild [ Christiansen ¶ 4 ] . Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Chan in view of Rohrkemper and Deshpande as applied in the rejection of claim 1 above, and further in view of Raj et al. (US 20210334695 A1 hereinafter Raj ) As to dependent claim 9 , Chan, Rohrkemper and Deshpande teach the method of claim 1 above that is incorporated, Chan, Rohrkemper and Deshpande do not specifically teach determining the cause of the mistake is because of concept drift in a relationship between an input data and the predictive data caused because a property of a target variable has changed over time. However, Raj teaches determining the cause of the mistake is because of concept drift in a relationship between an input data and the predictive data caused because a property of a target variable has changed over time. [concept drift from input data change ¶5-6 "Concept drift or domain drift refers to the drift that occurs when the statistical properties of the target variable being modeled have shifted or changed over time in unforeseen ways"] Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the model interpretation disclosed by Chan, Rohrkemper and Deshpande by incorporating the determining the cause of the mistake is because of concept drift in a relationship between an input data and the predictive data caused because a property of a target variable has changed over time disclosed by Raj because all techniques address the same field of machine learning and by incorporating Raj into Chan, Rohrkemper and Deshpande further automate modeling for more consistent results [ Raj ¶ 4] Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Chan in view of Rohrkemper and Deshpande as applied in the rejection of claim 1 above, and further in view of Grosse et al. (US 20210303695 A1 hereinafter Grosse) As to dependent claim 11 , Chan, Rohrkemper and Deshpande teach the method of claim 1 above that is incorporated, Chan, Rohrkemper and Deshpande do not specifically teach determining the cause of the mistake is because the evaluated model is overfitted because the diagnosed machine learning model is unable to generalize away from a narrow band of deep optimizations to extrapolate to a general case. However, Grosse teaches determining the cause of the mistake is because the evaluated model is overfitted because the diagnosed machine learning model is unable to generalize away from a narrow band of deep optimizations to extrapolate to a general case. [overfitting with issues on unseen data (unable to generalize) ¶38 "overfitting refers to the model learning the training data too well, such that the model does not generate good results when processing new, unseen test data. That is, the model may be trained to accurately generate results for the training data, but then does not provide accurate results when presented with new input data that does not significantly match the training data. Such overfitting often occurs as a result of training a model whose capacity is too large when given insufficiently diverse or insufficiently sized training data."] Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the model interpretation disclosed by Chan, Rohrkemper and Deshpande by incorporating the determining the cause of the mistake is because the evaluated model is overfitted because the diagnosed machine learning model is unable to generalize away from a narrow band of deep optimizations to extrapolate to a general case disclosed by Grosse because all techniques address the same field of machine learning and by incorporating Grosse into Chan, Rohrkemper and Deshpande increase the availability of training data for improved training of models [ Grosse ¶ 4]. Claim s 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Chan in view of Rohrkemper and Deshpande as applied in the rejection of claim 1 above, and further in view of Zhang et al. (US 20230019198 A1 hereinafter Zhang) As to dependent claim 1 4 , Chan, Rohrkemper and Deshpande teach the method of claim 1 above that is incorporated, Chan, Rohrkemper and Deshpande do not specifically teach determining the cause of the mistake is because an input data to the evaluated model is based on an intentional attack caused by malignant actors attempting to undermine an integrity of the evaluated model. However, Zhang teaches determining the cause of the mistake is because an input data to the evaluated model is based on an intentional attack caused by malignant actors attempting to undermine an integrity of the evaluated model . [an attack with bad examples that can undermine (cause misclassifications) ¶3 "attacker intentionally injects small perturbations (also known as adversarial examples) to a DNN's input data to cause misclassifications"] Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention to modify the model interpretation disclosed by Chan, Rohrkemper and Deshpande by incorporating the determining the cause of the mistake is because an input data to the evaluated model is based on an intentional attack caused by malignant actors attempting to undermine an integrity of the evaluated model disclosed by Zhang because all techniques address the same field of machine learning and by incorporating Zhang into Chan, Rohrkemper and Deshpande protects models from attacks for improved security [ Zhang ¶ 44 ] As to dependent claim 15 , the rejection of claim 14 is incorporated, Chan, Rohrkemper , Deshpande and Zhang further teac h wherein the intentional attack is an intentional modification of the input data to the model. [Zhang adversarial examples ¶27, ¶3 “ attacker intentionally injects small perturbations (also known as adversarial examples) to a DNN's input data to cause misclassifications"] Claim s 36-38 are r ejected under 35 U.S.C. 103 as being unpatentable over Chan in view of Rohrkemper . As to in dependent claim 36, Chan teaches a method comprising: determining that a predictive data produced by an evaluated model for a period of time does not match a known trusted data for the period of time; [determines ML predictions does not correlate (match) with actual outcome (known trusted data) ¶70 "In some embodiments, the machine learning model prediction does not correlate with the actual outcome data"] analyzing whether the predictive data produced by the evaluated model for the period of time is an optimal result of the evaluated model; [non correlation reveals a need to investigate (non-optimal) ¶92 "In the event the global importance value for a feature and the local importance value for the feature do not correlate, the entry with which the prediction is associated may be flagged. In some embodiments, the feature importance model is investigated to determine why the model outputted such values. In the event a threshold number of entries are flagged, the non-linear model may be determined to be inaccurate and adjusted. For example, the global importance value 504a for feature “F18” does not correlate with the local importance value 504b. This indicates that the non-linear model associated with non-linear model graph 500 may need to be adjusted or the feature importance model should be investigated."] explaining a cause [provides reason code (explanation) of decisions using surrogate model to explain a more complex model ¶37, ¶21 "A surrogate model may not only provide a prediction that is similar to the prediction made by the machine learning model, but also provide one or more reasons that describe why the surrogate model made its decision."] fine-tuning [retrain (fine tune) based on flagging entries (learnings) (flagged an another time ¶92) , ¶126 "At 1212, the linear and/or nonlinear model(s) are retrained. In some embodiments, the linear and/or non-linear surrogate models are retrained in the event a threshold number of entries are flagged"] Chan does not specifically teach determining that a mistake occurred when the predictive data is not the optimal result of the evaluated model; explaining a cause of the mistake using a diagnostic model; comparing the predictive data of the evaluated model with a simulated prediction of the diagnostic model when the mistake is observed to explain the cause of the mistake using the diagnostic model. However, Rohrkemper teaches determining that a mistake occurred when the predictive data is not the optimal result of the e
Read full office action
Prosecution Timeline

Jun 30, 2023
Application Filed
Mar 13, 2026
Non-Final Rejection — §101, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/558,810
Patent 12595715
Cementing Lab Data Validation based On Machine Learning
2y 5m to grant Granted Apr 07, 2026
17/869,528
Patent 12596955
REWARD FEEDBACK FOR LEARNING CONTROL POLICIES USING NATURAL LANGUAGE AND VISION DATA
2y 5m to grant Granted Apr 07, 2026
17/907,540
Patent 12596956
INFORMATION PROCESSING DEVICE AND INFORMATION PROCESSING METHOD FOR PRESENTING REACTION-ADAPTIVE EXPLANATION OF AUTOMATIC OPERATIONS
2y 5m to grant Granted Apr 07, 2026
17/664,499
Patent 12561464
CATALYST 4 CONNECTIONS
2y 5m to grant Granted Feb 24, 2026
17/724,783
Patent 12561606
TECHNIQUES FOR POLL INTENTION DETECTION AND POLL CREATION
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
79%
Grant Probability
99%
With Interview (+26.6%)
3y 1m
Median Time to Grant
Low
PTA Risk
Based on 432 resolved cases by this examiner. Grant probability derived from career allow rate.
EXPLANATION OF A CAUSE OF A MISTAKE IN A MACHINE LEARNING MODEL USING A DIAGNOSTIC ARTIFICIAL INTELLIGENCE MODEL

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email