Last updated: May 29, 2026
Application No. 18/325,267
CAUSAL EXPLANATION OF ATTENTION-BASED NEURAL NETWORK OUTPUT

Non-Final OA §101§103
Filed
May 30, 2023
Priority
Sep 15, 2022 — provisional 63/375,825
Examiner
SAX, STEVEN PAUL
Art Unit
2146
Tech Center
2100 — Computer Architecture & Software
Assignee
Intel Corporation
OA Round
1 (Non-Final)
Interview Optional

— +44.8% interview lift. Examiner has a relatively high allowance rate (70%); +44.8% interview lift. A written response may suffice.
Based on 460 resolved cases, 2023–2026
Examiner Intelligence

SAX, STEVEN PAUL View full profile →
Grants 70% — above average
Career Allowance Rate
320 granted / 460 resolved
+14.6% vs TC avg
Strong +45% interview lift
Without
With
+44.8%
Interview Lift
resolved cases with interview
Typical timeline
4y 1m
Avg Prosecution
15 currently pending
Career history
482
Total Applications
across all art units
Statute-Specific Performance

§101
10.7%
-29.3% vs TC avg
§103
77.8%
+37.8% vs TC avg
§102
6.1%
-33.9% vs TC avg
§112
2.3%
-37.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 460 resolved cases
Office Action

§101 §103
Detailed Action
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

2.	Claims 1-20 are pending.

Claim Rejections - 35 USC § 101
3.	35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


4.	Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more and thus is directed to non-patentable subject matter. Specifically, the claims are directed toward the judicial exception of an abstract idea without reciting additional elements that amount to significantly more than the judicial exception. The rationale for this determination is in accordance with the guidelines of USPTO, applies to all statutory categories, and is explained in detail below.   
When considering subject matter eligibility under 35 U.S.C. 101, (1) it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter. If the claim does fall within one of the statutory categories, (2a) it must then be determined whether the claim is directed to a judicial exception (i.e., law of nature, natural phenomenon, and abstract idea), and if so (2b), it must additionally be determined whether the claim is a patent-eligible application of the exception. If an abstract idea is present in the claim, any element or combination of elements in the claim must be sufficient to ensure that the claim amounts to significantly more than the abstract idea itself. Examples of abstract ideas include certain methods of organizing human activities; a mental processes; and mathematical concepts.   

STEP 1:   
Per Step 1 of the two-step analysis, the claims are determined to include a method (independent claim 1), non-transitory computer readable media (independent claim 11), and an apparatus (independent claim 16) respectively and in the therefrom dependent claims. Therefore, the claims are directed to a statutory eligibility category.   
 
 Step 2A, Prong 1:   
The independent claims recite:
 “extracting one or more matrices from the one or more attention layers” (A person can mentally derive matrices from data in an instruction set);
“generating a causal graph based on the one or more matrices, the causal graph comprising a plurality of elements, each of which represents a respective one of the plurality of variables, wherein one or more connections between the plurality of elements in the causal graph represent one or more causal relationships between the plurality of variables” (Note that this limitation does not actually display anything at all.  A person can mentally generate a graph with the aid of pen and paper a graph showing causal relationships between user actions represented by data);
“identifying a first variable in the variable set based on the causal graph, the first variable representing a first user action that is determined to be a cause of a second user action represented by a second variable in the variable set” (A person can identify a piece of data representing an action that causes a second action);
“generating an explanation for the output of the pretrained neural network” (A person can think of some reason or explanation for why the output was generated.  Note this may be any reason and based on any mental connection the person may make between the input variables and the output).
If claim limitations, under their broadest reasonable interpretation, covers performance of the limitations as a mental process but for the recitation of generic computer components, then it falls under the mental process grouping of abstract ideas. Accordingly, the claim “recites” an abstract idea.
Regarding dependent claims 2, 12, 17, a person can mentally rearrange data to form a variable set from the first and second variables.
Regarding dependent claims 3, 13, 18, a person can mentally assign data labels to represent user actions, predicted actions, etc.
Regarding dependent claims 4, 13, a person can mentally assign data labels to predicted actions, recommendations etc.
Regarding dependent claim 5, a person can mentally label the output data, rank the data, and select a particular piece of data.
Regarding dependent claim 6, the data result can represent a reason or cause or form a logical state in an algorithm.
Regarding dependent claims 7, a person can mentally rearrange data to form a variable set from the first and second variables and the data result can represent a reason or cause or form a logical state in an algorithm.
Regarding dependent claims 8, 14, 19, a person can mentally perform an algorithm to “measure” or calculate values based on matrix entries, and determine logical state values based on those values.
Regarding dependent claims 9, 15, 20, a person can mentally arrange data based on values and logical states.
Regarding dependent claim 10, a person can mentally arrange data based on values and relationships to other data, and use a sequence ranking algorithm to search for a given piece of data.
All these claim features may be accomplished by applying particular calculations, groupings, inspection, and general manipulation of data.  The invention is thus directed to mental process groupings of abstract ideas because they cover concepts performed in the human mind, including observation, evaluation, judgment, and opinion. See MPEP 2106.04(a)(2), subsection III.  

Step 2A, Prong 2   
This judicial exception is not integrated into a practical application. This part of the eligibility analysis evaluates whether the claim as a whole integrates the recited judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying whether there are any additional elements recited in the claim beyond the judicial exception, and (2) evaluating those additional elements individually and in combination to determine whether the claim as a whole integrates the exception into a practical application. See MPEP 2106.04(d). 
“inputting a variable set comprising a plurality of variables into a pretrained neural network comprising one or more attention layers, each of the plurality of variables representing a respective user action, the pretrained neural network generating an output”, “inputting an initial variable set into the pretrained neural network, the initial variable set comprising one or more variables that include the first variable, the pretrained neural network outputting the second variable”, “inputting the new variable set into the pretrained neural network, the pretrained neural network outputting a different recommendation” are using a generic computer to gather data and thus are mere instructions to apply the judicial exception using generic computer.
In addition, all uses of the recited judicial exceptions require such data gathering and output, and, as such, these limitations do not impose any meaningful limits on the claim.  See MPEP 2106.05. It is used to perform an abstract idea, as discussed above in Step 2A, Prong One, such that it amounts to no more than mere instructions to apply the exception using any generic  computer. See MPEP 2106.05(f). The limitations provide nothing more than mere instructions to implement an abstract idea on a generic computer. See MPEP 2106.05(f). MPEP 2106.05(f) provides the following considerations for determining whether a claim simply recites a judicial exception with the words “apply it” (or an equivalent), such as mere instructions to implement an abstract idea on a computer: (1) whether the claim recites only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished; (2) whether the claim invokes computers or other machinery merely as a tool to perform an existing process; and (3) the particularity or generality of the application of the judicial exception. 
Thus, under Step 2A, the Examiner holds that the claims are directed to concepts identified as abstract ideas.   
 STEP 2B.   
 The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05. 
Regarding “inputting a variable set comprising a plurality of variables into a pretrained neural network comprising one or more attention layers, each of the plurality of variables representing a respective user action, the pretrained neural network generating an output”, “inputting an initial variable set into the pretrained neural network, the initial variable set comprising one or more variables that include the first variable, the pretrained neural network outputting the second variable”, “inputting the new variable set into the pretrained neural network, the pretrained neural network outputting a different recommendation”, these insignificant extra solution activities are well understood routine and conventional activities. See Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362. 
Considering the additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.

Claim Rejections - 35 USC § 103
5.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


6.	Claim(s) 1-8, 11-14, and 16-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dalli et al “Dalli” (WO 2022129610 A) and Kollias et al “Kollias” (US 11860720 B2) and Xue et al “Xue” (CN 114610856 A).
(Please see the attached copies of Dalli and Kollias and Xue that number paragraphs in the same manner as that used in this Action).

7.	Regarding claim 1, Dalli shows a computer-implemented method, comprising inputting a variable set comprising a plurality of variables into a pretrained neural network comprising one or more attention layers (para 148-149, 263, 440 show the attention layers of the neural network, para 372, 381, 428, 441, 443b for example show inputting variables representing different features into the model, para 264-265 show the neural network may be pretrained), the pretrained neural network generating an output (para 210, 373, 381 for example all show the generated output of the [pretrained] neural network); extracting one or more matrices from the one or more attention layers (para 289-291, 313-314, 402, 443c show deriving matrices including attention data matrices from the attention layers); and generating an explanation for the output of the pretrained neural network (para 292, 313, 320, 404, 443c show generating the explanations for the [pretrained] neural network outputs).
Dalli para 379 and 445f show modeling causal relationships based on the data, which includes the matrix data, but Dalli does not explicitly show generating a causal graph based on the one or more matrices, the causal graph comprising a plurality of elements, each of which represents a respective one of the plurality of variables, wherein one or more connections between the plurality of elements in the causal graph represent one or more causal relationships between the plurality of variables; and identifying a first variable in the variable set based on the causal graph.  Kollias however does show a neural network model generating a causal graph based on the one or more matrices (para 32, 37, 48, 50 show generating the causal graph based on matrix data in a neural network model), the causal graph comprising a plurality of elements, each of which represents a respective one of the plurality of variables (para 32, 41-42, 45 show the causal graph comprising node elements each of which represents particular feature data variables), wherein one or more connections between the plurality of elements in the causal graph represent one or more causal relationships between the plurality of variables (para 32, 41-42, 45, 53 show connections between the nodes represent causal relationships between the feature data variables); and identifying a first variable in the variable set based on the causal graph (para 19, 30, 51, 53, 56 show identifying a particular feature data variable in the dataset using the causal graph). As shown in para 30-32, 42 Kollias uses the causal graph to model causal relationships. It would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention to have used the causal graph as is done in Kollias, in the neural network model of Dalli, because it would provide an efficient way to model the causal relationships among the data variables.
 Dalli para 382 shows that the variables may relate to user attributes which could include various actions or interactions, but Dalli and Kollias do not explicitly show each of the plurality of variables represents a respective user action per se such that the first variable represents a first user action that is determined to be a cause of a second user action represented by a second variable in the variable set.  Xue however does show each of the plurality of data variables represents a respective user action per se such that the first variable represents a first user action that is determined to be a cause of a second user action represented by a second variable in the variable set (para 83, 128, 131 show a causal graph in which each data variable represents a particular user action, and para 86-88, 92, 136 show the causal graph indicates how one data variable represents the user action leading causing the reply user action which is represented by another data variable on the causal graph).  It would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention to have data variables represent user actions in the causal graph such that a first variable represents a first user action that is determined to be a cause of a second user action represented by a second variable in the variable set, as is done in Xue,  in the neural network model of Dalli, especially as modified by the causal graph of Kollias, because it would provide an efficient and direct way to use a causal graph to model the causal relationships among the user actions.

8.	Regarding claim 2, in addition to that mentioned for claim 1, Dalli shows inputting an initial variable set into the pretrained neural network, the initial variable set comprising one or more variables that include the first variable, the pretrained neural network outputting the second variable (para 259, 266, 273 show the variable dataset with the input (first) data variable and outputted (second) data variable from the already trained/pretrained neural network) and forming the variable set that includes the one or more variables and the second variable (para 274, 441, 445c show forming the variable dataset with those input and outputted data variables).

9.	Regarding claim 3, in addition to that mentioned for claim 2, the one or more variables represent one or more historical actions of a user (Xue as mentioned above shows how the variables represent user actions, and para 83, 128 show in fact these may be historical actions of a user – motivation to combine Xue with Dalli as modified by Kollias is the same as that mentioned for claim 1, namely it would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention to have this in the neural network model of Dalli, especially as modified by the causal graph of Kollias, because it would provide an efficient and direct way to use a causal graph to model the causal relationships among the user actions), and the second user action represented by the second variable is a predicted action of the user (Dalli para 263, 281, 328 show that the output represented by the second variable is a predicted output.  That this would be a user action per se is obvious in view of Xue as just explained). 

10.	Regarding claim 4, in addition to that mentioned for claim 1, the second user action is a predicted interaction of a user with an item (Dalli para 382, 416 does show the input and output data variables may be associated with user interaction with an element, and Dalli para 259, 266, 273 show the variable dataset with the input (first) data variable and outputted (second) data variable and Xue as explained for claim 1 shows the variables representing user actions in general; Dalli para 263, 281, 328 furthermore show that the output represented by the second variable is a predicted output); and the output of the pretrained neural network comprises a recommendation of the item to the user (Dalli para 390, 445b show the output of the [pretrained] neural network includes a recommendation – this may be for any of the output types including for the variable input/output pair relating to the user interaction with an element as described above).

11.	Regarding claim 5, in addition to that mentioned for claim 4, the output of the pretrained neural network is a group of recommendations that includes the recommendation and one or more other recommendations (Dalli para 312, 390 show the output of the [pretrained] neural network is a plurality of recommendations), and the recommendation is selected based on a ranking of the group of recommendations (Dalli para 257, 282, 312, 387, 413 show ranking them to select the generated recommendation).

12.	Regarding claim 6, in addition to that mentioned for claim 4, the explanation indicates that the first user action is a reason for the recommendation (Dalli para 413 shows that the explanation indicates an input is the reason for the recommendation, and that the input is related to some type of user action/decision.  Given the combination with Xue, the input would be a first user action). 

13.	Regarding claim 7, in addition to that mentioned for claim 6, Dalli shows forming a new variable set that comprises the one or more variables other than the first variable, inputting the new variable set into the pretrained neural network, the pretrained neural network outputting a different recommendation (Dalli para 259, 266, 273 show the variable dataset with the input (first) data variable and outputted (second) data variable from the already trained/pretrained neural network, Dalli para 274, 416, 441, 445c show forming the variable datasets with those input and outputted data variables, and this includes with different inputs which thus generate different outputs accordingly. The generated recommendations in para 312, 390 would thus be different recommendations), wherein the explanation further indicates that the different recommendation would have been made if the first user action represented by the first variable was not performed (Dalli para 410, 413 show the explanation indicates a different recommendation applying to a different first input represented by the [different] first variable.  Given the combination with Xue, this would represent a different first user action).

14.	Regarding claim 8, in addition to that mentioned for claim 1, Kollias shows the generating the causal graph based on the one or more matrices comprises: measuring conditional independence between the plurality of variables based on the one or more matrices (Kollias para 54 show using conditional independence as a technique when generating the causal graph in para 32, 37, 48, 50); and determining the one or more connections in the causal graph based on the conditional independence between the plurality of variables (Kollias para 54 shows how the conditional independence is a technique used when determining connections in the causal graph in para 32, 41-42, 45, 53).

15.	Claims 11-12 and 14 show the same features as claims 1-2 and 8 respectively, and are rejected for the same reasons.  In addition, note that Dalli shows the non-transitory computer readable media storing instructions to perform the method operations (Dalli para 176 shows the non-transitory storage media such as disks storing the method instructions).

16.	Claim 13 shows the same features as claims 4 and 6 combined, and is rejected for the combined reasons for which claims 4 and 6 are rejected.

17. 	Claims 16-17 and 19 show the same features as claims 1-2 and 8 respectively, and are rejected for the same reasons.  In addition, note that Dalli para 176 shows the computer processor for executing computer program instructions, and the non-transitory computer readable media storing computer program instructions executable by the processor to perform the method operations (Dalli para 176 shows the non-transitory storage media such as disks storing the computer program instructions which are executed by the processor to perform the method operations).

18.	Claim 18 shows the same features as claims 4 and 6 combined, and is rejected for the combined reasons for which claims 4 and 6 are rejected.

19.	Claim(s) 9-10, 15, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dalli and Kollias and Xue and Jezewski et al “Jezewski” (US 2022/0075793 A1).

20.	Regarding claim 9, Dalli and Kollias and Xue do not explicitly show identifying the first variable in the variable set based on the causal graph comprises generating a tree structure comprising the plurality of elements based on the causal graph by arranging an element representing the second variable as a root of the tree; and arranging other elements of the plurality of elements around the root of the tree based on the one or more causal relationships between the plurality of variables.  Jezewski however does show identifying the first variable in the variable set based on the causal graph comprises generating a tree structure comprising the plurality of elements based on the causal graph by arranging an element representing the second variable as a root of the tree (para 62, 83, 210 show the causal graph and para 53, 925-928, 1494 show the tree structure identifies data variables on it based on the causal relationships and that an output may be represented as a root of the tree, para 882, 912 further show how the root relates to a changed output); and arranging other elements of the plurality of elements around the root of the tree based on the one or more causal relationships between the plurality of variables (para 53, 925-928, 1494 show the elements around the tree root are based on the causal relationships of the variables).  It would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention to have this tree structure as in Jezewski, in the neural network model of Dalli, especially as modified by the causal graph of Kollias and Xue, because it would provide an efficient and useful way to use a causal graph to model the causal relationships among the user actions.  Given the combination, the output would be represented by the second variable in Dalli. 

21.	Regarding claim 10, in addition to that mentioned for claim 9, Jezewski shows identifying the first variable in the variable set based on the causal graph further comprises dividing the other elements into a plurality of groups based on distances from each of the other elements to the root of the tree, each group comprising one or more of the other elements (Jezewski para 648, 899, 954, 1138-1140, 1498 show how the elements are categorized by their distances to the root of the tree, and each category may be considered a group with at least that particular elements if not more); and searching for the first variable in each of the plurality of groups based on a sequence in accordance with the distances (Jezewski para 234, 1498 show searching based on distance of the element from the root).  Motivation is the same as that mentioned for claim 9, namely that it would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention to use this tree structure technique as in Jezewski, in the neural network model of Dalli, especially as modified by the causal graph of Kollias and Xue, because it would provide an efficient and useful way to use a causal graph to model the causal relationships among the user actions.  

22.	Claims 15 and 20 each show the same features as claim 9, and each is rejected for the same reasons.

23.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
a) Burachas (US 2019/0370587 shows attention based explanations for artificial intelligence behavior).
b) Le (US 2023/0186072 A1) shows extracting explanations from attention-based models.

24.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to STEVEN PAUL SAX whose telephone number is (571)272-4072. The examiner can normally be reached Monday - Friday, 9:30 - 6:00 Est.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached at 571-272-4046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/STEVEN P SAX/Primary Examiner, Art Unit 2146
Read full office action
Prosecution Timeline

May 30, 2023
Application Filed
Jul 18, 2023
Response after Non-Final Action
Mar 21, 2026
Non-Final Rejection (signed) — §101, §103
Apr 21, 2026
Non-Final Rejection mailed — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/329,649
Patent 12632763
TIME-DOMAIN MULTIPLEXING OF QUANTUM BIT CONTROL SIGNALS
2y 11m to grant Granted May 19, 2026
18/224,155
Patent 12626152
ARTIFICIAL INTELLIGENCE-BASED AUDITORS OF ARTIFICIAL INTELLIGENCE
2y 9m to grant Granted May 12, 2026
18/068,726
Patent 12614202
PARAMETERS FOR GENERATING AN AUTOMATED SURVEY
3y 4m to grant Granted Apr 28, 2026
18/762,169
Patent 12613718
INFORMATION PROCESSING METHODS AND APPARATUS, ELECTRONIC DEVICES, AND STORAGE MEDIA
1y 10m to grant Granted Apr 28, 2026
19/231,250
Patent 12602537
METHODS FOR SERVING INTERACTIVE CONTENT TO A USER
10m to grant Granted Apr 14, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
70%
Grant Probability
99%
With Interview (+44.8%)
4y 1m (~1y 1m remaining)
Median Time to Grant
Low
PTA Risk
Based on 460 resolved cases by this examiner. Grant probability derived from career allowance rate.