Office Action Analysis: 18062673 — GENERATING CAUSAL ASSOCIATION RANKINGS USING DYNAMIC EMBEDDINGS

Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
	This action is in response to amendments filed December 16th, 2025, in which claims 1, 3, 8, 10, 15, and 17 have been amended. No claims have been cancelled nor added. The amendments to the claims and specification have been entered, and claims 1-20 are currently pending in the case. Claims 1, 8, and 15 are independent claims.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim 1:
Step 1: Claim 1 is directed to A computer-based method, therefore it falls under the statuary category of a process.
Step 2A Prong 1: The claim recites, in part: 
“…generating contrastive windows of candidate events of the first type preceding the one or more target events of interest, each of the generated contrastive windows of candidate events of the first type corresponding to a different dropped candidate event of the first type from the received window of candidate events” this encompasses the mental creation of a contrastive window based on observed events.
“…generating resulting embeddings for each of the generated contrastive windows of candidate events” this limitation is a mathematical concept. 
“…identifying matching historical windows of candidate events, the matching historical windows of candidate events including events of the first type preceding the one or more target events of interest of the second type” this encompasses the mental identification of matching observed historical windows.
“having resulting embeddings that are below a predetermined threshold distance away from the resulting embeddings generated for a given generated contrastive window of candidate events;” this limitation is a mathematical concept. 
“in response to identifying the matching historical windows of candidate events, …calculating… a corresponding first score for each of the matching historical window of candidate events” this limitation is a mathematical concept. 
“identifying matching incident windows corresponding to each identified matching historical window” this encompasses the mental identification of matching incident windows of observed historical windows. 
“the matching incident windows having features or textual data corresponding to events of the second type that are above a predetermined similarity value threshold when compared to features or textual data corresponding to the one or more target events of interest, and subsequently calculating corresponding second scores for each of the identified matching incident windows” this limitation is a mathematical concept. 
“determining…a causal impact that a given event of the first type has on bringing about a target event of interest of the second type” this encompasses the mental determination of a causal impact that an observed event of a first type has on bringing about an observed event of a second type.
“calculating combined causal association scores using the first scores and the second scores, and using the combined causal association scores to generate causal association rankings for the events of the first type in the received window of candidate events” this limitation is a mathematical concept. 
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows: “automatically…[receiving]”, “automatically…[generating]” (lines 6 and 10), “automatically…[identifying]” (lines 12 and 20), “automatically…[calculating]” (lines 18 and 26) the limitations are an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP § 2106.05(f)(2). “receiving a window of candidate events including events of a first type preceding one or more target events of interest of a second type, wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system, wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system”, “automatically receiving a window of candidate events including events of a first type preceding one or more target events of interest of a second type, wherein the target events of interest of the second type comprise outcomes or incidents”, “and storing…[a first score]” these limitations are an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g). “for the computer system” the limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use. See MPEP § 2106.05(h). 
Step 2B: The additional elements “automatically…[receiving]”, “automatically…[generating]” (lines 6 and 10), “automatically…[identifying]” (lines 12 and 20), “automatically…[calculating]” (lines 18 and 26), “for the computer system” taken individually and in combination, do not provide an inventive concept of significantly more than the abstract idea itself for the reasons set forth in step 2A prong 2 above. Further, “receiving a window of candidate events including events of a first type preceding one or more target events of interest of a second type, wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system, wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system”, “and storing…[a first score]” these limitations are an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g). Furthermore the additional element is directed to storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). See MPEP § 2106.05(d)/(II). Therefore, the claim is ineligible.

Regarding claim 2, the rejection of claim 1 is incorporated and further:
Step 2A Prong 1: A continuation of the abstract idea identified in the parent claim.
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows: “the generated contrastive windows of candidate events comprise dynamic embeddings generated using one of, a recurrent neural network, a convolutional neural network, a transformer network, or a fully connected network” the limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use. See MPEP § 2106.05(h). 
Step 2B: The additional elements, taken individually and in combination, do not provide an inventive concept of significantly more than the abstract idea itself for the reasons set forth in step 2A prong 2 above. Therefore, the claim is ineligible.

Regarding claim 3, the rejection of claim 1 is incorporated and further:
Step 2A Prong 1: The claim recites, in part:
“utilizing a kernel-based Maximum Mean Discrepancy (MMD) process to determine the distance between embeddings of counterfactual queries corresponding to generated contrastive windows being considered and embeddings corresponding to a given historical candidate window” this limitation is a mathematical concept. 
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows: “automatically…[utilizing]” the limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP § 2106.05(f)(2). 
Step 2B: The additional elements, taken individually and in combination, do not provide an inventive concept of significantly more than the abstract idea itself for the reasons set forth in step 2A prong 2 above. Therefore, the claim is ineligible.

Regarding claim 4, the rejection of claim 3 is incorporated and further:
Step 2A Prong 1: The claim recites, in part:
“the given historical windows of candidate events are indexed in a hierarchical structure based on pre-computed summary vectors, the pre-computed summary vectors being computed using an average of Random Fourier Features corresponding to the given historical candidate windows” This limitation is a mathematical concept. 
Step 2A Prong 2: The claim does not recite any additional limitations, thus does not further recite any additional elements that integrates the judicial exception into a practical application or amount to significantly more. 

Regarding claim 5, the rejection of claim 1 is incorporated and further:
Step 2A Prong 1: The claim recites, in part:
“the corresponding second scores for each of the identified matching incident windows comprise Bilingual Evaluation Understudy (BLEU) scores” this limitation is a mathematical concept. 
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows:
Step 2B: The additional elements, taken individually and in combination, do not provide an inventive concept of significantly more than the abstract idea itself for the reasons set forth in step 2A prong 2 above. Therefore, the claim is ineligible.

Regarding claim 6, the rejection of claim 1 is incorporated and further:
Step 2A Prong 1: A continuation of the abstract idea identified in the parent claim. 
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows: “utilizing a long short-term memory (LSTM) based encoder-decoder architecture trained as an auto-encoder that is configured to use unlabeled data and to transform each of the generated contrastive windows into a vector distilling both temporal and structural information” the limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use. See MPEP § 2106.05(h). 
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows: “automatically…[utilizing]” the limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP § 2106.05(f)(2). 
Step 2B: The additional elements, taken individually and in combination, do not provide an inventive concept of significantly more than the abstract idea itself for the reasons set forth in step 2A prong 2 above. Therefore, the claim is ineligible.

Regarding claim 7, the rejection of claim 1 is incorporated and further:
Step 2A Prong 1: A continuation of the abstract idea identified in the parent claim. 
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows: “automatically outputting to a user a list of individual events of the first type from the window of candidate events, the list of individual events being sorted by their respective generated causal association rankings” the limitation is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g). 
Step 2B: The claim does not contain significantly more than the judicial exception. “automatically outputting to a user a list of individual events of the first type from the window of candidate events, the list of individual events being sorted by their respective generated causal association rankings” the limitation is an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g).  Furthermore the additional element is directed to receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d as well as storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). See MPEP § 2106.05(d)/(II). Therefore, the claim is ineligible.

Regarding claim 8:
Step 1: Claim 8 is directed to A computer system, therefore it falls under the statuary category of a machine.
Step 2A Prong 1: The claim recites, in part: “generating contrastive windows of candidate events of the first type preceding the one or more target events of interest, each of the generated contrastive windows of candidate events of the first type corresponding to a different dropped candidate event of the first type from the received window of candidate events” this encompasses the mental creation of a contrastive window based on observed events.
“generating resulting embeddings for each of the generated contrastive windows of candidate events” this limitation is a mathematical concept. 
“identifying matching historical windows of candidate events, the matching historical windows of candidate events including events of the first type preceding the one or more target events of interest of the second type” this encompasses the mental identification of matching observed historical windows.
“having resulting embeddings that are below a predetermined threshold distance away from the resulting embeddings generated for a given generated contrastive window of candidate events;” this limitation is a mathematical concept. 
“in response to identifying the matching historical windows of candidate events, …calculating…a corresponding first score for each of the matching historical window of candidate events” this limitation is a mathematical concept. 
“identifying matching incident windows corresponding to each identified matching historical window” this encompasses the mental identification of matching incident windows of observed historical windows. 
“the matching incident windows having features or textual data corresponding to events of the second type that are above a predetermined similarity value threshold when compared to features or textual data corresponding to the one or more target events of interest, and subsequently calculating corresponding second scores for each of the identified matching incident windows” this limitation is a mathematical concept. 
“determining…a causal impact that a given event of the first type has on bringing about a target event of interest of the second type” this encompasses the mental determination of a causal impact that an observed event of a first type has on bringing about an observed event of a second type.
“calculating combined causal association scores using the first scores and the second scores, and using the combined causal association scores to generate causal association rankings for the events of the first type in the received window of candidate events” this limitation is a mathematical concept. 
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows: “one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage medium, and program instructions stored on at least one of the one or more computer-readable tangible storage medium for execution by at least one of the one or more processors via at least one of the one or more computer-readable memories, wherein the computer system is capable of performing a method”, “automatically…[receiving]”, “automatically…[generating]” (lines 10 and 14), “automatically…[identifying]” (lines 16 and 24), “automatically…[calculating]” (lines 22 and 30) the limitation is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP § 2106.05(f)(2). “receiving a window of candidate events including events of a first type preceding one or more target events of interest of a second type, wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system, wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system”, “and storing…[a first score]” these limitations are an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g). “for the computer system” the limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use. See MPEP § 2106.05(h).
Step 2B: The additional elements “one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage medium, and program instructions stored on at least one of the one or more computer-readable tangible storage medium for execution by at least one of the one or more processors via at least one of the one or more computer-readable memories, wherein the computer system is capable of performing a method”, “automatically…[receiving]”, “automatically…[generating]” (lines 10 and 14), “automatically…[identifying]” (lines 16 and 24), “automatically…[calculating]” (lines 22 and 30), “for the computer system” taken individually and in combination, do not provide an inventive concept of significantly more than the abstract idea itself for the reasons set forth in step 2A prong 2 above. Further, “receiving a window of candidate events including events of a first type preceding one or more target events of interest of a second type, wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system, wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system”, “and storing…[a first score]” these limitations are an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g). Furthermore the additional element is directed to storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). See MPEP § 2106.05(d)/(II). Therefore, the claim is ineligible.

Regarding claims 9-14:
The rejection of claim 8 is further incorporated, the rejection of claims 2-7 are applicable to claims 9-14, respectively.

Regarding claim 15:
Step 1: Claim 15 is directed to A computer program product, therefore it falls under the statuary category of a manufacture.
Step 2A Prong 1: The claim recites, in part: “generating contrastive windows of candidate events of the first type preceding the one or more target events of interest, each of the generated contrastive windows of candidate events of the first type corresponding to a different dropped candidate event of the first type from the received window of candidate events” this encompasses the mental creation of a contrastive window based on observed events.
“generating resulting embeddings for each of the generated contrastive windows of candidate events” this limitation is a mathematical concept. 
“identifying matching historical windows of candidate events, the matching historical windows of candidate events including events of the first type preceding the one or more target events of interest of the second type” this encompasses the mental identification of matching observed historical windows.
“having resulting embeddings that are below a predetermined threshold distance away from the resulting embeddings generated for a given generated contrastive window of candidate events;” this limitation is a mathematical concept. 
“in response to identifying the matching historical windows of candidate events, …calculating…a corresponding first score for each of the matching historical window of candidate events” this limitation is a mathematical concept. 
“identifying matching incident windows corresponding to each identified matching historical window” this encompasses the mental identification of matching incident windows of observed historical windows. 
“the matching incident windows having features or textual data corresponding to events of the second type that are above a predetermined similarity value threshold when compared to features or textual data corresponding to the one or more target events of interest, and subsequently calculating corresponding second scores for each of the identified matching incident windows” this limitation is a mathematical concept. 
“determining…a causal impact that a given event of the first type has on bringing about a target event of interest of the second type” this encompasses the mental determination of a causal impact that an observed event of a first type has on bringing about an observed event of a second type.
“calculating combined causal association scores using the first scores and the second scores, and using the combined causal association scores to generate causal association rankings for the events of the first type in the received window of candidate events” this limitation is a mathematical concept. 
Step 2A Prong 2: The judicial exception is not integrated into a practical application; the remaining limitations of the claim are as follows: “automatically…[receiving]”, “automatically…[generating]” (lines 8 and 12), “automatically…[identifying]” (lines 14 and 22), “automatically…[calculating]” (lines 20 and 28) the limitations are an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. See MPEP § 2106.05(f)(2). “receiving a window of candidate events including events of a first type preceding one or more target events of interest of a second type, wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system, wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system”, “and storing…[a first score]” these limitations are an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g). “for the computer system” the limitation is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use. See MPEP § 2106.05(h).
Step 2B: The additional elements “automatically…[receiving]”, “automatically…[generating]” (lines 8 and 12), “automatically…[identifying]” (lines 14 and 22), “automatically…[calculating]” (lines 20 and 28), “for the computer system” taken individually and in combination, do not provide an inventive concept of significantly more than the abstract idea itself for the reasons set forth in step 2A prong 2 above. Further, “receiving a window of candidate events including events of a first type preceding one or more target events of interest of a second type, wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system, wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system”, “and storing…[a first score]” these limitations are an additional element that amounts to adding insignificant extra-solution activity to the judicial exception.  See MPEP § 2106.05(g). Furthermore the additional element is directed to storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015). See MPEP § 2106.05(d)/(II). Therefore, the claim is ineligible.

Regarding claims 16-20:
The rejection of claim 15 is further incorporated, the rejection of claims 2-6 are applicable to claims 16-20, respectively.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 7, 8, 10, 14, 15 and 17 are rejected under 35 U.S.C. § 103 as being unpatentable over Muandet et al. (“Counterfactual Mean Embeddings”, Muandet et al., June 2021) (hereinafter “Muandet”) in view of Ramachandra (“Deep Learning for Causal Inference “, Ramachandra, 1 March 2018) (as cited in the IDS) in further view of Arya et al. (“Evaluation of Causal Inference Techniques for AIOps”, Arya et al., 02 January 2021) (as cited in the IDS; hereinafter “Arya”) .

Regarding claim 1:
Muandet teaches [a] computer-based method of generating causal association rankings for candidate events comprising:
automatically receiving a window of candidate events including events of a first type preceding one or more target events of interest of a second type… (Muandet, page 27, section 6, ¶1 “For instance, consider a recommendation system, where an action is a list of items to be recommended to a user, and a policy determines which action to take, given the features of the user. There will be a positive reward if the user clicks or buys one of the recommended items, and no reward otherwise.” Here the action which contains items can be considered a window of candidate events and the user buying the recommended item can be considered the outcome); 
automatically identifying matching historical windows of candidate events, the matching historical windows of candidate events including events of the first type preceding the one or more target events of interest of the second type (Muandet, page 28, section 6.1, ¶2 “Assume that tuples                         
                            
                                                            u
                                                        
                                                            i
                                                        
                                                    ,
                                                    
                                                            a
                                                        
                                                            i
                                                        
                                                    ,
                                                    
                                                            r
                                                        
                                                            i
                                                        
                                    ⅈ
                                    =
                                    1
                                
                                    n
                                
                            ⊂
                            U
                            ×
                            A
                            ×
                            R
                        
                     of context features                        
                             
                                    u
                                
                                    i
                                
                            ∈
                            U
                        
                    , action                         
                             
                                    a
                                
                                    i
                                
                            ∈
                            A
                        
                     and reward                         
                             
                                    r
                                
                                    i
                                
                            ∈
                            R
                        
                     are available as logged (or historical) data.” Here, the available logged data with context features and action can be considered matching historical windows), and having resulting embeddings that are below a predetermined threshold distance away from the resulting embeddings generated for a given generated contrastive window of candidate events (Muandet, page 4, ¶2 “The proposed estimator can be used for computing a distance between the counterfactual and controlled distributions, thereby providing a way of quantifying the effect of a treatment to the distribution of outcomes; we define this distance as the maximum mean discrepancy (MMD) (Borgwardt et al., 2006; Gretton et al., 2012) between the counterfactual and controlled distributions.” Here, the distance between counterfactual and controlled distributions can be considered the distance away from resulting embeddings.);
in response to identifying the matching historical windows of candidate events, automatically calculating and storing a corresponding first score for each of the matching historical window of candidate events (Muandet, pages 36-37, section 7.2, “Let η : U × A → R be the regression function that takes a pair (u, a) of user features u ∈ U and recommendation a ∈ A as an input and outputs the conditional expectation of the reward r” here, the user features can be considered the historical window, the output of the regression function, the conditional expectation of the reward can be considered a first score);
automatically identifying matching incident windows corresponding to each identified matching historical window, the matching incident windows having features or textual data corresponding to events of the second type that are above a predetermined similarity value threshold (Muandet, page 39-40, section 7.2.1, ¶2 “The parameter α controls how similar the policies are. If α = 1, we obtain π0 = π∗, whereas π0 and π∗ differ the most when α = −1.” Here, α can be considered a predetermined similarity threshold) when compared to features or textual data corresponding to the one or more target events of interest (Muandet, page 28, ¶2 “Let π∗(a|u) be another conditional distribution of actions a ∈ A given context features u ∈ U, which represents the target policy that one wants to evaluate. By design, the target policy is known and sampling from it is possible. Let q∗(u) be a probability distribution on U, which represents the distribution of context features under the target environment (e.g., the distribution of user features when a recommendation system is deployed). In the standard OPE setting, it is typically assumed that q0(u) = q∗(u), i.e., the historical and target environments are the same;”), and subsequently calculating corresponding second scores for each of the identified matching incident windows (Muandet, page 17, ¶1 “We show here that the same strategy of inverse propensity weighting (Imbens, 2004, Section III-C) can be straightforwardly used to define unbiased estimators of the mean embeddings µY ∗ 1 and µY ∗ 0 of potential outcome distributions PY ∗ 1 and PY ∗ 0 , respectively, thus providing a way of estimating the KTE.” Here, the propensity weighting can be considered the second scores); and
…wherein the determining further comprises automatically calculating combined causal association scores using the first scores and the second scores automatically calculating combined causal association scores using the first scores and the second scores, and using the combined causal association scores to generate causal association rankings for the events of the first type in the received window of candidate events (Muandet, page 38, ¶2 “The DR estimator combines the two aforementioned estimators by exploiting both the regression model ηˆ(u, a) and the propensity scores (Cassel et al., 1976; Dudík et al., 2011).”).
Maundet does not teach “automatically generating contrastive windows of candidate events of the first type preceding the one or more target events of interest, each of the generated contrastive windows of candidate events of the first type corresponding to a different dropped candidate event of the first type from the received window of candidate events;
automatically generating resulting embeddings for each of the generated contrastive windows of candidate events;”
However, Ramachandra teaches automatically generating contrastive windows of candidate events of the first type preceding the one or more target events of interest, each of the generated contrastive windows of candidate events of the first type corresponding to a different dropped candidate event of the first type from the received window of candidate events (Ramachandra, page 2, ¶1 “Since we cannot observe the counterfactual for any particular xi unit, one way to estimate the treatment effect for each unit will be by using values from its neighbors which received the opposite treatment, and by taking the difference between the two outcomes.” Here, by using values from the neighbors, xi  it can be considered the dropped candidate event and the treatment effect from those that received the opposite treatment can be considered the contrastive window);
automatically generating resulting embeddings for each of the generated contrastive windows of candidate events (Ramachandra, abstract, ¶2 “For generalized neighbor matching to estimate individual and average treatment effects, we analyze the use of autoencoders for dimensionality reduction while maintaining the local neighborhood structure among the data points in the embedding space.” In light of the specification, ¶41 “Event management program 150 may utilize a recurrent neural network (RNN) embedder to transform each window from this step into a single vector, distilling the structural as well as temporal information.”);
Maundet and Ramachandra are analogous art because both references concern causal inference. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Maundet’s causal counterfactual embedding method to incorporate the counterfactual treatments taught by Ramachandra. The motivation for doing so would have been to better perform the treatment effect estimation as stated in Ramachandra, page 1, abstract, ¶2 “This deep learning based technique is shown to perform better than simple k nearest neighbor matching for estimating treatment effects, especially when the data points have several features/covariates but reside in a low dimensional manifold in high dimensional space.”.
Maundet in view of Ramachandra does not teach “…wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system, wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system;
determining, for the computer system, a causal impact that a given event of the first type has on bringing about a target event of interest of the second type” 
However, Arya teaches  …wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system (Arya, page 3, col 2, ¶2 “Modeling logs as timeseries. We consider different time bin sizes (10ms, 100ms, 1sec) and count the number of error logs in each bin to obtain a timeseries of error counts corresponding to each impacted microservice (e.g. see figure 1(b)).”), wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system (Arya, page 4, col 2, section 4, ¶1 “We filter out error logs along with their timing information to construct (a) time series of error counts corresponding to each impacted microservice and (b) a temporal event sequence {(ti,li)} which records the time ti at which microservice li emits an error log.”);
determining, for the computer system, a causal impact that a given event of the first type has on bringing about a target event of interest of the second type (Arya, page 4, col 2, section 4, ¶1, “In order to introduce a fault, one of the microservices namely ts-basic is deleted from the system which results in 4 microservices emitting error logs namely: ts-ui-dashboard, ts-travel2-service, ts-travel-service, and ts-ticketinfo-service.” Here, the errors produces can be considered the causal impact a given event of the first type has on bringing about a target event of interest of the second type)
Maundet in view of Ramachandra and Arya are analogous art because both references concern methods for causal inference. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Maundet/Ramachandra’s causal inference system to incorporate the computer logs taught by Arya. The motivation for doing so would have been to have accurate causal inferences for IT operations as stated in Arya, page 1, col 1, Abstract, ¶1 “Inferring causality of events from log data is critical to IT operations teams who continuously strive to identify probable root causes of events in order to quickly resolve incident tickets so that down times and service interruptions are kept to a minimum.”

Regarding claim 3:
Maundet in view of Ramachandra in further view of Arya teaches [t]he computer-based method of claim 1, wherein automatically identifying the matching historical windows of candidate events, the matching historical windows of candidate events including the events of the first type preceding the one or more target events of interest of the second type, and having the resulting embeddings that are below the predetermined threshold distance away from the resulting embeddings generated for the given generated contrastive window of candidate events further comprises:
automatically utilizing a kernel-based Maximum Mean Discrepancy (MMD) (Muandet, page 6, ¶6 “The kernel mean embedding (1) is the key ingredient of a well-known metric on probability measures called maximum mean discrepancy (MMD) (Borgwardt et al., 2006; Gretton et al., 2012).”) process to determine the distance between embeddings of counterfactual queries corresponding to generated contrastive windows being considered and embeddings corresponding to a given historical candidate window (Muandet, page 4, ¶2 “The proposed estimator can be used for computing a distance between the counterfactual and controlled distributions, thereby providing a way of quantifying the effect of a treatment to the distribution of outcomes; we define this distance as the maximum mean discrepancy (MMD) (Borgwardt et al., 2006; Gretton et al., 2012) between the counterfactual and controlled distributions.”).

Regarding claim 7:
Maundet in view of Ramachandra in further view of Arya teaches [t]he computer-based method of claim 1, further comprising:
automatically outputting to a user a list of individual events of the first type from the window of candidate events (Muandet, page 39, section 7.2.1, ¶1 “When a user visits a website, the system provides a recommendation as an ordered list of K ∈ N items out of M ∈ N available items to that user.” Here, the K items can be considered first events and N items the window of candidate events), the list of individual events being sorted by their respective generated causal association rankings (Muandet, page 42, section 7.2.2, ¶2 “In order to generate logged data Dinit, we first sample a query q uniformly from the dataset, and select top M candidate URLs based on the relevance scores predicted by the treebody model.”).

Regarding claim 8:
Muandet teaches [a] computer system, the computer system comprising:
one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage medium, and program instructions stored on at least one of the one or more computer-readable tangible storage medium for execution by at least one of the one or more processors via at least one of the one or more computer-readable memories (Muadet, page 32, section 7, ¶1 “The codes to reproduce the experiments are available at https://github.com/ sorawitj/counterfactual-mean-embedding.” The codes inherently run on a computer system and are stored on computer-readable tangible storage medium), wherein the computer system is capable of performing a method comprising:
automatically receiving a window of candidate events including events of a first type preceding one or more target events of interest of a second type… (Muandet, page 27, section 6, ¶1 “For instance, consider a recommendation system, where an action is a list of items to be recommended to a user, and a policy determines which action to take, given the features of the user. There will be a positive reward if the user clicks or buys one of the recommended items, and no reward otherwise.” Here the action which contains items can be considered a window of candidate events and the user buying the recommended item can be considered the outcome); 
automatically identifying matching historical windows of candidate events, the matching historical windows of candidate events including events of the first type preceding the one or more target events of interest of the second type (Muandet, page 28, section 6.1, ¶2 “Assume that tuples                         
                            
                                                            u
                                                        
                                                            i
                                                        
                                                    ,
                                                    
                                                            a
                                                        
                                                            i
                                                        
                                                    ,
                                                    
                                                            r
                                                        
                                                            i
                                                        
                                    ⅈ
                                    =
                                    1
                                
                                    n
                                
                            ⊂
                            U
                            ×
                            A
                            ×
                            R
                        
                     of context features                        
                             
                                    u
                                
                                    i
                                
                            ∈
                            U
                        
                    , action                         
                             
                                    a
                                
                                    i
                                
                            ∈
                            A
                        
                     and reward                         
                             
                                    r
                                
                                    i
                                
                            ∈
                            R
                        
                     are available as logged (or historical) data.” Here, the available logged data with context features and action can be considered matching historical windows), and having resulting embeddings that are below a predetermined threshold distance away from the resulting embeddings generated for a given generated contrastive window of candidate events (Muandet, page 4, ¶2 “The proposed estimator can be used for computing a distance between the counterfactual and controlled distributions, thereby providing a way of quantifying the effect of a treatment to the distribution of outcomes; we define this distance as the maximum mean discrepancy (MMD) (Borgwardt et al., 2006; Gretton et al., 2012) between the counterfactual and controlled distributions.” Here, the distance between counterfactual and controlled distributions can be considered the distance away from resulting embeddings.);
in response to identifying the matching historical windows of candidate events, automatically calculating and storing a corresponding first score for each of the matching historical window of candidate events (Muandet, pages 36-37, section 7.2, “Let η : U × A → R be the regression function that takes a pair (u, a) of user features u ∈ U and recommendation a ∈ A as an input and outputs the conditional expectation of the reward r” here, the user features can be considered the historical window, the output of the regression function, the conditional expectation of the reward can be considered a first score);
automatically identifying matching incident windows corresponding to each identified matching historical window, the matching incident windows having features or textual data corresponding to events of the second type that are above a predetermined similarity value threshold (Muandet, page 39-40, section 7.2.1, ¶2 “The parameter α controls how similar the policies are. If α = 1, we obtain π0 = π∗, whereas π0 and π∗ differ the most when α = −1.” Here, α can be considered a predetermined similarity threshold) when compared to features or textual data corresponding to the one or more target events of interest (Muandet, page 28, ¶2 “Let π∗(a|u) be another conditional distribution of actions a ∈ A given context features u ∈ U, which represents the target policy that one wants to evaluate. By design, the target policy is known and sampling from it is possible. Let q∗(u) be a probability distribution on U, which represents the distribution of context features under the target environment (e.g., the distribution of user features when a recommendation system is deployed). In the standard OPE setting, it is typically assumed that q0(u) = q∗(u), i.e., the historical and target environments are the same;”), and subsequently calculating corresponding second scores for each of the identified matching incident windows (Muandet, page 17, ¶1 “We show here that the same strategy of inverse propensity weighting (Imbens, 2004, Section III-C) can be straightforwardly used to define unbiased estimators of the mean embeddings µY ∗ 1 and µY ∗ 0 of potential outcome distributions PY ∗ 1 and PY ∗ 0 , respectively, thus providing a way of estimating the KTE.” Here, the propensity weighting can be considered the second scores); and
…wherein the determining further comprises automatically calculating combined causal association scores using the first scores and the second scores automatically calculating combined causal association scores using the first scores and the second scores, and using the combined causal association scores to generate causal association rankings for the events of the first type in the received window of candidate events (Muandet, page 38, ¶2 “The DR estimator combines the two aforementioned estimators by exploiting both the regression model ηˆ(u, a) and the propensity scores (Cassel et al., 1976; Dudík et al., 2011).”).
Maundet does not teach “automatically generating contrastive windows of candidate events of the first type preceding the one or more target events of interest, each of the generated contrastive windows of candidate events of the first type corresponding to a different dropped candidate event of the first type from the received window of candidate events;
automatically generating resulting embeddings for each of the generated contrastive windows of candidate events;”
However, Ramachandra teaches automatically generating contrastive windows of candidate events of the first type preceding the one or more target events of interest, each of the generated contrastive windows of candidate events of the first type corresponding to a different dropped candidate event of the first type from the received window of candidate events (Ramachandra, page 2, ¶1 “Since we cannot observe the counterfactual for any particular xi unit, one way to estimate the treatment effect for each unit will be by using values from its neighbors which received the opposite treatment, and by taking the difference between the two outcomes.” Here, by using values from the neighbors, xi  it can be considered the dropped candidate event and the treatment effect from those that received the opposite treatment can be considered the contrastive window);
automatically generating resulting embeddings for each of the generated contrastive windows of candidate events (Ramachandra, abstract, ¶2 “For generalized neighbor matching to estimate individual and average treatment effects, we analyze the use of autoencoders for dimensionality reduction while maintaining the local neighborhood structure among the data points in the embedding space.” In light of the specification, ¶41 “Event management program 150 may utilize a recurrent neural network (RNN) embedder to transform each window from this step into a single vector, distilling the structural as well as temporal information.”);
Maundet and Ramachandra are analogous art because both references concern causal inference. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Maundet’s causal counterfactual embedding method to incorporate the counterfactual treatments taught by Ramachandra. The motivation for doing so would have been to better perform the treatment effect estimation as stated in Ramachandra, page 1, abstract, ¶2 “This deep learning based technique is shown to perform better than simple k nearest neighbor matching for estimating treatment effects, especially when the data points have several features/covariates but reside in a low dimensional manifold in high dimensional space.”
Maundet in view of Ramachandra does not teach “…wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system, wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system;
determining, for the computer system, a causal impact that a given event of the first type has on bringing about a target event of interest of the second type” 
However, Arya teaches  …wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system (Arya, page 3, col 2, ¶2 “Modeling logs as timeseries. We consider different time bin sizes (10ms, 100ms, 1sec) and count the number of error logs in each bin to obtain a timeseries of error counts corresponding to each impacted microservice (e.g. see figure 1(b)).”), wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system (Arya, page 4, col 2, section 4, ¶1 “We filter out error logs along with their timing information to construct (a) time series of error counts corresponding to each impacted microservice and (b) a temporal event sequence {(ti,li)} which records the time ti at which microservice li emits an error log.”);
determining, for the computer system, a causal impact that a given event of the first type has on bringing about a target event of interest of the second type (Arya, page 4, col 2, section 4, ¶1, “In order to introduce a fault, one of the microservices namely ts-basic is deleted from the system which results in 4 microservices emitting error logs namely: ts-ui-dashboard, ts-travel2-service, ts-travel-service, and ts-ticketinfo-service.” Here, the errors produces can be considered the causal impact a given event of the first type has on bringing about a target event of interest of the second type)
Maundet in view of Ramachandra and Arya are analogous art because both references concern methods for causal inference. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Maundet/Ramachandra’s causal inference system to incorporate the computer logs taught by Arya. The motivation for doing so would have been to have accurate causal inferences for IT operations as stated in Arya, page 1, col 1, Abstract, ¶1 “Inferring causality of events from log data is critical to IT operations teams who continuously strive to identify probable root causes of events in order to quickly resolve incident tickets so that down times and service interruptions are kept to a minimum.”

Regarding claims 10 and 14:
Claims 10 and 14 are rejected under the same rationale as claims 3 and 7 respectively. 
It would have been obvious to combine the teachings of Maundet and Ramachandra for the reasons set forth in connection with claim 8 above. 

Regarding claim 15:
Muandet teaches [a] computer program product, the computer program product comprising:
one or more computer-readable tangible storage medium and program instructions stored on at least one of the one or more computer-readable tangible storage medium, the program instructions executable by a processor capable of performing a method, the method comprising (Muadet, page 32, section 7, ¶1 “The codes to reproduce the experiments are available at https://github.com/ sorawitj/counterfactual-mean-embedding.” The codes inherently run on a computer system and are stored on computer-readable tangible storage medium):
automatically receiving a window of candidate events including events of a first type preceding one or more target events of interest of a second type… (Muandet, page 27, section 6, ¶1 “For instance, consider a recommendation system, where an action is a list of items to be recommended to a user, and a policy determines which action to take, given the features of the user. There will be a positive reward if the user clicks or buys one of the recommended items, and no reward otherwise.” Here the action which contains items can be considered a window of candidate events and the user buying the recommended item can be considered the outcome); 
automatically identifying matching historical windows of candidate events, the matching historical windows of candidate events including events of the first type preceding the one or more target events of interest of the second type (Muandet, page 28, section 6.1, ¶2 “Assume that tuples                         
                            
                                                            u
                                                        
                                                            i
                                                        
                                                    ,
                                                    
                                                            a
                                                        
                                                            i
                                                        
                                                    ,
                                                    
                                                            r
                                                        
                                                            i
                                                        
                                    ⅈ
                                    =
                                    1
                                
                                    n
                                
                            ⊂
                            U
                            ×
                            A
                            ×
                            R
                        
                     of context features                        
                             
                                    u
                                
                                    i
                                
                            ∈
                            U
                        
                    , action                         
                             
                                    a
                                
                                    i
                                
                            ∈
                            A
                        
                     and reward                         
                             
                                    r
                                
                                    i
                                
                            ∈
                            R
                        
                     are available as logged (or historical) data.” Here, the available logged data with context features and action can be considered matching historical windows), and having resulting embeddings that are below a predetermined threshold distance away from the resulting embeddings generated for a given generated contrastive window of candidate events (Muandet, page 4, ¶2 “The proposed estimator can be used for computing a distance between the counterfactual and controlled distributions, thereby providing a way of quantifying the effect of a treatment to the distribution of outcomes; we define this distance as the maximum mean discrepancy (MMD) (Borgwardt et al., 2006; Gretton et al., 2012) between the counterfactual and controlled distributions.” Here, the distance between counterfactual and controlled distributions can be considered the distance away from resulting embeddings.);
in response to identifying the matching historical windows of candidate events, automatically calculating and storing a corresponding first score for each of the matching historical window of candidate events (Muandet, pages 36-37, section 7.2, “Let η : U × A → R be the regression function that takes a pair (u, a) of user features u ∈ U and recommendation a ∈ A as an input and outputs the conditional expectation of the reward r” here, the user features can be considered the historical window, the output of the regression function, the conditional expectation of the reward, can be considered a first score);
automatically identifying matching incident windows corresponding to each identified matching historical window, the matching incident windows having features or textual data corresponding to events of the second type that are above a predetermined similarity value threshold (Muandet, page 39-40, section 7.2.1, ¶2 “The parameter α controls how similar the policies are. If α = 1, we obtain π0 = π∗, whereas π0 and π∗ differ the most when α = −1.” Here, α can be considered a predetermined similarity threshold) when compared to features or textual data corresponding to the one or more target events of interest (Muandet, page 28, ¶2 “Let π∗(a|u) be another conditional distribution of actions a ∈ A given context features u ∈ U, which represents the target policy that one wants to evaluate. By design, the target policy is known and sampling from it is possible. Let q∗(u) be a probability distribution on U, which represents the distribution of context features under the target environment (e.g., the distribution of user features when a recommendation system is deployed). In the standard OPE setting, it is typically assumed that q0(u) = q∗(u), i.e., the historical and target environments are the same;”), and subsequently calculating corresponding second scores for each of the identified matching incident windows (Muandet, page 17, ¶1 “We show here that the same strategy of inverse propensity weighting (Imbens, 2004, Section III-C) can be straightforwardly used to define unbiased estimators of the mean embeddings µY ∗ 1 and µY ∗ 0 of potential outcome distributions PY ∗ 1 and PY ∗ 0 , respectively, thus providing a way of estimating the KTE.” Here, the propensity weighting can be considered the second scores); and
…wherein the determining further comprises automatically calculating combined causal association scores using the first scores and the second scores automatically calculating combined causal association scores using the first scores and the second scores, and using the combined causal association scores to generate causal association rankings for the events of the first type in the received window of candidate events (Muandet, page 38, ¶2 “The DR estimator combines the two aforementioned estimators by exploiting both the regression model ηˆ(u, a) and the propensity scores (Cassel et al., 1976; Dudík et al., 2011).”).
Maundet does not teach “automatically generating contrastive windows of candidate events of the first type preceding the one or more target events of interest, each of the generated contrastive windows of candidate events of the first type corresponding to a different dropped candidate event of the first type from the received window of candidate events;
automatically generating resulting embeddings for each of the generated contrastive windows of candidate events;”
However, Ramachandra teaches automatically generating contrastive windows of candidate events of the first type preceding the one or more target events of interest, each of the generated contrastive windows of candidate events of the first type corresponding to a different dropped candidate event of the first type from the received window of candidate events (Ramachandra, page 2, ¶1 “Since we cannot observe the counterfactual for any particular xi unit, one way to estimate the treatment effect for each unit will be by using values from its neighbors which received the opposite treatment, and by taking the difference between the two outcomes.” Here, by using values from the neighbors, xi  it can be considered the dropped candidate event and the treatment effect from those that received the opposite treatment can be considered the contrastive window);
automatically generating resulting embeddings for each of the generated contrastive windows of candidate events (Ramachandra, abstract, ¶2 “For generalized neighbor matching to estimate individual and average treatment effects, we analyze the use of autoencoders for dimensionality reduction while maintaining the local neighborhood structure among the data points in the embedding space.” In light of the specification, ¶41 “Event management program 150 may utilize a recurrent neural network (RNN) embedder to transform each window from this step into a single vector, distilling the structural as well as temporal information.”);
Maundet and Ramachandra are analogous art because both references concern causal inference. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Maundet’s causal counterfactual embedding method to incorporate the counterfactual treatments taught by Ramachandra. The motivation for doing so would have been to better perform the treatment effect estimation as stated in Ramachandra, page 1, abstract, ¶2 “This deep learning based technique is shown to perform better than simple k nearest neighbor matching for estimating treatment effects, especially when the data points have several features/covariates but reside in a low dimensional manifold in high dimensional space.”
Maundet in view of Ramachandra does not teach “…wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system, wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system;
determining, for the computer system, a causal impact that a given event of the first type has on bringing about a target event of interest of the second type” 
However, Arya teaches  …wherein the window of candidate events further comprises an event sequence including a collection of time-stamped descriptions of events occurring within a computer system (Arya, page 3, col 2, ¶2 “Modeling logs as timeseries. We consider different time bin sizes (10ms, 100ms, 1sec) and count the number of error logs in each bin to obtain a timeseries of error counts corresponding to each impacted microservice (e.g. see figure 1(b)).”), wherein the events of the first type further comprises specific alerts associated with the computer system, and wherein the target events of interest of the second type comprise outcomes or incidents associated with the computer system (Arya, page 4, col 2, section 4, ¶1 “We filter out error logs along with their timing information to construct (a) time series of error counts corresponding to each impacted microservice and (b) a temporal event sequence {(ti,li)} which records the time ti at which microservice li emits an error log.”);
determining, for the computer system, a causal impact that a given event of the first type has on bringing about a target event of interest of the second type (Arya, page 4, col 2, section 4, ¶1, “In order to introduce a fault, one of the microservices namely ts-basic is deleted from the system which results in 4 microservices emitting error logs namely: ts-ui-dashboard, ts-travel2-service, ts-travel-service, and ts-ticketinfo-service.” Here, the errors produces can be considered the causal impact a given event of the first type has on bringing about a target event of interest of the second type)
Maundet in view of Ramachandra and Arya are analogous art because both references concern methods for causal inference. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Maundet/Ramachandra’s causal inference system to incorporate the computer logs taught by Arya. The motivation for doing so would have been to have accurate causal inferences for IT operations as stated in Arya, page 1, col 1, Abstract, ¶1 “Inferring causality of events from log data is critical to IT operations teams who continuously strive to identify probable root causes of events in order to quickly resolve incident tickets so that down times and service interruptions are kept to a minimum.”

Regarding claims 15 and 17:
Claims 15 and 17 are rejected under the same rationale as claims 3 and 7 respectively.
It would have been obvious to combine the teachings of Maundet and Ramachandra for the reasons set forth in connection with claim 15 above.

Claims 2, 9 and 16 are rejected under 35 U.S.C. § 103 as being unpatentable over Maundet in view of Ramachandra in view of Arya in further view of Li et al. (WO 2019172848 A1) (hereinafter “Li”).

Regarding claim 2:
Maundet in view of Ramachandra in further view of Arya teaches [t]he computer-based method of claim 1,
Maundet in view of Ramachandra in further view of Arya does not teach “wherein the generated resulting embeddings for each of the generated contrastive windows of candidate events comprise dynamic embeddings generated using one of, a recurrent neural network, a convolutional neural network, a transformer network, or a fully connected network.”
However, Li teaches wherein the generated resulting embeddings for each of the generated contrastive windows of candidate events comprise dynamic embeddings generated using one of, a recurrent neural network, a convolutional neural network, a transformer network, or a fully connected network (Li, page 17, lines 7-8 “In addition to this dynamic embedding learnt from the multi-relation structure RNN model, a 134-d feature fi is also extracted based on the timestamp of the event (ti).” It is noted the claim recites alternative language, and Li teaches at least one of the alternatives.).
Maundet in view of Ramachandra in further view of Arya and Li are analogous art because both references concern methods for predicting outcomes of an event. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Maundet/Ramachandra/Arya’s causal system to incorporate the RNN and dynamic embeddings taught by Li. The motivation for doing so would have been to achieve better performance as stated in Li, page 12, lines 28-29 “The proposed MRS-RMTPP model outperforms the state-of-the-art baseline by 5.3% (MRR) and 4.3% (RMSE), on a real-world ATS log dataset.”

Regarding claims 9 and 16:
Claims 9 and 16 are rejected under the same rationale as claim 2.
It would have been obvious to combine the teachings of Maundet in view of Ramachandra in further view of Arya and Li for the reasons set forth in connection with claim 2 above. 

Claims 4, 11 and 18 are rejected under 35 U.S.C. § 103 as being unpatentable over Maundet in view of Ramachandra in view of Arya in further view of Mehrkanoon et al. (“Deep hybrid neural-kernel networks using random Fourier features”, Mehrkanoon et al., 12 July 2018) (hereinafter “Mehrkanoon”).

Regarding claim 4:
Maundet in view of Ramachandra in further view of Arya teaches [t]he computer-based method of claim 3, 
Maundet in view of Ramachandra in further view of Arya does not teach “wherein the given historical windows of candidate events are indexed in a hierarchical structure based on pre-computed summary vectors, the pre-computed summary vectors being computed using an average of Random Fourier Features corresponding to the given historical candidate windows”
However, Mehrkanoon teaches wherein the given historical windows of candidate events are indexed in a hierarchical structure (Mehrkanoon, page 1, col 1, ¶2 “Deep learning models deal with complex tasks by learning from subtasks. In particular, several nonlinear modules are stacked in hierarchical architectures to learn multiple levels of representation (hierarchical features) from the raw input data. Each module transforms the representation at one level into a slightly more abstract representation at a higher level, i.e., the higher-level features are defined in terms of lower-level ones.”) based on pre-computed summary vectors, the pre-computed summary vectors (Mehrkanoon, page 3, col 2, ¶1 “In practice when n is large, one can work with a subsample (prototype vectors) of size m << n.” here the prototype vectors can be considered summary vectors) being computed using an average of Random Fourier Features corresponding to the given historical candidate windows (Mehrkanoon, page 1,abstract “In particular, here an explicit feature map, based on random Fourier features, is used to make the transition between the two architectures more straightforward as well as making the model scalable to large datasets by solving the optimization problem in the primal.”).
Maundet in view of Ramachandra in further view of Arya and Mehrkanoon are analogous art because both references concern deep neural-kernal networks. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Maundet/Ramachandra/Arya’s deep neural network to incorporate the stacking hierarchical architecture taught by Mehrkanoon. The motivation for doing so would have been to better learn complex hierarchical representations as stated in Mehrkanoon, page 4, col 1, section 3.2, ¶1 “As it has been shown in the literature, in many tasks, deep models with several staking nonlinear layers perform better than shallow models and are able to better learn the complex hierarchical representations of the given dataset.”.

Regarding claims 11 and 18:
Claims 11 and 18 are rejected under the same rationale as claim 4.
It would have been obvious to combine the teachings of Maundet in view of Ramachandra in further view of Arya and Mehrkanoon for the reasons set forth in connection with claim 4 above. 

Claims 5, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Maundet in view of Ramachandra in view of Arya in further view of Omeiza et al. (“From Spoken Thoughts to Automated Driving Commentary: Predicting and Explaining Intelligent Vehicles’ Actions”, Omeiza et al., 4 June 2022) (hereinafter “Omeiza”).

Regarding claim 5:
Maundet in view of Ramachandra in further view of Arya teaches [t]he computer-based method of claim 1, 
Maundet in view of Ramachandra in further view of Arya does not teach “wherein the corresponding second scores for each of the identified matching incident windows comprise Bilingual Evaluation Understudy (BLEU) scores.”
However, Omeiza teaches wherein the corresponding second scores for each of the identified matching incident windows comprise Bilingual Evaluation Understudy (BLEU) scores (Omeiza, page 6, col 1-2, Section A, ¶1 “We measured the amount of similarity between the generated factual explanations with ground truth explanations using the BiLingual Evaluation Understudy (specifically cumulative weighted BLEU-4) and The Recall-Oriented Understudy for Gisting Evaluation (specifically the weighted LCS ROUGE-W).”).
Maundet in view of Ramachandra in further view of Arya and Omeiza are analogous art because both references concern methods for causal inference. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Maundet/Ramachandra/Arya’s causal inference system to incorporate the Bilingual Evaluation Understudy (BLEU) scores taught by Omeiza. The motivation for doing so would have been to achieve better similarity scores as stated in Omeiza, page 6, col 2, ¶2 “In terms of similarity scores, there was a slight increase considering the model’s certainty (See Table I).”.

Regarding claims 12 and 19:
Claims 12 and 19 are rejected under the same rationale as claim 5.
It would have been obvious to combine the teachings of Maundet in view of Ramachandra and Omeiza for the reasons set forth in connection with claim 5 above.

Claims 6, 13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Maundet in view of Ramachandra in further view of Huang et al. (“Bi-Directional Causal Graph Learning through Weight-sharing and Low-rank Neural Network”, Huang et al., 2019) (as cited in the IDS; hereinafter “Huang”).

Regarding claim 6:
Maundet in view of Ramachandra in further view of Arya teaches [t]he computer-based method of claim 1
Maundet in view of Ramachandra in further view of Arya does not teach “automatically utilizing a long short-term memory (LSTM) based encoder-decoder architecture trained as an auto-encoder that is configured to use unlabeled data and to transform each of the generated contrastive windows into a vector distilling both temporal and structural information.”
However, Huang teaches automatically utilizing a long short-term memory (LSTM) based encoder-decoder architecture trained as an auto-encoder that is configured to use unlabeled data and to transform each of the generated contrastive windows into a vector (Huang, page 2, col 2, section B, ¶1 “Causal graph learning with multivariate time series roots from Granger causality analysis, which is performed by fitting a vector autoregressive model (VAR) to the time series input [21].”)) distilling both temporal and structural information (Huang, page 6, col 1, section IV, ¶3 “Work in [32], [31] proposed multilayer perception (MLP) and LSTM deep neural network to learn causal graph through time series prediction. These two models can be considered as an extension of VAR, where more complicated temporal relationship can be learned in hidden layers.” The causal graph and temporal relationship can be considered temporal and structural information.).
Maundet in view of Ramachandra in further view of Arya and Huang are analogous art because both references concern methods for causal inference. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Maundet/Ramachandra/Arya’s causal inference system to incorporate the LSTM taught by Huang. The motivation for doing so would have been to improve performance as stated in Huang, page 8, col 1, ¶2 “We show the performance of all models on two synthetic datasets in Figure 9. It it obvious that our Bi-CGL achieves the best performance.”.

Regarding claims 13 and 20:
Claims 13 and 20 are rejected under the same rationale as claim 6.
It would have been obvious to combine the teachings of Maundet in view of Ramachandra in further view of Arya and Huang for the reasons set forth in connection with claim 6 above.

Response to Arguments
Applicant's arguments filed December 16th, 2025 (hereinafter “Remarks”) have been fully considered but they are not persuasive.
Regarding the objections to the Specification, Applicant’s amended Specification has overcome the objections, which are withdrawn.
Regarding the 35 U.S.C. 101 rejections, applicant’s arguments have been considered, but they are not persuasive.
Applicant’s arguments with respect to claim rejections under 35 U.S.C. § 103 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Rejections under 35 U.S.C. § 101:
Argument 1: 
	“The Applicant asserts that the presently claimed invention integrates any alleged judicial exception into a practical application because the presently claimed limitations and elements are directed to improvements to the functioning of a computer.” (Remarks, page 16)

Examiners Response:
Examiner respectfully disagrees, the Applicant merely uses a computer to perform processes which can be performed by a mental process. An improvement to managing of computer incidents and events may be an improvement in an abstract idea, but not an improvement in the functioning of a computer, as a computer.

Argument 2:
“[T]he Applicant asserts that “the presently described embodiments have the capacity to improve managing of computer incidents and events using machine learning by providing a method of generating causal association rankings for multiple candidate events within a received window of candidate events by using generated dynamic embeddings to identify events that are impactful in causing the occurrence of an incident or outcome.” Applicant's Specification, para. [0011].” (Remarks, page 20).

Examiners Response:
	Examiner respectfully disagrees, the Applicant merely uses a computer to perform processes which can be performed by a mental process. The MPEP states “[a]n inventive concept “cannot be furnished by the unpatentable law of nature (or natural phenomenon or abstract idea) itself.” Genetic Techs. v. Merial LLC, 818 F.3d 1369, 1376, 118 USPQ2d 1541, 1546 (Fed. Cir. 2016).” See MPEP § 2106.05(I). Furthermore, “It should be noted that while this consideration is often referred to in an abbreviated manner as the “improvements consideration,” the word “improvements” in the context of this consideration is limited to improvements to the functioning of a computer or any other technology/technical field, whether in Step 2A Prong Two or in Step 2B.”  See MPEP § 2106.04(d)(1). Therefore, the improvement to managing computer events can be considered an improvement to a mental process, but not an improvement to the functioning of a computer or any other technology/technical field. 

Argument 3:
	“Furthermore, such claim limitations and elements are not a generic use of a computer but are specifically integrated into a computer system in improving “managing of computer incidents and events using machine learning,” by, for example, “providing a method of generating causal association rankings for multiple candidate events within a received window of candidate events by using generated dynamic embeddings to identify events that are impactful in causing the occurrence of an incident or outcome.” As stated above in the USPTO Memo, “Examiners are cautioned not to oversimplify claim limitations and expand the application of the 'apply it' consideration.” USPTO Memo, page 4.” (Remarks, page 20).

Examiners Response:
	Examiner respectfully disagrees, the MPEP states “[u]se of a computer or other machinery in its ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) does not integrate a judicial exception into a practical application or provide significantly more. See Affinity Labs v. DirecTV, 838 F.3d 1253, 1262, 120 USPQ2d 1201, 1207 (Fed. Cir. 2016) (cellular telephone); TLI Communications LLC v. AV Auto, LLC, 823 F.3d 607, 613, 118 USPQ2d 1744, 1748 (Fed. Cir. 2016) (computer server and telephone unit). Similarly, “claiming the improved speed or efficiency inherent with applying the abstract idea on a computer” does not integrate a judicial exception into a practical application or provide an inventive concept. Intellectual Ventures I LLC v. Capital One Bank (USA), 792 F.3d 1363, 1367, 115 USPQ2d 1636, 1639 (Fed. Cir. 2015). In contrast, a claim that purports to improve computer capabilities or to improve an existing technology may integrate a judicial exception into a practical application or provide significantly more. McRO, Inc. v. Bandai Namco Games Am. Inc., 837 F.3d 1299, 1314-15, 120 USPQ2d 1091, 1101-02 (Fed. Cir. 2016); Enfish, LLC v. Microsoft Corp., 822 F.3d 1327, 1335-36, 118 USPQ2d 1684, 1688-89 (Fed. Cir. 2016). See MPEP §§ 2106.04(d)(1) and 2106.05(a) for a discussion of improvements to the functioning of a computer or to another technology or technical field.” See MPEP § 2106.5(f). Therefore, as recited, the recitations of computer hardware does not integrate a judicial exception into a practical application or provide significantly more. 

Argument 4:
	“Therefore, with respect to all claims, a rejection under 35 U.S.C. § 101 would be improper in view of 1) the “2019 Revised Patent Subject Matter Eligibility Guidance”, the “October 2019 Update: Subject Matter Eligibility”, and the recently USPTO published “2024 Patent Subject Matter Eligibility Guidance Update Including on Artificial Intelligence” (collectively referred to as the Patent Eligibility Guidance, or PEG), 2) MPEP 2106.04(a)(2)(II), and 3) USPTO Memo: Reminders on evaluating subject matter eligibility of claims under 35 U.S.C. 101, because any alleged judicial exception is integrated into a practical application in that the combination of elements reflect an improvement in the functioning of a computer.” (Remarks, pages 20-21).

Examiners Response:
	Examiner respectfully disagrees, the improvements to managing of computer incidents and events is an improvement in an abstract idea, but, unlike the cited examples, is not an improvement in the functioning of a computer, as a computer. The MPEP states “2. Performing a mental process in a computer environment. An example of a case identifying a mental process performed in a computer environment as an abstract idea is…FairWarning IP, LLC v. Iatric Sys., Inc., 839 F.3d 1089, 120 USPQ2d 1293 (Fed. Cir. 2016). The patentee in FairWarning claimed a system and method of detecting fraud and/or misuse in a computer environment, in which information regarding accesses of a patient’s personal health information was analyzed according to one of several rules (i.e., related to accesses in excess of a specific volume, accesses during a pre-determined time interval, or accesses by a specific user) to determine if the activity indicates improper access. 839 F.3d. at 1092, 120 USPQ2d at 1294. The court determined that these claims were directed to a mental process of detecting misuse, and that the claimed rules here were “the same questions (though perhaps phrased with different words) that humans in analogous situations detecting fraud have asked for decades, if not centuries.” 839 F.3d. at 1094-95, 120 USPQ2d at 1296”. See MPEP § 2106.04(a)(2)/III(C).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACOB Z SUSSMAN MOSS whose telephone number is (571) 272-1579. The examiner can normally be reached Monday - Friday, 9 a.m. - 5 p.m. ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/J.S.M./Examiner, Art Unit 2122  

                                                                                                                                                                                         /KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122
Read full office action
GENERATING CAUSAL ASSOCIATION RANKINGS USING DYNAMIC EMBEDDINGS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

GENERATING CAUSAL ASSOCIATION RANKINGS USING DYNAMIC EMBEDDINGS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email