Last updated: May 04, 2026
Application No. 17/891,981
SELF-SUPERVISED FRAMEWORK FOR GRAPH REPRESENTATION LEARNING

Final Rejection §103
Filed
Aug 19, 2022
Priority
Jun 01, 2022 — provisional 63/347,921
Examiner
NYE, LOUIS CHRISTOPHER
Art Unit
2141
Tech Center
2100 — Computer Architecture & Software
Assignee
Feedzai - Consultadoria E Inovação Tecnológica S A
OA Round
2 (Final)
Interview Optional

— +35.7% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 22% grant rate with +35.7% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 9 resolved cases, 2023–2026
Examiner Intelligence

NYE, LOUIS CHRISTOPHER View full profile →
Grants only 22% of cases
Career Allowance Rate
2 granted / 9 resolved
-32.8% vs TC avg
Strong +36% interview lift
Without
With
+35.7%
Interview Lift
resolved cases with interview
Typical timeline
4y 1m
Avg Prosecution
27 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
38.4%
-1.6% vs TC avg
§103
50.2%
+10.2% vs TC avg
§102
7.6%
-32.4% vs TC avg
§112
3.8%
-36.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 9 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim(s) 1-5, 7-8, 12-14, and 16-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shekhar et al. (US Pub. No. 2021/0233080, published July 2021, hereinafter “Shekhar”) in view of Verma et al. (US Pub. No. 2023/0289610, effective filing date of March 2022, hereinafter “Verma”), and further in view of Wadhwa et al. (US Pub. No. 2022/0020026, published Jan. 2022, hereinafter “Wadhwa”).
Regarding claim 1, Shekhar teaches a method, comprising: 
receiving entity data for a plurality of entities (Shekhar, [0030] – “the term “identity attribute” refers to data associated with a digital identity. In particular, an identity attribute can refer to a characteristic of a digital identity, an activity performed in association with the digital identity, and/or details related to the activity” – teaches receiving entity data (identity attributes) for a plurality of entities); 
receiving transaction data for transactions between corresponding entities included in the plurality of entities (Shekhar, [0016] – “the fraudulent transaction detection system identifies a plurality of digital identities corresponding to a plurality of digital transactions.” – teaches receiving transaction data between corresponding entities included in the plurality of entities (plurality of digital identities corresponding to digital transactions)); 
performing a self-supervised training of a graph neural network including by sampling the heterogeneous graph representation for positive samples and negative samples to learn embedding representations for the nodes of the heterogeneous graph representation (Shekhar, [0016] – “ the time-dependent graph convolutional neural network can generate, based on the edge connections of the transaction graph, node embeddings corresponding to the plurality of nodes utilizing a plurality of temporal random walks” – teaches self-supervised training of a graph neural network by learning embedding representations for the nodes of the heterogenous graph representation, and in [0104] – “The researchers followed a similar strategy for training and testing each model. For example, the researchers utilized 75% of the edges from the datasets to learn the embeddings. Further, the researchers used 25% of the edges as positive link examples and sampled an equal number of negative links. For the edge representation, the researchers used the Hadamard distance between the connecting nodes.” – teaches using positive and negative samples of the embeddings to train the model); and 
utilizing the learned embedding representations for the nodes of the heterogeneous graph representation for automatic transaction analysis (Shekhar, [0016] – “The fraudulent transaction detection system can determine that a digital identity from the pair of digital identities corresponds to a fraudulent transaction based on the similarity probability” – teaches utilizing the learned embedding representations for the nodes of the heterogenous graph representation for automatic transaction analysis (determines fraudulent transactions using a similarity probability based on the embeddings)).
Shekhar fails to explicitly teach generating a heterogeneous graph representation, wherein nodes of the heterogeneous graph representation include a first type of node representing an entity of the plurality of entities and a second type of node representing the transactions; 
However, analogous to the field of the claimed invention, Verma teaches:
generating a heterogeneous graph representation, wherein nodes of the heterogeneous graph representation include a first type of node representing an entity of the plurality of entities and a second type of node representing the transactions (Verma, [0032] – “in a bipartite graph, edges may represent relationships or interactions between entities belonging to a first set of nodes (e.g., users, cardholders, etc.) and entities belonging to a second set of nodes (e.g., movies, merchants, etc.). In addition, no relationship may exist between any two entities that belong to the same set of nodes. The entities that belong to a first set of nodes may be completely different from the entities that belong to a second set of nodes in the bipartite graph” and in [0121] – “The processor 206 is configured to calculate the second loss value that enables learning representations for different node types (i.e., the first nodes and the second nodes) independently and simultaneously as:” – teaches generating a heterogenous graph representation wherein nodes of the representation include a first type of node representing an entity (first set of nodes may be users, cardholders, etc.) and a second type of node representing the transactions (second set of nodes may be movies, merchants, etc.)); 
determining, for at least one embedding representation including a node of the first type and a node of the second type, a likelihood of an edge existing between the node of the first type and the node of the second type (Verma, [0174] – “Link Prediction: To evaluate the performance of the BipGNN model for the link prediction task, the same three public datasets are used that were used for the node regression task. Here, the aim should predict whether a link between two nodes exists or not. For the AC dataset, a link between a user and CD denotes that the user has given a rating to a CD Similarly, for AM and ML datasets, a link exists when a user has given a rating to a movie. A random split strategy (90:10) is followed to split the links between the nodes for each dataset. In addition, the training links are used to learn the node embeddings, and further, these node embeddings along with the self-node features are used to train FFN for the binary classification task.” – teaches determining, for at least one embedding representation including a node of the first type and a node of the second type, a likelihood of an edge existing between the node of the first type and the node of the second type (uses bipartite GNN model for link prediction, to predict whether a link between nodes of a first and second type exists, teaches first and second type nodes in [0032]));
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the different nodes of Verma to the transaction data, self-supervised training, and utilizing of learned embeddings for automatic transaction analysis of Shekhar in order to create a method for receiving transaction data, creating a heterogenous graph based on the transaction data, learning embeddings based on the nodes of the heterogenous graph, and utilizing the embeddings to perform automatic transaction analysis. Doing so would learn node representations for bipartite graphs that explicitly capture information of neighboring nodes and implicitly capture self-node features (Verma, [0034]).
The combination of Shekhar and Verma fails to explicitly teach determining an anomaly score for at least one transaction of the transactions based at least in part on the determined likelihood of the edge existing between the node of the first type and the node of the second type, and flagging the at least one transaction of the transactions in response to a determination that the determined anomaly score exceeds a threshold. 
However, analogous to the field of the claimed invention, Wadhwa teaches:
determining, for at least one embedding representation including a node of the first type and a node of the second type, a likelihood of an edge existing between the node of the first type and the node of the second type (Wadhwa, [0030] – “In one embodiment, the server system is configured to apply machine learning algorithms over the graph embedding vector for training a data model to facilitate prediction of missing links in the temporal knowledge graph. The missing links may be related to money laundering financial transactions.” and in [0058] – “In one example scenario, a party ‘X’ transfers $1000 to a party ‘Y’ who is a nephew of the party ‘X’. In the above example scenario, the temporal knowledge graph has two nodes depicting the party ‘X’ (i.e., source node) and the party ‘Y’ (i.e., destination node)” – teaches determining, for at least one embedding representation including a node of the first type and a node of the second type (applies ML algorithms over graph embedding vector to facilitate prediction, or likelihood estimation, of missing links, or edges, between nodes. Teaches nodes of two types: source node and destination node));
determining an anomaly score for at least one transaction of the transactions based at least in part on the determined likelihood of the edge existing between the node of the first type and the node of the second type (Wadhwa, [0071] – “The prediction engine 220 is configured to determine time-based probabilities associated with the flagged cluster. The time-based probabilities may include, but not limited to, a time-based probability of next edge formation within the flagged cluster, a time-based probability of next edge formation outside the flagged cluster with a nearby cluster. In one embodiment, the time-based probability of the next edge formation within the flagged cluster is determined by constructing a Long Short Term Memory (LSTM) network for the flagged cluster using the trained data model. In one embodiment, the time-based probability of next edge formation outside the flagged cluster with the nearby cluster is determined by generating a convolution network. These time-based probabilities are used to detect nodes/groups/transactions that might lead to the money laundering financial transaction.” and in [0074] – “In one embodiment, the processor 202 is configured to update fraud score of the flagged cluster and the particular node based on the time-based probabilities.” – teaches determining an anomaly score (time-based probability, node fraud score, and cluster fraud score are all anomaly scores; wherein the fraud scores are based at least in part on the time-based probability score) for at least one transaction of the transactions based at least in part on the determined likelihood of the edge existing between the node of the first type and the node of the second type (fraud scores updated based at least in part on time-based probabilities which are the likelihood estimations of an edge formation between the node of the first type (source node) and the node of the second type (destination node))); and
flagging the at least one transaction of the transactions in response to a determination that the determined anomaly score exceeds a threshold (Wadhwa, [0072] – “In one embodiment, if the time-based probability of the next edge formation leading to a source node is greater than a predetermined threshold value, the prediction engine 220 identifies an issuer associated with a particular node (i.e., a trailing node) related to the next edge (i.e., link) which may be linked in future money-laundering activities. The source node refers to a node from where all the financial transactions were initiated previously.” and in [0075] – “Additionally, the processor 202 is configured to generate a suspicious activity report (SAR) file and alert the identified issuer 102 for preventing fraudulent financial transactions based on the SAR file. The SAR file may include, but not limited to, a cluster fraud score, a node fraud score, and a prediction probability associated with the next financial transaction being the money laundering financial transaction.” – teaches flagging the at least one transaction of the transactions in response to a determination that the determined anomaly score exceeds a threshold (when time-based probability exceeds threshold value fraud scores are updated and an alert file is generated to flag the transaction, cluster, and node)).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the link prediction and anomaly scores of Wadhwa to the first and second type nodes and automatic transaction analysis of Shekhar and Verma in order to generate anomaly scores based on link predictions between the first and second type nodes. Doing so would provide an automated system for predicting next financial transactions of suspicious customers in near real-time which can be used to take pre-emptive action and help in enriching the SAR file for AML systems (Wadhwa, [0034]).

Regarding claim 19, Shekhar teaches a system, comprising: a processor configured to: 
receive entity data for a plurality of entities (Shekhar, [0030] – “the term “identity attribute” refers to data associated with a digital identity. In particular, an identity attribute can refer to a characteristic of a digital identity, an activity performed in association with the digital identity, and/or details related to the activity” – teaches receiving entity data (identity attributes) for a plurality of entities); 
receive transaction data for transactions between corresponding entities included in the plurality of entities (Shekhar, [0016] – “the fraudulent transaction detection system identifies a plurality of digital identities corresponding to a plurality of digital transactions.” – teaches receiving transaction data between corresponding entities included in the plurality of entities (plurality of digital identities corresponding to digital transactions));  
perform a self-supervised training of a graph neural network including by sampling the heterogeneous graph representation for positive samples and negative samples to learn embedding representations for the nodes of the heterogeneous graph representation (Shekhar, [0016] – “ the time-dependent graph convolutional neural network can generate, based on the edge connections of the transaction graph, node embeddings corresponding to the plurality of nodes utilizing a plurality of temporal random walks” – teaches self-supervised training of a graph neural network by learning embedding representations for the nodes of the heterogenous graph representation, and in [0104] – “The researchers followed a similar strategy for training and testing each model. For example, the researchers utilized 75% of the edges from the datasets to learn the embeddings. Further, the researchers used 25% of the edges as positive link examples and sampled an equal number of negative links. For the edge representation, the researchers used the Hadamard distance between the connecting nodes.” – teaches using positive and negative samples of the embeddings to train the model); and 
utilize the learned embedding representations for the nodes of the heterogeneous graph representation for automatic transaction analysis (Shekhar, [0016] – “The fraudulent transaction detection system can determine that a digital identity from the pair of digital identities corresponds to a fraudulent transaction based on the similarity probability” – teaches utilizing the learned embedding representations for the nodes of the heterogenous graph representation for automatic transaction analysis (determines fraudulent transactions using a similarity probability based on the embeddings)); and 
a memory coupled to the processor and configured to provide the processor with instructions (Shekhar, [0140] – “The computing device 1000 includes memory 1004, which is coupled to the processor(s) 1002. The memory 1004 may be used for storing data, metadata, and programs for execution by the processor(s).” – teaches memory coupled to the processor, configured to provide processor with instructions to be executed).
Shekhar fails to explicitly teach generating a heterogeneous graph representation, wherein nodes of the heterogeneous graph representation include a first type of node representing an entity of the plurality of entities and a second type of node representing the transactions; 
However, analogous to the field of the claimed invention, Verma teaches:
generate a heterogeneous graph representation, wherein nodes of the heterogeneous graph representation include a first type of node representing an entity of the plurality of entities and a second type of node representing the transactions (Verma, [0032] – “in a bipartite graph, edges may represent relationships or interactions between entities belonging to a first set of nodes (e.g., users, cardholders, etc.) and entities belonging to a second set of nodes (e.g., movies, merchants, etc.). In addition, no relationship may exist between any two entities that belong to the same set of nodes. The entities that belong to a first set of nodes may be completely different from the entities that belong to a second set of nodes in the bipartite graph” and in [0121] – “The processor 206 is configured to calculate the second loss value that enables learning representations for different node types (i.e., the first nodes and the second nodes) independently and simultaneously as:” – teaches generating a heterogenous graph representation wherein nodes of the representation include a first type of node representing an entity (first set of nodes may be users, cardholders, etc.) and a second type of node representing the transactions (second set of nodes may be movies, merchants, etc.)); 
determine, for at least one embedding representation including a node of the first type and a node of the second type, a likelihood of an edge existing between the node of the first type and the node of the second type (Verma, [0174] – “Link Prediction: To evaluate the performance of the BipGNN model for the link prediction task, the same three public datasets are used that were used for the node regression task. Here, the aim should predict whether a link between two nodes exists or not. For the AC dataset, a link between a user and CD denotes that the user has given a rating to a CD Similarly, for AM and ML datasets, a link exists when a user has given a rating to a movie. A random split strategy (90:10) is followed to split the links between the nodes for each dataset. In addition, the training links are used to learn the node embeddings, and further, these node embeddings along with the self-node features are used to train FFN for the binary classification task.” – teaches determining, for at least one embedding representation including a node of the first type and a node of the second type, a likelihood of an edge existing between the node of the first type and the node of the second type (uses bipartite GNN model for link prediction, to predict whether a link between nodes of a first and second type exists, teaches first and second type nodes in [0032]));
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the different nodes of Verma to the transaction data, self-supervised training, and utilizing of learned embeddings for automatic transaction analysis of Shekhar in order to create a method for receiving transaction data, creating a heterogenous graph based on the transaction data, learning embeddings based on the nodes of the heterogenous graph, and utilizing the embeddings to perform automatic transaction analysis. Doing so would learn node representations for bipartite graphs that explicitly capture information of neighboring nodes and implicitly capture self-node features (Verma, [0034]).
The combination of Shekhar and Verma fails to explicitly teach determining an anomaly score for at least one transaction of the transactions based at least in part on the determined likelihood of the edge existing between the node of the first type and the node of the second type, and flagging the at least one transaction of the transactions in response to a determination that the determined anomaly score exceeds a threshold. 
However, analogous to the field of the claimed invention, Wadhwa teaches:
determine, for at least one embedding representation including a node of the first type and a node of the second type, a likelihood of an edge existing between the node of the first type and the node of the second type (Wadhwa, [0030] – “In one embodiment, the server system is configured to apply machine learning algorithms over the graph embedding vector for training a data model to facilitate prediction of missing links in the temporal knowledge graph. The missing links may be related to money laundering financial transactions.” and in [0058] – “In one example scenario, a party ‘X’ transfers $1000 to a party ‘Y’ who is a nephew of the party ‘X’. In the above example scenario, the temporal knowledge graph has two nodes depicting the party ‘X’ (i.e., source node) and the party ‘Y’ (i.e., destination node)” – teaches determining, for at least one embedding representation including a node of the first type and a node of the second type (applies ML algorithms over graph embedding vector to facilitate prediction, or likelihood estimation, of missing links, or edges, between nodes. Teaches nodes of two types: source node and destination node));
determine an anomaly score for at least one transaction of the transactions based at least in part on the determined likelihood of the edge existing between the node of the first type and the node of the second type (Wadhwa, [0071] – “The prediction engine 220 is configured to determine time-based probabilities associated with the flagged cluster. The time-based probabilities may include, but not limited to, a time-based probability of next edge formation within the flagged cluster, a time-based probability of next edge formation outside the flagged cluster with a nearby cluster. In one embodiment, the time-based probability of the next edge formation within the flagged cluster is determined by constructing a Long Short Term Memory (LSTM) network for the flagged cluster using the trained data model. In one embodiment, the time-based probability of next edge formation outside the flagged cluster with the nearby cluster is determined by generating a convolution network. These time-based probabilities are used to detect nodes/groups/transactions that might lead to the money laundering financial transaction.” and in [0074] – “In one embodiment, the processor 202 is configured to update fraud score of the flagged cluster and the particular node based on the time-based probabilities.” – teaches determining an anomaly score (time-based probability, node fraud score, and cluster fraud score are all anomaly scores; wherein the fraud scores are based at least in part on the time-based probability score) for at least one transaction of the transactions based at least in part on the determined likelihood of the edge existing between the node of the first type and the node of the second type (fraud scores updated based at least in part on time-based probabilities which are the likelihood estimations of an edge formation between the node of the first type (source node) and the node of the second type (destination node))); and
flagging the at least one transaction of the transactions in response to a determination that the determined anomaly score exceeds a threshold (Wadhwa, [0072] – “In one embodiment, if the time-based probability of the next edge formation leading to a source node is greater than a predetermined threshold value, the prediction engine 220 identifies an issuer associated with a particular node (i.e., a trailing node) related to the next edge (i.e., link) which may be linked in future money-laundering activities. The source node refers to a node from where all the financial transactions were initiated previously.” and in [0075] – “Additionally, the processor 202 is configured to generate a suspicious activity report (SAR) file and alert the identified issuer 102 for preventing fraudulent financial transactions based on the SAR file. The SAR file may include, but not limited to, a cluster fraud score, a node fraud score, and a prediction probability associated with the next financial transaction being the money laundering financial transaction.” – teaches flagging the at least one transaction of the transactions in response to a determination that the determined anomaly score exceeds a threshold (when time-based probability exceeds threshold value fraud scores are updated and an alert file is generated to flag the transaction, cluster, and node)).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the link prediction and anomaly scores of Wadhwa to the first and second type nodes and automatic transaction analysis of Shekhar and Verma in order to generate anomaly scores based on link predictions between the first and second type nodes. Doing so would provide an automated system for predicting next financial transactions of suspicious customers in near real-time which can be used to take pre-emptive action and help in enriching the SAR file for AML systems (Wadhwa, [0034]).

Regarding claim 20, Shekhar teaches a computer program product embodied in a non-transitory computer readable medium and comprising computer instructions for:
receiving entity data for a plurality of entities (Shekhar, [0030] – “the term “identity attribute” refers to data associated with a digital identity. In particular, an identity attribute can refer to a characteristic of a digital identity, an activity performed in association with the digital identity, and/or details related to the activity” – teaches receiving entity data (identity attributes) for a plurality of entities); 
receiving transaction data for transactions between corresponding entities included in the plurality of entities (Shekhar, [0016] – “the fraudulent transaction detection system identifies a plurality of digital identities corresponding to a plurality of digital transactions.” – teaches receiving transaction data between corresponding entities included in the plurality of entities (plurality of digital identities corresponding to digital transactions)); 
performing a self-supervised training of a graph neural network including by sampling the heterogeneous graph representation for positive samples and negative samples to learn embedding representations for the nodes of the heterogeneous graph representation (Shekhar, [0016] – “ the time-dependent graph convolutional neural network can generate, based on the edge connections of the transaction graph, node embeddings corresponding to the plurality of nodes utilizing a plurality of temporal random walks” – teaches self-supervised training of a graph neural network by learning embedding representations for the nodes of the heterogenous graph representation, and in [0104] – “The researchers followed a similar strategy for training and testing each model. For example, the researchers utilized 75% of the edges from the datasets to learn the embeddings. Further, the researchers used 25% of the edges as positive link examples and sampled an equal number of negative links. For the edge representation, the researchers used the Hadamard distance between the connecting nodes.” – teaches using positive and negative samples of the embeddings to train the model); and 
utilizing the learned embedding representations for the nodes of the heterogeneous graph representation for automatic transaction analysis (Shekhar, [0016] – “The fraudulent transaction detection system can determine that a digital identity from the pair of digital identities corresponds to a fraudulent transaction based on the similarity probability” – teaches utilizing the learned embedding representations for the nodes of the heterogenous graph representation for automatic transaction analysis (determines fraudulent transactions using a similarity probability based on the embeddings)).
Shekhar fails to explicitly teach generating a heterogeneous graph representation, wherein nodes of the heterogeneous graph representation includes a first type of node representing an entity of the plurality of entities and a second type of node representing the transactions; 
However, analogous to the field of the claimed invention, Verma teaches:
generating a heterogeneous graph representation, wherein nodes of the heterogeneous graph representation includes a first type of node representing an entity of the plurality of entities and a second type of node representing the transactions (Verma, [0032] – “in a bipartite graph, edges may represent relationships or interactions between entities belonging to a first set of nodes (e.g., users, cardholders, etc.) and entities belonging to a second set of nodes (e.g., movies, merchants, etc.). In addition, no relationship may exist between any two entities that belong to the same set of nodes. The entities that belong to a first set of nodes may be completely different from the entities that belong to a second set of nodes in the bipartite graph” and in [0121] – “The processor 206 is configured to calculate the second loss value that enables learning representations for different node types (i.e., the first nodes and the second nodes) independently and simultaneously as:” – teaches generating a heterogenous graph representation wherein nodes of the representation include a first type of node representing an entity (first set of nodes may be users, cardholders, etc.) and a second type of node representing the transactions (second set of nodes may be movies, merchants, etc.)); 
determining, for at least one embedding representation including a node of the first type and a node of the second type, a likelihood of an edge existing between the node of the first type and the node of the second type (Verma, [0174] – “Link Prediction: To evaluate the performance of the BipGNN model for the link prediction task, the same three public datasets are used that were used for the node regression task. Here, the aim should predict whether a link between two nodes exists or not. For the AC dataset, a link between a user and CD denotes that the user has given a rating to a CD Similarly, for AM and ML datasets, a link exists when a user has given a rating to a movie. A random split strategy (90:10) is followed to split the links between the nodes for each dataset. In addition, the training links are used to learn the node embeddings, and further, these node embeddings along with the self-node features are used to train FFN for the binary classification task.” – teaches determining, for at least one embedding representation including a node of the first type and a node of the second type, a likelihood of an edge existing between the node of the first type and the node of the second type (uses bipartite GNN model for link prediction, to predict whether a link between nodes of a first and second type exists, teaches first and second type nodes in [0032]));
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the different nodes of Verma to the transaction data, self-supervised training, and utilizing of learned embeddings for automatic transaction analysis of Shekhar in order to create a method for receiving transaction data, creating a heterogenous graph based on the transaction data, learning embeddings based on the nodes of the heterogenous graph, and utilizing the embeddings to perform automatic transaction analysis. Doing so would learn node representations for bipartite graphs that explicitly capture information of neighboring nodes and implicitly capture self-node features (Verma, [0034]).
The combination of Shekhar and Verma fails to explicitly teach determining an anomaly score for at least one transaction of the transactions based at least in part on the determined likelihood of the edge existing between the node of the first type and the node of the second type, and flagging the at least one transaction of the transactions in response to a determination that the determined anomaly score exceeds a threshold. 
However, analogous to the field of the claimed invention, Wadhwa teaches:
determining, for at least one embedding representation including a node of the first type and a node of the second type, a likelihood of an edge existing between the node of the first type and the node of the second type (Wadhwa, [0030] – “In one embodiment, the server system is configured to apply machine learning algorithms over the graph embedding vector for training a data model to facilitate prediction of missing links in the temporal knowledge graph. The missing links may be related to money laundering financial transactions.” and in [0058] – “In one example scenario, a party ‘X’ transfers $1000 to a party ‘Y’ who is a nephew of the party ‘X’. In the above example scenario, the temporal knowledge graph has two nodes depicting the party ‘X’ (i.e., source node) and the party ‘Y’ (i.e., destination node)” – teaches determining, for at least one embedding representation including a node of the first type and a node of the second type (applies ML algorithms over graph embedding vector to facilitate prediction, or likelihood estimation, of missing links, or edges, between nodes. Teaches nodes of two types: source node and destination node));
determining an anomaly score for at least one transaction of the transactions based at least in part on the determined likelihood of the edge existing between the node of the first type and the node of the second type (Wadhwa, [0071] – “The prediction engine 220 is configured to determine time-based probabilities associated with the flagged cluster. The time-based probabilities may include, but not limited to, a time-based probability of next edge formation within the flagged cluster, a time-based probability of next edge formation outside the flagged cluster with a nearby cluster. In one embodiment, the time-based probability of the next edge formation within the flagged cluster is determined by constructing a Long Short Term Memory (LSTM) network for the flagged cluster using the trained data model. In one embodiment, the time-based probability of next edge formation outside the flagged cluster with the nearby cluster is determined by generating a convolution network. These time-based probabilities are used to detect nodes/groups/transactions that might lead to the money laundering financial transaction.” and in [0074] – “In one embodiment, the processor 202 is configured to update fraud score of the flagged cluster and the particular node based on the time-based probabilities.” – teaches determining an anomaly score (time-based probability, node fraud score, and cluster fraud score are all anomaly scores; wherein the fraud scores are based at least in part on the time-based probability score) for at least one transaction of the transactions based at least in part on the determined likelihood of the edge existing between the node of the first type and the node of the second type (fraud scores updated based at least in part on time-based probabilities which are the likelihood estimations of an edge formation between the node of the first type (source node) and the node of the second type (destination node))); and
flagging the at least one transaction of the transactions in response to a determination that the determined anomaly score exceeds a threshold (Wadhwa, [0072] – “In one embodiment, if the time-based probability of the next edge formation leading to a source node is greater than a predetermined threshold value, the prediction engine 220 identifies an issuer associated with a particular node (i.e., a trailing node) related to the next edge (i.e., link) which may be linked in future money-laundering activities. The source node refers to a node from where all the financial transactions were initiated previously.” and in [0075] – “Additionally, the processor 202 is configured to generate a suspicious activity report (SAR) file and alert the identified issuer 102 for preventing fraudulent financial transactions based on the SAR file. The SAR file may include, but not limited to, a cluster fraud score, a node fraud score, and a prediction probability associated with the next financial transaction being the money laundering financial transaction.” – teaches utilizing the learned embedding representations for automatic transaction analysis including by flagging the at least one transaction of the transactions in response to a determination that the determined anomaly score exceeds a threshold (when time-based probability exceeds threshold value fraud scores are updated and an alert file is generated to flag the transaction, cluster, and node)).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the link prediction and anomaly scores of Wadhwa to the first and second type nodes of Shekhar and Verma in order to generate anomaly scores based on link predictions between the first and second type nodes. Doing so would provide an automated system for predicting next financial transactions of suspicious customers in near real-time which can be used to take pre-emptive action and help in enriching the SAR file for AML systems (Wadhwa, [0034]).

Regarding claim 2, the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1, further comprising using at least a portion of the embedding representations to cluster at least a subset of the transactions into a plurality of different cluster groups (Shekhar, [0032] – “The server classifies the plurality of nodes in the graph into a set of clusters based on a degree of similarity among the plurality of nodes. In other words, those nodes that are similar to each other are classified in the same cluster” – teaches using at least a portion of the embedding representations to cluster at least a subset of the transactions into a plurality of different cluster groups).

Regarding claim 3, the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1, wherein performing the self-supervised training of the graph neural network includes using at least one of: an edge prediction task, a transaction similarity task, or a subgraph similarity task (Shekhar, [0092] – “the fraudulent transaction detection system 106 trains a time-dependent graph convolutional neural network to generate similarity probabilities for pairs of digital identities” – teaches wherein performing the self-supervised training of the graph neural network includes using at least one of a transaction similarity or subgraph similarity task (trains a neural network to generate similarity probabilities for pairs of digital identities)).

Regarding claim 4, the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1, further comprising: predicting an anomaly based at least on the embedding representations (Shekhar, [0047] – “The fraudulent transaction detection system 106 can further utilize the time-dependent graph convolutional neural network to generate a similarity probability (e.g., a similarity score) for a pair of digital identities based on the node embeddings. Via the server(s) 102, the fraudulent transaction detection system 106 can determine that a digital identity from the pair of digital identities corresponds to a fraudulent transaction based on the similarity probability” – teaches predicting an anomaly based at least on the embedding representations (teaches predicting an anomaly based on a similarity score based on the node embeddings)).

Regarding claim 5, the combination of Shekhar, Verma, and Wadhwa teach the method of claim 4, further comprising predicting the anomaly using a multilayer perceptron (MLP) (Shekhar, [0015] – “the graph convolutional neural network can predict whether two digital identities correspond to the same user and further determine whether one of the digital identities is associated with a fraudulent transaction” – teaches predicting the anomaly using a multilayer perceptron, as supported in [0095] – “Algorithm 2, presented below, provides a characterization of how the fraudulent transaction detection system 106 executes training of the time-dependent graph convolutional neural network in accordance with one or more embodiments” – teaches an algorithm for training the time-dependent graph convolutional neural network, where the algorithm includes feed-forward layers and backpropagation indicating the time-dependent graph convolutional neural network is a differentiable model predicting the anomaly). 

Regarding claim 7, the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1, wherein the graph neural network is applied in one or more discrete, fixed snapshots containing transactions in time intervals (Shekhar, [0015] – “One or more embodiments described herein include a fraudulent transaction detection system that accurately determines whether a digital identity is associated with a fraudulent transaction utilizing time-based node embeddings generated by a graph convolutional neural network.” – teaches the graph neural network applied in one or more discrete, fixed snapshots (time-based node embeddings generated by a graph convolutional neural network), and in [0018] – “the fraudulent transaction detection system 106 associates a transaction timestamp for a digital transaction with the corresponding edge connection” – teaches snapshots containing transactions in time intervals (timestamp for digital transaction)).

Regarding claim 8, the combination of Shekhar, Verma, and Wadhwa teach the method of claim 7,
Shekhar teaches:
embedding representations produced by a graph neural network (GNN) on each snapshot are sent as input to a recurrent neural network (RNN) (Shekhar, [0097] – “Upon generating the node embeddings, the fraudulent transaction detection system 106 can determine whether a digital identity corresponds to a fraudulent entity using a machine learning technique (e.g., a machine learning model). In particular, the fraudulent transaction detection system 106 can provide the node embeddings generated by the time-dependent graph convolutional neural network 404 to a trained machine learning model.” – teaches sending embedding representations produced by graph neural network on each snapshot to a machine learning model )
Shekhar fails to explicitly teach wherein a sliding window is applied such that each snapshot is offset from a previous snapshot by a time interval of the sliding window and a recurrent neural network (RNN) that combines the embedding representations on each snapshot s with a per-customer hidden state, maintained across snapshots.
However, analogous to the field of the claimed invention, Verma teaches:
wherein a sliding window is applied such that each snapshot is offset from a previous snapshot by a time interval of the sliding window (Verma, [0128] – “the TPP model 504 may utilize the actual time (real-time) of the next event for predicting the corresponding marker. In one embodiment, the TPP model 504 demonstrates high applicability in scenarios where the marker (e.g., fraudulent, or non-fraudulent transaction) is computed or derived after some time as opposed to being simultaneously available with the event occurrence. As shown in FIG. 5A, there is a sequence of events denoted by their time of occurrence and corresponding markers. Mathematically, each sequence is represented by S={(t.sub.1, y.sub.1), (t.sub.2, y.sub.2), . . . , (t.sub.n, y.sub.n)}, where n refers to the total sequence length. Here, (t.sub.j, y.sub.j) refers to the j.sup.th event represented by the time of the event (t.sub.j) and the corresponding marker (y.sub.j). By default, the events are ordered in time, such that t.sub.j+1≥t.sub.j. Given the sequence of last n events, the task should predict the next event time t.sub.n+1 and the corresponding marker y.sub.n+1.” – teaches a sliding window applied such that each snapshot is offset from the previous snapshot by a time interval of the sliding window (payment transactions performed within a particular time interval)) 
a recurrent neural network (RNN) that combines the embedding representations on each snapshot s with a per-customer hidden state, maintained across snapshots (Verma, [0129] – “the TPP model 504 utilizes a recurrent neural network (RNN) as backbone architecture to learn the first embedding (i.e., the temporal embedding 506) based on analysis of past events. In general, RNN is a feed-forward neural network structure. In RNN, additional edges (also known as recurrent edges) are added such that the outputs from the hidden units at the current time step are fed into them again as future inputs at the next time step. In consequence, the same feed-forward neural network structure is replicated at each time step, and the recurrent edges connect the hidden units of the network replicated at adjacent time steps across time.” – teaches combining representations on each snapshot with a per-customer hidden state (recurrent edges connect hidden units replicated at adjacent time steps across time)).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the time intervals and recurrent neural network of Verma to further modify the method of Shekhar, Verma, and Wadhwa in order to provide embedding representations produced by a GNN on each snapshot to a RNN that combines representations on each snapshot with a per-customer hidden state, such that each snapshot is offset by a time interval. Doing so would capture the temporal embedding of the real-time sequence of the transaction and create an internal state of the network to memorize influence of past data samples (Verma, [0129].

Regarding claim 12, the combination of Shekhar and Verma teach the method of claim 1,
 wherein the first type of node has a respective set of learnable parameters and the second type of node has a respective set of learnable parameters different from the set of learnable parameters associated with the first type of node (Verma, [0121] – “The processor 206 is configured to calculate the second loss value that enables learning representations for different node types (i.e., the first nodes and the second nodes) independently and simultaneously as” – teaches learning representations for different node types independently and simultaneously, and in [0122] – “all the model parameters are learned via stochastic gradient descent with the Adam optimizer” – teaches wherein the different nodes have respective sets of learnable parameters (all parameters are learned via stochastic gradient descent, and the different node types are learned independently, thus their sets of learnable parameters are different from each other)).
Therefore, it would have been obvious to a person to incorporate the first and second types of nodes having different sets of learnable parameters of Verma to further modify the method of Shekhar, Verma, and Wadhwa in order to have different node types with different sets of learnable parameters. Doing so would preserve the structure of the bipartite graph and provide high mutual-information with self-node features (Verma, [0122]).

Regarding claim 13, the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1,
wherein an attention coefficient between a first node and a second node defines a weight of a corresponding interaction between the first node and the second node (Verma, [0094] – “The attention mechanism facilitates the determination that how much weight or attention should be provided to the direct neighborhood embeddings and the skip neighborhood embeddings. In one embodiment, the sum of the direct neighborhood embeddings and the skip neighborhood embeddings is always 1. In this manner, the attention mechanism helps to determine or learn the importance of two different types of embeddings (i.e., the direct neighborhood embeddings and the skip neighborhood embeddings) coming from two different types of neighbors (i.e., the direct neighbor nodes and the skip neighbor nodes)” – teaches wherein an attention coefficient between a first and second node defines a weight of a corresponding interaction between the first and second node (attention mechanism provides how much or attention to be provided to embeddings of different types of nodes)). 
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the attention coefficient defining a weight of a corresponding interaction between the first and second nodes of Verma to further modify the method of Shekhar, Verma, and Wadhwa to define a weight of a corresponding interaction between the first and second nodes. Doing so would preserve mutual information between the comprehensive node embedding and the self-node features of the node and preserve the graph structure of the bipartite graph (Verma, [0096]).

	Regarding claim 14,  the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1, wherein the sampling of the heterogenous graph representation includes uniform negative sampling (Shekhar, [0093] – “Accordingly, in one or more embodiments, the fraudulent transaction detection system 106 defines the loss function for a pair of node representations (z.sub.u, z.sub.ν) as follows where P.sub.n(u) is a negative sampling distribution for the out of context nodes for the node U:” – teaches sampling of the heterogenous graph representation including negative sampling).

Regarding claim 16,  the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1, wherein the embedding representations are based at least on a first layer of the graph neural network such that behavior divergence measures reflect a source entity's transactions (Shekhar, [0070] – “the node embeddings that make up the input to the first neural network layer 416 include the identity attributes of the digital identities corresponding to the nodes 408a-408c, 408e-408f. Thus, the time-dependent graph convolutional neural network 404 can generate the node embedding 414a for the node 408a based on the identity attributes of the plurality of digital identities represented in the transaction graph 402." – teaches wherein the embedding representations are based at least on a first layer of the graph neural network (node embedding 414a generated for node 408a at the first layer) such that behavior divergence measures reflect a source entity’s transactions (based on identity attributes of the plurality of digital identities in transaction graph, thus reflecting source entity’s transactions)).

Regarding claim 17,  the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1, wherein the embedding representations are based at least on a second layer of the graph neural network such that behavior divergence measures reflect counterparts interacted with (Shekhar, [0020] – “the time-dependent graph convolutional neural network generates the prior neural network layer node embeddings based on the identity attributes of a set of nodes associated with the neighbor nodes” – teaches generating embedding representation based at least on a second layer (generates prior neural network layer node embeddings) of the graph neural network such that behavior divergence measures reflect counterparts interacted with (based on identity attributes of a set of nodes associated with neighbor nodes)).

Regarding claim 18,  the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1, wherein the embedding representations are based at least on a deepest layer of the graph neural network such that a counterpart's transactions affect a representation of a source entity (Shekhar, [0096] – “As shown in Algorithm 2, the fraudulent transaction detection system 106 computes the temporal neighborhood of each node. The fraudulent transaction detection system 106 applies the K-layered convolution to generate the representation of the nodes. The fraudulent transaction detection system 106 passes the output of the final convolution layer through a fully connected layer to generate the final node embeddings z.sub.u,∀u∈V. Accordingly, the fraudulent transaction detection system 106 learns a set of parameters for the time-dependent graph convolutional neural network (Q.sup.(k), b.sub.q.sup.(k), W.sup.(k), b.sub.w.sup.(k)) along with the weights of the final fully connected layer.” – teaches wherein the embedding representations are based at least on a deepest later of the graph neural network (passes output of final layer to generate final node embeddings) and in [0020] – “the time-dependent graph convolutional neural network generates the prior neural network layer node embeddings based on the identity attributes of a set of nodes associated with the neighbor nodes.” – teaches such that a counterpart’s (neighbor node) transactions affect a representation (layer node embedding based on attributes on neighboring nodes) of a source entity).

Claim(s) 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shekhar, Verma, and Wadhwa as applied to claims 1, 19, and 20 above, and further in view of Sun et al. (US Pub. No. 2023/0351215, effective filing date of September 2021, hereinafter “Sun”).
Regarding claim 6, the combination of Shekhar, Verma, and Wadhwa teach the method of claim 4. 
The combination of Shekhar, Verma, and Wadhwa fails to explicitly teach wherein predicting the anomaly based at least on the embedding representations is based at least on a sigmoid of a weighted Hadamard product of a first embedding representation and a second embedding representation.
However, analogous to the field of the claimed invention, Sun teaches:
wherein predicting the anomaly based at least on the embedding representations is based at least on a sigmoid of a weighted Hadamard product of a first embedding representation and a second embedding representation (Sun, [0148] – “ In the above equation, W.sub.a.sup.l,W.sub.b.sup.l∈R.sup.(D.sup.l-1 x D.sup.l-1), a,b∈R.sup.D.sup.l-1 are the learnable parameters, σ.sub.glu is the sigmoid function, and .Math. is the hadamard product” – teaches a sigmoid of a weighted Hadamard product of a first embedding representation and a second embedding representation (representations a and b), and in [0192] – “For example, the analysis computer can compute a Hadamard product using two vectors: the first final vector representation of the first node and a second final vector representation of the second node (e.g., vectors corresponding to the latest snapshot). The Hadamard product can be used as a vector representing a potential link between the two node” and [0193] – “the analysis computer can send an advisory notice that a transaction is likely to take place, or that a current transaction being attempted was not likely to take place and may therefore be fraudulent” – teaches predicting the anomaly based at least on the embedding representation based at least on a sigmoid of a weighted Hadamard product of the a first and second embedding representation).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the predicting of the anomaly based at least in part on a sigmoid of a weighted Hadamard product of the first and second embeddings of Sun to the method of Shekhar, Verma, and Wadhwa. Doing so would capture the temporal evolution of dynamic graphs and provide for a more efficient method to learn temporal features (Sun, [0113]).

Claim(s) 9-10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shekhar, Verma, and Wadhwa as applied to claims 1, 19, and 20 above, and further in view of Arora et al. (US Pub. No. 2022/0101327, published March 2022, hereinafter “Arora”).
Regarding claim 9, the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1, wherein performing the self-supervised training of the graph neural network includes performing message passing including by: 
combining the aggregated message with information associated with a source node (Verma, [0122] – “The learned embeddings are concatenated with the node's self-node features and can cater to one or more different domain agnostic downstream applications and/or tasks” – teaches combining the aggregated message with information associated with a source node (representations are concatenated with the node’s self-node features)).
The combination of Shekhar, Verma, and Wadhwa fail to explicitly teach: computing representations by repeatedly sending messages along edges of a local neighborhood of a node; and aggregating the messages.
However, analogous to the field of the claimed invention, Arora teaches:
computing representations by repeatedly sending messages along edges of a local neighborhood of a node (Arora, [0147] – “The historical transaction data includes information (e.g., various transaction messages) pertaining to the historical transactions executed between the merchants M1-Mn and the consumers C1-Cn. At step 1004, the fraud detection server 110 generates the first consumer-merchant graph 200 based on the historical transaction data 818” – teaches computing representations (generated graph) by repeatedly sending message along edges of a local neighborhood of a node (historical data comprises messages executed between nodes in a neighborhood)); 
aggregating the messages (Arora, [0147] – “The historical transaction data includes information (e.g., various transaction messages) pertaining to the historical transactions executed between the merchants M1-Mn and the consumers C1-Cn.” – teaches aggregating the messages (messages are stored as transaction data for use in creating embeddings));
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the message passing and aggregation of Arora to the method of Shekhar, Verma, and Wadhwa in order to train the graph neural network by computing representations by repeatedly sending messages along edges of a local neighborhood of a node. Doing so would capture complex interdependencies in transaction data to provide for accurate detection of fraudulent transactions (Arora, [0006]).

Regarding claim 10, the combination of Shekhar, Verma, Wadhwa, and Arora teach the method of claim 9, wherein the computed representations are based at least on a context of a respective node (Shekhar, [0099] – “Indeed, by determining that a digital identity corresponds to a fraudulent entity, the fraudulent transaction detection system 106 can determine that digital transactions originating from the fraudulent entity are also fraudulent. Thus, the fraudulent transaction detection system 106 can utilize the node embeddings that correspond to digital identities to identify fraudulent transactions.” – teaches wherein computed representations are based on a context of a respective node (the respective node being the digital entity determined as the fraudulent entity, where computed representations, such as the digital transactions, are based on the context of the respective node)).

Claim(s) 11 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shekhar, Verma, and Wadhwa as applied to claims 1, 19, and 20 above, and further in view of Liu et al. (NPL: Anomaly Detection in Dynamic Graphs via Transformer, published Nov. 2021, hereinafter “Liu”).
Regarding claim 11, the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1.
The combination of Shekhar, Verma, and Wadhwa fails to explicitly teach wherein at least one node has an associated receptive field is defined by the number of layers of the graph neural network such that the number of layers controls a neighborhood considered for message passing.
However, analogous to the field of the claimed invention, Liu teaches:
wherein at least one node has an associated receptive field is defined by the number of layers of the graph neural network such that the number of layers controls a neighborhood considered for message passing (Liu, Section 4.1 Paragraph 5 – “we consider a sequence of graphs Gt τ = {Gt−τ+1 , · · · , G t} with length τ , where the time window size τ is a hyperparameter and determines the receipt fields on time axis.” – teaches wherein at least one node has an associated receptive field defined by the number of layers of the graph neural network such that the number of layers controls a neighborhood considered for message passing (receipt field defined by τ which is also the length of sequence of graphs)).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the receptive fields defined by the number of layers of the graph neural network of Liu to the method of Shekhar, Verma, and Wadhwa in order to define the receipt fields of the nodes as the number of layers in the graph neural network. Doing so would capture the dynamic evolution of the graph and provide sufficient receipt fields for the learning model (Liu, Introduction).

Regarding claim 15,  the combination of Shekhar, Verma, and Wadhwa teach the method of claim 1.
The combination of Shekhar, Verma, and Wadhwa fails to explicitly teach wherein performing the self-supervised training of the graph neural network includes jointly training an encoder and a decoder through binary cross-entropy.
However, analogous to the field of the claimed invention, Liu teaches:
wherein performing the self-supervised training of the graph neural network includes jointly training an encoder and a decoder through binary cross-entropy (Liu, Section 4.3 – “With the multiple timestamps of node encoding as input, the dynamic graph transformer can simultaneously capture both spatial and temporal features with a single encoder.” – teaches performing self-supervised training of the graph neural network including jointly training an encoder and decoder, and in Fig. 2 – “The whole framework is trained with a binary cross-entropy loss in an end-to-end manner.” – teaches training the encoder and decoder through binary-cross entropy).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate the self-supervised training of the graph neural network including jointly training an encoder and a decoder through binary cross-entropy to the method of Shekhar, Verma, and Wadhwa in order to have a graph neural network trained by jointly training an encoder and a decoder through cross-entropy loss. Doing so would provide a comprehensive node encoding composed of three functional terms to distill global spatial, local spatial, and temporal information (Liu, Introduction).
Response to Arguments
Applicant’s arguments, see pp. 3-4, filed 24 October 2025, with respect to the rejection(s) of claim(s) 1 and 19-20 under Shekhar and Verma have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Wadhwa et al. (US Pub. No. 2022/0020026, published Jan. 2022). Wadhwa teaches the amended limitations of claim 1 regarding “determining, for at least one embedding…”, “determining an anomaly score for at least one transaction…”, and ”including by flagging the at least one transaction…”. 
Applicant argues on pp. 4 of Remarks that Verma only generally describes that the learned representations may be further used to perform tasks such as link prediction without detailing as to how link prediction is performed. Examiner respectfully disagrees and points to Verma at [0174] – “Link Prediction: To evaluate the performance of the BipGNN model for the link prediction task, the same three public datasets are used that were used for the node regression task. Here, the aim should predict whether a link between two nodes exists or not. For the AC dataset, a link between a user and CD denotes that the user has given a rating to a CD Similarly, for AM and ML datasets, a link exists when a user has given a rating to a movie. A random split strategy (90:10) is followed to split the links between the nodes for each dataset. In addition, the training links are used to learn the node embeddings, and further, these node embeddings along with the self-node features are used to train FFN for the binary classification task.” – teaches performing link prediction. 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Shumovskaia et al. (NPL dated March 2021: Linking bank clients using graph neural network powered by rich transactional data) teaches a graph network model that uses rich time-series transaction data for graph nodes and edges, with a focus on the task of predicting new interactions within a network of bank clients and treating it as a link prediction problem. 
	Wang et al. (NPL dated April 2021: Self-Supervised Learning of Contextual Embeddings for Link Prediction in Heterogenous Networks) teaches using global information from an entire graph with localized attention mechanisms to learn contextual node representations for downstream tasks such as link prediction. 
	Lo et al. (NPL dated March 2022: Inspection-L: Practical GNN-Based Money Laundering Detection System for Bitcoin) teaches a GNN framework based on self-supervised Deep Graph Infomax to detect fraudulent transactions for Anti-Money laundering. 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LOUIS C NYE whose telephone number is 571-272-0636. The examiner can normally be reached Monday - Friday 9:00AM - 5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MATT ELL can be reached at 571-270-3264. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LOUIS CHRISTOPHER NYE/Examiner, Art Unit 2141                                                                                                                                                                                                        
/MATTHEW ELL/Supervisory Patent Examiner, Art Unit 2141
Read full office action
Prosecution Timeline

Aug 19, 2022
Application Filed
Jul 22, 2025
Non-Final Rejection — §103
Oct 22, 2025
Examiner Interview Summary
Oct 22, 2025
Applicant Interview (Telephonic)
Oct 24, 2025
Response Filed
Feb 03, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/972,539
Patent 12524683
METHOD FOR PREDICTING REMAINING USEFUL LIFE (RUL) OF AERO-ENGINE BASED ON AUTOMATIC DIFFERENTIAL LEARNING DEEP NEURAL NETWORK (ADLDNN)
3y 2m to grant Granted Jan 13, 2026
Study what changed to get past this examiner. Based on 1 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
22%
Grant Probability
58%
With Interview (+35.7%)
4y 1m (~5m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 9 resolved cases by this examiner. Grant probability derived from career allowance rate.