Prosecution Insights
Last updated: May 29, 2026
Application No. 17/302,941

TRANSACTION DATA PROCESSING

Non-Final OA §103
Filed
May 17, 2021
Examiner
MANDEL, MONICA A
Art Unit
3622
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
International Business Machines Corporation
OA Round
4 (Non-Final)
18%
Grant Probability
At Risk
4-5
OA Rounds
8m
Est. Remaining
27%
With Interview

Examiner Intelligence

Grants only 18% of cases
18%
Career Allowance Rate
59 granted / 324 resolved
-33.8% vs TC avg
Moderate +9% lift
Without
With
+8.9%
Interview Lift
resolved cases with interview
Typical timeline
5y 8m
Avg Prosecution
9 currently pending
Career history
342
Total Applications
across all art units

Statute-Specific Performance

§101
5.5%
-34.5% vs TC avg
§103
76.4%
+36.4% vs TC avg
§102
4.5%
-35.5% vs TC avg
§112
8.1%
-31.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 324 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Acknowledgements This Office Action is in response to Applicant’s response filed on August 21, 2025. Claims 1-2, 5-6, 8-9, 12-13, 15-16, and 19 are currently pending and have been examined. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. Claims 1-2, 5-6, 8-9, 12-13, 15-16, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Shekhar et al. (US 2021/0233080 A1)(“Shekhar”) in view of Sankar et al. (US 12,205,044 B2)(“Sankar”) and further in view of the NPL reference-“Graph Transformer Networks” by Yun et al. (“Yun”). Claims 1, 8, and 15 As to Claims 1, 8, and 15, Shekhar discloses a computer-implemented method (“methods,” [0116]) [computer system comprising: a processing unit; and a memory coupled to the processing unit and storing instructions thereon, the instructions, when executed by the processing unit (“systems,” [0116]), performing actions], [computer program product comprising a computer-readable storage medium (“non-transitory computer-readable media,” [0116]) having a set of instructions stored therein which, when executed by a processor, causes the processor to perform a method] comprising: obtaining transaction data associated with a target account (“generates a heterogenous network (e.g., a graph) that includes nodes corresponding to digital identities (e.g., email addresses, etc.) and edges representing associated digital transactions. The system can utilize a graph convolutional neural network to analyze the heterogenous network…” [0015], “the term ‘digital transaction’ refers to a transfer of money or other asset via electronic means. In particular, a digital transaction can refer to a transaction in which money or some other asset is electronically withdrawn from one financial account and deposited into another financial account.” [0031], “the set of nodes associated with a node u (the set of nodes represented as Cu Δt-also referred to as the temporal context of node u) at temporal distance Δt… within the graph G (i.e., the transaction graph 402)…” [0077], see Fig.3); dividing the transaction data into a plurality of time windows (“the set of nodes associated with a node u (the set of nodes represented as Cu Δt-also referred to as the temporal context of node u) at temporal distance Δt… within the graph G (i.e., the transaction graph 402)…” [0077], see Fig.3) representing the transaction data as a graph comprising a plurality of nodes respectively corresponding to a plurality of accounts including the target account (“generates a heterogenous network (e.g., a graph) that includes nodes corresponding to digital identities (e.g., email addresses, etc.) and edges representing associated digital transactions. The system can utilize a graph convolutional neural network to analyze the heterogenous network…” [0015], “the term ‘digital transaction’ refers to a transfer of money or other asset via electronic means. In particular, a digital transaction can refer to a transaction in which money or some other asset is electronically withdrawn from one financial account and deposited into another financial account.” [0031]); extracting, by a graph convolutional network, temporal information (“zu” at step 2 in algorithm 1, see [0086], “zu…incorporates the information about the node itself and its importance weighted temporal context” [0087], and see [0087]) of the transaction data (“time-dependent graph convolutional neural network 404 can transform the concatenated vector through another fully connected layer…new representation zu for node u” [0087]) from a first graph corresponding to a first time window (see Fig.3 which represents a plurality of subgraphs comprised of two nodes and one edge and a time window which is the temporal distance, within graph 300, “the set of nodes associated with a node u (the set of nodes represented as Cu Δt-also referred to as the temporal context of node u) at temporal distance Δt… within the graph G (i.e., the transaction graph 402)…” [0077], see Fig.3), and extracting, spatial information (i.e.: visits, “the fraudulent transaction detection system 106 can, for each neighbor node associated with the node v, divide the visit count of the neighbor node by the combined visit count from all neighbor nodes associated with the node v in order to provide a normalized weight value.” [0085]) of the first graph to create spatial-temporal information of the transaction (“Algorithm I, presented below, is another characterization of how the time-dependent graph convolutional neural network 404 generates a node embedding corresponding to a given node.” [0086]), the spatial-temporal information including transactions between the target account and additional accounts during the first time window (“Algorithm I, presented below, is another characterization of how the time-dependent graph convolutional neural network 404 generates a node embedding corresponding to a given node.” [0086] see Δt in Algorithm 1, “a heterogenous network…edges representing associated digital transactions.” [0015]), generating a feature representation for the target account based on the extracted spatial-temporal information (“The fraudulent transaction detection system 106 passes the output of the final convolution layer through a fully connected layer to generate the final node embeddings zu, ⱯuϵV.” [0096]), and determining whether transaction behaviors of the target account, according to the feature representation of the plurality of time windows, match an abnormal pattern of behavior (“Upon generating the node embeddings, the fraudulent transaction detection system 106 can determine whether a digital identity corresponds to a fraudulent entity using a machine learning technique (e.g., a machine learning model). In particular, the fraudulent transaction detection system 106 can provide the node embeddings generated by the time-dependent graph convolutional neural network 404 to a trained machine learning model. The fraudulent transaction detection system 106 can utilize the trained machine learning model to determine whether a given digital identity corresponds to a fraudulent entity based on those node embeddings. For example, the fraudulent transaction detection system 106 can use the node embeddings as features to identify fraudulent entities.” [0097]). Shekhar does not directly disclose: a plurality of graphs comprising the plurality of nodes, each graph respectively corresponding to an associated time window of the plurality of time windows; the extracting is from the plurality of graphs; the transactions between the target account and additional accounts is during respective time windows of the plurality of graphs; generating a plurality of query vectors, a plurality of key vectors and a plurality of value vectors corresponding to the plurality of graphs by projecting the respective spatial information of the plurality of graphs into different spaces; determining respective attention weights of the plurality of graphs based on the plurality of query vectors and the plurality of key vectors; determining the spatial-temporal information of the transaction data by aggregating the plurality of value vectors based on the attention weights; the extracting is using a transformer framework; and the generating is within a transformer framework. Sankar teaches a plurality of graphs comprising a plurality of nodes (“A dynamic graph can include a series of observed snapshots, G={ ꞔ1, . . . , ꞔ T} where T can be a number of time-steps. Each snapshot ꞔ1 =(V, ϵt, Wt) can be a weighted undirected graph including a shared node set V, a link (e.g., edge) set ϵT and weights Wt, depicting the graph structure at time t.” C.14, L.39-44), each graph respectively corresponding to an associated time window of the plurality of time windows (“A dynamic graph can include a series of observed snapshots, G={ ꞔ1, . . . , ꞔ T} where T can be a number of time-steps. Each snapshot ꞔ1 =(V, ϵt, Wt) can be a weighted undirected graph including a shared node set V, a link (e.g., edge) set ϵT and weights Wt, depicting the graph structure at time t.” C.14, L.39-44, see graphs as times 1-T depicted in Fig.7); extracting from the plurality of graphs (“For each node v, the input to the temporal self-attention layer can be a set of intermediate vector representations {xv1, xv2, ... , xvT}, xvt ϵ ℝD’ where T can be a number of time-steps (e.g., graph snapshots) and D' can be a dimensionality of the input vector representations. The output of the layer can be a new set of vector representations (e.g., final node representations) for each node vat each time step (e.g., zv={zv1, zv2, ... , zvt}, zvt ϵ ℝF’ with dimensionality F'). The input and output representations of v, packed together across all graph snapshots, can be denoted by matrices Xv ϵ ℝTXD’ and Zv ϵ ℝTXF’ |respectively.” C.17, L.19-29); transactions between the target account and additional accounts during respective time windows of the plurality of graphs (“As an illustrative example, graph data can include interaction data (e.g., transaction data, etc.). The graph data can be a dynamic graph comprising a plurality of graph snapshots. Each graph snapshot can include any suitable number of nodes and edges. The nodes of the graph data can represent resource providers and users. Edges may connect a resource provider node to a user node when the two have performed a transaction.” C.13, L.10-17, “A dynamic graph can include a series of observed snapshots, G={ ꞔ1, . . . , ꞔ T} where T can be a number of time-steps. Each snapshot ꞔ1 =(V, ϵt, Wt) can be a weighted undirected graph including a shared node set V, a link (e.g., edge) set ϵT and weights Wt, depicting the graph structure at time t.” C.14, L.39-44), generating a plurality of query vectors, a plurality of key vectors and a plurality of value vectors corresponding to the plurality of graphs by projecting the respective spatial information of the plurality of graphs into different spaces (“The structural self-attention method of FIG. 4 can accept three inputs xu(Q), xv(K), and xv(V). An attention function can be described as mapping a query Q and a set of key-value pairs (e.g., K and V, respectively) to an output, where the query, keys, values, and output can all be vectors, or in some embodiments matrices.” C.16, L.1-6, “The input to the structural self-attention layer can be a graph snapshot ꞔ ϵ G where G can be a dynamic graph (e.g., graph data), and a set of input node representations {xvϵℝD ,ⱯvϵV } where D can be the dimensionality of the input embeddings. The structural self-attention layer can output a new set of node representations { zvϵℝF v, Ɐv ϵ V} with dimensionality F.” C.16, L.10-16); determining respective attention weights of the plurality of graphs based on the plurality of query vectors and the plurality of key vectors (“The structural self-attention layer can attend over the immediate neighbors of a node v at time t, by computing attention weights as a function of their input node embeddings.” C.16, L.19-22, “In equation (1), above, Nv={u ϵ V:(u, v) ϵ Et} can be a set of immediate neighbors of node v in the graph snapshot, WS ϵ ℝDXF can be a shared weight transformation applied to each node in the graph snapshot. In terms of FIG. 4, an analysis computer can apply different linear transformations at steps 402, 404, and 406 to the query Q, the keys K, and the values V, respectively. The linear transformations can be any suitable linear transformation applied to the query Q, the keys K, and the values V. In equation (1), the linear transformations may be applied, for example, by the shared weight transformation WS ϵ ℝDX . a ϵ ℝ2D can be a weight vector parameterizing the attention function implemented as feed-forward layer.” C.16, L.33-45, “The node representations computed by the structural block can be input to a temporal self-attention layer, which can compute a temporal self-attention independently for each node v over all time steps (e.g., over each graph snapshot).” C.17, L.11-16); and determining the spatial-temporal information of the transaction data by aggregating the plurality of value vectors based on the attention weights (“Embodiments can use multiple attention heads, followed by concatenation, in both structural and temporal self-attention layers” C.18, L.61-63, “Then at step 608, the attention score between each query and key can be computed and then used to weight the values and sum them. Then at step 610, The output of the attention process at step 608, can be concatenated for each head of attention that is performed.” C.19, L.17-22). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Shekhar by the features of Sankar, and in particular to include in Shekhar, the feature of a plurality of graphs comprising the plurality of nodes, as taught by Sankar; to include in Shekar, the feature of each graph respectively corresponding to an associated time window of the plurality of time windows, as taught by Sankar; to include in Shekar, the feature of the extracting is from the plurality of graphs, as taught by Sankar; to include in Shekar, the feature of the transactions between the target account and additional accounts is during respective time windows of the plurality of graphs, as taught by Sankar; to include in Shekar’s extracting, the feature of generating a plurality of query vectors, a plurality of key vectors and a plurality of value vectors corresponding to the plurality of graphs by projecting the respective spatial information of the plurality of graphs into different spaces, as taught by Sankar; to include in Shekar, the feature of determining respective attention weights of the plurality of graphs based on the plurality of query vectors and the plurality of key vectors, as taught by Sankar; and to include in Shekar, the feature of determining the spatial-temporal information of the transaction data by aggregating the plurality of value vectors based on the attention weights, as taught by Sankar. A person having ordinary skill in the art would have been motivated to combine these features because it would help to “capture temporal dependencies at a fine-grained node-level granularity” (Sankar, C.7, L.3-4). Yun teaches extracting using a transformer framework (see extraction of edge types and weights in equation 4, on page 4, “The adjacency matrix of arbitrary length l meta-paths can be calculated by where AP denotes the adjacency matrix of meta-paths, Te denotes a set of edge types and α(l) t1 is the weight for edge type tl at the lth GT layer.” Page 4); and generating within a transformer framework (“Graph Transformer Network (GTN) that learns to transform a heterogeneous input graph into useful meta-path graphs…” Page 2, 2nd paragraph). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Shekhar’s extracting of the spatial information and generating of a feature representation, in the Shekhar/Sankar combination, by the feature of Yun, and in particular to include in Shekhar’s extracting step, the extracting to be applied using a transformer framework, as taught by Yun, and to include in Shekhar’s generating step, the generating to be applied within a transformer framework, as taught by Yun. A person having ordinary skill in the art would have been motivated to combine these features because it would help improve the node representation (Yun, “Abstract”). Claims 2, 9, and 16 Shekhar further discloses wherein: a first edge between a first node for the target account and a second node for a second account corresponds to a transaction between the target account and a second account (“…edges representing associated digital transactions…” [0015]). Claims 5, 12, and 19 Sankar teaches wherein aggregating the plurality of value vectors based on the attention weights includes: smoothing the attention weights by using a smoothing attention layer in the transformer framework (the transformer framework of the combination of Shekhar/Sankar/Yun discussed above)(“applying the softmax function…” C.18, L.21-26); and aggregating the plurality of value vectors based on the smoothed attention weights (“Embodiments can use multiple attention heads, followed by concatenation, in both structural and temporal self-attention layers” C.18, L.61-63, “Then at step 608, the attention score between each query and key can be computed and then used to weight the values and sum them. Then at step 610, The output of the attention process at step 608, can be concatenated for each head of attention that is performed.” C.19, L.17-22). Claims 6 and 13 Sankar further teaches the plurality of key vectors includes a first key vector corresponding to the first graph (“The structural self-attention method of FIG. 4 can accept three inputs xu(Q), xv(K), and xv(V). An attention function can be described as mapping a query Q and a set of key-value pairs (e.g., K and V, respectively) to an output, where the query, keys, values, and output can all be vectors, or in some embodiments matrices.” C.16, L.1-6), and determining respective attention weights of the plurality of graphs includes: determining time intervals between the plurality of time windows and the first time window (“The node representations computed by the structural block can be input to a temporal self-attention layer, which can compute a temporal self-attention independently for each node v over all time steps (e.g., over each graph snapshot).” C.17, L.11-16); generating relative position vectors by encoding positions of the plurality of time windows relative to the first time window (“To compute the output representation of node v at t, embodiments can use a scaled dot-product form of attention (Vaswani et al., 2017) where the queries, the keys, and the values may all come from the input vector representations. The queries, the keys, and the values can first be transformed to a different space using linear projections matrices…” C.17, L.17, L.44-50); determining a plurality of attention scores between the plurality of query vectors and the first key vector based on the relative position vectors and parameter vectors corresponding to the time intervals (“At step 508, a Matmul function can be applied to the linear projected matrices from the query Q and the keys K. Matmul can be a transformational function that works on arrays…” C.18, L.1-20); and determining an attention weight of the first graph based on the plurality of attention scores (“After applying the softmax function at step 514, the output of the softwax function can be multiplied by the linear projected matrix from the values V, yielding the output vector representative of the change in the local structure of the node of the query Q over time” C.18, L.21-25). Response to Arguments Applicant’s arguments filed on August 21, 2025 have been fully considered and addressed below. On page 9, Applicant argues that Shekar “… does not disclose extracting spatial-temporal information of a plurality of graphs, ‘each graph…,” by ‘generating a plurality of query vectors…,’ ‘determining respective…weights.’ For example, Shekar does not disclose ‘generating a plurality of query vectors…graphs,’ but merely discloses a time-dependent graph convolutional neural network using the node embeddings as input to a second neural network layer.’ Further, the notion of determining ‘respective attention weights…graphs’ is not disclosed by Shekar, but merely associating individual weights to neighbors of a given node.” However, Sankar is now applied in the rejection to teach those features. Therefore, the argument is now moot. Conclusion Applicant’s amendment filed on August 21, 2025 necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to MONICA A MANDEL whose telephone number is (571)270-7046. The examiner can normally be reached Monday and Thursday 10:00 AM-6:00 PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ilana Spar can be reached at (571) 270-7537. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /M.A.M/ Examiner, Art Unit 3622 /ILANA L SPAR/ Supervisory Patent Examiner, Art Unit 3622
Read full office action

Prosecution Timeline

Show 5 earlier events
Oct 08, 2024
Response after Non-Final Action
May 21, 2025
Non-Final Rejection mailed — §103
Aug 08, 2025
Interview Requested
Aug 21, 2025
Response Filed
Feb 03, 2026
Final Rejection mailed — §103
Feb 26, 2026
Examiner Interview Summary
Feb 26, 2026
Applicant Interview (Telephonic)
Mar 16, 2026
Response after Non-Final Action

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12437279
TECHNIQUES FOR PERFORMING AUTHENTICATION IN ECOMMERCE TRANSACTIONS
4y 4m to grant Granted Oct 07, 2025
Patent 11941591
DEVICE INCLUDING ENCRYPTED DATA FOR EXPIRATION DATE AND VERIFICATION VALUE CREATION
2y 11m to grant Granted Mar 26, 2024
Patent 11868170
SIMPLE NONAUTONOMOUS PEERING MEDIA CLONE DETECTION
3y 5m to grant Granted Jan 09, 2024
Patent 11763335
REAL-TIME DISTRIBUTION OF CRYPTOCURRENCY REWARDS FOR A LOYALTY PROGRAM
4y 0m to grant Granted Sep 19, 2023
Patent 11734393
CONTENT DISTRIBUTION WITH RENEWABLE CONTENT PROTECTION
7y 7m to grant Granted Aug 22, 2023
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

4-5
Expected OA Rounds
18%
Grant Probability
27%
With Interview (+8.9%)
5y 8m (~8m remaining)
Median Time to Grant
High
PTA Risk
Based on 324 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month