Prosecution Insights
Last updated: April 19, 2026
Application No. 17/389,039

Graph Based Discovery on Deep Learning Embeddings

Non-Final OA §103
Filed
Jul 29, 2021
Examiner
SHINE, NICHOLAS B
Art Unit
2126
Tech Center
2100 — Computer Architecture & Software
Assignee
Microsoft Technology Licensing, LLC
OA Round
5 (Non-Final)
38%
Grant Probability
At Risk
5-6
OA Rounds
5y 1m
To Grant
82%
With Interview

Examiner Intelligence

Grants only 38% of cases
38%
Career Allow Rate
14 granted / 37 resolved
-17.2% vs TC avg
Strong +45% interview lift
Without
With
+44.6%
Interview Lift
resolved cases with interview
Typical timeline
5y 1m
Avg Prosecution
25 currently pending
Career history
62
Total Applications
across all art units

Statute-Specific Performance

§101
34.9%
-5.1% vs TC avg
§103
46.0%
+6.0% vs TC avg
§102
5.3%
-34.7% vs TC avg
§112
13.4%
-26.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 37 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Status of Claims A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 01/12/2026 has been entered. Claims 1, 13, and 20 are amended. No claims have been cancelled, and there are no new claims. Claims 1–20 are pending for examination.. Response to Arguments In reference to 35 USC § 103 Applicant’s arguments filed on 01/12/2026, have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 1–20 are rejected under 35 U.S.C. 103 as being unpatentable over Silva et al., (US 20210124780 A1), hereinafter “Silva”, in view of Wu et al., (US 20220107799 A1), hereinafter “Wu”. Regarding claim 1, Silva teaches: a computer implemented method of querying data, the method comprising (Silva ¶0034: “Graph embedding and clustering using the processes and systems more fully described below results in a highly scalable and low latency solution for graph searching and exploration. Embodiments of the present disclosure provide a full pipeline to run queries, visualize the results, and provide recommendations based on historical patterns. Thus, the disclosed graph search and visualization techniques are able to handle transaction data such as in fraud detection contexts where conventional methods typically cannot”): executing a deep learning model to obtain an embedding from a middle or lower layer of the deep learning model for each instance of data of a dataset (Silva Figs. 2–3, ¶0041, ¶0058–0066: “The process can implement any embedding method such as direct embedding and node embedding. When embedding a graph directly, a function converts each graph into a numeric vector. When using node embedding, a function converts each of the graph's nodes into a numeric vector and then aggregates the node embeddings for each graph using differential pooling. Other embedding methods include unsupervised embedding method that uses the topology of the graph (FIG. 2), and supervised embedding method that takes into account node features, such as timestamp and amount (FIG. 3)” and “A node embedding algorithm such as GraphSAGE learns node representations at each hierarchical layer, and the differential pooling mechanism successively aggregates these embeddings into clusters until arriving at a single representation”—[wherein the execution involves learning models with layers in a hierarchical structures including processing the data through middle and lower layers]), constructing a similarity graph object having instances of data of the dataset represented by points and similarity distance, based on the obtained embedding of each instance of data, represented by edges between points (Silva Figs. 2–4, 13–14, ¶0040, ¶0050–0064, ¶0074–0075, ¶0081–0087: “In various embodiments, the process computes the Euclidean distance between the calculated vectors in vector space, and stores this information to represent a corresponding graph or portion of a graph” and “The process outputs the model diagnosis analysis for each specific setup (410). For example, the output can be in the form of a model diagnosis report with one or more of the following components. FIGS. 14A-14G show examples of metrics that can be output in a report. Graph representatives plot (FIG. 14A), which is a plot with the center graph of each cluster Silhouette by cluster plot (FIG. 14B), which is a plot with each cluster's average Silhouette coefficient and its standard deviation Cluster size and fraud rate analysis (FIGS. 14C and 14D), which may be represented by three plots. The first shows the number of graphs in each cluster, the second shows the graph fraud rate in each cluster, and the third shows a ROC curve for the clustering method. The cluster size and fraud rate analysis can be displayed at the graph level or the transaction level. Graph features distribution plot (FIG. 14E), which is a plot with the mean and the standard deviation of a set of graph features for each cluster. The features are number of nodes, number of edges, mean degree, mean betweenness centrality and mean eccentricity. Node type distribution plot (FIGS. 14F and 14G), which is a plot with the mean and the standard deviation of each node type's ratio for each cluster. UMAP plot (1050 of FIG. 10A), which is a scatter plot with the two-dimensional UMAP representation of each embedding, colored by the assigned cluster. The UMAP (Uniform Manifold Approximation and Projection) is a dimensionality reduction technique that preserves the local structure of the data in their original dimensions”—[wherein the system constructs graph objects including plots representative of similarities between the data based on distances]); receiving a query that includes an identifier of a first point in the similarity graph object the query specifying a constraint on a walk through the similarity graph object (Silva Figs. 1, 5, ¶¶0046–0049, ¶0058–0059, ¶0064: “The process identifies a representative graph for each of the at least one group of vectors (108). In various embodiments, the representative graph for a cluster is the graph with the lowest average distance to every other graph in the cluster, and so it is sometimes referred to as the “center” of the cluster. As further described below, the representative graph may be used to respond to graph queries by comparing the query graph to clusters usings the representative graph for each cluster” and “Diffpool is a graph embedding method that combines a node embedding algorithm with a differential pooling mechanism. For purposes of illustration, the node embedding algorithm is GraphSAGE. However, any node embedding method can be used with the differential pooling mechanism, which provides flexibility to the end user to use any node embedding suitable for their use case” and “Whether to learn node representations for more layers can be based on a predefined number of layers as specified by a user or automatically selected based on available computational resources or desired speed”—[wherein the system receives queries that include graph embeddings that specifies constraints on the GraphSAGE algorithm]). processing the query against the similarity graph object to determine concept similarity distances based on the respective embeddings of points in the similarity graph object to identify at least one point representative of an instance of data responsive to the query (Silva Figs. 1, 5, 7, ¶0030–0034, ¶0048, ¶0088–0093: “The disclosed techniques leverage graph embedding to speed graph similarity queries, while also providing a full system for querying and exploring the results … Graph embedding and clustering using the processes and systems more fully described below results in a highly scalable and low latency solution for graph searching and exploration” and “The process then extracts the initial groups of graphs with similar patterns (the clusters), e.g., by running a Hierarchical Agglomerative Clustering model on the set of the new and old embeddings (106)”—[wherein the system runs the agglomerative clustering model on the embeddings to process the query to find similarities in the graphs and data]). Silva does not appear to explicitly teach: the embedding selected from an embedding space of the model with semantically similar data having embeddings incorporating a measure of concept similarity. However, Wu teaches: the embedding selected from an embedding space of the model with semantically similar data having embeddings incorporating a measure of concept similarity (Wu ¶0028, ¶0058, ¶0067, Claim 8: “Further, one or more embodiments described herein can employ a semantic matching operation based on cross-attention to explore fine-grained semantic relations between the text graph and corresponding code graph and update the embedding of each node in the graphs” and “In various embodiments, the code searching component 602 can perform code searching over the graph representations of both the code snippets and query texts by aggregating a graph-level embedding for each graph” and “wherein the code searching component further computes the amount of similarity based on a distance measure between the first aggregation and the second aggregation”—[(emphasis added)])l and The methods of Silva, the teachings of Wu, and the instant application are analogous art because they pertain to analyzing data with graphs and machine learning techniques. It would be obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the methods of Silva with the teachings of Wu to provide for a searchable distance representing a measure of how closely the embeddings are related based on semantic similarity. One would be motivated to do so to explore fine-grained semantic relations between the text graph and corresponding code graph and update the embedding of each node in the graphs (Wu ¶0028). Regarding claim 2, Silva in view of Wu teaches all the limitations of claim 1. Silva teaches: accessing the at least one point based on the at least one point's embedding (Silva Fig. 6, ¶0103: “User interface engine 630 is configured to render a user interface with which client 650 interacts to access system 610 (a graph search and visualization tool)”; and displaying content representative of the at least one point (Silva Fig. 6, ¶0103: “Client 650 may be operated by a user such as an analyst or data scientist. The user interface engine outputs a graph visualization and, optionally, additional information about the pattern (graph) being queried”). Regarding claim 3, Silva in view of Wu teaches all the limitations of claim 1. Silva teaches: wherein the similarity distance comprises a Euclidean distance between the respective embeddings (Silva Fig. 1, ¶0040: “In another aspect, vectors can be easily searched and compared because comparing vectors typically uses fewer computational resources than comparing graphs. In various embodiments, the process computes the Euclidean distance between the calculated vectors in vector space, and stores this information to represent a corresponding graph or portion of a graph”). Regarding claim 4, Silva in view of Wu teaches all the limitations of claim 1. Silva teaches: wherein processing the query further comprises progressively identifying a list of points from the first point to a target point, the list including points representing a fewest number of hops to progress from the first point to the target point (Silva Fig. 3, ¶0060–0067: “A node embedding algorithm such as GraphSAGE learns node representations at each hierarchical layer, and the differential pooling mechanism successively aggregates these embeddings into clusters until arriving at a single representation”—[wherein the algorithm progressively learns node representations in a hierarchically structured progressive process that optimizes (i.e., fewest) the pool of embeddings from clusters into a target representation]). Regarding claim 5, Silva in view of Wu teaches all the limitations of claim 1. Silva teaches: wherein processing the query further comprises identifying a list of points within a selected similarity distance from the first point (Silva Figs. 1, 4, ¶0043–0046: “The process groups the calculated vector(s) into at least one group of vectors (106). Identifying groups of vectors is also called “clustering” because the process extracts groups (clusters) of vectors that match patterns. The process defines a cluster of vectors by determining that vectors match a data pattern of interest. The number of groups identified can be based on a model report as further described below … The process identifies a representative graph for each of the at least one group of vectors (108). In various embodiments, the representative graph for a cluster is the graph with the lowest average distance to every other graph in the cluster”—[(emphasis added)]). Regarding claim 6, Silva in view of Wu teaches all the limitations of claim 1. Silva teaches: wherein processing the query further comprises progressively identifying a path between the first and a second point identified as a constraint as a function of concept similarities in the embeddings (Silva Figs. 1, 4, ¶0043–0046, ¶0050–0067: “The process groups the calculated vector(s) into at least one group of vectors (106). Identifying groups of vectors is also called “clustering” because the process extracts groups (clusters) of vectors that match patterns. The process defines a cluster of vectors by determining that vectors match a data pattern of interest. The number of groups identified can be based on a model report as further described below … The process identifies a representative graph for each of the at least one group of vectors (108). In various embodiments, the representative graph for a cluster is the graph with the lowest average distance to every other graph in the cluster” and “The first pattern is more suspicious than a second pattern because the first pattern indicates a stolen card, while the second pattern can simply be a public IP address that is then used by multiple people. The conventional NetLSD method could give two different graphs the same embedding because node type is not taken into consideration. Referring to the two patterns above, NetLSD would give both graphs the same embedding because it does not differentiate between a card node and an IP address node … One of the advantages of Diffpool is that it is able to incorporate node attributes such as timestamps, amounts, and node types and allow a tailoring of the final representations to a given label by using one or more losses”—[(emphasis added)]). Regarding claim 7, Silva in view of Wu teaches all the limitations of claim 6. Wu teaches: identifying a path comprises excluding selected concepts from the path (Silva Figs. 1, 5, 14A–14G, ¶0062–0064, ¶0090–0093: “The process outputs the identified similar graphs (506). The number of similar graphs to be returned in response to the query graph can be predetermined. For example, the process returns the n closest graphs as the query result. The process can also output any extra information related to those graphs (e.g., fraud label, total amount, etc.). The output can be rendered on a graphical user interface as further described below”). Regarding claim 8, Silva in view of Wu teaches all the limitations of claim 6. Silva teaches: wherein identifying a path comprises including all points of all concept types (Silva Figs. 1, 5, 14A–14G, ¶0062–0064, ¶0090–0093: “The process outputs the identified similar graphs (506). The number of similar graphs to be returned in response to the query graph can be predetermined. For example, the process returns the n closest graphs as the query result. The process can also output any extra information related to those graphs (e.g., fraud label, total amount, etc.). The output can be rendered on a graphical user interface as further described below”). Regarding claim 9, Silva in view of Wu teaches all the limitations of claim 6. Silva teaches: wherein identifying a path comprises constraining the path to a number of hops or a total distance between the first and second points (Silva Figs. 1–3, ¶0050–0057, ¶0062–0064: “The process learns node representations at a next hierarchical layer for the smaller graph (304). In this hierarchical layer (which follows the first layer of 300), the process will run on top of this coarsened graph, treating clusters as nodes and learning representations for them. In various embodiments, learning node representations for the next hierarchical layer is optional. That is, a single pooling layer can map directly to a final embedding so that all nodes are clustered into a single node”—[wherein the learning of the next layer is optional and configurable (i.e., constraining the total)]). Regarding claim 10, Silva in view of Wu teaches all the limitations of claim 6. Silva teaches: wherein identifying a path comprises including points of specified concepts in the path (Silva ¶0063–0064, ¶0067, ¶: “The process determines whether to learn node representations for more layers (306). This process can be repeated for as many hierarchical layers as the user wishes to be encoded. The purpose of encoding information per hierarchical level is that both local as well as global graph information is embedded. Whether to learn node representations for more layers can be based on a predefined number of layers as specified by a user or automatically selected based on available computational resources or desired speed. User interaction with a model such as Diffpool is further described with respect to FIG. 4”). Regarding claim 11, Silva in view of Wu teaches all the limitations of claim 6. Silva teaches: wherein the path comprises a queryable object (Silva Figs. 1, 5, ¶0088–0093: “The process calculates one or more vectors for the query graph (502). In other words, the process calculates embeddings of the query graph and may use the same process as 104 of FIG. 1 to calculate the embeddings”). Regarding claim 12, Silva in view of Wu teaches all the limitations of claim 6. Silva teaches: accessing points on the path (Silva Figs. 1, 5–6, ¶0091–0097: “The process identifies one or more graphs similar to the query graph including by comparing the calculated one or more vectors for the query graph with one or more previously-calculated vectors for a different set of graphs (504). The process refers to a knowledge base such as the one generated by the process of FIG. 1 to find graphs similar to the query graph. The knowledge base contains previously-calculated vectors for a different set of graphs (e.g., graph embeddings of other graphs)”); and displaying content representative of the instances of data corresponding to the points along with an indication of the similarity distance between successive entities wherein the data comprises images (Silva Figs. 1, 5–7, ¶0091–0111: “The graph embeddings can be used to efficiently serve graph similarity searches. Search engine 620 is configured to determine similarity between graphs and store historical patterns (clusters) to provide a faster response to graph queries compared with conventional techniques. The search engine computes the distance between the vectors (embeddings) calculated by the graph representation engine to determine similarity between graphs. In various embodiments, the search engine is configured to perform 108-110 of FIG. 1 … One way the user interface enables a user to make comparisons and navigate is using a card system in which cards are displayed to a user and the user can interact with the card. A card system maps a visual rectangular pane (a card) to each network. These cards can then be positioned side by side allowing a visual comparison of the information they display. As further described below, a user can navigate the networks by scrolling through a list view. Alternatively, a Map View is provided to support an overview of the set of similar graphs”). Regarding claim 13, Silva teaches: a machine-readable storage device having instructions for execution by a processor of a machine to cause the processor to perform operations to perform a method, the operations comprising (Silva Fig. 12, ¶0128–0134: “programming instructions and data … Processor 1202 … memory 1210 … any suitable computer readable storage media”): Regarding claim 13, although varying in scope, the remaining limitations of claim 13 are substantially the same as the limitations of claim 1. Thus, the remaining limitations of claim 13 are rejected using the same reasoning and analysis as claim 1 above. Regarding claims 14–19, although varying in scope, the limitations of claims 14–19 are substantially the same as the limitations of claims 2, 4, and 6–12. Thus, claims 14–19 are rejected using the same reasoning and analysis as claims 2, 4, and 6–12 above. Regarding claim 20, Silva teaches: a device comprising: a processor; and a memory device coupled to the processor and having a program stored thereon for execution by the processor to perform operations comprising (Silva Fig. 12, ¶0128–0134: “programming instructions and data … Processor 1202 … memory 1210 … any suitable computer readable storage media”): Regarding claim 20, although varying in scope, the remaining limitations of claim 20 are substantially the same as the limitations of claim 1. Thus, the remaining limitations of claim 20 are rejected using the same reasoning and analysis as claim 1 above. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS SHINE whose telephone number is (571)272-2512. The examiner can normally be reached M-F, 9a-5p ET. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached on (571) 270-7519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /N.B.S./Examiner, Art Unit 2126 /DAVID YI/Supervisory Patent Examiner, Art Unit 2126
Read full office action

Prosecution Timeline

Jul 29, 2021
Application Filed
Mar 18, 2024
Non-Final Rejection — §103
Jun 18, 2024
Examiner Interview Summary
Jun 18, 2024
Applicant Interview (Telephonic)
Jul 01, 2024
Response Filed
Oct 18, 2024
Final Rejection — §103
Jan 10, 2025
Examiner Interview Summary
Jan 10, 2025
Applicant Interview (Telephonic)
Jan 30, 2025
Request for Continued Examination
Feb 07, 2025
Response after Non-Final Action
May 17, 2025
Non-Final Rejection — §103
Aug 13, 2025
Applicant Interview (Telephonic)
Aug 13, 2025
Examiner Interview Summary
Aug 14, 2025
Response Filed
Nov 06, 2025
Final Rejection — §103
Jan 12, 2026
Response after Non-Final Action
Jan 12, 2026
Examiner Interview Summary
Jan 12, 2026
Applicant Interview (Telephonic)
Feb 24, 2026
Request for Continued Examination
Mar 07, 2026
Response after Non-Final Action
Mar 21, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12579449
HYDROCARBON OIL FRACTION PREDICTION WHILE DRILLING
2y 5m to grant Granted Mar 17, 2026
Patent 12572440
AUTOMATICALLY DETECTING WORKLOAD TYPE-RELATED INFORMATION IN STORAGE SYSTEMS USING MACHINE LEARNING TECHNIQUES
2y 5m to grant Granted Mar 10, 2026
Patent 12561554
ERROR IDENTIFICATION FOR AN ARTIFICIAL NEURAL NETWORK
2y 5m to grant Granted Feb 24, 2026
Patent 12533800
TRAINING REINFORCEMENT LEARNING AGENTS TO LEARN FARSIGHTED BEHAVIORS BY PREDICTING IN LATENT SPACE
2y 5m to grant Granted Jan 27, 2026
Patent 12536428
KNOWLEDGE GRAPHS IN MACHINE LEARNING DECISION OPTIMIZATION
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

5-6
Expected OA Rounds
38%
Grant Probability
82%
With Interview (+44.6%)
5y 1m
Median Time to Grant
High
PTA Risk
Based on 37 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month