Prosecution Insights
Last updated: April 19, 2026
Application No. 18/258,523

IMPROVED DISTRIBUTED TRAINING OF GRAPH-EMBEDDING NEURAL NETWORKS

Non-Final OA §103
Filed
Jun 20, 2023
Examiner
FIGUEROA, KEVIN W
Art Unit
2124
Tech Center
2100 — Computer Architecture & Software
Assignee
Orange
OA Round
1 (Non-Final)
70%
Grant Probability
Favorable
1-2
OA Rounds
4y 0m
To Grant
91%
With Interview

Examiner Intelligence

Grants 70% — above average
70%
Career Allow Rate
252 granted / 362 resolved
+14.6% vs TC avg
Strong +21% interview lift
Without
With
+21.0%
Interview Lift
resolved cases with interview
Typical timeline
4y 0m
Avg Prosecution
20 currently pending
Career history
382
Total Applications
across all art units

Statute-Specific Performance

§101
24.4%
-15.6% vs TC avg
§103
52.0%
+12.0% vs TC avg
§102
9.1%
-30.9% vs TC avg
§112
7.1%
-32.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 362 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zheng, Da, et al. "DistDGL: Distributed graph neural network training for billion-scale graphs." in view of Zhu, Rong, et al. "Aligraph: A comprehensive graph neural network platform." [herein Ali]. Regarding claims 1 and 14, Zheng teaches “a computer-implemented method for distributed training of a graph-embedding neural network, the method performed at a first server and comprising” (abstract “DistDGL distributes the graph and its associated data (initial features and embeddings) across the machines and uses this distribution to derive a computational decomposition by following an owner-compute rule. DistDGL follows a synchronous training approach and al lows ego-networks forming the mini-batches to include non-local nodes”): “computing, based on a first input data sample, first model data and first embedding data of a first graph neural network, the first graph neural network corresponding to a first set of nodes of a graph that are visible to the first server” (pg. 2 left col. ¶1 “It distributes graph data (both graph structure and the associated data, such as node and edge features) across all machines and run trainers, sampling servers (for sampling subgraphs to generate mini-batches) and in-memory KVStore servers (for serving node data and edge data) all on the same set of machines”); “sharing the first model data and the first embedding data with a second server” (pg. 3 left col. “A number of trainers that compute the gradients of the model parameters over a mini-batch. At each iteration, they first fetch the mini-batch graphs from the samplers and the corresponding vertex/edge features from the KVStore. They then run the forward and backward computation on their own mini-batches in parallel to compute the gradients.” i.e. sharing); While Zheng generally teaches the remaining limitations Ali, in the same field of endeavor teaches “receiving second embedding data from a third server” (Ali pg. 3 left col. “If the neighbors of a vertex are not cached, a call to remote graph server is needed. When getting the context of a batch of vertices, we first partition the vertices into sub-batches, and the context of each sub-batch will be stitched together after being returned from the corresponding graph server”), “the second embedding data comprising embedding data of a second graph neural network corresponding to a second set of nodes of the graph that are invisible to the first server” (pg. 5 §3.3 “Recall that, GNN algorithms rely on aggregating neighborhood information to generate embeddings of each vertex. However, the degree distribution of real-world graphs is often skewed [48], which makes the convolution operation hard to operate. To tackle this, existing GNNs usually adopt various sampling strategies to sample a subset of neighbors with aligned sizes”); and “computing second model data of the first graph neural network based on a second input data sample and the embedding data of the second graph neural network” (Ali previous citation “existing GNNs usually adopt various sampling strategies to sample a subset of neighbors with aligned sizes” i.e. using the other data to compute the current data) It would have been obvious to one having ordinary skill in the art at the time that the invention was effectively filed to combine the teachings of Zheng with that of Ali since “AliGraph runs 40%-50% faster with the novel caching strategy and demonstrates around 12 times speed up with the improved runtime. In addition, our in-house developed GNNmodels all showcase their statistically significant superiorities in terms of both effectiveness and efficiency (e.g., 4.12%–17.19% lift by F1 scores).” Ali abstract. That is, by combining the two references, one would have faster distributed GNN training. Note that independent claim 14 recites the same substantial subject matter as independent claim 1, only differing in embodiment. The difference in embodiment, a computer-implemented method as opposed to a computer service executing the method are an obvious vibration of another. The additional limitations of a processor and memory are inherent components to any computing system such as the system of Zheng and Ali. Regarding claim 2, the Zheng and Ali references have been addressed above. Ali further teaches “computing third embedding data of the first graph neural network based on the second input data sample and the second embedding data” (Ali pg. 5 §3.3 “Recall that, GNN algorithms rely on aggregating neighborhood information to generate embeddings of each vertex. However, the degree distribution of real-world graphs is often skewed [48], which makes the convolution operation hard to operate. To tackle this, existing GNNs usually adopt various sampling strategies to sample a subset of neighbors with aligned sizes” the system is not limited to any particular number of servers/data and therefore the functionality is the same whether its third, fourth, fifth, data, etc.); and Zheng teaches “sharing the third embedding data with the second server” (pg. 8 right col. “GNN models are composed of multiple operators organized into multiple graph convolution network layers shared among all nodes and edges”) Regarding claim 3, the Zheng and Ali references have been addressed above. Ali further teaches “wherein the embedding data of the second graph neural network is computed by a fourth server” (Ali pg. 5 §3.3 “Recall that, GNN algorithms rely on aggregating neighborhood information to generate embeddings of each vertex. However, the degree distribution of real-world graphs is often skewed [48], which makes the convolution operation hard to operate. To tackle this, existing GNNs usually adopt various sampling strategies to sample a subset of neighbors with aligned sizes” the system is not limited to any particular number of servers/data and therefore the functionality is the same whether its third, fourth, fifth, data, etc.) Regarding claim 4, the Zheng and Ali references have been addressed above. Zheng further teaches “wherein the third server is a parameter server that receives the embedding data of the second graph neural network from the fourth server” (pg. 3 left col. “A number of trainers that compute the gradients of the model parameters over a mini-batch. At each iteration, they first fetch the mini-batch graphs from the samplers and the corresponding vertex/edge features from the KVStore.”) Regarding claim 5, the Zheng and Ali references have been addressed above. Zheng further teaches “wherein the third server is the fourth server” (abstract “we develop DistDGL, a system for training GNNs in a mini-batch fashion on a cluster of machines.” i.e. servers/machines can be the same or not) Regarding claim 6, the Zheng and Ali references have been addressed above. Zheng further teaches “wherein the second server is different than the fourth server” (abstract “we develop DistDGL, a system for training GNNs in a mini-batch fashion on a cluster of machines.” i.e. servers/machines can be the same or not and fig. 3 which shows two distinct machines/servers) Regarding claim 7, the Zheng and Ali references have been addressed above. Zheng further teaches “wherein sharing the third embedding data with the second server comprises sharing the computed third embedding data and the second embedding data received from the third server” (pg. 8 right col. “GNN models are composed of multiple operators organized into multiple graph convolution network layers shared among all nodes and edges”) Regarding claim 8, the Zheng and Ali references have been addressed above. Ali further teaches “wherein the third server combines the first embedding data and the embedding data of the second graph neural network to form the second embedding data” (Ali pg. 3 algorithm 1 shows combination of vertexes/embeddings) Regarding claim 9, the Zheng and Ali references have been addressed above. Zheng further teaches “further comprising sharing the second model data of the first graph neural network with the second server” (pg. 8 right col. “GNN models are composed of multiple operators organized into multiple graph convolution network layers shared among all nodes and edges”) Regarding claim 10, the Zheng and Ali references have been addressed above. Zheng further teaches “further comprising receiving third model data, comprising a model of the graph-embedding neural network, from the third server, said third model data being used when computing said second model data” (pg. 2 right col. “This sampling strategy forms a computation graph for passing messages on. Figure 1b depicts such a graph for computing representation of one target vertex when the GNN has two layers. The sampled graph and together with the extracted features are called a mini-batch in GNN training” which can and is received by any of the servers and used by any) Regarding claim 11, the Zheng and Ali references have been addressed above. Zheng further teaches “wherein said third model data comprises aggregate model data obtained by aggregating, at the third server, a plurality of model data received from different servers” (pg. 3 left col. “A dense model update component for aggregating dense GNN parameters to perform synchronous SGD. Dist DGL reuses the existing components depending on DGL’s backend deep learning frameworks”) Regarding claim 12, the Zheng and Ali references have been addressed above. Zheng further teaches “further comprising aggregating the third model data with the first model data to produce aggregate model data; and using the aggregate model data when computing the second model data” (pg. 3 left col. “A dense model update component for aggregating dense GNN parameters to perform synchronous SGD. Dist DGL reuses the existing components depending on DGL’s backend deep learning frameworks” and pg. 6 left col. “This hybrid approach is potentially more advantageous than the multiprocessing approach for synchronous SGD because we need to aggregate gradients of model parameters from all trainer processes and broadcast new model parameters to all trainers. More trainer processes result in more communication overhead for model parameter updates”) Regarding claim 13, the Zheng and Ali references have been addressed above. Zheng further teaches “wherein computing the second model data of the first graph neural network comprises integrating the embedding data of the second graph neural network into the first graph neural network beginning at a first convolutional layer of the first graph neural network” (pg. 2 right col. “Similar to convolutional neural networks (CNNs), a GNN model iteratively applies Equations (1) to generate vertex representations for multiple layers.”). Regarding claim 15, the Zheng and Ali references have been addressed above. Zheng further teaches “the computer server of claim 14; and at least one server, connected to the computer server” (abstract “we develop DistDGL, a system for training GNNs in a mini-batch fashion on a cluster of machines” machines i.e. servers), Ali teaches “said at least one server configured to receive model data and embedding data from the computer server and to return aggregate model data and aggregate embedding data to the computer server” (pg. 3 right col. “Specifically, we apply the SAMPLE function to fetch a subset S of vertices based on the neighbor set Nb(v) of vertex v, aggregate the embeddings of all vertices u 2 S by the AGGREGATE function to obtain a vector h0 v, and combine h0 v with h(k􀀀1) v to generate the embedding vector h(k) v by the COMBINE function. After processing all vertices, the embedding vectors are normalized. Finally, after kmax hops, h(kmax) v is returned as the embedding result hv of vertex v”) Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEVIN W FIGUEROA whose telephone number is (571)272-4623. The examiner can normally be reached Monday-Friday, 10AM-6PM EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MIRANDA HUANG can be reached at (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. KEVIN W FIGUEROA Primary Examiner Art Unit 2124 /Kevin W Figueroa/Primary Examiner, Art Unit 2124
Read full office action

Prosecution Timeline

Jun 20, 2023
Application Filed
Mar 21, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12586093
SYSTEMS AND METHODS FOR FACILITATING NETWORK CONTENT GENERATION VIA A DYNAMIC MULTI-MODEL APPROACH
2y 5m to grant Granted Mar 24, 2026
Patent 12573477
MOLECULAR STRUCTURE ACQUISITION METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM
2y 5m to grant Granted Mar 10, 2026
Patent 12570281
METHOD FOR EVALUATING DRIVING RISK LEVEL IN TUNNEL BASED ON VEHICLE BUS DATA AND SYSTEM THEREFOR
2y 5m to grant Granted Mar 10, 2026
Patent 12554964
CIRCUIT FOR HANDLING PROCESSING WITH OUTLIERS
2y 5m to grant Granted Feb 17, 2026
Patent 12547873
METHOD AND APPARATUS WITH NEURAL NETWORK INFERENCE OPTIMIZATION IMPLEMENTATION
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
70%
Grant Probability
91%
With Interview (+21.0%)
4y 0m
Median Time to Grant
Low
PTA Risk
Based on 362 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month