Last updated: May 29, 2026
Application No. 18/143,430
LEARNING GRAPH REPRESENTATIONS USING HIERARCHICAL TRANSFORMERS FOR CONTENT RECOMMENDATION

Final Rejection §101§103
Filed
May 04, 2023
Priority
Aug 31, 2020 — provisional 63/072,770 +1 more
Examiner
CHUANG, SU-TING
Art Unit
2146
Tech Center
2100 — Computer Architecture & Software
Assignee
Microsoft Technology Licensing, LLC
OA Round
2 (Final)
Interview Optional

— +37.2% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 50% grant rate with +37.2% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 104 resolved cases, 2023–2026
Examiner Intelligence

CHUANG, SU-TING View full profile →
Grants 50% of resolved cases
Career Allowance Rate
52 granted / 104 resolved
-5.0% vs TC avg
Strong +37% interview lift
Without
With
+37.2%
Interview Lift
resolved cases with interview
Typical timeline
4y 6m
Avg Prosecution
18 currently pending
Career history
130
Total Applications
across all art units
Statute-Specific Performance

§101
11.7%
-28.3% vs TC avg
§103
75.8%
+35.8% vs TC avg
§102
8.7%
-31.3% vs TC avg
§112
2.0%
-38.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 104 resolved cases
Office Action

§101 §103
DETAILED ACTION
This action is in response the communications filed on 03/05/2026 in which claims 21, 28, 30 and 36-39 are amended, the second claim 36 is canceled and therefore claims 21-40 are pending.
--
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
-
Claims 21-40 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more

Step 1: Claims 21-29 recite a method. Claims 30-36 recite a non-transitory medium. Claims 37-40 recite a system comprising a processor and memory. Therefore, claims 21-29 are directed to a process, claims 30-36 are directed to a manufacture, and claims 37-40 are directed to a machine.

With respect to claims 21, 30 and 37:
2A Prong 1: The claim recites a judicial exception.
capturing… interaction information for the source entity-relation pair information and the neighborhood entity-relation pair information (mental process – evaluation or judgement, a human can manually evaluate interaction information between the source and the neighbor)
providing… link predictions for the incomplete triplet based on the interaction information (mental process – evaluation or judgement, a human can manually predict/determine link predictions based on the interaction information)
selecting one of the link predictions to be a target node for the incomplete triplet (mental process – evaluation or judgement, a human can manually select one of the predictions)
adding the target node to the incomplete triplet in the knowledge graph (mental process – evaluation or judgement, a human can manually add the node to the triplet)

2A Prong 2: The judicial exception is not integrated into a practical application.
(claim 30) A non-transitory computer-readable medium storing instructions for completing an incomplete triplet in a knowledge graph, the instructions when executed by one or more processors of a computing device, cause the computing device to (claim 37) A system for completing an incomplete triplet in a knowledge graph, the system comprising: a processor; memory storing computer-executable instructions, which when executed, cause the system to (mere instructions to apply an exception – MPEP 2106.05(f), (2) invoking generic computer components)
receiving… a source entity-relation pair information from the knowledge graph (insignificant extra-solution activity – MPEP 2106.05(g), (3) data gathering and outputting)
receiving… neighborhood entity-relation pair information from the knowledge graph (insignificant extra-solution activity – MPEP 2106.05(g), (3) data gathering and outputting)
by a first transformer…  by the first transformer… by the first transformer… by a second transformer different from the first transformer… (insignificant extra-solution activity – MPEP 2106.05(g), (1) Whether the extra-solution limitation is well known)

Since the claim as a whole, looking at the additional elements individually and in combination, does not contain any other additional elements that are indicative of integration into a practical application, the claim is directed to an abstract idea.

2B: The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
(claim 30) A non-transitory computer-readable medium storing instructions for completing an incomplete triplet in a knowledge graph, the instructions when executed by one or more processors of a computing device, cause the computing device to (claim 37) A system for completing an incomplete triplet in a knowledge graph, the system comprising: a processor; memory storing computer-executable instructions, which when executed, cause the system to (mere instructions to apply an exception – MPEP 2106.05(f), (2) invoking generic computer components)
receiving… a source entity-relation pair information from the knowledge graph (insignificant extra-solution activity – MPEP 2106.05(g), (3) data gathering and outputting, and WURC: receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 - MPEP 2106.05(d)(II)(i))
receiving… neighborhood entity-relation pair information from the knowledge graph (insignificant extra-solution activity – MPEP 2106.05(g), (3) data gathering and outputting, and WURC: receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 - MPEP 2106.05(d)(II)(i))
by a first transformer…  by the first transformer… by the first transformer… by a second transformer different from the first transformer… (insignificant extra-solution activity – MPEP 2106.05(g), (1) Whether the extra-solution limitation is well known; in light of spec. [0006] and [0026] “The model includes two different Transformer blocks.” The hierarchical two-transformer architecture is an architecture of multiple transformer blocks, which is well-known and in common use, i.e. multi-layer transformer is a common use in Large Language Models (LLMs). BERT, GPT, UNILM are all architectures with multiple transformer blocks.)
Devlin (“BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”) teaches (Devlin, p. 3, Model Architecture “In this work, we denote the number of layers (i.e., Transformer blocks) as L”) 
Horsuwan (“A Comparative Study of Pretrained Language Models on Thai Social Text Categorization”) teaches (Horsuwan, p. 2, 5.1 Implementation Details “GPT… The resulting model has 12 layers of transformer each with 12 self-attention heads… BERT. We used the publicly available BERT_BASE… 12 self-attention heads, and 12 transformer blocks.”) 

    PNG
    media_image1.png
    286
    225
    media_image1.png
    Greyscale
Dong (“Unified Language Model Pre-training for Natural Language Understanding and Generation”) teaches transformer Block 1, transformer Block 2… transformer Block L in Figure 1. 

Considering the additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.

With respect to claims 22, 31 and 39:
2A Prong 1: The claim recites a judicial exception.
further comprising: determining neighborhood relational information from the knowledge graph (mental process – evaluation or judgement, a human can manually determine relational information)
converting the neighborhood relational information into the neighborhood entity-relation pair information (mental process – evaluation or judgement, a human can manually convert the relational information)

With respect to claims 23 and 32:
2A Prong 1: The claim recites a judicial exception.
wherein providing the link predictions for the incomplete triplet includes: aggregating the interaction information for the source entity-relation pair information and the interaction information for the neighborhood entity-relation pair information (mental process – evaluation or judgement, a human can manually aggregate the interaction information)
providing target entity predictions based on the aggregated interaction information (mental process – evaluation or judgement, predicting/determining target entity predictions based on the aggregated information, a human can manually provide predictions)

With respect to claims 24, 33 and 38:
2A Prong 1: The claim recites a judicial exception.
further comprising: converting the incomplete triplet from the knowledge graph to the source entity-relation pair information, wherein the incomplete triplet is missing one of a subject or an object (mental process – evaluation or judgement, a human can manually convert the triplet)

With respect to claims 25, 34 and 40:
2A Prong 1: The claim recites a judicial exception.
wherein selecting the one of the link predictions to be a target node for the incomplete triplet comprises (mental process – evaluation or judgement, a human can manually select one of the link predictions)
ranking the link predictions based on a plausibility score (mental process – evaluation or judgement, a human can manually rank the link predictions)
selecting the highest ranked link prediction to be the target node for the incomplete triplet (mental process – evaluation or judgement, a human can manually select the highest link prediction)

With respect to claims 26 and 35:
2A Prong 2: The judicial exception is not integrated into a practical application.
wherein the source entity-relation pair information further comprises a token embedding, a source embedding, and a predicate embedding (insignificant extra-solution activity – MPEP 2106.05(g), (3) data gathering and outputting. Claim 21 recites “receiving a source entity-relation pair information” which is insignificant extra-solution activity.)

Since the claim as a whole, looking at the additional elements individually and in combination, does not contain any other additional elements that are indicative of integration into a practical application, the claim is directed to an abstract idea.

2B: The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
wherein the source entity-relation pair information further comprises a token embedding, a source embedding, and a predicate embedding (insignificant extra-solution activity – MPEP 2106.05(g), (3) data gathering and outputting, and WURC: receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 - MPEP 2106.05(d)(II)(i). Claim 21 recites “receiving a source entity-relation pair information” which is insignificant extra-solution activity.)

Considering the additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.

With respect to claim 27:
2A Prong 2: The judicial exception is not integrated into a practical application.
wherein the token embedding is a classification token (insignificant extra-solution activity – MPEP 2106.05(g), (3) data gathering and outputting. Claim 26 recites “the source entity-relation pair information further comprises a token embedding…” which is insignificant extra-solution activity.)

Since the claim as a whole, looking at the additional elements individually and in combination, does not contain any other additional elements that are indicative of integration into a practical application, the claim is directed to an abstract idea.

2B: The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
wherein the token embedding is a classification token (insignificant extra-solution activity – MPEP 2106.05(g), (3) data gathering and outputting, and WURC: receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 - MPEP 2106.05(d)(II)(i). Claim 26 recites “the source entity-relation pair information further comprises a token embedding…” which is insignificant extra-solution activity.)

Considering the additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.

With respect to claims 28 and 36:
2A Prong 1: The claim recites a judicial exception.
wherein providing link predictions further comprises: providing a token for each link prediction, wherein the token comprises an aggregation of the source embedding and the predicate embedding (mental process – evaluation or judgement, using a token for link prediction, a human can manually provide/determine a token for link prediction)
determining a plausibility score for the link prediction based on the token (mental process – evaluation or judgement, a human can manually determine a score)

With respect to claim 29:
2A Prong 2: The judicial exception is not integrated into a practical application.
wherein the knowledge graph comprises a plurality of nodes connected by edges, wherein each of the plurality of nodes comprises an entity and each of the edges represents a relationship between two of the plurality of entities (a particular technological environment or field of use – MPEP 2106.05(h))

Since the claim as a whole, looking at the additional elements individually and in combination, does not contain any other additional elements that are indicative of integration into a practical application, the claim is directed to an abstract idea.

2B: The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
wherein the knowledge graph comprises a plurality of nodes connected by edges, wherein each of the plurality of nodes comprises an entity and each of the edges represents a relationship between two of the plurality of entities (a particular technological environment or field of use – MPEP 2106.05(h))

Considering the additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 21-25, 29-34 and 37-40 rejected under 35 U.S.C. 103 as being unpatentable over Nathani ("Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs" 20190604) in view of Peters ("Knowledge Enhanced Contextual Word Representations" 20191031) in view of Lecue (US 20200242484 A1, filed on 20190124)

In regard to claims 21, 30 and 37, Nathani teaches: A computer-implemented method of completing an incomplete triplet in a knowledge graph comprising: (Nathani, p. 1 "The recent proliferation of knowledge graphs (KGs) [a knowledge graph] coupled with incomplete or partial information, in the form of missing relations (links) between entities..."; p. 6, 4.3 Evaluation Protocol "In the relation prediction task, the aim is to predict a triple (ei, rk, ej) with ei or ej missing, i.e., predict ei given (rk; ej) or predict ej given (ei; rk). [completing an incomplete triplet]")

    PNG
    media_image2.png
    337
    658
    media_image2.png
    Greyscale



receiving, by a first transformer, a source entity-relation pair information from the knowledge graph; (Nathani, p. 1 "Our idea is: 1) to capture multi-hop relations... surrounding a given node,"; p. 3, 3.3 Relations are important "we propose a novel embedding approach to incorporate relation and neighboring node features in the attention mechanism [a first transformer]... a particular triple t_ijk = (ei, rk, ej), as is shown in Equation 5… c_ijk = W1[hi||hj||gk] (5) [a source entity-relation pair information, [hi||hj||gk] when j is 1-hop] where c_ijk is the vector representation of a triple t_ijk. Vectors hi, hj, and gk denote embeddings of entities ei, ej and relation rk, respectively... where Ni denotes the neighborhood of entity ei... j∈Ni"; see Fig. 3, when j is 1-hop neighbor, [hi||hj||gk] is provided to the attention mechanism, i.e. a source entity-relation pair information is received by a first transformer; when j is 1-hop neighbor, the source pair is a source node i and its 1-hop neighbors, i.e. its direct neighbors)

    PNG
    media_image3.png
    386
    334
    media_image3.png
    Greyscale


receiving, by the first transformer, neighborhood entity-relation pair information from the knowledge graph; (Nathani, p. 1 "Our idea is: 1) to capture multi-hop relations... surrounding a given node,"; p. 3, 3.3 Relations are important "we propose a novel embedding approach to incorporate relation and neighboring node features in the attention mechanism [the first transformer]... a particular triple t_ijk = (ei, rk, ej), as is shown in Equation 5… c_ijk = W1[hi||hj||gk] (5) [neighborhood entity-relation pair information, [hi||hj||gk] when j is n-hop, n > 1] where c_ijk is the vector representation of a triple t_ijk. Vectors hi, hj, and gk denote embeddings of entities ei, ej and relation rk, respectively... where Ni denotes the neighborhood of entity ei... j∈Ni"; see Fig. 3, when j is n-hop neighbor n>1, [hi||hj||gk] is provided to the attention mechanism, i.e. neighborhood entity-relation pair information is received by the first transformer; when j is n-hop neighbor n>1, the neighborhood pair is a source node i and its n-hop neighbors, n>1, i.e. 2-hop, 3-hop, etc.)
capturing, by the first transformer, interaction information for the source entity-relation pair information and the neighborhood entity-relation pair information; (Nathani, p. 3, 3.3 Relations are important "The new embedding of the entity ei is the sum of each trile representation weighted by their attention values as shown in Equation 8... As suggested by (Velickovic et al., 2018), multi-head attention which was first introduced by (Vaswani et al., 2017), is used to stabilize the learning process and encapsulate more information about the neighborhood. Essentially, M independent attention mechanisms calculate the embeddings, which are then concatenated, resulting in the following representation: hi' = || σ(Σ_j∈Ni α_ijk_m c_ijk_m) (9) This is the graph attention layer shown in Figure 4."; attention mechanism uses dot products to capture interaction information between different parts of an input data/embeddings [capturing interaction information]; specifically, the results of  Eq. (8) (9) are the 'interaction' for a source node i and all neighbors j (j∈Ni) in the graph, i.e. [the interaction information (Eq. 8-9) for the source pair (a source node and its 1-hop neighbors, i.e. its direct neighbors) and the neighborhood pair (a source node and its n-hop neighbors, n > 1, i.e. 2-hop, 3-hop, etc.)])

Nathani does not teach, but Peters teaches: providing, by a second transformer different from the first transformer, link predictions for the incomplete triplet based on the interaction information; (Peters, p. 3, 3.2 Knowledge Bases "we adopt a broad definition for a KB in the most general sense as fixed collection of K entity nodes, ek, from which it is possible to compute entity embeddings, ek ∈ RE. This includes KBs with a typical (subj, rel, obj) [a triple or entity nodes in a KB, represented as entity embeddings] graph structure..."; p. 3, 3.1 Pretrained BERT "The masked LM objective randomly replaces a percentage of input word pieces with a special [MASK] token [incomplete, hidden] and computes the negative log-likelihood of the missing token with a linear layer and softmax over all possible word pieces."; p. 7, 4.2 Intrinsic Evaluation "we evaluated whether a model could recover the masked entity [a second transformer providing link predictions for the incomplete triplet] by computing the mean reciprocal rank (MRR) of the masked word pieces."; p. 4, Figure 1 "The Knowledge Attention and Recontextualization (KAR) component. BERT word piece representations (Hi) are first projected to H_proj_i (1), then pooled over candidate mentions spans (2) to compute S, and contextualized into Se using mention-span self-attention (3). [the first transformer] An integrated entity linker... enhance the span representations with knowledge from the KB (5), computing S'e. Finally, the BERT word piece representations are recontextualized with word-to-entity-span attention (6) [a second transformer] and projected back..."; p. 4, Entity linker "It first runs mention-span self-attention to compute Se = TransformerBlock(S). (2) [the first transformer]"; p. 5, Recontextualization "The word-to-entity-span attention in KnowBert... H'i_proj = MLP(MultiHeadAttn(H_proj, S'e, S'e) [a second transformer]"; in light of spec. [0006] and [0026] “The model includes two different Transformer blocks.”; masked word (token) pieces in the masked entity is predicted by the model, where the model comprises two self-attention mechanisms, i.e. two transformer blocks, i.e. a second transformer predicts the masked/missing token 
    PNG
    media_image4.png
    388
    1120
    media_image4.png
    Greyscale
based on the information captured by the first transformer)

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Nathani to incorporate the teachings of Peters by including entity spans and recontextualization. Doing so would allow long range interactions between contextual word representations and all entity spans in the context. (Peters, p. 1, 1 Introduction "The key idea is to explicitly model entity spans in the input text and use an entity linker to retrieve relevant entity embeddings from a KB to form knowledge enhanced entity-span representations. Then, the model recontextualizes the entity-span representations with word-to-entity attention to allow long range interactions between contextual word representations and all entity spans in the context.")

Nathani and Peters do not teach, but Lecue teaches: selecting one of the link predictions to be a target node for the incomplete triplet; and (Lecue, [0018] "The KG system 100 further includes discovery score computation circuitry 140 for computing an aggregate score for one or more discovery candidate nodes that were filtered down by the embedding space slicing circuitry 130. After the substitution scores are computed for the discovery candidate nodes, a node having the highest substitution score may be selected, [selecting the one of the link predictions to be a target node, ranking the link predictions based on a plausibility score, selecting the highest ranked link prediction to be the target node]"; [0028] "An exemplary vector triple may include the following format: <head entity, relationship, tail entity>")
adding the target node to the incomplete triplet in the knowledge graph. (Lecue, [0018] "the compound represented by the selected node may be selected to be the discovery compound that will either replace the selected compound [adding the target node to the incomplete triplet] or be added to the formulation."; [0019] "Furthermore, the utilization of the knowledge graph representation of formulations provides a more efficient data structure to allow for the discovery process to be implemented within the embedding space.")

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Nathani and Peters to incorporate the teachings of Lecue by including by including the node being added to the formulation. Doing so would a new and oftentimes unexpected formulation may be created while still satisfying desired compound attributes. (Lecue, [0019] "By including the discovery node as described herein, a new and oftentimes unexpected formulation may be created while still satisfying desired compound attributes.")

Claims 30 and 37 recite substantially the same limitation as claim 21, therefore the rejection applied to claim 21 also apply to claims 30 and 37. In addition, Nathani teaches: (claim 30) A non-transitory computer-readable medium storing instructions for completing an incomplete triplet in a knowledge graph, the instructions when executed by one or more processors of a computing device, cause the computing device to (claim 37) A system for completing an incomplete triplet in a knowledge graph, the system comprising: a processor; memory storing computer-executable instructions, which when executed, cause the system to (Nathani, p. 9, Acknowledgments "We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research"; p. 6, 4.2 Training Protocol "We use Adam to optimize all the parameters with initial learning rate set at 0.001."; the implementation with GPU inherently teach all the computer components)

In regard to claims 22, 31 and 39, Nathani teaches: further comprising: determining neighborhood relational information from the knowledge graph; and (Nathani, p. 3, 3.3 Relations are important "where Ni denotes the neighborhood of entity ei and Rij denotes the set of relations connecting entities ei and ej. [neighborhood relational information r]")
converting the neighborhood relational information into the neighborhood entity-relation pair information. (Nathani, p. 1 "Our idea is: 1) to capture multi-hop relations... surrounding a given node,"; p. 3, 3.3 Relations are important "a particular triple t_ijk = (ei, rk, ej), as is shown in Equation 5… c_ijk = W1[hi||hj||gk] (5) [the neighborhood entity-relation pair information, [hi||hj||gk] when j is n-hop, n > 1] where c_ijk is the vector representation of a triple t_ijk. Vectors hi, hj, and gk denote embeddings of entities ei, ej and relation rk, [the neighborhood relational information r] respectively... where Ni denotes the neighborhood of entity ei... j∈Ni"; when j is n-hop neighbor n>1, [hi||hj||gk] is [the neighborhood entity-relation pair information]. The neighborhood pair is a source node and its n-hop neighbors, n > 1, i.e. 2-hop, 3-hop, etc.)])

In regard to claims 23 and 32, Nathani teaches: aggregating the interaction information for the source entity-relation pair information and the interaction information for the neighborhood entity-relation pair information; and (Nathani, p. 3, 3.3 Relations are important "The new embedding of the entity ei is the sum of each trile representation weighted by their attention values as shown in Equation 8... As suggested by (Velickovic et al., 2018), multi-head attention which was first introduced by (Vaswani et al., 2017), is used to stabilize the learning process and encapsulate more information about the neighborhood. Essentially, M independent attention mechanisms calculate the embeddings, which are then concatenated, resulting in the following representation: hi' = || σ(Σ_j∈Ni α_ijk_m c_ijk_m) (9) This is the graph attention layer shown in Figure 4."; attention mechanism aggregates input information by creating a weighted sum of the input data/embeddings [aggregating the interaction information]; specifically, the results of  Eq. (8) (9) are the 'aggregated interaction' for a souce node i and all neighbors j (j∈Ni) in the graph, i.e. [the interaction information (Eq. 8-9) for the source pair (a source node and its 1-hop neighbors, i.e. its direct neighbors) and the neighborhood pair (a source node and its n-hop neighbors, n > 1, i.e. 2-hop, 3-hop, etc.)])
Nathani does not teach, but Peters teaches: wherein providing the link predictions for the incomplete triplet includes: (Peters, p. 3, 3.2 Knowledge Bases "we adopt a broad definition for a KB in the most general sense as fixed collection of K entity nodes, ek, from which it is possible to compute entity embeddings, ek ∈ RE. This includes KBs with a typical (subj, rel, obj) [the triple] graph structure..."; p. 3, 3.1 Pretrained BERT "The masked LM objective randomly replaces a percentage of input word pieces with a special [MASK] token [incomplete triple] and computes the negative log-likelihood of the missing token with a linear layer and softmax over all possible word pieces."; p. 7, 4.2 Intrinsic Evaluation "we evaluated whether a model could recover the masked entity [providing link predictions for the incomplete triplet] by computing the mean reciprocal rank (MRR) of the masked word pieces.")
… providing target entity predictions based on the aggregated interaction information. (Peters, p. 4, Entity linker "It first runs mention-span self-attention to compute Se = TransformerBlock(S). (2) [the interaction information]"; p. 5, Recontextualization "The word-to-entity-span attention in KnowBert... H'i_proj = MLP(MultiHeadAttn(H_proj, S'e, S'e) [attention mechanism: aggregating the interaction information]"; p. 7, 4.2 Intrinsic Evaluation "we evaluated whether a model could recover the masked entity [providing target entity predictions] by computing the mean reciprocal rank (MRR) of the masked word pieces."; attention mechanism aggregates input information by creating a weighted sum of the input data/embeddings; both Nathani and Peters teach the concept of the aggregated interaction information)
The rationale for combining the teachings of Nathani and Peters is the same as set forth in the rejection of claim 21.

In regard to claims 24, 33 and 38, Nathani teaches: further comprising: converting the incomplete triplet from the knowledge graph to the source entity-relation pair information, (Nathani, p. 1 "Our idea is: 1) to capture multi-hop relations... surrounding a given node,"; p. 3, 3.3 Relations are important "a particular triple t_ijk = (ei, rk, ej), as is shown in Equation 5… c_ijk = W1[hi||hj||gk] (5) [the source entity-relation pair information] where c_ijk is the vector representation of a triple t_ijk... where Ni denotes the neighborhood of entity ei... j∈Ni"; see Fig. 3, when j is 1-hop neighbor, [hi||hj||gk] is the source entity-relation pair information; when j is 1-hop neighbor, the source pair is a source node i and its 1-hop neighbors, i.e. its direct neighbors)
wherein the incomplete triplet is missing one of a subject or an object. (Nathani, p. 6, 4.3 Evaluation Protocol "In the relation prediction task, the aim is to predict a triple (ei, rk, ej) with ei or ej missing, [incomplete triplet, missing a subject (ei) or an object (ej)] i.e., predict ei given (rk; ej) or predict ej given (ei; rk). We generate a set of (N-1) corrupt triples for each entity... by replacing it with every other entity...")

In regard to claims 25, 34 and 40, Nathani does not teach, but Lecue teaches: wherein selecting the one of the link predictions to be a target node for the incomplete triplet comprises: ranking the link predictions based on a plausibility score; and selecting the highest ranked link prediction to be the target node for the incomplete triplet. (Lecue, [0018] "The KG system 100 further includes discovery score computation circuitry 140 for computing an aggregate score for one or more discovery candidate nodes that were filtered down by the embedding space slicing circuitry 130. After the substitution scores are computed for the discovery candidate nodes, a node having the highest substitution score may be selected, [selecting the one of the link predictions to be a target node, ranking the link predictions based on a plausibility score, selecting the highest ranked link prediction to be the target node]")
The rationale for combining the teachings of Nathani, Peters and Lecue is the same as set forth in the rejection of claim 21.

In regard to claim 29, Nathani teaches: wherein the knowledge graph comprises a plurality of nodes connected by edges, wherein each of the plurality of nodes comprises an entity and each of the edges represents a relationship between two of the plurality of entities. (Nathani, p. 1, 1 Introduction "Knowledge graphs (KGs) represent knowledge bases (KBs) as a directed graph whose nodes and edges represent entities and relations between entities, respectively."; p. 2, Figure 1 "Subgraph of a knowledge graph contains actual relations between entities (solid lines) and inferred relations that are initially hidden (dashed lines)."; p. 3, 3.1 Background "A knowledge graph is denoted by G = (E,R), where E and R represent the set of entities (nodes) and relations (edges), respectively.")

Claims 26-28 and 35-36 rejected under 35 U.S.C. 103 as being unpatentable over Nathani, Peters and Lecue as applied to claims 21, 30 and 37, and in further view of Yao ("KG-BERT: BERT for Knowledge Graph Completion" 20190911)

In regard to claims 26 and 35, Nathani, Peters and Lecue do not teach, but Yao teaches: wherein the source entity-relation pair information further comprises a token embedding, a source embedding, and a predicate embedding. (Yao, p. 2, Knowledge Graph BERT (KG-BERT) "The architecture of the KG-BERT for modeling triples is shown in Figure 1. We name this KG-BERT version KGBERT(a). The first token of every input sequence is always a special classification token [CLS]… [a token embedding]... The head entity [a source embedding] is represented as a sentence containing tokens... the relation [a predicate embedding] is represented as a sentence containing tokens... the tail entity is represented... Different elements separated by [SEP] have different segment embeddings, the tokens in sentences of head and tail entity share the same segment embedding eA, while the tokens in relation sentence have a different segment embedding eB.")

    PNG
    media_image5.png
    548
    1102
    media_image5.png
    Greyscale

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Nathani, Peters and Lecue to incorporate the teachings of Yao by including KGBERT for link prediction Doing so would achieves the lowest mean ranks. (Yao, p. 5, Link Prediction "Table 3 shows link prediction performance of various models... We can observe that: 1) KGBERT(a) can achieve lower MR than baseline models, and it achieves the lowest mean ranks on WN18RR and FB15k237 to our knowledge")

In regard to claim 27, Nathani, Peters and Lecue do not teach, but Yao teaches: wherein the token embedding is a classification token. (Yao, p. 2, Knowledge Graph BERT (KG-BERT) "The architecture of the KG-BERT for modeling triples is shown in Figure 1. We name this KG-BERT version KGBERT(a). The first token of every input sequence is always a special classification token [CLS]… [a classification token]")

The rationale for combining the teachings of Nathani, Peters, Lecue and Yao is the same as set forth in the rejection of claim 26.

In regard to claims 28 and 36, Nathani, Peters and Lecue do not teach, but Yao teaches: wherein providing link predictions further comprises: (Yao, p. 5, Link Prediction "The link (entity) prediction task predicts the head entity h given (?, r, t) or predicts the tail entity t given (h, r, ?) where ? means the missing element.")
providing a token for each link prediction, wherein the token comprises an aggregation of the source embedding and the predicate embedding; and (Yao, p. 2, Knowledge Graph BERT (KG-BERT) "To model the plausibility of a triple, we packed the sentences of (h, r, t) as a single sequence. A sequence means the input token sequence to BERT, which may be two entity name/description sentences or three sentences of (h, r, t) [the source embedding (h) and the predicate embedding (r)] packed together... The final hidden state C corresponding to [CLS] [token] is used as the aggregate sequence representation [an aggregation of the source embedding and the predicate embedding] for computing triple scores.")
determining a plausibility score for the link prediction based on the token. (Yao, p. 5, Link Prediction "Each correct test triple (h, r, t) is corrupted by replacing either its head or tail entity with every entity e ∈ E, then these candidates are ranked in descending order of their plausibility score…"; p. 2, Knowledge Graph BERT (KG-BERT) "The first token of every input sequence is always a special classification token [CLS]… [the token]"))

The rationale for combining the teachings of Nathani, Peters, Lecue and Yao is the same as set forth in the rejection of claim 26.

Response to Arguments
Applicant's amendments with respect to 112(b) rejections have been fully considered and are sufficient to overcome the rejections. The 112(b) rejections have been withdrawn.

Applicant's arguments with respect to the rejection of the claims under 35 U.S.C. 103 have been fully considered but they are moot:
Applicant argues: (p. 8) Nathani's stacked GAT layers are successive layers within a single encoder that iteratively update entity embeddings, not a distinct "first transformer" whose output is provided to a separate downstream transformer. Moreover, Nathani explicitly describes that link prediction is performed by a ConvKB decoder-a convolution-based scoring model using convolutional filters over concatenated embeddings…
Examiner answers: the arguments do not apply to the references (Peters) being used in the current rejection.

Applicant's arguments with respect to the rejection of the claims under 35 U.S.C. 101 have been fully considered but they are not persuasive:
Applicant argues: (p. 10 top) amended claim 21 recites specific machine-learning operations performed by a "first transformer" and a "second transformer different from the first transformer" to process "source entity-relation pair information" and "neighborhood entity- relation pair information,""captur[e] ... interaction information," and generate "link predictions" used to complete an incomplete triplet in a knowledge graph. These transformer-based computations over knowledge-graph neighborhood inputs are not the type of steps that can practically be performed in the human mind, and therefore the claims do not fall within the "mental processes" grouping.
Examiner answers: In step 2A, prong One, a human with aids of pen and paper can evaluate interaction information, determine link predictions, selecting one of the link predictions and adding the node in the graph. If a claim recites a limitation that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper, the limitation falls within the mental processes grouping, and the claim recites an abstract idea – MPEP 2106.04(a)(2)(III)(B).

Applicant argues: (p. 10 bottom) Step 2A, Prong Two: Even if somehow the claims are deemed to recite a judicial exception, the alleged exception is integrated into a practical application. Even assuming, arguendo, that the claims are deemed to recite a judicial exception, the claims clearly integrate such alleged exception into a practical application. Specifically, the claims are directed to a technical solution for improving knowledge-graph completion (link prediction) by leveraging graph neighborhood context in a hierarchical transformer architecture that generates improved entity embeddings and link-prediction performance.
Examiner answers: In step 2A, prong Two, the additional element of leveraging a hierarchical transformer architecture is well known, which is identified as insignificant extra-solution activity – MPEP 2106.05(g), (1) Whether the extra-solution limitation is well known.

Applicant argues: (p. 11 top) Step 2B: Even if somehow the claims are determined to direct to an abstract idea, the claims as a whole amount to significantly more than the abstract idea itself… the claimed solution represents a technical improvement in knowledge-graph completion, namely, improving link prediction for completing incomplete triplets by using a hierarchical two-transformer architecture.
Examiner answers: In step 2B, the additional element of leveraging a hierarchical transformer architecture is well known, and examiner has provided Berkheimer analysis with citation of publications (Devlin, Horsuwan, Dong) that demonstrate the well-understood, routine, conventional nature of the additional element(s). Therefore, the hierarchical transformer architecture is not considered to reflect an improvement to the technology.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SU-TING CHUANG whose telephone number is (408)918-7519. The examiner can normally be reached Monday - Thursday 8-5 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached at (571) 272-4046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/S.C./Examiner, Art Unit 2146                                                                                                                                                                                                        
/USMAAN SAEED/Supervisory Patent Examiner, Art Unit 2146
Read full office action
Prosecution Timeline

May 04, 2023
Application Filed
Jan 14, 2026
Non-Final Rejection mailed — §101, §103
Feb 24, 2026
Interview Requested
Mar 03, 2026
Examiner Interview Summary
Mar 03, 2026
Applicant Interview (Telephonic)
Mar 05, 2026
Response Filed
Apr 22, 2026
Final Rejection mailed — §101, §103
May 20, 2026
Interview Requested
Precedent Cases

Applications granted by this same examiner with similar technology

17/726,040
Patent 12626164
SYSTEM AND METHOD FOR REDUCTION OF DATA TRANSMISSION BY DATA RECONSTRUCTION
4y 0m to grant Granted May 12, 2026
17/828,778
Patent 12626106
MACHINE LEARNING MODELS FOR BEHAVIOR UNDERSTANDING
3y 11m to grant Granted May 12, 2026
17/871,819
Patent 12626140
SYSTEMS AND METHODS FOR ONLINE TIME SERIES FORCASTING
3y 9m to grant Granted May 12, 2026
16/655,202
Patent 12619890
LEARNING PATTERN DICTIONARY FROM NOISY NUMERICAL DATA IN DISTRIBUTED NETWORKS
6y 6m to grant Granted May 05, 2026
17/131,035
Patent 12619882
DECISION TREE OF MODELS: USING DECISION TREE MODEL, AND REPLACING THE LEAVES OF THE TREE WITH OTHER MACHINE LEARNING MODELS
5y 4m to grant Granted May 05, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
50%
Grant Probability
87%
With Interview (+37.2%)
4y 6m (~1y 5m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 104 resolved cases by this examiner. Grant probability derived from career allowance rate.