Office Action Analysis: 18154637 — KNOWLEDGE-GRAPH EXTRAPOLATING METHOD AND SYSTEM BASED ON MULTI-LAYER PERCEPTION

Office Action

§101 §103 §112
Detailed Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 3 & 13 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “…inside to outside…” in claims 3 & 13 is a relative term which renders the claims indefinite. The term “…inside to outside…” is not defined by the claims, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The claims recite scalars represented “from inside to outside”, without clarifying what they are inside and outside of. The scalars are then described to be in descending order, followed by being normalized. This has a variety of possible interpretations but for the sake of compact prosecution, this will be interpreted herein to mean that the scalars are all different sizes, in order to be able to be ordered in descending order.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea (mental process) without significantly more.
	Regarding claim 1, in Step 1 of the 101-analysis set forth in MPEP 2106, the claim recites “A knowledge-graph extrapolating method”. A method is within one of the four statutory categories of invention.
		In Step 2a Prong 1 of the 101-analysis set forth in the MPEP 2106, the examiner has determined that the following limitations recite a process that, under the broadest reasonable interpretation, covers a mental process but for recitation of generic computer components:
“A knowledge-graph extrapolating method based on multi-layer perception, the method comprising… to learn embedding representations of entities, relations, and timestamps, and capturing dynamic evolution of a fact” (A person can mentally evaluate entities, relations, and timestamps and make a judgement to learn embedding representations of them and capture dynamic evolution of a fact (MPEP 2106).)
“assigning a matching historical relevance degree to the entity sets at each of the multiple layers” (A person can mentally evaluate the multiple layers of entity sets and make a judgement to mentally “assign” values that represent how historically relevant they are (MPEP 2106).)
“classifying prediction tasks into different classes of reasoning scenes” (A person can mentally evaluate prediction tasks and make a judgement to classify them into different “reasoning scenes” (MPEP 2106).)
“using a multi-class task solving method to acquire predicted probability distributions of target entities, and taking the entity having a highest level of probability as a prediction answer, so as to accomplish extrapolation of a temporal knowledge graph, wherein the prediction tasks are classified into different classes of reasoning scenes according to whether it contains any entity or any relation that has never appeared historically” (In view of the specification at [0085-0099], the use of the multi-task solving method to acquire probability distributions extrapolated from the graph, and taking the highest probability, is a positive recitation of a mathematical process, which recites an abstract idea (MPEP 2106).)
If claim limitations, under their broadest reasonable interpretation, covers performance of the limitations as a mental process but for the recitation of generic computer components, then it falls within the mental process grouping of abstract ideas. According, the claim “recites” an abstract idea.
	In Step 2a Prong 2 of the 101-analysis set forth in MPEP 2106, the examiner has
determined that the following additional elements do not integrate this judicial exception into a
practical application:
“...using relational graph convolutional network encoders…” (Mere instructions to apply the judicial exception (MPEP 2106.05(f)).)
“designing emerging task processing units to construct multiple layers of entity sets” (Mere instructions to apply the judicial exception (MPEP 2106.05(f)).)
“connecting each of the classes of the reasoning scenes to the processing unit for the corresponding layer, so as to accomplish partition of the multiple layers of the historically relevant entity sets” (Mere instructions to apply the judicial exception (MPEP 2106.05(f)).)
Since the claim as a whole, looking at the additional elements individually and in combination, does not contain any other additional elements that are indicative of integration into a practical application, the claim is “directed” to an abstract idea.
In Step 2b of the 101-analysis set forth in the 2019 PEG, the examiner has determined that the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
As discussed above, additional elements v, vi, & vii recite mere instructions to apply the judicial exception, which is not indicative of significantly more. Considering the additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.

	Regarding claim 2, it is dependent upon claim 1, and thereby incorporates the limitations of, and corresponding analysis applied to claim 1. Further, claim 2 recites the following additional mental processes:
“wherein the emerging task processing units search in the current fact for entities related to the prediction tasks and group the searched entities as entity sets of first, second, and third layers… wherein at the first layer are the entity sets directly connected to relation predicates of the prediction tasks, at the second layer are the entity sets that can be reached by subject entities of the prediction tasks in one hop or two hops, at the third layer are the entity sets that can be reached through remaining paths in the current fact in multiple hops” (A person can mentally evaluate a “current fact” to search for entities related to a prediction task and then make a judgement to mentally group those entities into sets based on how closely related they are (MPEP 2106).)
“by comparing all the entity sets of a data set, identify and group the entities that have never appeared into entity sets of a fourth layer… and at the fourth layer are the entity sets that have never appeared historically” (A person can mentally evaluate a “current fact” to search for entities related to a prediction task and make a judgement to identify entities which they’ve never encountered, and group those separately from other entities (MPEP 2106).)
	Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible.

	Regarding claim 3, it is dependent upon claim 2, and thereby incorporates the limitations of, and corresponding analysis applied to claim 2. Further, claim 3 recites the following additional abstract idea:
 “wherein the historical relevance degrees assigned to the entity sets at the four layers are represented by scalars from inside to outside as α, β, γ and δ, wherein α > β > γ > δ, and α + β + γ + δ = 1” (this appears to be directed to a mathematical process, which is an abstract idea (MPEP 2106).)
	Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible.

	Regarding claim 4, it is dependent upon claim 3, and thereby incorporates the limitations of, and corresponding analysis applied to claim 3. Further, claim 4 recites the following additional abstract idea:
“wherein the prediction tasks are at least classified into at least the four classes of reasoning scenes: Class 1, having neither never-appeared entities nor never-appeared relations, Class 2, having only never-appeared entities, Class 3, having only never-appeared relations, and Class 4, having both never-appeared entities and never-appeared relations” (A person can mentally evaluate the prediction tasks and make a judgement to classify them based on their composition of never-appeared entities and never-appeared relations (MPEP 2106).)
	Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible.

	Regarding claim 5, it is dependent upon claim 4, and thereby incorporates the limitations of, and corresponding analysis applied to claim 4. Further, claim 5 recites the following additional abstract idea:
“processing the different prediction tasks at the different number of layers for the multi-layer path extrapolation” (In view of the specification at [0085-0099], the use of the multi-task solving method to acquire probability distributions extrapolated from the graph, and taking the highest probability, is a positive recitation of a mathematical process, which recites an abstract idea (MPEP 2106).)
Further, claim 5 recites “connecting each of the classes of the reasoning scenes to the processing unit for the corresponding layer, wherein the number of layers for the multi-layer path extrapolation corresponds to the entity sets at the multiple layers” (In step2A, prong 2, this recites mere instructions to apply the judicial exception (MPEP 2106.05(f).) In step 2B, mere instructions to apply the judicial exception is not indicative of significantly more.)	Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible.

	Regarding claim 6, it is dependent upon claim 5, and thereby incorporates the limitations of, and corresponding analysis applied to claim 5. Further, claim 6 recites the following additional abstract ideas:
“mapping each of the entities, relations, and timestamps data of the data set into a low-dimension dense vector space” (Mapping data values into a “dense vector space” is a mathematical process rooted in linear algebra and vector space theory, as explained at https://milvus.io/docs/dense-vector.md. A mathematical process recites an abstract idea (MPEP 2106).)
“initializing all pre-determined parameters through Xavier initialization” (Xavier initialization is a mathematical process, as explained at https://365datascience.com/tutorials/machine-learning-tutorials/what-is-xavier-initialization/. A mathematical process recites an abstract idea (MPEP 2106).)
“using Cross-Entropy Loss Function to minimize global loss thereby optimizing parameter learning” (Cross-Entropy is a well-known mathematical process, as defined at https://en.wikipedia.org/wiki/Cross-entropy. A mathematical process is an abstract idea (MPEP 2106).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible.

Regarding claim 7, it is dependent upon claim 6, and thereby incorporates the limitations of, and corresponding analysis applied to claim 6. Further, claim 7 recites the following additional abstract idea:
“wherein the ω-layer relational graph convolutional network encoder is represented as: 
    PNG
    media_image1.png
    27
    343
    media_image1.png
    Greyscale
 where 
    PNG
    media_image2.png
    26
    31
    media_image2.png
    Greyscale
 and 
    PNG
    media_image3.png
    24
    28
    media_image3.png
    Greyscale
 are embeddings of the entities s and o in a graph snapshot 
    PNG
    media_image4.png
    14
    18
    media_image4.png
    Greyscale
 at the lth layer with a timestamp 
    PNG
    media_image5.png
    18
    15
    media_image5.png
    Greyscale
, respectively, and 
    PNG
    media_image6.png
    23
    31
    media_image6.png
    Greyscale
 and 
    PNG
    media_image7.png
    25
    37
    media_image7.png
    Greyscale
 are weight matrixes for converging the features from the different relations and a self-loop matrix for the lth layer” (this appears to be directed to a mathematical process, which is an abstract idea (MPEP 2106).)
Further, claim 7 recites “wherein a ω-layer relational graph convolutional network encoder is used for representation learning to converge and extract features of the different relations” (In step2A, prong 2, this recites mere instructions to apply the judicial exception (MPEP 2106.05(f).) In step 2B, mere instructions to apply the judicial exception is not indicative of significantly more.)
	Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible.

Regarding claim 8, it is dependent upon claim 7, and thereby incorporates the limitations of, and corresponding analysis applied to claim 7. Further, claim 8 recites “wherein the multi-class task solving method uses a multilayer perceptron and a SoftMax logistic regression model to convert the prediction tasks into entity multi-class tasks, wherein each of the classes corresponds to the level of probability of one of the target entities, so that the entity having the highest level of probability is taken as the prediction answer” (In step2A, prong 2, this recites mere instructions to apply the judicial exception (MPEP 2106.05(f).) In step 2B, mere instructions to apply the judicial exception is not indicative of significantly more.)
	Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible.

Regarding claim 9, it is dependent upon claim 8, and thereby incorporates the limitations of, and corresponding analysis applied to claim 8. Further, claim 9 recites the following additional abstract idea:
“wherein a final prediction 
    PNG
    media_image8.png
    23
    22
    media_image8.png
    Greyscale
 is the entity having a highest level of combined probability, defined as: 
    PNG
    media_image9.png
    24
    194
    media_image9.png
    Greyscale
” (this appears to be directed to a mathematical process, which is an abstract idea (MPEP 2106).)
	Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible.

	Regarding claim 10, it is dependent upon claim 9, and thereby incorporates the limitations of, and corresponding analysis applied to claim 9. Further, claim 10 recites the following additional abstract idea:
“wherein in order to close the loop of the tested training, in the training process, the function Cross-Entropy Loss is used for updating of the tensors and the parameters, so as to appropriately adjust the prediction result” (Cross-Entropy is a well-known mathematical process, as defined at https://en.wikipedia.org/wiki/Cross-entropy. A mathematical process is an abstract idea (MPEP 2106).)
	Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. 
Regarding claim 11, it recites similar limitations as claim 1, and is rejected under the same rationale, with the exception that it is directed to “A knowledge-graph extrapolating system based on multi-layer perception, the system comprising at least one processor” instead of the method of claim 1. A system of the described configuration is within one of the four statutory categories of invention. Further, under Step 2a Prong Two, this limitation recites use of a computer as a tool to perform an abstract idea, which does not provide evidence of integration into a practical application or significantly more (MPEP 2106.05(f).) 
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible.

Regarding claims 12-20, they are dependent upon claim 11 and thereby incorporate the limitations of, and corresponding analysis applied to claim 11. Further, claim 12-20 comprise similar additional limitations as claims 2-10, respectively, and are rejected under the same rationale. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-2 & 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Li, Z. et al. “Temporal Knowledge Graph Reasoning Based on Evolutional
Representation Learning.” Available on 21 April 2021 (hereafter, LI), and further in view of Oh, B. “Open-world knowledge graph completion for unseen entities and relations via attentive feature aggregation.” Available in March 2022 (hereafter, OH).
Regarding claim 1, LI teaches “A knowledge-graph extrapolating method based on multi-layer perception, the method comprising: using relational graph convolutional network encoders to learn embedding representations of entities, relations, and timestamps, and capturing dynamic evolution of a fact”:
([Abstract] “…reasoning over Temporal KG (TKG) that predicts facts in the future is still far from resolved. The key to predict future facts is to thoroughly understand the historical facts. A TKG is actually a sequence of KGs corresponding to different timestamps, where all concurrent facts in each KG exhibit structural dependencies and temporally adjacent facts carry informative sequential patterns. To capture these properties effectively and efficiently, we propose a novel Recurrent Evolution network based on Graph Convolution Network (GCN), called RE-GCN (the use of relational graph convolutional network encoders), which learns the evolutional representations of entities and relations at each timestamp by modeling the KG sequence recurrently...”)
Further, LI teaches “designing emerging task processing units to construct multiple layers of entity sets, and assigning a matching historical relevance degree to the entity sets at each of the multiple layers”:
([4. The RE-GCN MODEL] “Based on the learned entity and relation representations, temporal reasoning at future timestamps can be made with various score functions. Thus RE-GCN contains an evolution unit (emerging task processing unit) and multi-task score functions, as illustrated in Figure 2. The former is employed to encode the historical KG sequence and obtain the evolutional representations of entities and relations (construct multiple layers of entity sets). The latter contains score functions for corresponding tasks with the evolutional representations (i.e., embeddings) at the final timestamp as the input (scores based on the historical sequence, whichc equivalate to the claimed “historical relevance degrees”).”)
Further, LI teaches “using a multi-class task solving method to acquire predicted probability distributions of target entities, and taking the entity having a highest level of probability as a prediction answer, so as to accomplish extrapolation of a temporal knowledge graph”:
([4.2, Score Functions for Different Tasks] “…the probability vector of all entities is below: 
    PNG
    media_image10.png
    22
    351
    media_image10.png
    Greyscale
 Similarly, the probability vector of all the relations is below: 
    PNG
    media_image11.png
    17
    352
    media_image11.png
    Greyscale
”) Different functions being used to calculate probability for entities as well as relations signifies a “multi-class task solving method” which is used to acquire probability distributions of target entities. Further, [4.3 Parameter Learning] further elaborates the method, showing the two being used in combination by the multi-task learning framework to retrieve the probability distributions of the entities, and thus, use the highest as the answer.
LI fails to explicitly teach “classifying prediction tasks into different classes of reasoning scenes, and connecting each of the classes of the reasoning scenes to the processing unit for the corresponding layer, so as to accomplish partition of the multiple layers of the historically relevant entity sets… wherein the prediction tasks are classified into different classes of reasoning scenes according to whether it contains any entity or any relation that has never appeared historically.” However, analogous art, OH, does teach “classifying prediction tasks into different classes of reasoning scenes… wherein the prediction tasks are classified into different classes of reasoning scenes according to whether it contains any entity or any relation that has never appeared historically”:
([6.1.1 Real-world datasets, Paragraph 2] “In a transductive setup, all entities and relations in test sets must be in-KG like the existing KG datasets. On the other hand, since IKGE supports out-of-KG entities and relations (unseen/ never appeared historically), test sets for an inductive setup contain at least one out-of-KG entity or relation. To do this, we randomly selected out-of-KG relations based on uniform sampling over all in-KG relations. As a result, the test set is split into 8 fact types: O-O-O, O-O-X, O-X-O, X-O-O, O-X-X, X-X-O, X-O-X, and X-X-X, where O and X indicate in-KG and out-of-KG, respectively... (This shows prediction tasks being split into different “reasoning scenes” according to whether or not they contain any entity or any relation that hasn’t appeared historically.)”)
Further, OH teaches “connecting each of the classes of the reasoning scenes to the processing unit for the corresponding layer, so as to accomplish partition of the multiple layers of the historically relevant entity sets”:
([5. The IKGE model] “As shown in Fig. 2, in the training phase, (a) given a sample KG, (b) we first extract fact feature information for every fact from word-level side information and construct a line graph where a node and an edge are a fact and a pair of adjacent edges, respectively. (c) After applying an attention-based GCN, a fact feature extractor for fact feature information extraction, aggregator functions for attentive feature aggregation, and fully-connected (FC) layers for scoring facts, are trained via supervised learning.”) Since the layers of the IKGE model, used above with the split predictions tasks, are “fully connected”, it can be inferred that the classes of the all the connections are made so as to accomplish partition of the multiple layers.

It would be obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to combine the base reference of LI with the teachings of OH because both references teach methods for predictions using knowledge graphs and machine learning.
One of ordinary skill in the art would be motivated to do so because, as pointed out in the conclusions of OH, “IKGE preserves global structure information, leading to more accurate KGC in open-world scenarios. Experimental results show that IKGE outperforms existing approaches in both transductive and inductive setups by successfully aggregating neighborhood features.”

Regarding claim 2, LI in view of OH teaches the limitations of claim 1. Further, OH teaches “wherein the emerging task processing units search in the current fact for entities related to the prediction tasks and group the searched entities as entity sets of first, second, and third layers, and… wherein at the first layer are the entity sets directly connected to relation predicates of the prediction tasks, at the second layer are the entity sets that can be reached by subject entities of the prediction tasks in one hop or two hops, at the third layer are the entity sets that can be reached through remaining paths in the current fact in multiple hops”:
([5.2 Attentive feature aggregation, paragraph 3] “We set the search depth from the target fact as 
    PNG
    media_image12.png
    17
    104
    media_image12.png
    Greyscale
, where K denotes the maximum depth for aggregating features of the target fact’s k-hop neighborhoods. For each search depth k, we build an attentive feature aggregator function denoted by AGGREGATE k, which accumulates exactly 1-hop neighbors’ fact features and then passes the aggregated neighborhood features to the next aggregator function AGGREGATE k+1 at depth k + 1. Thus, as the search depth is deeper, the target fact gains more and more neighborhood features while containing graph structural information.”) This citation explicitly shows the “search depth” which captures the relevance of historical relations between entities based on hop-distance, which is equivalent to the details as claimed.
Further, OH teaches “by comparing all the entity sets of a data set, identify and group the entities that have never appeared into entity sets of a fourth layer… and at the fourth layer are the entity sets that have never appeared historically”:
([6.1.1 Real-world datasets, Paragraph 2] “In a transductive setup, all entities and relations in test sets must be in-KG like the existing KG datasets. On the other hand, since IKGE supports out-of-KG entities and relations (unseen/ never appeared historically), test sets for an inductive setup contain at least one out-of-KG entity or relation. To do this, we randomly selected out-of-KG relations based on uniform sampling over all in-KG relations. As a result, the test set is split into 8 fact types: O-O-O, O-O-X, O-X-O, X-O-O, O-X-X, X-X-O, X-O-X, and X-X-X, where O and X indicate in-KG and out-of-KG, respectively... (Here, we see OH teaching the idea of separating entity sets that have never appeared historically.)

Regarding claim 11, it comprises similar limitations as claim 1, and is rejected under the same rationale with the following addition: LI teaches “A knowledge-graph extrapolating system based on multi-layer perception, the system comprising at least one processor”:
([Abstract] “…To capture these properties effectively and efficiently, we propose a novel Recurrent Evolution network based on Graph Convolution Network (GCN), called RE-GCN, which learns the evolutional representations of entities and relations at each timestamp by modeling the KG sequence recurrently. (a convolution network is a type of neural network which is well-known to require a system with at least one processor in order to function).”)

Regarding claim 12, LI in view of OH teaches the limitations of claim 11. Further, claim 12 comprises similar additional limitations as claim 2, and is rejected under the same rationale.

Claims 3-5 & 13-15 are rejected under 35 U.S.C. 103 as being unpatentable over LI in view of OH, as applied to claims above, and further in view of Tsang, S. “Review — Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks.” Available on 4 June 2022 (hereafter, TSANG)
Regarding claim 3, LI in view of OH teaches the limitations of claim 2. Further, OH teaches “wherein the… relevance degrees assigned to the entity sets at the four layers are represented by scalars… as α, β, γ and δ, wherein α > β > γ > δ”:
(5.2 Attentive feature aggregation, paragraph 3] “the cited section explains the extraction of groups and entities using k-hop neighborhoods around the query’s target nodes, resulting in multiple layers of entity sets based on hop reachability. The first layer, k1, would include immediate neighbors while the second layer, k2, would include those reachable in two hops, and so on. Therefore, the weights assigned to each layer is based on reachability, thus resulting in each layer naturally having different values which can be ordered in descending order, as described. Further, the scalars representing the relevance degrees being explicitly represented as “α, β, γ and δ” is merely functional language and does not restrict scope. Further, the degrees are historical relevance degrees, as taught by ZHU at [3.2 Model Components, paragraph 1] cited in claim 1 above.”)
Further, LI in view of OH fails to explicitly teach “α + β + γ + δ = 1.” However, analogous art, TSANG does teach this, as this limitation is describing normalization of weights which is shown by TSANG to be a standard feature in the art.:
([1. Weight Normalization, bullets 1-2] “
In a standard artificial neural network, the computation of each neuron consists in taking a weighted sum of input features x, followed by an elementwise nonlinearity Φ…
Weight normalization re-parameterizes each weight vector w in terms of a parameter vector v and a scalar parameter g and to perform stochastic gradient descent with respect to those parameters instead.”)
	It would be obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to combine the base reference of LI in view of OH with the teachings of TSANG because the base reference uses neural machine learning models with weights while TSANG explores possibilities to accelerate training of these models by manipulating the weights.
One of ordinary skill in the art would be motivated to do so because, as pointed out in the second bullet point of TSANG, weight normalization “…speeds up convergence while it does not introduce any dependencies between the examples in a minibatch. This means that weight norm can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning.”

Regarding claim 4, LI in view of OH & TSANG teaches the limitations of claim 3. Further, OH teaches “wherein the prediction tasks are at least classified into at least the four classes of reasoning scenes: Class 1, having neither never-appeared entities nor never-appeared relations, Class 2, having only never-appeared entities, Class 3, having only never-appeared relations, and Class 4, having both never-appeared entities and never-appeared relations”:
([6.1.1 Real-world datasets, Paragraph 2] “In a transductive setup, all entities and relations in test sets must be in-KG like the existing KG datasets. On the other hand, since IKGE supports out-of-KG entities and relations (unseen/ never appeared historically), test sets for an inductive setup contain at least one out-of-KG entity or relation. To do this, we randomly selected out-of-KG relations based on uniform sampling over all in-KG relations. As a result, the test set is split into 8 fact types: O-O-O, O-O-X, O-X-O, X-O-O, O-X-X, X-X-O, X-O-X, and X-X-X, where O and X indicate in-KG and out-of-KG, respectively... (This shows prediction tasks being split into different “reasoning scenes” according to whether or not they contain any entity or any relation that hasn’t appeared historically, and every combination thereof, as claimed.)”)

Regarding claim 5, LI in view of OH & TSANG teaches the limitations of claim 4. Further, LI teaches “…at the different number of layers…”:
([Abstract] “…a relation-aware GCN is leveraged to capture the structural dependencies within the KG at each timestamp…”)
And further:
([4.1.2 Sequential Patterns across Temporally Adjacent Facts] “Therefore, the potential sequential patterns are modeled by stacking the 𝜔-layer relation-aware GCN.”)
Each evolution unit includes graph convolution layers, enabling reasoning at multiple depths.
Further, LI teaches “…for the multi-layer path extrapolation…”:
([4.1 The Evolution Unit] The cited section shows that the aggregation is across hops and time, which constitutes multi-layer path extrapolation)
Further, OH teaches “processing the different prediction tasks…”:
([Introduction, paragraph 7] “…the proposed model is inductive by fundamentally learning an embedding generator function to compute the vector representations of the facts containing out-of-KG entities and relations…”) OH describes addressing facts containing unseen entities and relations and distinguishes these from conventional closed-world KGC tasks.
Further, OH teaches “connecting each of the classes of the reasoning scenes to the processing unit for the corresponding layer”:
([6.1.1. Real-world datasets] The cited section describes classification of reasoning scenarios based on the presence/absence of unseen entities, in addition to the basis of presence/absence of unseen relations, which defines different reasoning scenes. This combined with LI’s modular, layered processing units will achieve the limitation as claimed.)
Further, OH teaches “…wherein the number of layers for the multi-layer path extrapolation corresponds to the entity sets at the multiple layers”:
([5.2 Attentive feature aggregation, paragraph 3] The cited section shows the klayer setup, showing the cited correspondence.)

Regarding claims 13-15, LI in view of OH teaches the limitations of claim 12. Further, claims 13-15 comprise similar additional limitations as claims 3-5, respectively, and are rejected under the same rationale.

Claims 6 & 16 are rejected under 35 U.S.C. 103 as being unpatentable over LI in view of OH & TSANG, as applied to claims above, and further in view of Jones, A. “andy's blog
An Explanation of Xavier Initialization.” Available on 14 February 2015 (hereafter, JONES), and further in view of Koppert-Anisimova, I. “Cross-Entropy Loss in ML” Available on 3 January 2021 (hereafter, KOPPERT-ANISIMOVA)
Regarding claim 6, LI in view of OH & TSANG teaches the limitations of claim 5. Further, LI teaches “mapping each of the entities, relations, and timestamps data of the data set into a low-dimension dense vector space”:
([4.1 The Evolution Unit] The cited section describes the mapping of each of the entities, relations, and timestamps into a vector space, such as seen in equation 5 and 6 and the correlated text explanations.)
LI in view of OH & TSANG fails to explicitly teach “initializing all pre-determined parameters through Xavier initialization” & “using Cross-Entropy Loss Function to minimize global loss thereby optimizing parameter learning.” However, analogous art, JONES, does teach “initializing all pre-determined parameters through Xavier initialization”:
([Okay, hit me with it. What’s Xavier initialization?] The given citation explains what Xavier initialization is and how it is used to initialize weights in a ML model.)
	It would be obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to combine the base reference of LI in view of OH & TSANG with the teachings of JONES because LI in view of OH & TSANG teaches a machine learning method while JONES teaches a widely used process for machine learning parameters, Xavier Initialization.
One of ordinary skill in the art would be motivated to do so because, as pointed out by JONES in [Why’s Xavier initialization important?], “it helps signals reach deep into the network” and “makes sure the weights are ‘just right’, keeping the signal in a reasonable range of values through many layers.”
Further, LI in view of OH, TSANG, & JONES still fails to explicitly teach “using Cross-Entropy Loss Function to minimize global loss thereby optimizing parameter learning.” However, analogous art, KOPPERT-ANISIMOVA, does teach this:
([When do we use it?] “Cross-entropy loss is used when adjusting model weights during training. The aim is to minimize the loss, i.e, the smaller the loss the better the model. A perfect model has a cross-entropy loss of 0. Normally its serves for multi-class and multi-label classifications.”) Cross-entropy loss is shown here to be a common function which is “normally used” to minimize loss of ML models/global loss.
It would be obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to combine the base reference of LI in view of OH, TSANG, & JONES with the teachings of KOPPERT-ANISIMOVA because LI in view of OH, TSANG, & JONES teaches of a machine learning method while KOPPERT-ANISIMOVA teaches of a common feature used in machine learning.
One of ordinary skill in the art would be motivated to do so because, as pointed out by KOPPERT-ANISIMOVA in the citation above, “the smaller the loss the better the model.”

Regarding claim 16, LI in view of OH & TSANG teaches the limitations of claim 15. Further, claim 16 comprises similar additional limitations as claim 6, and is rejected under the same rationale.

Conclusion
Claim 7 has been searched, but has not been rejected with respect to prior art. This is because claim 7 recites an extremely specific mathematical process, as cited in the 101 rejections above, which have not been found in any prior art, including the applicant’s own prior art.
Claim 17 comprises similar additional limitations as claim 7, and thus the search for claim 7 also covers claim 17.
Claims 8-10, and 18-20 depend upon claim 7 and 17, and thereby inherit the same limitations as those of claims 7 and 17 which have not been found in prior art.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MATTHEW LEE LEWIS whose telephone number is (571)272-1906. The examiner can normally be reached Monday: 12:00PM - 4:00PM and Tuesday - Friday: 12:00PM - 9PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached at (571)272-4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Matthew Lee Lewis/            Examiner, Art Unit 2144                                                                                                                                                                                            

/TAMARA T KYLE/Supervisory Patent Examiner, Art Unit 2144
Read full office action
KNOWLEDGE-GRAPH EXTRAPOLATING METHOD AND SYSTEM BASED ON MULTI-LAYER PERCEPTION

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

KNOWLEDGE-GRAPH EXTRAPOLATING METHOD AND SYSTEM BASED ON MULTI-LAYER PERCEPTION

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email