Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1, 3-11 and 19-26 are pending. Claims 22-26 are new. This Office Action is responsive to the amendment filed on 02/02/2026, which has been entered into the above identified application.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, 3-11 and 19-26 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Independent Claims
Step 1 – Claim 1 is drawn to a method and claim 19 is drawn to a system. Therefore, each of these claims fall under one of the four categories of statutory subject matter (process/method, machine/product/apparatus, manufacture or composition of matter).
Step 2A Prong 1 – Claims 1 and 19 are directed to a judicially recognized exception of an abstract idea without significantly more. Claims 1 and 19 recite:
performing inference for each content type inferring a structure of the content – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0025] of the specification, it states “The systems and method of the present disclosure train a machine learning model architecture with relevant data in the training input to infer various types of graphs (e.g., molecules, process flows, maps, scene graphs, document graphs, etc.) from information in the input. Different training approaches are possible based on the type of graph to be output.” BRI in light of the specification would support that “performing inference inferring a structure of the content” would encompass a mental process with or without the assistance of pen and paper of evaluating a graph type that would be most appropriate for the information in the input.
predicting all possible combinations of nodes and edges for a predicted graph based on information inferred from the input and the structure of the content – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0031] of the specification, it states “the input 10 includes a sketch of a molecule or a map of a city and the graph 16 is derived based on the sketch of the molecule or the map of the city. In this case, the machine learning model 102 is conceptually similar to an object detection model, where nodes are localized in the image like objects, and edges are additionally predicted between them.” BRI in light of the specification would support that “predicting all possible combinations of nodes and edges” would encompass a mental process with or without the assistance of pen and paper of drawing a graph based on an input that connects all nodes to each other.
selecting a subset of nodes and edges from the possible combinations of nodes and edges based on the information inferred from the input – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0036] of the specification, it states “The machine learning model 102 classifies the different possible nodes 18 and/or edges 20 for the graph 16 and selects a set of nodes 24 from the possible nodes 18 and a set of edges 26 from the possible edges 20 for a predicted graph 22. The machine learning model 102 is trained to select the set of nodes 24 and/or the set of edges 26 selected for the predicted graph 22 based on the information received in the input 10.” Paragraph [0076] provides the example in which “the machine learning model 102 identifies the set of nodes 24 and the set of edges 26 based on relationships expressed in natural langue of the text or the document of the input 10. In some implementations, the machine learning model 102 identifies the set of nodes 24 based on identified entities in the image or the video of the input 10 and the set of edges 26 based on identified relationships between the identified entities.” BRI in light of the specification would support that “selecting a subset of nodes and edges” would encompass a mental process with or without the assistance of pen and paper of analyzing an input to identify relevant portions of a graph.
generate a predicted graph with predicted node labels for the subset of nodes and predicted edge labels for the subset of edges – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0062] of the specification, it states “Referring now to Fig. 5, illustrated is an example of an image of a molecule graph 500 inferred by the machine learning system 100 (Fig. 1). In the molecule graph 500, the set of nodes 24 (Fig. 1) included in the molecule graph 500 are atoms and the set of edges 26 (Fig. 1) are the bonds between the atoms.” Additionally, Paragraph [0034] states “Attributes (e.g., shapes, label, color, etc.) of the possible nodes 18 may be used to represent the different types of nodes in a visual representation of the graph 16. The different types of nodes include, for example, atoms in a molecule graph, process steps in a flow or process graph, cities in a map graph, stops on a transit route map graph, entities in a scene graph, and/or document portions in a document graph.” BRI in light of the specification would support that “generating a predicted graph with predicted node labels” would encompass a mental process with or without the assistance of pen and paper of drawing a labeled graph.
performing a task using information in the predicted graph – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0056] of the specification, it states “The predicted graph 22 may include knowledge and/or information in node labels, edge weights, etc., that may be used by the applications 36 to perform various tasks 38 on the information, such as, search and/or question-answering. The display 108 may present the predicted graph 22 to the user 104 and the user 104 may perform one or more tasks 38 on the information provided in the predicted graph 22. An example task 38 includes the user 104 providing a query (e.g., query 40) on the predicted graph 22 (e.g., what is the shortest path from A to B). The application 36 may use the information (e.g., node labels, edges, etc.) in the predicted graph 22 to answer the query provided by the user 104.” BRI in light of the specification would support that “performing a task using information in the predicted graph” would encompass a mental process with or without the assistance of pen and paper of interpreting a graph to answer questions.
Step 2A Prong 2 – The following additional limitations recited do not integrate the abstract idea into a practical application:
receiving an input with content – This limitation recites an insignificant extra-solution activity of mere data gathering (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
the content comprising at least one or images, text documents, maps, audio, or speech – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
by the machine learning model; using a machine learning model – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the above abstract ideas to a machine learning model and thus, fails to integrate the exception into a practical application.
providing an output of a complete representation of the predicted graph with predicted node labels for the subset of nodes indicating a type of node and the predicted edge labels for the subset of edges output in parallel – This limitation recites an insignificant extra-solution activity of mere data output (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
wherein the machine learning model outputs different types of nodes based on information in the input – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---multiple data types and thus, fails to integrate the exception into a practical application.
Claim 19:
at least one processor; memory in electronic communication with the at least one processor: and instructions stored in the memory – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It recites a generic computer or generic computer components that merely act as a tool on which the method operates.
Step 2B – The additional elements in Step 2A Prong 2, view individually or wholistically, do not provide an inventive concept or otherwise amount to significantly more than the abstract idea itself.
receiving an input with content – This limitation recites an insignificant extra-solution activity of mere data gathering (see MPEP § 2106.05(g)), which is well-understood, routine and conventional activity similar to cases reviewed by the courts involving receiving or transmitting data over a network (see MPEP § 2106.05(d)(II)) and thus, fails to provide significantly more to the judicial exception.
the content comprising at least one or images, text documents, maps, audio, or speech – This limitation recites the insignificant extra-solution activity of selecting a particular data source or type of data to be manipulated (see MPEP § 2106.05(g)) )), which is well-understood, routine and conventional activity similar to cases reviewed by the courts involving receiving or transmitting data over a network (see MPEP § 2106.05(d)(II)) and thus, fails to provide significantly more to the judicial exception.
by the machine learning model; using a machine learning model – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to a machine learning model and thus, fails to provide significantly more to the judicial exception.
providing an output of a complete representation of the predicted graph with predicted node labels for the subset of nodes indicating a type of node and the predicted edge labels for the subset of edges output in parallel – This limitation recites an insignificant extra-solution activity of mere data output (see MPEP § 2106.05(g)), which is well-understood, routine and conventional activity similar to cases reviewed by the courts involving receiving or transmitting data over a network (see MPEP § 2106.05(d)(II)) and thus, fails to provide significantly more to the judicial exception.
wherein the machine learning model outputs different types of nodes based on information in the input – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---multiple data types and thus, fails to provide significantly more to the judicial exception.
Claim 19:
at least one processor; memory in electronic communication with the at least one processor: and instructions stored in the memory – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.
As such, Claims 1 and 19 are not patent eligible.
Dependent Claims
Claims 3-11 and 20-26 merely narrow the previously cited abstract idea limitations. For the reasons described above with respect to independent claims 1 and 19, these judicial exceptions are not meaningfully integrated into a practical application, nor amount to significantly more than the abstract idea itself. The claims disclose similar limitations described for the independent claims above and do not provide anything more than the mental processes that are practically capable of being performed in the human mind with the assistance of pen and paper. Therefore claims 3-11 and 20-26 also recite abstract ideas that do not integrate into a practical application or amount to significantly more than the judicial exception, and are rejected under U.S.C. § 101.
Step 1 – Claims 3-11 and 21-26 are drawn to a method and claim 20 is drawn to a system. Therefore, each of these claims fall under one of the four categories of statutory subject matter (process/method, machine/product/apparatus, manufacture or composition of matter).
Step 2A Prong 1 – These claims are directed to a judicially recognized exception of an abstract idea without significantly more.
Claim 4:
wherein the machine learning model identifies the set of nodes and the set of edges based on relationships expressed in natural langue of the text or the document of the input – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0071] of the specification, it states “Inferring the document structure from formats, such as, PDF may have many applications for document understanding and extraction pipelines. Entities and their relationships are often expressed in natural language, for example, within a sentence or paragraph of text. The machine learning system 100 may identify the entities and their relationships and extract a predicted document graph 900 based on the identified entities and relationships. Examples of predicted document graphs 900 extracted from text include parse trees and knowledge graphs.” BRI in light of the specification would support that “identif[ying] the set of nodes and edges based on relationships expressed in natural langue of the text” would encompass a mental process with or without the assistance of pen and paper of analyzing a text file.
Claim 5:
wherein the machine learning model identifies the set of nodes based on identified entities in the image or the video of the input and the set of edges based on identified relationships between the identified entities – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). BRI would support that “identif[ying] the set of nodes based on identified entities in the image or the video and the set of edges based on identified relationships between the identified entities” would encompass a mental process with or without the assistance of pen and paper of analyzing an image/video.
Claim 6:
wherein the machine learning model identifies the set of nodes based on identified entities in the audio or the speech of the input and the set of edges based on actions performed between identified entities in the audio or the speech – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). BRI would support that “identif[ying] the set of nodes based on identified entities in the audio or speech and the set of edges based on action performed between the identified entities” would encompass a mental process with or without the assistance of pen and paper of analyzing an audio file.
Claim 7:
wherein the machine learning model identifies the set of nodes based on identified stops in the map of the input and the set of edges based on a travel mode between the identified stops – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). BRI would support that “identif[ying] the set of nodes based on identified stops in the map and the set of edges based on travel mode between the identified stops” would encompass a mental process with or without the assistance of pen and paper of reading a map.
Claim 10:
using the information in the set of nodes and the set of edges of the predicted graph to provide an answer to the query – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0056] of the specification, it states “An example task 38 includes the user 104 providing a query (e.g., query 40) on the predicted graph 22 (e.g., what is the shortest path from A to B). The application 36 may use the information (e.g., node labels, edges, etc.) in the predicted graph 22 to answer the query provided by the user 104.” BRI in light of the specification would support that “provid[ing] an answer to the query” would encompass a mental process with or without the assistance of pen and paper of interpreting a graph.
Claim 11:
updating the representation of the predicted graph based on the modifications of the set of nodes or the set of edges – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0057] of the specification, it states “Another example task 38 includes the user 104 modifying the predicted graph 22 (e.g., changing nodes, removing nodes, adding nodes, changing a location of the nodes, changing labels of the graph, changing edges, removing edges, adding edges, changing a location of the edges, etc.). The representation of the predicted graph 22 is updated on the display 108 based on the modifications made to the predicted graph 22.” BRI in light of the specification would support that “updating the representation of the predicted graph” would encompass a mental process with or without the assistance of pen and paper of modifying a graph based on feedback.
Claim 24:
wherein the machine learning model infers relationships between the entities that are both spatial and conceptual in predicting the possible combination of the nodes and the edges for the predicted graph – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0068] of the specification, it states “The predicted scene graph 800 may capture relationships between entities that are both spatial and conceptual. In a visual scene, entities or objects may be represented in the predicted scene graph 800 by nodes and the relationships between the entities or objects, including physical relationships in space and actions occurring between the entities or objects, may be represented as directed edges. For example, different color of nodes represent different types of nodes (e.g., green represents verbs and blue represents objects or individuals).” BRI in light of the specification would support that “inferring relationships between the entities that are both spatial and conceptual” would encompass a mental process with or without the assistance of pen and paper of evaluating how entities are related.
Claim 25:
wherein the machine learning infers relationships that are physical relationships in space between the entities in predicting the possible combination of the nodes and the edges for the predicted graph – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0068] of the specification, it states “The predicted scene graph 800 may capture relationships between entities that are both spatial and conceptual. In a visual scene, entities or objects may be represented in the predicted scene graph 800 by nodes and the relationships between the entities or objects, including physical relationships in space and actions occurring between the entities or objects, may be represented as directed edges. For example, different color of nodes represent different types of nodes (e.g., green represents verbs and blue represents objects or individuals).” BRI in light of the specification would support that “inferring relationships that are physical relationships in space between the entities” would encompass a mental process with or without the assistance of pen and paper of representing relationships between entities in a graph based on their location (e.g. organizing a graph like a subway system map).
Claim 26:
wherein the machine learning models infers relationships between the entities based on actions occurring between the entities in predicting the possible combination of the nodes and the edges for the predicted graph – This limitation is directed towards the abstract idea of a mental process, or a concept that can be performed in the human mind, including observation, evaluation, judgement or opinion (see MPEP § 2106.04(a)(2), subsection III, C). In Paragraph [0068] of the specification, it states “The predicted scene graph 800 may capture relationships between entities that are both spatial and conceptual. In a visual scene, entities or objects may be represented in the predicted scene graph 800 by nodes and the relationships between the entities or objects, including physical relationships in space and actions occurring between the entities or objects, may be represented as directed edges. For example, different color of nodes represent different types of nodes (e.g., green represents verbs and blue represents objects or individuals).” BRI in light of the specification would support that “inferring relationships between entities based on actions occurring between the entities” would encompass a mental process with or without the assistance of pen and paper of representing relationships between entities in a graph based on their interactions with each other (e.g. organizing a graph with edges describing what an entity does to another).
Step 2A Prong 2 – These limitations do not recite any additional elements which integrate the abstract idea into a practical application.
Claims 3 and 20:
wherein the input includes one or more of an image, a document, a video, a scene, a map, audio, speech, or text – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---specific type of input and thus, fails to integrate the exception into a practical application.
Claim 8:
wherein the input includes any combination of documents, video, audio, speech, or text – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---specific type of input and thus, fails to integrate the exception into a practical application.
Claim 9:
wherein the representation of the predicted graph includes different shapes and labels for the set of nodes and one or more of undirected edges, directed edges, or weighted edges in the set of edges – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to specific graph formatting and thus, fails to integrate the exception into a practical application.
Claim 10:
receiving a query – This limitation recites an insignificant extra-solution activity of mere data gathering (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
Claim 11:
presenting the representation of the predicted graph – This limitation recites an insignificant extra-solution activity of mere data output (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
receiving a modification of the set of nodes or the set of edges in the predicted graph – This limitation recites an insignificant extra-solution activity of mere data gathering (see MPEP § 2106.05(g)) and thus, fails to integrate the exception into a practical application.
Claim 21:
wherein the input is a PDF image of a table and the machine learning extracts information from the table and provides the representation of the predicted graph of the table in the PDF image – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---specific type of input and thus, fails to integrate the exception into a practical application.
Claim 22:
wherein the type of node is an atom, and the predicted graph is a molecule graph – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---specific type of graph and thus, fails to integrate the exception into a practical application.
Claim 23:
wherein the type of node is a city, and the predicted graph is a transportation graph – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---specific type of graph and thus, fails to integrate the exception into a practical application.
Step 2B – These limitations, as a whole, do not amount to significantly more than the judicial exception.
Claims 3 and 20:
wherein the input includes one or more of an image, a document, a video, a scene, a map, audio, speech, or text – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---specific type of input and thus, fails to provide significantly more to the judicial exception.
Claim 8:
wherein the input includes any combination of documents, video, audio, speech, or text – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---specific type of input and thus, fails to provide significantly more to the judicial exception.
Claim 9:
wherein the representation of the predicted graph includes different shapes and labels for the set of nodes and one or more of undirected edges, directed edges, or weighted edges in the set of edges – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to specific graph formatting and thus, fails to provide significantly more to the judicial exception.
Claim 10:
receiving a query – This limitation recites the well-understood, routine, conventional activity of receiving or transmitting data over a network (see MPEP § 2106.05(d)) and thus, fails to provide significantly more to the judicial exception.
Claim 11:
presenting the representation of the predicted graph – This limitation recites an insignificant extra-solution activity of mere data output (see MPEP § 2106.05(g)), which is well-understood, routine and conventional activity similar to cases reviewed by the courts involving receiving or transmitting data over a network (see MPEP § 2106.05(d)(II)) and thus, fails to provide significantly more to the judicial exception.
receiving a modification of the set of nodes or the set of edges in the predicted graph – This limitation recites the well-understood, routine, conventional activity of receiving or transmitting data over a network (see MPEP § 2106.05(d)) and thus, fails to provide significantly more to the judicial exception.
Claim 21:
wherein the input is a PDF image of a table and the machine learning extracts information from the table and provides the representation of the predicted graph of the table in the PDF image – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---specific type of input and thus, fails to provide significantly more to the judicial exception.
Claim 22:
wherein the type of node is an atom, and the predicted graph is a molecule graph – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---specific type of graph and thus, fails to provide significantly more to the judicial exception.
Claim 23:
wherein the type of node is a city, and the predicted graph is a transportation graph – This limitation amounts to no more than generally linking the use of the judicial exception to a particular technological environment or field of use (see MPEP § 2106.05(h)). It merely limits the use of the abstract idea to ---specific type of graph and thus, fails to provide significantly more to the judicial exception.
As such, Claims 3-11 and 20-26 are not patent eligible.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 3-5, 7-11, 19-20, and 24-26 are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (“Graph R-CNN for Scene Graph Generation”, published 9/10/2018), hereinafter Yang, in view of Hadar et al. (US 20220051111 A1, filed 08/17/2020), hereinafter Hadar. Yang and Hadar were cited in previous Office Actions.
Regarding Claim 1, Yang teaches a method, comprising:
receiving an input with content, the content comprising at least one or images, text documents, maps, audio, or speech (Yang: “we use the Faster R-CNN [32] framework to extract a set of n object proposals from an input image.” [Section 3.1 Object Proposals]);
predicting, by the machine learning model, all possible combinations of nodes and edges for a predicted graph based on information inferred from the input and the structure of the content by the machine learning model (Yang: “Given an image (a), our proposed approach first extracts a set of objects visible in the scene and considers possible relationships between all nodes (b)” [Fig. 1]);
selecting, by the machine learning model, a subset of nodes and edges from the possible combinations of nodes and edges based on the information inferred from the input (Yang: “To model these regularities, we introduce a relation proposal network (RePN) which learns to efficiently estimate the relatedness of an object pair. By pruning edges corresponding to unlikely relations, the RePN can efficiently sparsify the candidate scene graph – retaining likely edges and suppressing noise introduced from unlikely ones.” [Section 3.2 Relation Proposal Network]; “The remaining m object pairs are considered as candidates having meaningful relationships E. With E, we obtain a graph G = (V , E), which
is much sparser than the original fully connected graph.” [Section 3.2 Relation Proposal Network]);
using a machine learning model to generate a predicted graph with predicted node labels for the subset of nodes and predicted edge labels for the subset of edges (Yang: “Then it prunes unlikely relationships using a learned measure of ‘relatedness’, producing a sparser candidate graph structure (c). Finally, an attentional graph convolution network is applied to integrate global context and update object node and relationship edge labels.” [Fig. 1]; “given the resulting sparsely connected scene graph candidate, we apply an attentional graph convolution network (aGCN) to propagate higher-order context throughout the graph – updating each object and relationship representation based on its neighbors.” [Section I. Introduction]);
providing an output of a complete representation of the predicted graph with predicted node labels for the subset of nodes indicating a type of node and the predicted edge labels for the subset of edges, wherein the machine learning model outputs different types of nodes based on information in the input (Yang: “we model scene graphs as graphs consisting of image regions, relationships, and their labellings.” [Section 3. Approach]; See “Scene Graph” in [Fig. 2]; “Recall that from the previous sections we have a set of N object regions and m relationships. From these, we construct a graph G with nodes corresponding to object and relationship proposals. We insert edges between relation nodes and their associated objects. We also add skip-connect edges directly between all object nodes. These connections allow information to flow directly between object nodes. Recent work has shown that reasoning about object correlation can improve detection performance [10]. We apply aGCN to this graph to update object and relationship representations based on global context. Note that our graph captures a number of different types of connections (i.e.object ↔ relationship, relationship ↔ subject and object ↔ object). In addition, the information flow across each connection may be asymmetric (the informativeness of subject on relationship might be quite different from relationship to subject). We learn different transformations for each type and ordering – denoting the linear transform from node type a to node type b as Wab with s=subjects, o=objects, and r=relationships.” [Section 3.3 Attentional GCN]); and
performing a task using information in the predicted graph (Yang: “These scene graphs form an interpretable structured representation of the image that can support higher-level visual intelligence tasks such as captioning [24, 39], visual question answering [1, 11, 35, 37–39], and image-grounded dialog [3].” [Section 1. Introduction]).
However, Yang fails to expressly disclose performing inference, by a machine learning model, for each content type inferring a structure of the content; and providing a predicted graph output in parallel.
In the same field of endeavor, Hadar teaches performing inference, by a machine learning model, for each content type inferring a structure of the content (Hadar: “Another example technique is a machine learning-based technique. In this approach, the knowledge graph generation engine 136 can use one or more machine learning models to generate the knowledge graph based on the data in the discovery database 134. This approach can be especially advantageous when the data includes unstructured or semi-structured text, images, and/or videos.” [0024]); and
providing a predicted graph output in parallel (Hadar: “The ontology can be metadata of the knowledge graph and can define the kind of entities and relationships may exist in the knowledge graph. The ontology can also define the kind of relationships that are valid between every pair of entities, which the knowledge graph generation engine 136 can use to generate the edges between the nodes in the knowledge graph.” [0023]; “while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous.” [0079]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to have incorporated performing inference, by a machine learning model, for each content type inferring a structure of the content; and providing a predicted graph output in parallel, as taught by Hadar to the method of Yang because both of these methods are directed towards utilizing machine learning to generate a predicted relationship graph to represent information in an input. In making this combination and determining the structure of the content before outputting a predicted graph in parallel, it would allow the method of Yang to interpret multiple types of inputs, allowing for wide applicability (Hadar: [0024]), as well as multitask, thereby speeding up computation while still achieving desirable results (Hadar: [0079]).
Regarding Claim 3, Yang and Hadar teach the method of Claim 1, wherein the input includes one or more of an image, a document, a video, a scene, a map, audio, speech, or text (Yang: “Given an image, our model first uses RPN to propose object regions, and then prunes the connections between object regions through our relation proposal network (RePN).” [Fig. 2]; Hadar: “the data includes unstructured or semi-structured text, images, and/or videos.” [0024]).
Regarding Claims 19 and 20, they are system claims that correspond to Claims 1 and 3. Therefore, they are rejected for the same reasons as Claims 1 and 3 above.
Regarding Claim 4, Yang and Hadar teach the method of Claim 3, wherein the machine learning model identifies the set of nodes and the set of edges based on relationships expressed in natural langue of the text or the document of the input (Hadar: “A knowledge graph can represent a real world system, such as a computer network, roadways in a geographic area, or a population of people during an epidemic outbreak. The nodes of the knowledge graph can represent the real world elements in the system, e.g., computing devices in a computer network, roads in the geographic area, or people in the population. The edges between the nodes can represent the relationships between the real world elements, e.g., pathways between pairs of elements and the characteristics of the pathways.” [0016]; “the data includes unstructured or semi-structured text, images, and/or videos.” [0024]).
Regarding Claim 5, Yang and Hadar teach the method of Claim 3, wherein the machine learning model identifies the set of nodes based on identified entities in the image or the video of the input and the set of edges based on identified relationships between the identified entities (Yang: “We propose a novel scene graph generation model called Graph R-CNN, that is both effective and efficient at detecting objects and their relations in images.” [Abstract]).
Regarding Claim 7, Yang and Hadar teach the method of Claim 3, wherein the machine learning model identifies the set of nodes based on identified stops in the map of the input and the set of edges based on a travel mode between the identified stops (Hadar: “A knowledge graph can represent a real world system, such as a computer network, roadways in a geographic area, or a population of people during an epidemic outbreak. The nodes of the knowledge graph can represent the real world elements in the system, e.g., computing devices in a computer network, roads in the geographic area, or people in the population. The edges between the nodes can represent the relationships between the real world elements, e.g., pathways between pairs of elements and the characteristics of the pathways.” [0016]; “the data includes unstructured or semi-structured text, images, and/or videos.” [0024]; BRI of map is that a map is a type of image that shows geographic locations, including roads).
Regarding Claim 8, Yang and Hadar teach the method of Claim 1, wherein the input includes any combination of documents, video, audio, speech, or text (Hadar: “the data includes unstructured or semi-structured text, images, and/or videos.” [0024]).
Regarding Claim 9, Yang and Hadar teach the method of Claim 1, wherein the representation of the predicted graph includes different shapes and labels for the set of nodes and one or more of undirected edges, directed edges, or weighted edges in the set of edges (Hadar: “An example, knowledge graph 150 illustrated in FIG. 1. This knowledge graph 150 includes nodes 151 represented by circles and edges 152 represented by arrows. As described in more detail below, the knowledge graph 150 includes regular nodes (without shading), a cardinal node 153, and target nodes 154 and 155. The knowledge graph 150 will be used as an example for the remaining description of FIG. 1, although the techniques can be applied to knowledge graphs having different arrangements, sizes, numbers of nodes, different edges, etc.” [0026]; See [Figure 1]).
Regarding Claim 10, Yang and Hadar teach the method of Claim 1, further comprising:
receiving a query (Hadar: “The analytical engine 142 can receive queries 117 from user terminals 116 (e.g., client computers)” [0038]); and
using the information in the set of nodes and the set of edges of the predicted graph to provide an answer to the query (Hadar: “the analytical engine 142 can evaluate the knowledge graph 150 to identify the nodes that are on a path to the specified targets and that have a high (e.g., greater than threshold or higher than other nodes) cardinal value. The analytical engine 142 can respond to each query 117 with node data 118 specifying the nodes that match the query 117.” [0038]).
Regarding Claim 11, Yang and Hadar teach the method of Claim 1, further comprising:
presenting the representation of the predicted graph (Hadar: “the node prioritization engine 146 can generate a graph that plots the aggregate cardinal values over time so that a user can assess the effectiveness of the efforts and resource utilization to improve the system.” [0041]);
receiving a modification of the set of nodes or the set of edges in the predicted graph (Hadar: “The update may happen either as a next round of system scan, or, alternatively, as simulation run by a user.” [0041]); and
updating the representation of the predicted graph based on the modifications of the set of nodes or the set of edges (Hadar: “The knowledge graph generation engine 136 can update the knowledge graph 150 after elements corresponding to the nodes 151 in the knowledge graph 150 are improved, removed from the system, or the system is otherwise altered.” [0041]).
Regarding Claim 24, Yang and Hadar teach the method of Claim 1, wherein the machine learning model infers relationships between the entities that are both spatial and conceptual in predicting the possible combination of the nodes and the edges for the predicted graph (Yang: “The pipeline of our proposed Graph R-CNN framework. Given an image, our model first uses RPN to propose object regions, and then prunes the connections between object regions through our relation proposal network (RePN). Attentional GCN is then applied to integrate contextual information from neighboring nodes in the graph. Finally, the scene graph is obtained on the right side.” [Fig. 2]; See “Scene Graph” in [Fig. 2]).
Regarding Claim 25, Yang and Hadar teach the method of Claim 1, wherein the machine learning infers relationships that are physical relationships in space between the entities in predicting the possible combination of the nodes and the edges for the predicted graph (Yang: “In our approach, we use the Faster R-CNN [32] framework to extract a set of n object proposals from an input image. Each object proposal i is associated with a spatial region roi = [xi, yi, wi, hi], a pooled feature vector xoi, and an initial estimated label distribution poi over classes C={1, . . . , k}. We denote the collection of these vectors for all n proposals as the matrices Ro∈ Rn×4, Xo∈ Rn×d, and Po∈ Rn×|C| respectively.” [Section 3.1 Object Proposals]).
Regarding Claim 26, Yang and Hadar teach the method of Claim 1, wherein the machine learning models infers relationships between the entities based on actions occurring between the entities in predicting the possible combination of the nodes and the edges for the predicted graph (Yang: “Given the n proposed object nodes from the previous step, there are O(n2) possible connections between them; however, as previously discussed, most object pairs are unlikely to have relationships due to regularities in real-world object interactions. To model these regularities, we introduce a relation proposal network (RePN) which learns to efficiently estimate the relatedness of an object pair. By pruning edges corresponding to unlikely relations, the RePN can efficiently sparsify the candidate scene graph – retaining likely edges and suppressing noise introduced from unlikely ones.” [Section 3.2 Relation Proposal Network]).
Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Yang in view of Hadar, as applied to Claim 3 above, in further view of Zhang et al. (US 20210104234 A1, filed 10/08/2020), hereinafter Zhang. Zhang was cited in a previous Office Action.
Regarding Claim 6, Yang and Hadar teach the method of Claim 3. However, they fail to expressly disclose wherein the machine learning model identifies the set of nodes based on identified entities in the audio or the speech of the input and the set of edges based on actions performed between identified entities in the audio or the speech.
In the same field of endeavor, Zhang teaches wherein the machine learning model identifies the set of nodes based on identified entities in the audio or the speech of the input and the set of edges based on actions performed between identified entities in the audio or the speech (Zhang: “the knowledge graph is created by: receiving one or more phrases; for each of the one or more phrases: performing lemmatization on the respective phrase to reduce the phrase; tagging parts of speech in the reduced phrase; extracting action words and forming an action list; and extracting object words and forming an object list, wherein the one or more nodes added to the knowledge graph correspond to the extracted action words and the extracted object word” [0027]).
It would have been obvious to one or ordinary skill in the art before the effective filing date of the invention to have incorporated wherein the machine learning model identifies the set of nodes based on identified entities in the audio or the speech of the input and the set of edges based on actions performed between identified entities in the audio or the speech, as taught by Zhang, to the method of Yang and Hadar because both methods are directed towards using machine learning to generate graphical representations of inputs. Audio/speech is another form of language that can be interpreted through natural language processing, just like text. In making this combination and expanding the type of input to include relationships between actions and entities in audio/speech data, it would allow the method of Yang and Hadar to expand its use to applications in “commercial and academic sectors… such as voice searching in mobile devices and meeting user summarization” (Zhang: [0010]).
Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Yang in view of Hadar, as applied to Claim 1 above, in further view of Chua et al. (US 20200410231 A1; filed 06/28/2019), hereinafter Chua. Chua was cited in the previous Office Action.
Regarding Claim 21, Yang and Hadar teach the method of Claim 1. However, they fail to expressly disclose wherein the input is a PDF image of a table and the machine learning model extracts information from the table and provides the representation of the predicted graph of the table in the PDF image.
In the same field of endeavor, Chua teaches wherein the input is a PDF image of a table and the machine learning model extracts information from the table and provides the representation of the predicted graph of the table in the PDF image (Chua: “A method for extracting data from lineless tables includes storing an image including a table in a memory. A processor operably coupled to the memory identifies a plurality of text-based characters in the image, and defines multiple bounding boxes based on the characters. Each of the bounding boxes is uniquely associated with at least one of the text-based characters. A graph including multiple nodes and multiple edges is generated based on the bounding boxes, using a graph construction algorithm.” [Abstract]; “In some embodiments, a method for extracting data from lineless tables includes obtaining, at a processor, a portable document format (PDF) file including formatted data. The PDF file is converted to an image file, and OCR is performed on the image file to produce a scanned file.” [0005]).
It would have been obvious to one or ordinary skill in the art before the effective filing date of the invention to have incorporated wherein the input is a PDF image of a table and the machine learning model extracts information from the table and provides the representation of the predicted graph of the table in the PDF image, as taught by Chua, to the method of Yang and Hadar because both methods are directed towards using machine learning to generate graphical representations of inputs. In making this combination and expanding the type of input to include information extracted from PDF tables, it would allow the method of Yang and Hadar to expand its use to address the issue that existing technologies have with OCR software, in that such software “typically cannot determine associations among the recovered characters/text or between the recovered characters/text and locations (e.g. cells) within a table” (Chua: [0037]).
Claims 22 and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Yang in view of Hadar, as applied to Claim 1 above, in further view of Hamilton (“Graph Representation Learning”; published 2020).
Regarding Claim 22, Yang and Hadar teach the method of Claim 1. However, they fail to expressly disclose wherein the type of node is an atom, and the predicted graph is a molecule graph.
In the same field of endeavor, Hamilton teaches wherein the type of node is an atom, and the predicted graph is a molecule graph (Hamilton: “The same graph formalism can be used to represent social networks, interactions between drugs and proteins, the interactions between atoms in a molecule, or the connections between terminals in a telecommunications network—to name just a few examples.” [Chapter 1. Introduction]).
It would have been obvious to one or ordinary skill in the art before the effective filing date of the invention to have incorporated wherein the type of node is an atom, and the predicted graph is a molecule graph, as taught by Hamilton, to the method of Yang and Hadar because both methods are directed towards using machine learning to generate graphical representations of multiple types of inputs to represent relationships between multiple types entities. In making this combination and expanding the type of interpreted graphs to include molecular relationships between atoms, it would allow the method of Yang and Hadar to “a mathematical foundation that we can build upon to analyze understand, and learning from real-world complex systems” (Hamilton: [Chapter 1. Introduction]).
Regarding Claim 23, Yang and Hadar teach the method of Claim 1. However, they fail to expressly disclose wherein the type of node is a city, and the predicted graph is a transportation graph.
In the same field of endeavor, Hamilton teaches wherein the type of node is a city, and the predicted graph is a transportation graph (Hamilton: “For instance, in a multiplex transportation network, each node might represent a city and each layer might represent a different mode of transportation (e.g., air travel or train travel). Intra-layer edges would then represent cities that are connected by different modes of transportation, while inter-layer edges represent the possibility of switching modes of transportation within a particular city.” [Chapter 1.1.1 Multi-relational Graph]).
It would have been obvious to one or ordinary skill in the art before the effective filing date of the invention to have incorporated wherein the type of node is a city, and the predicted graph is a transportation graph, as taught by Hamilton, to the method of Yang and Hadar because both methods are directed towards using machine learning to generate graphical representations of multiple types of inputs to represent relationships between multiple types entities. In making this combination and expanding the type of interpreted graphs to include transportation between cities, it would allow the method of Yang and Hadar to “a mathematical foundation that we can build upon to analyze understand, and learning from real-world complex systems” (Hamilton: [Chapter 1. Introduction]).
Response to Arguments
Examiner acknowledges the Applicant’s amendments to Claims 1 and 19, as well as newly added Claims 22-26.
Applicant's arguments, filed 02/02/2026, traversing the rejection of Claims 1, 3-11 and 19-21 under 35 U.S.C. § 101 have been fully considered but are not persuasive.
Applicant alleges, on Pages 7-8 of the Remarks, that the features of the amended independent claims are not capable of being performed in the human mind, specifically because the human mind cannot perform inference by a machine learning model, not can the human mind “provide, using a parallel decoder, an output of a complete representation of the predicted graph with predicted node labels,” as recited in amended independent claim 19. Additionally, Applicant also submits that the amended independent claims recite elements that are directed to an improvement in computing, as “parallel decoding of the graph has significant advantages over sequential decoding, including being faster at inference time and simplifying the learning procedure by eliminating the dependence on the many arbitrary orders in which the graph can be produced. The parallel decoding of the graph also enables the graph generation task to be abstracted completely as an end-to-end learning problem, which among other things eliminates the need for custom-engineered approached for each application”, as described in the specification. In addition, the features recited in the amended independent claims have “the advantage of not only being faster at inference time, but also means the machine learning model 102 to be permutation invariant (invariant to the ordering of the nodes). This eliminates the additional complexity encountered by autoregressive approaches, which must commit the model to producing the nodes and edges in a particular arbitrary order.” As such, the Applicant submits that the amended independent claims are patent eligible.
Examiner respectfully disagrees.
Firstly, Step 2A Prong One of the Alice/Mayo test for patent eligibility instructs examiners to determine whether a claim recites an abstract by (1) identifying the specific limitation(s) in the claim under examination that the examiner believes recites an abstract idea, and (2) determining whether the identified limitation(s) fall within at least one of the groupings of abstract ideas. If the identified limitation(s) falls within at least one of the groupings of abstract ideas, it is reasonable to conclude that the claim recites an abstract idea in Step 2A Prong One. As noted in the 101 analysis above, the human mind is perfectly capable of, and regularly does, perform inference as well as generate a representation of a graph. That the inference or graph generation is being done by a machine learning model or using a parallel encoder does not preclude the limitations from reciting abstract ideas because, as noted in MPEP 2106.04(a)(2)(III)(C), claims that describe a concept that is performed in the human mind and the concept is merely “performed 1) on a generic computer, or 2) in a computer environment, or 3) is merely using a computer as a tool to perform the concept”, then that claim is considered to recite a mental process. As such, Examiner asserts that the independent claim as amended recites an abstract idea at least with the claimed “performing inference…”, as well as other recitations of abstract ideas as listed in the Step 2A Prong One analysis of the independent claim above.
Secondly, in the Step 2A Prong Two and Step 2B analyses, if the alleged improvement comes exclusively from the use of parallel decoding, not only is this a well-known, widely used, preexisting computational technique that can be applied on any generic computer, but it merely invokes the use of a computer to perform the existing mental process of interpreting information in a non-graphical format to represent it in a graphical format. As per MPEP § 2106.05(f), “Use of a computer or other machinery in its ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) does not integrate a judicial exception into a practical application or provide significantly more… Similarly, "claiming the improved speed or efficiency inherent with applying the abstract idea on a computer" does not integrate a judicial exception into a practical application or provide an inventive concept.” As the provided limitation of parallel decoding is both well-understood, routine, and conventional in the field of computer architecture and added as a mere token addition to the claim without imposing any meaningful limitations on the claimed invention of using a machine learning model to interpret relational information and generate a knowledge graph, the amended independent claims recite an abstract idea that is merely performed on a general purpose computer and is not integrated into a practical application nor amounts to an inventive concept, and thus Examiner asserts that the 35 U.S.C. § 101 rejection set forth above is proper and establishes a prima facie case of patent ineligibility for the independent claims 1 and 19. Dependent claims 3-11 and 20-26 are similarly ineligible for their dependence on an ineligible claim as well as for their own deficiencies as outlined in the 35 U.S.C. § 101 analysis above.
Applicant’s arguments, filed 02/02/2026, regarding the rejection of Claims 1-11 and 19-21 under 35 U.S.C. § 102/103 have been fully considered and are found moot in light of the new grounds of rejection (see rejection above).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Ankisettipalli et al. (US 20190332956 A1) teaches generating dynamic knowledge graphs indicating relationships between data and analyzing the knowledge graphs to respond to user queries.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MEGAN E HWANG whose telephone number is (703)756-1377. The examiner can normally be reached Monday-Thursday 10:00-7:30 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Welch can be reached at (571) 272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/M.E.H./Examiner, Art Unit 2143
/JENNIFER N WELCH/Supervisory Patent Examiner, Art Unit 2143