DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 1 and 13 are objected to because of the following informalities: the limitation “the graph is stored one or more vertex tables and one or more edge data structures” appears to be missing a word between “stored” and “one.” Appropriate correction is required.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-10 and 13-19 are rejected under 35 U.S.C. 103 as being unpatentable over Arnaboldi et al. (U.S. Patent No. US 20210224235 A1), hereinafter “Arnaboldi” in view of Haprian et al. (U.S. Patent No. US 20210182285 A1), hereinafter “Haprian.”
With regards to Claim 1, Arnaboldi teaches:
A method comprising: performing in-memory processing of a pattern matching query for matching a path pattern on a graph (Paragraphs 14-15 and 26, “CSR encodings are graph topology indices that make navigation through graph elements quick, speeding up workloads such as graph pattern matching queries and computationally or navigationally intensive graph algorithms such as PageRank. Herein are ways to efficiently construct an in-memory index for a graph defined over existing relational tables that can seamlessly integrate with an in-memory storage feature provided by a relational database management system (RDBMS)… Explained herein is how to efficiently build a main memory representations of graph indices that exploit CSR representations and seamlessly integrate with in-memory columnar representations of relational database tables that participate in providing content of the graph… Construction of the main memory representation for a graph may be performed concurrent to user requests (especially, requests for executing graph algorithms or graph pattern match querying).” The main memory representation for a graph being built with in-memory columnar representations of relational database tables and constructed concurrently with requests for graph pattern match querying and navigationally intensive graph algorithms correlates to performing in-memory processing of a pattern matching query for matching a path pattern on a graph), wherein:
the graph is stored one or more vertex tables and one or more edge data structures (Paragraphs 20 and 42, “A graph data model herein is defined over many relational tables (not just two) acting as graph vertex or graph edge containers… Graph 105 is loaded from a relational database having relational schema 160 that defines vertex tables 171-172 and edge tables 181-182. Relational schema 160 defines the persistence format of data for graph 105, and graph data model 130 defines an analytical format that is suitable for graph analytics in memory. For example, each row of vertex table 171 may be a persistent representation of a respective vertex of vertex type 141. For example, vertex A may be stored as a row in vertex table 171, and vertex D may be stored in vertex table 172.” The graph being loaded from a relational database having a relational schema with persistent representations of vertexes and edges in vertex and edge tables correlates to the graph being stored in one or more vertex tables and one or more edge data structures),
performing the in-memory processing of the pattern matching query comprises:
partitioning vertices of a graph table row source into a plurality of granules of work (Paragraphs 74, 100, “In another example as described later herein: a) horizontal and/or vertical slices of vertex tables and/or edge tables have their data stored into memory chunks; b) each chunk may be individually loaded or evicted; and c) multiple chunks may load in parallel from same or different relational table(s)… However in memory, an implementation may store any array, such as property vector(s) or lookup tables, in segments known herein as chunks that need not be contiguous. In an embodiment, populating or later using a chunk may be assigned as a unit of work to a thread.” Horizontal slices of vertex tables being stored as chunks, which may be assigned as a unit of work to a thread, correlates to partitioning vertices of a graph table row source into a plurality of granules of work); and
executing the plurality of granules of work on a plurality of worker processes using inter-process parallelism (Paragraphs 105 and 133, “In step 402, each of many threads concurrently processes a respective subset of rows of one vertex table of a relational database. Simultaneously, different threads may perform different respective sub-steps. If there is a backlog of unprocessed chunks because of fewer threads than chunks, than a thread that finishes a chunk may take and process another chunk until all chunks are processed… Depending on the database system, parallelism can be achieved using processes, or lightweight threads, or a combination of both.” Each of the processes concurrently processing an unprocessed chunk to achieve parallelism correlates to executing the plurality of granules of work on a plurality of worker processes using inter-process parallelism), executing each given granule of work on a given worker process comprises:
and scheduling execution of the plurality of tasks on a plurality of threads from a pool of threads using intra-process parallelism (Paragraphs 9, 105 and 133, “FIG. 4 is a flow diagram that depicts an example multithreaded process… In step 402, each of many threads concurrently processes a respective subset of rows of one vertex table of a relational database. Simultaneously, different threads may perform different respective sub-steps. If there is a backlog of unprocessed chunks because of fewer threads than chunks, than a thread that finishes a chunk may take and process another chunk until all chunks are processed… Depending on the database system, parallelism can be achieved using processes, or lightweight threads, or a combination of both.” Each of the many threads concurrently processing a respective subset of rows of one vertex table or a backlog to achieve parallelism, where the many threads can be part of one process, correlates to scheduling execution of the plurality of tasks on a plurality of threads from a pool of threads using intra-process parallelism);
wherein the method is performed by one or more computing devices (Fig. 1, paragraphs 29 and 213, “FIG. 1 is a block diagram that depicts an example computer 100, in an embodiment. Computer 100 uses parallelism to accelerate population of compressed sparse row (CSR) encoding 110 of logical graph 105. Computer 100 may be at least one rack server such as a blade, a personal computer, a mainframe, a virtual computer, or other computing device… According to one embodiment, the techniques described herein are implemented by one or more special-purpose computing devices.” The special-purpose computing devices implementing the techniques correlates to the method being performed by one or more computing devices).
Arnaboldi does not explicitly teach:
executing each given granule of work on a given worker process comprises:
decomposing graph neighbor matching operations into a plurality of tasks that can be processed independently;
However, Haprian teaches:
executing each given granule of work on a given worker process comprises:
decomposing graph neighbor matching operations into a plurality of tasks that can be processed independently (Fig. 19, paragraphs 70, 72, 193, and 195, “Path pattern match processing with graph table operator is performed using multiple match operators, which are level-specialized, computing units. Each match operator is responsible for computing vertices for a specific hop of a target pattern… Another type of match operator is an Intermediate Neighbor Match (NM) operator. The NM operator will start from a set of input vertices and produces a fixed number of neighbors for each of them…. In an embodiment, the query engine may exploit parallelism to efficiently process a path pattern. One consideration is to split work across execution threads. The main unit of work to parallelize is the neighbor iteration. Level sync parallelism entails parallelizing neighbor iteration for every operator... By exploiting parallelism, work balance is achieved since each thread iterates over the same number of neighbors. No synchronization is needed because different threads write to different result chunks.” The path pattern match processing being performed using match operators such as an intermediate neighbor match operator correlates to graph neighbor matching operations. Parallelism being used to efficiently process a path pattern where work is split across execution threads that do not require synchronization correlates to decomposing graph neighbor matching operations into a plurality of tasks that can be processed independently);
Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with executing each given granule of work on a given worker process comprises: decomposing graph neighbor matching operations into a plurality of tasks that can be processed independently as taught by Haprian because query engines can exploit parallelism to efficiently process a path pattern. Work balance can also be achieved when high-degree nodes are present in the data set by ensuring that every thread iterates over a similar number of neighbors. Additionally, synchronization is not needed due to different threads writing to different result chunks (Haprian: paragraph 193-195).
With regards to Claim 13, the method of Claim 1 performs the same steps as the manufacture of Claim 13, and Claim 13 is therefore rejected using the same rationale set forth above in the rejection of Claim 1.
With regards to Claim 2, Arnaboldi in view of Haprian teaches the method of Claim 1 above. Haprian further teaches:
wherein partitioning the vertices into the plurality of granules of work comprises: compiling the pattern matching query into one or more specialization trees (Paragraphs 128, 131, and 157, “When running a pattern-matching query on a heterogeneous graph, a given pattern can correspond to multiple specializations… Specializations are part of the same specialization tree if they have an overlapping common prefix (e.g., (a IS VT1).fwdarw.(b IS VT2)). In an embodiment, a graph table operator computes the paths matching the pattern for each of these specializations... In an embodiment, sharing the processing of the common path pattern prefix reduces processing and memory usage and turns the specializations into a specialization tree of match operators including the branching NM operator.” A specialization tree being created for each specialization that can be a result of a pattern-matching query correlates to compiling the pattern matching query into one or more specialization trees);
for each of the one or more specialization trees:
partitioning vertices of a root vertex table of the specialization tree into one or more granules of work (Paragraphs 128, 131, 138, “When running a pattern-matching query on a heterogeneous graph, a given pattern can correspond to multiple specializations. As an example, consider a graph that has two different vertex tables VT1 and VT2. When matching the pattern (a IS VT1).fwdarw.(b IS VT2).fwdarw.(c), two pattern specializations can be generated… The specializations are generated at compile time, based on the metadata tracking the vertex and edge tables of the graph. Specializations are part of the same specialization tree if they have an overlapping common prefix (e.g., (a IS VT1).fwdarw.(b IS VT2)). In an embodiment, a graph table operator computes the paths matching the pattern for each of these specializations… An important aspect to observe is that the simple path pattern is now transformed into a specialization tree of operators, which might have multiple leaf nodes. Each path from a tree root to a tree leaf corresponds to a path specialization, which generates different paths… Furthermore, each leaf node should produce results into the graph table result table.” The specializations in the same specialization tree having an overlapping prefix based on the vertex table, where each path or specialization goes from the tree root to a leaf and the leaf node produces results in a result table, correlates to partitioning vertices of a root vertex table of the specialization tree into one or more granules of work) and
executing the one or more granules of work on one or more worker processes (Paragraphs 143-144, “In heterogeneous graphs processing, a path pattern is compiled into a set of independent specialization trees. A graph table operator executes each of those specialization trees to produce its output row set. The different specialization trees can be computed sequentially or in parallel, in which case, different threads are synchronized when writing results to the output row set. As discussed above, a specialization tree contains multiple leaf operators. In order to return paths from all possible specializations, the graph table operator sequentially calls the pull function on the leaf match operators and writes the matches to the graph table result table.” The graph table operator calling the pull function of all the leaf match operators which correspond to a different specialization path using different threads correlates to executing the one or more granules of work on one or more worker processes),
wherein each worker process in the one or more worker processes creates a result set of matching paths for the specialization tree (Paragraphs 143-144, “In heterogeneous graphs processing, a path pattern is compiled into a set of independent specialization trees. A graph table operator executes each of those specialization trees to produce its output row set. The different specialization trees can be computed sequentially or in parallel, in which case, different threads are synchronized when writing results to the output row set. As discussed above, a specialization tree contains multiple leaf operators. In order to return paths from all possible specializations, the graph table operator sequentially calls the pull function on the leaf match operators and writes the matches to the graph table result table.” The graph table operator executing each specialization tree and its corresponding path specializations to write the results to a graph table result table correlates to each worker process in the one or more worker processes creating a result set of matching paths for the specialization tree).
Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with wherein partitioning the vertices into the plurality of granules of work comprises: compiling the pattern matching query into one or more specialization trees; for each of the one or more specialization trees: partitioning vertices of a root vertex table of the specialization tree into one or more granules of work; and executing the one or more granules of work on one or more worker processes, wherein each worker process in the one or more worker processes creates a result set of matching paths for the specialization tree as taught by Haprian because sharing the processing of common path pattern prefixes reduces processing and memory usage. Path patterns can be compiled into a set of independent specialization tress to produce output row sets for all possible specialization paths (Haprian: paragraphs 143-144 and 157).
With regards to Claim 14, the method of Claim 2 performs the same steps as the manufacture of Claim 14, and Claim 14 is therefore rejected using the same rationale set forth above in the rejection of Claim 2.
With regards to Claim 3, Arnaboldi in view of Haprian teaches the method of Claim 2 above. Arnaboldi further teaches:
wherein each granule of work has an associated range of internal vertex identifiers (Paragraphs 77 and 130, “For example, such identifying column(s) of vertex table 171 may be a primary key, a secondary key such as a natural key, a pseudocolumn such as a row identifier (ROWID), and/or a compound key from multiple columns of vertex table 171… Each chunk has a metadata header associated to it, which contains useful information about the chunk itself, like the number of rows stored in it, and information allowing to identify the range of ROWIDs for the rows stored in the chunk.” Each chunk having a metadata header associated to it which contains information about the range of ROWIDs for rows stored in the chunk, where a ROWID can be an identifying column of a vertex table, correlates to each granule of work having an associated range of internal vertex identifiers) and a pointer to a vertex table over which a range of internal vertex identifiers is defined (Paragraphs 130 and 136, “Each chunk has a metadata header associated to it, which contains useful information about the chunk itself, like the number of rows stored in it, and information allowing to identify the range of ROWIDs for the rows stored in the chunk… When the in-memory store is organized in chunks, the position can be relative to the beginning of the chunk. Further, the in-memory store maintains a mapping between the physical address of a row on disk (e.g., a ROWID in Oracle) and the position assigned to that row in the chunk of the in-memory column store holding the column values of that row.” The in-memory store being organized in chunks and maintaining mappings which include the physical address of a row on a disk, such as the ROWID, and are specified as a range in the metadata, correlates to a pointer to a vertex table over which a range of internal vertex identifiers is defined).
With regards to Claim 4, Arnaboldi in view of Haprian teaches the method of Claim 3 above. Arnaboldi further teaches:
wherein the pointer to the vertex table references a vertex table (Paragraphs 77, 130 and 136, “For example, such identifying column(s) of vertex table 171 may be a primary key, a secondary key such as a natural key, a pseudocolumn such as a row identifier (ROWID), and/or a compound key from multiple columns of vertex table 171… Each chunk has a metadata header associated to it, which contains useful information about the chunk itself, like the number of rows stored in it, and information allowing to identify the range of ROWIDs for the rows stored in the chunk… When the in-memory store is organized in chunks, the position can be relative to the beginning of the chunk. Further, the in-memory store maintains a mapping between the physical address of a row on disk (e.g., a ROWID in Oracle) and the position assigned to that row in the chunk of the in-memory column store holding the column values of that row.” The in-memory store maintaining mappings which include the physical address of a row of a vertex table on a disk, such as the ROWID, correlates to wherein the pointer to the vertex table references a vertex table).
Arnaboldi does not explicitly teach that the vertex table is a root vertex table of a specialization tree. However, root vertex table[s] of a specialization tree[s] are a popular type of table used for specialization trees as evidenced by Haprian (Paragraphs 131 and 147, “Specializations are part of the same specialization tree if they have an overlapping common prefix (e.g., (a IS VT1).fwdarw.(b IS VT2)). In an embodiment, a graph table operator computes the paths matching the pattern for each of these specializations… The process 1500 assumes that a heterogeneous property graph is defined from relational tables, and an in-memory representation of the heterogeneous graph uses a CSR format. A path pattern expression is compiled into multiple specializations, with each specialization rooted by a starting vertex table in the graph.” Each specialization being part of a specialization tree and rooted by a starting vertex table correlate to a root vertex table of a specialization tree).
Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with a root vertex table of a specialization tree as taught by Haprian because pattern path expressions can be evaluated against an in-memory graph representation, which supports graph pattern matching queries inside a relational database system. The path pattern expressions can be compiled into multiple specializations, with each specialization rooted by a starting vertex table in the graph for evaluation against a sequence of match operators (Haprian: paragraphs 147-148).
With regards to Claim 5, Arnaboldi in view of Haprian teaches the method of Claim 1 above. Haprian further teaches:
wherein each task within the plurality of tasks is associated with a graph match operator comprising one of: a root vertex match (RM) operator for finding a first level of vertices of the path pattern, an intermediate neighbor match (NM) operator for finding non-leaf neighbor matches of the path pattern, or a leaf neighbor match (LNM) operator for finding leaf level vertices matching the path pattern (Paragraphs 73, 148, and 175, “The LNM operator finds the last or leaf level vertices that match the pattern… At step 1505, a path pattern expression is evaluated against an in-memory graph representation by at least executing a sequence of match operators. The sequence of match operators includes a root vertex match (RNM) operator, an intermediate neighbor match (NM) operator that is a branching NM operator, and a plurality of leaf neighbor match (LNM) operators… In an embodiment, the path pattern expression comprises a plurality of nodes that includes a root node and a plurality of parent nodes. Each parent node of the plurality of parent nodes has at least one child node. Each child node is a child of one of the plurality of parent nodes. In an embodiment, each of the plurality of nodes is associated with a match operator of the sequence of match operators.” Each of the plurality of nodes in a path pattern expression being associated with a match operator that is executed against the graph representation correlates to each task within the plurality of tasks is associated with a graph match operator. The graph match operators including a leaf neighbor match operator which finds last or leaf level vertices matching the pattern correlates to a leaf neighbor match (LNM) operator for finding leaf level vertices matching the path pattern).
Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with wherein each task within the plurality of tasks is associated with a graph match operator comprising one of: a root vertex match (RM) operator for finding a first level of vertices of the path pattern, an intermediate neighbor match (NM) operator for finding non-leaf neighbor matches of the path pattern, or a leaf neighbor match (LNM) operator for finding leaf level vertices matching the path pattern as taught by Haprian because path pattern match processing with graph table operator can be performed using multiple match operators, which are level-specialized, computing units. Leaf neighbor match operators can be used specifically for matching the last node in the pattern to find the last or leaf level vertices that match the pattern. Each type of match operators has a different control flow where they have an additional specialization, depending on whether a vertex predicate, an edge predicate, or both vertex and edge predicate must be evaluated (Haprian: paragraphs 70 and 73-74).
With regards to Claim 15, the method of Claim 5 performs the same steps as the manufacture of Claim 15, and Claim 15 is therefore rejected using the same rationale set forth above in the rejection of Claim 5.
With regards to Claim 6, Arnaboldi in view of Haprian teaches the method of Claim 1 above. Haprian further teaches:
wherein each task within the plurality of tasks is associated with a graph match operator that maintains one or more level contexts that can be used for input or output (Paragraphs 70, 83, 145, and 175, “Path pattern match processing with graph table operator is performed using multiple match operators, which are level-specialized, computing units… The first or root match operator—RNM operator—of a path pattern expression uses a CSR and fills its first or root level data structure storing a set of matching vertices. This first or root level data structure acts as input for the next operator (intermediate match operator—NM operator) in the path pattern expression which, using the input and the CSR, computes the vertices matching the path until that point, and stores the results in its intermediate level data structure… In an embodiment, the graph table operator may maintain a map <NODE, STATE> that keeps track of the state of every node in the specialization tree, wherein STATE can be ACTIVE or WAITING. The initial state for all nodes in the specialization tree may be ACTIVE… In an embodiment, the path pattern expression comprises a plurality of nodes that includes a root node and a plurality of parent nodes. Each parent node of the plurality of parent nodes has at least one child node. Each child node is a child of one of the plurality of parent nodes. In an embodiment, each of the plurality of nodes is associated with a match operator of the sequence of match operators.” Each of the plurality of nodes in a path pattern expression being associated with a match operator that is executed against the graph representation correlates to each task within the plurality of tasks is associated with a graph match operator. The first root match operator filling its root level data structure which is used as input for the next operator and maintaining a map of the state of every node correlates to wherein each task within the plurality of tasks is associated with a graph match operator that maintains one or more level contexts that can be used for input or output).
Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with wherein each task within the plurality of tasks is associated with a graph match operator that maintains one or more level contexts that can be used for input or output as taught by Haprian because each match operator is associated with a data structure where it stores the current state, such as the matched vertices by the match operator, the number of valid rows, etc. The control flow enables match operators to move data through the different level data structures and interact with the graph table operator. The data structures avoid duplicating vertices along a path, providing enough information for path reconstruction, controlling the memory footprint and support pipelined execution and memory locality. Sharing the computation of prefix of a path pattern in the specialization tree can be performed by synchronizing all leaf operators feeding on the same branching operators in the specialization tree (Haprian: paragraphs 75, 77-80 and 145).
With regards to Claim 16, the method of Claim 6 performs the same steps as the manufacture of Claim 16, and Claim 16 is therefore rejected using the same rationale set forth above in the rejection of Claim 6.
With regards to Claim 7, Arnaboldi in view of Haprian teaches the method of Claim 6 above. Haprian further teaches:
wherein each level context has an associated state comprising one of: an empty state indicating the level context does not hold any valid vertices and is ready to be filled (Paragraphs 110 and 145, “When the parent of the graph table operator 904 requests rows to process, the newly created operator will call a pull function on the LNM operator 904c to request rows from the graph table operator 904. If the LNM operator 904c still has valid rows in its data structure, then it writes them to the graph table operator's result table 906. If the LNM operator 904c does not have valid rows in its data structure, then the LNM operator 904c invokes the previous match operator 904b to fetch more neighbors from the previous operator 904b. This is done by calling a pull function on the previous match operator 904b and, then, processing the neighbors of the new vertices and storing results in its data structure… In an embodiment, the graph table operator may maintain a map <NODE, STATE> that keeps track of the state of every node in the specialization tree, wherein STATE can be ACTIVE or WAITING. The initial state for all nodes in the specialization tree may be ACTIVE.” The operator calling a pull function of the LNM operator, which if the LNM does not have valid rows in its data structure, such as in the initial state in the specialization tree where the node has an active state, to invoke the previous match operator to fetch more neighbors to process, correlates to wherein each level context has an associated state comprising one of: an empty state indicating the level context does not hold any valid vertices and is ready to be filled. Additionally, based on paragraph 59 of the specification, the empty state is described as “the initial state after allocation”), a ready state indicating the level context stores valid vertices that have neighbor iterators, or a waiting state indicating the level context has at least one child context but all neighbor iterators have been exhausted.
Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with wherein each level context has an associated state comprising one of: an empty state indicating the level context does not hold any valid vertices and is ready to be filled, a ready state indicating the level context stores valid vertices that have neighbor iterators, or a waiting state indicating the level context has at least one child context but all neighbor iterators have been exhausted as taught by Haprian because each match operator is associated with a data structure where it stores the current state, such as the matched vertices by the match operator, the number of valid rows, etc. The control flow enables match operators to move data through the different level data structures and interact with the graph table operator. The data structures avoid duplicating vertices along a path, providing enough information for path reconstruction, controlling the memory footprint and support pipelined execution and memory locality (Haprian: paragraphs 75 and 77-80).
With regards to Claim 17, the method of Claim 7 performs the same steps as the manufacture of Claim 17, and Claim 17 is therefore rejected using the same rationale set forth above in the rejection of Claim 7.
With regards to Claim 8, Arnaboldi in view of Haprian teaches the method of Claim 7 above. Haprian further teaches:
wherein decomposing graph neighbor matching operations comprises: scanning levels of the graph neighbor matching operations backwards (Fig. 9, paragraphs 10 and 109-110, “Once the whole pattern is matched, the path is backtracked to iterate over more vertices… Note that the control flow and the data flow are in opposite directions. The control flow is from relational operators 908 to CSR 902, while the data flow is from CSR 902 to relational operators 908... Accordingly, each match operator in the chain will either consume more neighbors from its predecessor in the chain or call its predecessor's pull function to fetch the next batch of neighbors.” The path being backtracked to iterate over more vertices, where each match operator call its predecessor's pull function to fetch the next batch of neighbors, correlates to scanning levels of the graph neighbor matching operations backwards);
identifying a level context that has an empty state and an input level context with valid vertices to be processed (Paragraphs 83, 110, and 145 “The first or root match operator—RNM operator—of a path pattern expression uses a CSR and fills its first or root level data structure storing a set of matching vertices. This first or root level data structure acts as input for the next operator (intermediate match operator—NM operator) in the path pattern expression which, using the input and the CSR, computes the vertices matching the path until that point, and stores the results in its intermediate level data structure… When the parent of the graph table operator 904 requests rows to process, the newly created operator will call a pull function on the LNM operator 904c to request rows from the graph table operator 904. If the LNM operator 904c still has valid rows in its data structure, then it writes them to the graph table operator's result table 906. If the LNM operator 904c does not have valid rows in its data structure, then the LNM operator 904c invokes the previous match operator 904b to fetch more neighbors from the previous operator 904b. This is done by calling a pull function on the previous match operator 904b and, then, processing the neighbors of the new vertices and storing results in its data structure… In an embodiment, the graph table operator may maintain a map <NODE, STATE> that keeps track of the state of every node in the specialization tree, wherein STATE can be ACTIVE or WAITING. The initial state for all nodes in the specialization tree may be ACTIVE.” The LNM operator not having valid rows in its data structure, such as during its initial active state, correlates to identifying a level context that has an empty state. The LNM operator invoking a previous match operator to fetch more neighbors, which is used as input for the LNM operator to process the neighbors, correlates to identifying an input level context with valid vertices to be processed);
creating a matching task to process the identified level context using the input level context (Paragraphs 83 and 110, “In order to compute the pattern matches for a graph, vertices are matched starting at the root node and finishing at the leaf node of the pattern. The first or root match operator—RNM operator—of a path pattern expression uses a CSR and fills its first or root level data structure storing a set of matching vertices. This first or root level data structure acts as input for the next operator (intermediate match operator—NM operator) in the path pattern expression which, using the input and the CSR, computes the vertices matching the path until that point, and stores the results in its intermediate level data structure… If the LNM operator 904c does not have valid rows in its data structure, then the LNM operator 904c invokes the previous match operator 904b to fetch more neighbors from the previous operator 904b. This is done by calling a pull function on the previous match operator 904b and, then, processing the neighbors of the new vertices and storing results in its data structure. This process happens for each match operator in the path until reaching the RNM operator 904a, for which a pull function is simply loading the next chunk of vertices existing in a graph, by referencing the CSR 902 which represents the graph.” The LNM operator not having valid rows in its data structure and calling a pull function on previous match operators to use the previous level data structure as input correlates to creating a matching task to process the identified level context using the input level context); and scheduling execution of the matching task (Paragraphs 107 and 110, “Match operators are chained according to a pull-based control flow… This is done by calling a pull function on the previous match operator 904b and, then, processing the neighbors of the new vertices and storing results in its data structure. This process happens for each match operator in the path until reaching the RNM operator 904a, for which a pull function is simply loading the next chunk of vertices existing in a graph, by referencing the CSR 902 which represents the graph. Accordingly, each match operator in the chain will either consume more neighbors from its predecessor in the chain or call its predecessor's pull function to fetch the next batch of neighbors.” Each match operator being chained according to a pull-based flow, where pull functions of previous match operators are incrementally called based on their chain and executed when they have neighbors to process, until the current match operator has new neighbors to process, correlates to scheduling execution of the matching task).
Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with wherein decomposing graph neighbor matching operations comprises: scanning levels of the graph neighbor matching operations backwards; identifying a level context that has an empty state and an input level context with valid vertices to be processed; creating a matching task to process the identified level context using the input level context; and scheduling execution of the matching task as taught by Haprian because each match operator is associated with a data structure where it stores the current state, such as the matched vertices by the match operator, the number of valid rows, etc. The control flow enables match operators to move data through the different level data structures and interact with the graph table operator. The data structures avoid duplicating vertices along a path, providing enough information for path reconstruction, controlling the memory footprint and support pipelined execution and memory locality (Haprian: paragraphs 75 and 77-80).
With regards to Claim 18, the method of Claim 8 performs the same steps as the manufacture of Claim 18, and Claim 18 is therefore rejected using the same rationale set forth above in the rejection of Claim 8.
With regards to Claim 9, Arnaboldi in view of Haprian teaches the method of Claim 8 above. Haprian further teaches:
wherein decomposing graph neighbor matching operations further comprises: in response to completing execution of the matching task:
marking an output level context of the matching task as ready if one or more neighbors were produced or empty if no neighbors were produced (Paragraphs 83, 85, and 110-111, “The first or root match operator—RNM operator—of a path pattern expression uses a CSR and fills its first or root level data structure storing a set of matching vertices. This first or root level data structure acts as input for the next operator (intermediate match operator—NM operator) in the path pattern expression which, using the input and the CSR, computes the vertices matching the path until that point, and stores the results in its intermediate level data structure… If the LNM operator 904c does not have valid rows in its data structure, then the LNM operator 904c invokes the previous match operator 904b to fetch more neighbors from the previous operator 904b. This is done by calling a pull function on the previous match operator 904b and, then, processing the neighbors of the new vertices and storing results in its data structure. This process happens for each match operator in the path until reaching the RNM operator 904a, for which a pull function is simply loading the next chunk of vertices existing in a graph, by referencing the CSR 902 which represents the graph… This process continues until reaching the leaf node of the pattern… For example, the match operator at the level may pull from the parent until either the level data structure is full or the parent runs out of data.” The LNM not having valid rows in its data structure and invoking a previous match operator to fetch more neighbors, such as an NM operator, correlates to completing execution of the matching task. The NM operator storing results in its intermediate level data structure, which is later pulled and used as input for the LNM operator, correlates to marking an output level context of the matching task as ready if one or more neighbors were produced. The parent running out of data and the match operator stopping pulling from the parent, which can result in the match operator not having data, correlates to marking an output level context of the matching task as empty if no neighbors were produced);
marking the input level context as empty if the input level context does not have another child context (Paragraphs 111-112, “For example, the match operator at the level may pull from the parent until either the level data structure is full or the parent runs out of data… For example, rather than iterating over all entries in the parent structure and jumping over invalid ones, an index of valid entries in a list is kept and referenced to iterate only over the indices present in the list. When an index becomes invalid (e.g., all neighbors for that entry were consumed), it is removed from the list.” A parent match operator having all neighbors for every entry consumed by child contexts and therefore deleted would result in the parent match operator running out of data and therefore correlates to marking the input level context as empty if the input level context does not have another child) or waiting if the input level context has another child context (Paragraph 155, “In an embodiment, the branching NM operator controls the state of its children match operators (e.g., ACTIVE/WAITING) to synchronize consumption of matches from the common path pattern prefix. A pull request from all but the last active child match operator results in the children match operators to wait. A pull request from the last active child match operator triggers pulling more matches from the parent match operator (e.g., RNM operator) of the branching NM operator and reactivating all of the children match operators of the branching NM operator if additional matches are available.” The branching NM operator getting a pull request from a non-last active child match operator would involve the branching NM operator having at least one more active child match operator and therefore correlates to marking the input level context as waiting if the input level context has another child context); and
scheduling a next set of matching tasks at a next level of the graph neighbor matching operations (Paragraph 192, “As shown, when an operator, in a sequence of match operators, is asked for tuples or rows by a child operator, if the operator has tuples, then the child operator will directly iterate over the neighbors of its parent operator. However, if the operator does not have tuples, then the operator will ask the same request from its parent operator.” If the operator receives a request for tuples by a child operator and does not have tuples, the operator asking the same request from its parent operator correlates to scheduling a next set of matching tasks at a next level of the graph neighbor matching operations).
Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with wherein decomposing graph neighbor matching operations further comprises: in response to completing execution of the matching task: marking an output level context of the matching task as ready if one or more neighbors were produced or empty if no neighbors were produced; marking the input level context as empty if the input level context does not have another child context or waiting if the input level context has another child context; and scheduling a next set of matching tasks at a next level of the graph neighbor matching operations as taught by Haprian because each match operator is associated with a data structure where it stores the current state, such as the matched vertices by the match operator, the number of valid rows, etc. The control flow enables match operators to move data through the different level data structures and interact with the graph table operator. The data structures avoid duplicating vertices along a path, providing enough information for path reconstruction, controlling the memory footprint and support pipelined execution and memory locality (Haprian: paragraphs 75 and 77-80).
With regards to Claim 10, Arnaboldi in view of Haprian teaches the method of Claim 6 above. Haprian further teaches:
wherein executing each given granule of work on the given worker process comprises: scanning leaf level graph match operators to identify a particular level context having valid vertices (Paragraphs 110 and 144, “When the parent of the graph table operator 904 requests rows to process, the newly created operator will call a pull function on the LNM operator 904c to request rows from the graph table operator 904. If the LNM operator 904c still has valid rows in its data structure, then it writes them to the graph table operator's result table 906… As discussed above, a specialization tree contains multiple leaf operators. In order to return paths from all possible specializations, the graph table operator sequentially calls the pull function on the leaf match operators and writes the matches to the graph table result table.” The graph table operator calling the pull function on multiple leaf match operators, where one of the LNM operators can have valid rows in its data structure, correlates to scanning leaf level graph match operators to identify a particular level context having valid vertices);
projecting property values of all paths ending at the particular level context into an output row set (Paragraphs 138 and 143-144, “An important aspect to observe is that the simple path pattern is now transformed into a specialization tree of operators, which might have multiple leaf nodes. Each path from a tree root to a tree leaf corresponds to a path specialization, which generates different paths… A graph table operator executes each of those specialization trees to produce its output row set. The different specialization trees can be computed sequentially or in parallel, in which case, different threads are synchronized when writing results to the output row set… As discussed above, a specialization tree contains multiple leaf operators. In order to return paths from all possible specializations, the graph table operator sequentially calls the pull function on the leaf match operators and writes the matches to the graph table result table. The change at the row set level is one change to the control flow execution when there are multiple leaf operators in a specialization tree.” The graph table operator calling the leaf match operators to write the matches to the graph table result table to product its output row set corresponding to its path from the tree root correlates to projecting property values of all paths ending at the particular level context into an output row set); and marking the particular level context as empty (Paragraphs 111-112, “For example, the match operator at the level may pull from the parent until either the level data structure is full or the parent runs out of data… For example, rather than iterating over all entries in the parent structure and jumping over invalid ones, an index of valid entries in a list is kept and referenced to iterate only over the indices present in the list. When an index becomes invalid (e.g., all neighbors for that entry were consumed), it is removed from the list.” If all of the neighbors for each entry are consumed and the indexes are removed from the list, the match operator would have an empty level data structure and therefore correlates to marking the particular level context as empty).
Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with wherein executing each given granule of work on the given worker process comprises: scanning leaf level graph match operators to identify a particular level context having valid vertices; projecting property values of all paths ending at the particular level context into an output row set; and marking the particular level context as empty as taught by Haprian because each match operator is associated with a data structure where it stores the current state, such as the matched vertices by the match operator, the number of valid rows, etc. The control flow enables match operators to move data through the different level data structures and interact with the graph table operator. The data structures avoid duplicating vertices along a path, providing enough information for path reconstruction, controlling the memory footprint and support pipelined execution and memory locality. The graph table operator can then compile path patterns into a set of independent specialization trees to produce its output row set. Different specialization trees can be computed sequentially or in parallel, in which case, different threads are synchronized when writing results to the output row set (Haprian: paragraphs 75, 77-80 and 143).
With regards to Claim 19, the method of Claim 10 performs the same steps as the manufacture of Claim 19, and Claim 19 is therefore rejected using the same rationale set forth above in the rejection of Claim 10.
Claim(s) 11-12 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Arnaboldi in view of Haprian and Ionescu et al. (U.S. Patent No. US 20160301766 A1), hereinafter “Ionescu.”
With regards to Claim 11, Arnaboldi in view of Haprian teaches the method of Claim 1 above. Arnaboldi further teaches:
wherein: the pool of threads comprises a coordinator thread and a plurality of worker threads (Paragraphs 100, 103, and 105 “In an embodiment, populating or later using a chunk may be assigned as a unit of work to a thread… Step 401 is preparatory and performed by a controller such as a master thread. Step 402 and its sub-steps are performed by each of many threads... Step 402 has sub-steps 403-406. In step 402, each of many threads concurrently processes a respective subset of rows of one vertex table of a relational database.” The master thread and the each of the many threads which process rows of one vertex table correlates to the pool of threads comprising a coordinator thread and a plurality of worker threads respectively), and the coordinator thread assigns the plurality of tasks to the plurality of worker threads (Paragraphs 103-105 and 115, “Step 401 is preparatory and performed by a controller such as a master thread. Step 402 and its sub-steps are performed by each of many threads. Step 401 associates, with a vertex type, a respective vertex counter that contains a non-negative integer value that is initially zero and ascending in increments of a fixed or varied amount. Somewhat similar to counter thread safety presented earlier herein, an atomic instruction may be used for counter integrity protection as discussed below. Step 402 has sub-steps 403-406. In step 402, each of many threads concurrently processes a respective subset of rows of one vertex table of a relational database… Step 502 is preparatory. In step 502, each thread executes a database query that retrieves data only for the thread's respective subset of rows of a same edge table, which may be based on a thread safe edge counter that is used in a similar way as the thread safe vertex counter for FIG. 4.” The master thread assigning each thread’s respective subset of rows to process in the vertex table of a relational database correlates to the coordinator thread assigns the plurality of tasks to the plurality of worker threads).
Haprian further teaches:
and the coordinator thread manages level contexts (Paragraphs 145, 215, “In an embodiment, the graph table operator may maintain a map <NODE, STATE> that keeps track of the state of every node in the specialization tree, wherein STATE can be ACTIVE or WAITING. The initial state for all nodes in the specialization tree may be ACTIVE… An operator may be executed by one or more computer processes or threads. Referring to an operator as performing an operation means that a process or thread executing functions or routines of an operator are performing the operation.” The graph table operator maintaining a map to keep track of the state of every node in the specialization tree, where the graph table operator may be executed by one or more threads, correlates to the coordinator thread manages level contexts).
Arnaboldi in view of Haprian does not explicitly teach:
and the coordinator thread performs task creation
However, Ionescu teaches:
and the coordinator thread performs task creation (Paragraph 175, “Master thread at protocol module 327 may then create work items for the job, where a corresponding work item for each of the chunks may be placed in a queue. The work items may be numbered or otherwise identified with a particular chunk of the file and indicate whether the file is to be downloaded or uploaded.” The master thread creating work items for the job correlates to the coordinator thread performing task creation).
Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with and the coordinator thread manages level contexts as taught by Haprian because sharing the computation of prefix of a path pattern in the specialization tree requires the graph table operator to implement additional control flow to synchronize all leaf operators feeding on the same branching operators in the specialization tree through maintaining a map of the state of every node in the tree. This control flow enables match operators to move data through the different level data structures and interact with the graph table operator. Additionally, operators can be executed by one or more computer processes or threads (Haprian: paragraphs 75, 145 and 215).
Additionally, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with and the coordinator thread performs task creation as taught by Ionescu because master threads can send requests or other notifications to protocol modules indicating information about a job such as an upload or download job, the size of the content, number of chunks of the content, metadata associated with the content, including one or more identifiers associated with the content or other metadata. In response to this notification, the primary protocol module may obtain the content in the case of a download request or create a buffer space or other location to store the content in the case of an upload request. Once the work items are created with an indication of the file type and added to a queue, a number of work threads can start to process the work items (Ionescu: paragraphs 174-175).
With regards to Claim 20, the method of Claim 11 performs the same steps as the manufacture of Claim 20, and Claim 20 is therefore rejected using the same rationale set forth above in the rejection of Claim 11.
With regards to Claim 12, Arnaboldi in view of Haprian teaches the method of Claim 1 above. Arnaboldi in view of Haprian does not explicitly teach:
wherein each worker process within the plurality of worker processes requests a new granule of work from a coordinator process after completing execution of a previous granule of work.
However, Ionescu teaches:
wherein each worker process within the plurality of worker processes requests a new granule of work from a coordinator process after completing execution of a previous granule of work (Paragraphs 175 and 177-179, “Master thread at protocol module 327 may then create work items for the job, where a corresponding work item for each of the chunks may be placed in a queue… Each work thread then proceeds to obtain a work item associated with a chunk of the file and request that chunk from the corresponding protocol module 337 at the primary content transfer module 334. When the protocol module 337 at the primary content transfer module 334 receives a request for the chunk of the content, it can access the requested chunk of the content at the primary content server module 334 and return the requested chunk to the requesting work thread... Once the work thread has stored the chunk and updated the indication that the chunk has been downloaded, the work thread may then obtain another work item off the work queue if any more exist. If the all chunks of the content are downloaded successfully (e.g., all work items processed by work threads and all chunks are indicated as downloaded) …” The work thread storing the chunk and updating the indication that the chunk has been downloaded includes the work item being processed by the work thread and correlates to after completing execution of a previous granule of work. The work thread then obtaining another work item off the work queue by making a request to the protocol module, which comprises the master thread, correlates to wherein each worker process within the plurality of worker processes requests a new granule of work from a coordinator process after completing execution of a previous granule of work).
Therefore, it would have been obvious to one of ordinary skill in the art to which said subject matter pertains before the effective filing date of the claimed invention to combine Arnaboldi with wherein each worker process within the plurality of worker processes requests a new granule of work from a coordinator process after completing execution of a previous granule of work as taught by Ionescu because work threads can obtain work items from the corresponding protocol module, which accesses the requested chunk of the content at the primary content server module and returns the requested chunk to the requesting work thread. Once the work thread has stored the chunk and processed the work item, the work thread may then obtain another work item off the work queue if any more exist. If the all chunks of the content are downloaded successfully, the work threads may be cleaned up by the master thread and the master thread may mark the status associated with the content as content downloaded (Ionescu: paragraphs 177-179).
Prior Art Made of Record
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Xia et al. (U.S. Patent No. US 20210004374 A1); teaching a method of executing concurrent property graph queries with each node storing data for a subgraph shard that contains a range of local vertices that are a subset of all vertices of the property graph. Each subgraph shard also has boundary vertices having edges that connect the subgraph shard to boundary vertices of another subgraph shard. Concurrent queries of the property graph cause queries of the subgraph shards to be scheduled based on the initial vertex, and the property graph is traversed by traversing edge-sets within a subgraph shard on each node.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SELINA HU whose telephone number is (571)272-5428. The examiner can normally be reached Monday-Friday 8:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chat Do can be reached at (571) 272-3721. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
The publicPAIR and privatePAIR systems are no longer available. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
SELINA HU
Examiner
Art Unit 2193
/Chat C Do/Supervisory Patent Examiner, Art Unit 2193