DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office action is in response to the RCE, filed on 11/19/2025, and the amendments, arguments and remarks, filed on 10/30/2025, in which claim(s) 1-18 is/are presented for further examination.
Claim(s) 1 and 10 has/have been amended.
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant’s submission filed on 10/30/2025 has been entered.
Response to Amendment
Applicant’s amendment(s) to claim(s) 1 and 10 has/have been accepted.
Note: The examiner requests that applicant cite where in the specification there is support for applicant’s amendment(s)/addition(s). It will quicken the prosecution if the examiner does not have to search the entire specification to ensure that applicant has not introduced new matter.
Response to Arguments
Applicant's arguments, filed on 10/30/2025, have been fully considered but they are not persuasive.
Applicant’s arguments with respect to the rejection(s) of claim(s) 1-18 under 35 U.S.C. 120(a)(2), see page 10 of applicant’s remarks, filed on 10/30/2025, have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Specification
The new title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed.
The title of the invention does not limit the scope of the claims in any form. A good title is important because it helps to accurately describe the invention and make it easier to search for and understand. Yes, applicant’s title (“Distributed Database System”) describes the invention, but in a very abstracted way. Yes, applicant’s invention deals with a “distributed database system”; however, this title/description is very abstracted. The examiner requests that the title be more descriptive and to give the reader a better idea of the invention when performing a patent search. The examiner is asking applicant to give more detail about how the invention is different/unique in the title. For example, applicant’s invention is a “distributed database system”, how is applicant’s distributed database system different from other distributed database systems? It is applicant’s invention and the Office is allowing applicant the opportunity to amend the title to whatever it feels is best in describing the invention; however, if a satisfactory title is not supplied by the applicant, the examiner may, at the time of allowance, change the title by an examiner’s amendment. See MPEP § 1302.04.
Claim Objections
Claim 10 is objected to because of the following informalities: in line 18, “by different computer devices” should be corrected to “by different computing devices”. Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claim(s) 1-18 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.
Claim 1 is directed to “a database” with a “first plurality of computing devices” and a “second plurality of computing devices”, where the “database” and the “computing devices” are not limited in any fashion. The plain meaning of the term “database” would include software under the broadest reasonable interpretation. The plain meaning of the term “device” would include software under the broadest reasonable interpretation. As such, claim 1 could be interpreted to be implemented purely as software and, thus, is rejected as software per se.
Claim(s) 2-9 inherit(s) the deficiencies of the claim it/they depend(s) from.
Claim 10 is directed to “a computer readable memory device” with a “first memory section”, a “second memory section”, a “third memory section”, a “first computing device” and a “computing device”, where the “memory device”, “memory sections” and the “computing devices” are not limited in any fashion. The plain meaning of the term “device” would include software under the broadest reasonable interpretation. “Memory section” has not been defined in the claim and includes software under the broadest reasonable interpretation. As such, claim 10 could be interpreted to be implemented purely as software and, thus, is rejected as software per se.
Claim(s) 11-18 inherit(s) the deficiencies of the claim it/they depend(s) from.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Petropoulos et al., US 2018/0285418 A1 (hereinafter “Petro”) in view of Watari, US 2013/0297646 A1 (hereinafter “Watari”).
Claim 1
Petro discloses a database system comprises:
a data input sub-system includes a first plurality of computing devices (Petro, Fig. 3, processing cluster 320a), wherein a data input cluster of computing devices of the first plurality of computing devices is operably coupled to organize respective portions of a data set into a plurality of segments (Petro, [0057], see the ingestion processing 442 [i.e., “data input sub-system”] implements partition assignment 530 to determine which partition 540 of not-structured data set 550 to store the data objects 542 [i.e., where the data objects assigned to a corresponding/respective partition are being interpreted as “a plurality of segments”] in … , partition assignment 530 may generate a hash value (e.g., based on a timestamp or other unique value that may be determined for each data object, such as the order in which the data objects are parsed by parser 510), … . Partition assignment 530 may then implement a consistent hashing technique to assign the data objects based on the generated hash value. … The data objects may be stored 506 according to the partition assignment in not-structured data set 550 in data storage service 230. Each partition may be stored as collection of objects, such as buckets for partitions 540 a, 540 b and 540 n storing data object(s) 542 a, 542 b, and 542 n respectively);
a data store, retrieve, and processing (SRP) sub-system includes a second plurality of computing devices (Petro, Fig. 3, processing cluster 320b), wherein an SRP cluster of computing devices of the second plurality of computing devices is operably coupled to store and a first set of segments of the plurality of segments (Petro, [0057], see the ingestion processing 442 [i.e., “data input sub-system”] implements partition assignment 530 to determine which partition 540 of not-structured data set 550 to store the data objects 542 [i.e., where the data objects assigned to a corresponding/respective partition are being interpreted as “a plurality of segments”] in … , partition assignment 530 may generate a hash value (e.g., based on a timestamp or other unique value that may be determined for each data object, such as the order in which the data objects are parsed by parser 510), … . Partition assignment 530 may then implement a consistent hashing technique to assign the data objects based on the generated hash value. … The data objects may be stored 506 according to the partition assignment in not-structured data set 550 in data storage service 230. Each partition may be stored as collection of objects, such as buckets for partitions 540 a, 540 b and 540 n storing data object(s) 542 a, 542 b, and 542 n respectively, see how the partitioning stores the data objects in the corresponding/respective partitions [i.e., “storage clusters]), wherein a first computing device of the SRP cluster of computing devices stores a first segment of the first set of segments, wherein "x" nodes of the first computing device store "x" divisions of the first segment (Petro, [0058], see FIG. 6 is a logical block diagram illustrating an example processing cluster of a data warehouse service using a not-structured data processing service to perform operations at a remote data store to execute a query, … . Processing cluster 600 may be data warehouse service cluster, like processing clusters 320 discussed above with regard to FIG. 3, or another processing cluster that distributes execution of a query among multiple processing nodes, … . As illustrated in this example, a processing cluster 600 may include a leader node 610 and compute nodes 620 a, 620 b, and 620 n, which may communicate with each other over an interconnect (not illustrated), in one embodiment. Leader node 610 may implement query planning 612 (discussed in detail below with regard to FIG. 7) to generate query plan(s) and instructions 614 for executing queries on processing cluster 600 that perform tiered data processing, … . As described herein, each node in a processing cluster 600 may include attached storage, such as attached storage 622 a, 622 b, and 622 n, on which a database (or portions thereof) may be stored [i.e., where the corresponding/respective compute node stores its part of the data to process] on behalf of clients (e.g., users, client applications, and/or storage service subscribers), …); and
a query and result sub-system includes a third plurality of computing devices (Petro, Fig. 3, processing cluster 320n), wherein a QR cluster of computing devices of the third plurality of computing devices is operably coupled to:
generate an optimized query plan in accordance with a query or an adjusted query (Petro, [0068], see FIG. 7 is a logical block diagram illustrating an example query planning engine that generates query plans for performing tiered data processing, according to some embodiments. Query planning 712 may implement parser 710 to receive a query statement, such as a SQL statement, and determine the various requested operations to perform as a result of the query. For example, parser 710 may generate a query tree for a given query input string to separate out the various query clauses, fields, predicates, conditions, commands, or other query information for planning and optimization. Query planning 712 may implement query optimizer 720 to rewrite the parsed query based on metadata that describes both the local data and remote data);
generate a distribution plan regarding the optimized query plan, wherein the distribution plan allocates a first portion of the optimized query plan to the SRP cluster of computing devices (Petro, [0071], see the optimized query plan provided to plan generator 730. Plan generator 730 may perform various operations to generate a query execution plan (e.g., a tree of plan operation nodes, which may be later used to generate query execution code). For example, plan generator may perform a cost-based optimization to select one of various combinations or orderings of plan operator nodes in a tree produces a least costly plan to execute. Plan generator 730 may also implement not-structured data operation selection, which may use local 740 or remote 760 metadata to determine what operations to perform in order to satisfy the query (e.g., full text searches, predicates, or other conditions, etc.), … . For example, not-structured data operation selection may receive a list of predicates as part of query 702 and along with a list of partitions (for local and/or remote data) along with range values or other information describing the values stored within the partitions (e.g., timestamp values). If an evaluation of a predicate compared with the range values or other value description information were to exclude that partition from satisfying the query predicate (e.g., timestamp values in the partition are out of a timestamp range for the query predicate), then operations to evaluate (e.g., scan) the partition may be removed, in one embodiment. In scenarios where the partitions removed are partitions of remote data, in addition to saving processing costs, removal of partitions would save transmission costs (e.g., network bandwidth) to move results from remote data))
send the optimized query plan and the distribution plan to the data SRP sub-system, wherein the SRP cluster of computing devices receives the firs portion of the optimized query plan (Petro, Fig. 10, step 1030 “Initiate performance of the stateless operation(s) at remote query processing engine(s) with respect to not-structured portion as part of executing the query plan”, where in order to process the query, it must first be sent to the respective query processing engine(s));
receive first result components regarding the first portion of the optimized query plan from the SRP cluster of computing devices (Petro, Fig. 10, step 1040 “Receive result(s) from the remote query processing engine(s) for the stateless operation(s)”); and
generate a query result at least partially based on the first result components (Petro, Fig. 10, step 1050 “Generate a final result based, at least in part, on the result(s) from the remote query engine(s)”).
Petro does not appear to explicitly disclose concurrently organize respective portions of a data set into a plurality of segments;
process, in parallel, a first set of segments of the plurality of segments;
for parallel execution with other portions of the optimized query plan on corresponding sets of segments by different computing devices of the SRP sub-system.
Watari discloses concurrently organize respective portions of a data set into a plurality of segments (Watari, [0013], see reading a query execution plan, assigning at least one thread [i.e., threads run concurrently] to each of the plurality of data sources [i.e., each data source is a “respective portion of a data set” and where each data source is a “segment”], and generating a result);
process, in parallel, a first set of segments of the plurality of segments (Watari, [0008], see execute a query execution plan using a first processor of the plurality of processors concurrently with a second processor of the plurality of processors, wherein the first processor processes a first data source of the plurality of data sources identified in the query execution plan and the second processor processes a second data source of the plurality of data sources identified in the query execution plan);
for parallel execution with other portions of the optimized query plan on corresponding sets of segments by different computing devices of the SRP sub-system (Watari, [0008], see execute a query execution plan using a first processor of the plurality of processors concurrently with a second processor of the plurality of processors, wherein the first processor processes a first data source of the plurality of data sources identified in the query execution plan and the second processor processes a second data source of the plurality of data sources identified in the query execution plan).
Petro and Watari are analogous art because they are from the same field of endeavor of processing large amounts of data.
It would have been obvious to one of ordinary skill in the art before the effective filing data of the invention, having the teachings of Petro and Watari before him/her, to modify the data processing of Petro to include the parallel processing of Watari because it would quicken processing.
The suggestion/motivation for doing so would have been to increase efficiency over conventional query execution processes, see Watari, [0024].
Therefore, it would have been obvious to combine Watari with Petro to obtain the invention as specified in the instant claim(s).
Claim 2
With respect to claim 2, the combination of Petro and Watari discloses further comprises:
the first computing device of the SRP cluster of computing devices executes a first sub-portion of the first portion of the optimized query plan on the first segment of the first set of segments (See below); and
a first node of the "x" nodes of the first computing device executes a first division of the first sub-portion of the first portion of the optimized query plan on a first division of the first segment of the first set of segments (Petro, [0058], see processing cluster 600 may be data warehouse service cluster, like processing clusters 320 discussed above with regard to FIG. 3, or another processing cluster that distributes execution of a query among multiple processing nodes, … . As illustrated in this example, a processing cluster 600 may include a leader node 610 and compute nodes 620 a, 620 b, and 620 n, which may communicate with each other over an interconnect (not illustrated), in one embodiment. Leader node 610 may implement query planning 612 (discussed in detail below with regard to FIG. 7) to generate query plan(s) and instructions 614 for executing queries on processing cluster 600 that perform tiered data processing, … . As described herein, each node in a processing cluster 600 may include attached storage, such as attached storage 622 a, 622 b, and 622 n, on which a database (or portions thereof) may be stored [i.e., where the corresponding/respective compute node stores its part of the data to process] on behalf of clients (e.g., users, client applications, and/or storage service subscribers), …; and Petro, [0062], see processing cluster 600 may also include compute nodes, such as compute nodes 620 a, 620 b, and 620 n. Compute nodes 620 be implemented on servers or other computing devices, such as those described below with regard to computer system 2000 in FIG. 13, and each may include individual query processing “slices” defined, for example, for each core of a server's multi-core processor, one or more query processing engine(s), such as query engine(s) 624 a, 624 b, and 624 n, to execute the instructions 614 or otherwise perform the portions of the query plan assigned to the compute node, ... . Query engine(s) 624 may access a certain memory and disk space in order to process a portion of the workload for a query (or other database operation) that is sent to one or more of the compute nodes 620. Query engine 624 may access attached storage, such as 622 a, 622 b, and 622 n, to perform local operation(s), …).
Claims 3 and 12
With respect to claims 3 and 12, the combination of Petro and Watari discloses further comprises:
"n" processing core resources of the first node of the "x" nodes of the first computing device store "n" partitions of first division of the first segment of the first set of segments (Petro, [0062], see processing cluster 600 may also include compute nodes, such as compute nodes 620 a, 620 b, and 620 n. Compute nodes 620 be implemented on servers or other computing devices, such as those described below with regard to computer system 2000 in FIG. 13, and each may include individual query processing “slices” defined, for example, for each core of a server's multi-core processor, one or more query processing engine(s), such as query engine(s) 624 a, 624 b, and 624 n, to execute the instructions 614 or otherwise perform the portions of the query plan assigned to the compute node, ... . Query engine(s) 624 may access a certain memory and disk space in order to process a portion of the workload for a query (or other database operation) that is sent to one or more of the compute nodes 620. Query engine 624 may access attached storage, such as 622 a, 622 b, and 622 n, to perform local operation(s), …, see assigning a portion of the query to a compute node and each core of a server’s multi-core processor in the processing cluster).
Claims 4 and 13
With respect to claims 4 and 13, the combination of Petro and Watari discloses further comprises:
a first processing core of the "n" processing core resources of the first node of the "x" nodes of the first computing device executes a first sub-division of the first division of the first sub-portion of the first portion of the optimized query plan on a first sub-division of the first division of the first segment of the first set of segments (Petro, [0062], see processing cluster 600 may also include compute nodes, such as compute nodes 620 a, 620 b, and 620 n. Compute nodes 620 be implemented on servers or other computing devices, such as those described below with regard to computer system 2000 in FIG. 13, and each may include individual query processing “slices” defined, for example, for each core of a server's multi-core processor, one or more query processing engine(s), such as query engine(s) 624 a, 624 b, and 624 n, to execute the instructions 614 or otherwise perform the portions of the query plan assigned to the compute node, ... . Query engine(s) 624 may access a certain memory and disk space in order to process a portion of the workload for a query (or other database operation) that is sent to one or more of the compute nodes 620. Query engine 624 may access attached storage, such as 622 a, 622 b, and 622 n, to perform local operation(s), …, see assigning a portion of the query to a compute node and each core of a server’s multi-core processor in the processing cluster).
Claim 5
With respect to claim 5, the combination of Petro and Watari discloses further comprises:
a second SRP cluster operably coupled to store a second set of segments of the plurality of segments, wherein a first computing device of the second SRP cluster stores a first segment of the second set of segments, wherein "x" nodes of the first computing device of the second SRP cluster store "x" divisions of the first segment of the second set of segments (Petro, [0058], see FIG. 6 is a logical block diagram illustrating an example processing cluster of a data warehouse service using a not-structured data processing service to perform operations at a remote data store to execute a query, … . Processing cluster 600 may be data warehouse service cluster, like processing clusters 320 discussed above with regard to FIG. 3, or another processing cluster that distributes execution of a query among multiple processing nodes, … . As illustrated in this example, a processing cluster 600 may include a leader node 610 and compute nodes 620 a, 620 b, and 620 n, which may communicate with each other over an interconnect (not illustrated), in one embodiment. Leader node 610 may implement query planning 612 (discussed in detail below with regard to FIG. 7) to generate query plan(s) and instructions 614 for executing queries on processing cluster 600 that perform tiered data processing, … . As described herein, each node in a processing cluster 600 may include attached storage, such as attached storage 622 a, 622 b, and 622 n, on which a database (or portions thereof) may be stored [i.e., where the corresponding/respective compute node stores its part of the data to process] on behalf of clients (e.g., users, client applications, and/or storage service subscribers), …; and Petro, Fig. 6, see compute node 620a with query engine(s) 624a and attached storage 622a [i.e., “first storage cluster”] separate from compute node 620b with query engine(s) 624b and attached storage 622b [i.e., “second storage cluster”]).
Claim 6
With respect to claim 6, the combination of Petro and Watari discloses further comprises:
the SRP cluster of computing devices is operably coupled to:
receive the optimized query plan and the distribution plan (Petro, [0071], see the optimized query plan may then be provided to plan generator 730. Plan generator 730 perform various operations to generate a query execution plan (e.g., a tree of plan operation nodes, which may be later used to generate query execution code);
identify a plurality of SRP clusters based on the distribution plan (Petro, [0012], see FIG. 10 is a high-level flowchart illustrating methods and techniques to generate a query execution plan at a query engine [i.e., “storage cluster”] to apply a query to not-structured data at one or more remote query engines [i.e., “storage clusters”], … .);
send the first portion of the optimized query plan and a first corresponding portion of the distribution plan to a computing device of the first SRP cluster (See below); and
send a second portion of the optimized query plan and a second corresponding portion of the distribution plan to a computing device of a second SRP cluster (Petro, [0084], see a query execution plan [i.e., “distribution plan”] generated for the query that include(s) stateless operations to apply the query to the not-structured portion of the data set, … . Local or remote metadata (from a metadata store like data catalog service 240) may be evaluated to determine what operations to perform in order to satisfy the query (e.g., full text searches, predicates, or other conditions, etc.), … .; and Petro, [0085], see performance of the operation(s) may be initiated at one or more remote query processing engine(s) with respect to the not-structured portion as part of executing the query plan, in some embodiments. For example, the query engine may begin performing local operations according to the query plan [i.e., “first portion of the optimized query plan and a first corresponding portion of the distribution plan to a computing device of the first storage cluster”] and upon reaching a remote operation, send one or more requests to complete the remote operation to a remote query engine (or engines) [i.e., “second portion of the optimized query plan and a second corresponding portion of the distribution plan to a computing device of a second storage cluster”], such as processing nodes 640 in not-structured data processing service 220 in FIG. 2 above. The query engine may, in some embodiments, continue performing local operations until results for the remote operations are needed to continue executing in accordance with the query execution plan).
Claim 7
With respect to claim 7, the combination of Petro and Watari discloses further comprises:
the computing device of the first SRP cluster is operably coupled to:
receive the first portion of the optimized query plan and the first corresponding portion of the distribution plan (See below);
partition the first portion of the optimized query plan into a plurality of sub-portions based on the corresponding portion of the distribution plan (See below);
send a first sub-portion of the plurality of sub-portions of the first portion of the optimized query plan and a first sub-portion of the first corresponding portion of the distribution plan to the first computing device of the first storage cluster (See below); and
send a second sub-portion of the plurality of sub-portions of the first portion of the optimized query plan and a second sub-portion of the first corresponding portion of the distribution plan to a second computing device of the first SRP cluster (Petro, [0058], see a processing cluster 600 includes a leader node 610 and compute nodes 620 a, 620 b, and 620 n, which may communicate with each other over an interconnect (not illustrated), in one embodiment. Leader node 610 implement query planning 612 (discussed in detail below with regard to FIG. 7) to generate query plan(s) and instructions 614 for executing queries on processing cluster 600 that perform tiered data processing, where this discloses that the leader node directs the other nodes to process the query against a specific portion of the data; Petro, [0064], see the processing nodes 640 access metadata stored along with the data object in the data storage service to check whether the operation will return any results out the data object. … , range values for a timestamp may be used as an index to check whether any data objects in a partition have a timestamp value within a range specified by the operation [i.e., data objects are partitioned among storage and the processing nodes only processes the data objects assigned to its assigned partition]. If not, then processing node 640 may return an empty result without accessing the assigned partition. … , multiple processing nodes 640 may receive operations from a single compute node 620, so that processing operations for not-structured data may be highly parallelized. In at least some embodiments, an operation requests for each partition of a not-structured data set may be processed by a different processing node, which could initiate processing operations at a large number of nodes (e.g., 1,000 processing nodes 640) reporting results to a significantly smaller number of compute nodes 620 (e.g., 4 compute nodes). … ; and Petro, [0068], see query planning 712 implement query optimizer 720 to rewrite the parsed query based on metadata that describes both the local data and remote data. …, as illustrated in FIG. 7, query optimizer 720 has access to local metadata 740 (e.g., table descriptions or definitions, including the names and data types of each column, physical information (e.g., partitions information), number of rows, number of distinct values, value ranges, value cardinality, value distribution, indexes, views, etc.), to rewrite portions of a query tree, such as changing the location or ordering of predicates, join operations, or other portions or operations in the query tree, , which discloses that queries are optimized based on which partition the data objects are stored).
Claim 8
With respect to claim 8, the combination of Petro and Watari discloses further comprises:
the first computing device of the first SRP cluster is operably coupled to:
receive the first sub-portion of the plurality of sub-portions of the first portion of the optimized query plan and the first sub-portion of the first corresponding portion of the distribution plan (See below);
allocate a first division of the first sub-portion of the first portion of the optimized query plan a first node of the "x" nodes of the first computing device based on the first sub-portion of the first corresponding portion of the distribution plan (See below); and
allocate a second division of the first sub-portion of the first portion of the optimized query plan a second node of the "x" nodes of the first computing device based on the first sub-portion of the first corresponding portion of the distribution plan (Petro, [0058], see a processing cluster 600 includes a leader node 610 and compute nodes 620 a, 620 b, and 620 n, which may communicate with each other over an interconnect (not illustrated), in one embodiment. Leader node 610 implement query planning 612 (discussed in detail below with regard to FIG. 7) to generate query plan(s) and instructions 614 for executing queries on processing cluster 600 that perform tiered data processing, where this discloses that the leader node directs the other nodes to process the query against a specific portion of the data; and Petro, [0062], see processing cluster 600 may also include compute nodes, such as compute nodes 620 a, 620 b, and 620 n. Compute nodes 620 be implemented on servers or other computing devices, such as those described below with regard to computer system 2000 in FIG. 13, and each may include individual query processing “slices” defined, for example, for each core of a server's multi-core processor, one or more query processing engine(s), such as query engine(s) 624 a, 624 b, and 624 n, to execute the instructions 614 or otherwise perform the portions of the query plan assigned to the compute node, ... . Query engine(s) 624 may access a certain memory and disk space in order to process a portion of the workload for a query (or other database operation) that is sent to one or more of the compute nodes 620. Query engine 624 may access attached storage, such as 622 a, 622 b, and 622 n, to perform local operation(s), …, see assigning a portion of the query to a compute node and each core of a server’s multi-core processor in the processing cluster).
Claims 9 and 18
With respect to claims 9 and 18, the combination of Petro and Watari discloses further comprises:
the first node of the "x" nodes of the first computing device is operably coupled to:
allocate a first sub-division of the first division of the first sub-portion of the first portion of the optimized query plan to a first processing core resource of "n" processing core resources of the "x" nodes of the first computing device based on a first sub-division of the first sub-portion of the first corresponding portion of the distribution plan (See below); and
allocate a second sub-division of the first division of the first sub-portion of the first portion of the optimized query plan to a second processing core resource of the "n" processing core resources of the "x" nodes of the first computing device based on a second sub-division of the first sub-portion of the first corresponding portion of the distribution plan (Petro, [0062], see processing cluster 600 may also include compute nodes, such as compute nodes 620 a, 620 b, and 620 n. Compute nodes 620 be implemented on servers or other computing devices, such as those described below with regard to computer system 2000 in FIG. 13, and each may include individual query processing “slices” defined, for example, for each core of a server's multi-core processor, one or more query processing engine(s), such as query engine(s) 624 a, 624 b, and 624 n, to execute the instructions 614 or otherwise perform the portions of the query plan assigned to the compute node, ... . Query engine(s) 624 may access a certain memory and disk space in order to process a portion of the workload for a query (or other database operation) that is sent to one or more of the compute nodes 620. Query engine 624 may access attached storage, such as 622 a, 622 b, and 622 n, to perform local operation(s), …, see assigning a portion of the query to a compute node and each core of a server’s multi-core processor in the processing cluster).
Claim 10
Petro discloses a computer readable memory device (Petro, [0096], see system memory 2020) comprises:
a first memory section (Petro, [0096], see system memory 2020) that stores operational instructions that, when executed by a data input sub-system of a database system, causes the data input sub-system to:
organize respective portions of a data set into a plurality of segments (Petro, [0057], see the ingestion processing 442 [i.e., “data input sub-system”] implements partition assignment 530 to determine which partition 540 of not-structured data set 550 to store the data objects 542 [i.e., where the data objects assigned to a corresponding/respective partition are being interpreted as “a plurality of segments”] in … , partition assignment 530 may generate a hash value (e.g., based on a timestamp or other unique value that may be determined for each data object, such as the order in which the data objects are parsed by parser 510), … . Partition assignment 530 may then implement a consistent hashing technique to assign the data objects based on the generated hash value. … The data objects may be stored 506 according to the partition assignment in not-structured data set 550 in data storage service 230. Each partition may be stored as collection of objects, such as buckets for partitions 540 a, 540 b and 540 n storing data object(s) 542 a, 542 b, and 542 n respectively);
a second memory section (Petro, [0096], see system memory 2020) that stores operational instructions that, when executed by a data store, retrieve, and processing (SRP) sub-system of a database system, causes the data SRP sub-system to:
store and the plurality of segments, wherein a first storage cluster stores a first set of segments of the plurality of segments (Petro, [0057], see the ingestion processing 442 [i.e., “data input sub-system”] implements partition assignment 530 to determine which partition 540 of not-structured data set 550 to store the data objects 542 [i.e., where the data objects assigned to a corresponding/respective partition are being interpreted as “a plurality of segments”] in … , partition assignment 530 may generate a hash value (e.g., based on a timestamp or other unique value that may be determined for each data object, such as the order in which the data objects are parsed by parser 510), … . Partition assignment 530 may then implement a consistent hashing technique to assign the data objects based on the generated hash value. … The data objects may be stored 506 according to the partition assignment in not-structured data set 550 in data storage service 230. Each partition may be stored as collection of objects, such as buckets for partitions 540 a, 540 b and 540 n storing data object(s) 542 a, 542 b, and 542 n respectively, see how the partitioning stores the data objects in the corresponding/respective partitions [i.e., “storage clusters]), wherein a first computing device of the first storage cluster stores a first segment of the first set of segments, wherein "x" nodes of the first computing device store "x" divisions of the first segment (Petro, [0058], see FIG. 6 is a logical block diagram illustrating an example processing cluster of a data warehouse service using a not-structured data processing service to perform operations at a remote data store to execute a query, … . Processing cluster 600 may be data warehouse service cluster, like processing clusters 320 discussed above with regard to FIG. 3, or another processing cluster that distributes execution of a query among multiple processing nodes, … . As illustrated in this example, a processing cluster 600 may include a leader node 610 and compute nodes 620 a, 620 b, and 620 n, which may communicate with each other over an interconnect (not illustrated), in one embodiment. Leader node 610 may implement query planning 612 (discussed in detail below with regard to FIG. 7) to generate query plan(s) and instructions 614 for executing queries on processing cluster 600 that perform tiered data processing, … . As described herein, each node in a processing cluster 600 may include attached storage, such as attached storage 622 a, 622 b, and 622 n, on which a database (or portions thereof) may be stored [i.e., where the corresponding/respective compute node stores its part of the data to process] on behalf of clients (e.g., users, client applications, and/or storage service subscribers), …); and
a third memory section (Petro, [0096], see system memory 2020) that stores operational instructions that, when executed by a query and result sub-system of a database system, causes the query and result sub-system to:
generate an optimized query plan in accordance with a query or an adjusted query (Petro, [0068], see FIG. 7 is a logical block diagram illustrating an example query planning engine that generates query plans for performing tiered data processing, according to some embodiments. Query planning 712 may implement parser 710 to receive a query statement, such as a SQL statement, and determine the various requested operations to perform as a result of the query. For example, parser 710 may generate a query tree for a given query input string to separate out the various query clauses, fields, predicates, conditions, commands, or other query information for planning and optimization. Query planning 712 may implement query optimizer 720 to rewrite the parsed query based on metadata that describes both the local data and remote data);
generate a distribution plan regarding the optimized query plan, wherein the distribution plan allocates a first portion of the optimized query plan to the first storage cluster (Petro, [0071], see the optimized query plan provided to plan generator 730. Plan generator 730 may perform various operations to generate a query execution plan (e.g., a tree of plan operation nodes, which may be later used to generate query execution code). For example, plan generator may perform a cost-based optimization to select one of various combinations or orderings of plan operator nodes in a tree produces a least costly plan to execute. Plan generator 730 may also implement not-structured data operation selection, which may use local 740 or remote 760 metadata to determine what operations to perform in order to satisfy the query (e.g., full text searches, predicates, or other conditions, etc.), … . For example, not-structured data operation selection may receive a list of predicates as part of query 702 and along with a list of partitions (for local and/or remote data) along with range values or other information describing the values stored within the partitions (e.g., timestamp values). If an evaluation of a predicate compared with the range values or other value description information were to exclude that partition from satisfying the query predicate (e.g., timestamp values in the partition are out of a timestamp range for the query predicate), then operations to evaluate (e.g., scan) the partition may be removed, in one embodiment. In scenarios where the partitions removed are partitions of remote data, in addition to saving processing costs, removal of partitions would save transmission costs (e.g., network bandwidth) to move results from remote data))
send the optimized query plan and the distribution plan to a computing device of the data SRP sub-system (Petro, Fig. 10, step 1030 “Initiate performance of the stateless operation(s) at remote query processing engine(s) with respect to not-structured portion as part of executing the query plan”, where in order to process the query, it must first be sent to the respective query processing engine(s));
receive first result components regarding a first portion of the optimized query plan from the first storage cluster (Petro, Fig. 10, step 1040 “Receive result(s) from the remote query processing engine(s) for the stateless operation(s)”); and
generate a query result at least partially based on the first result components (Petro, Fig. 10, step 1050 “Generate a final result based, at least in part, on the result(s) from the remote query engine(s)”).
Petro does not appear to explicitly disclose concurrently organize respective portions of a data set into a plurality of segments;
process, in parallel, the plurality of segments;
for parallel execution with other portions of the optimized query plan on corresponding sets of segments by different computer devices of the SRP sub-system.
Watari discloses concurrently organize respective portions of a data set into a plurality of segments (Watari, [0013], see reading a query execution plan, assigning at least one thread [i.e., threads run concurrently] to each of the plurality of data sources [i.e., each data source is a “respective portion of a data set” and where each data source is a “segment”], and generating a result);
process, in parallel, the plurality of segments (Watari, [0008], see execute a query execution plan using a first processor of the plurality of processors concurrently with a second processor of the plurality of processors, wherein the first processor processes a first data source of the plurality of data sources identified in the query execution plan and the second processor processes a second data source of the plurality of data sources identified in the query execution plan);
for parallel execution with other portions of the optimized query plan on corresponding sets of segments by different computer devices of the SRP sub-system (Watari, [0008], see execute a query execution plan using a first processor of the plurality of processors concurrently with a second processor of the plurality of processors, wherein the first processor processes a first data source of the plurality of data sources identified in the query execution plan and the second processor processes a second data source of the plurality of data sources identified in the query execution plan).
See claim 1 above for the motivation to combine.
Claim 11
With respect to claim 11, the combination of Petro and Watari discloses wherein the second memory section further stores operational instructions that, when executed by a first computing device of the first storage cluster of the data SRP sub-system of the database system, causes the first computing device to:
execute a first sub-portion of the first portion of the optimized query plan on the first segment of the first set of segments (See below), wherein a first node of the "x" nodes of the first computing device executes a first division of the first sub-portion of the first portion of the optimized query plan on a first division of the first segment of the first set of segments (Petro, [0058], see processing cluster 600 may be data warehouse service cluster, like processing clusters 320 discussed above with regard to FIG. 3, or another processing cluster that distributes execution of a query among multiple processing nodes, … . As illustrated in this example, a processing cluster 600 may include a leader node 610 and compute nodes 620 a, 620 b, and 620 n, which may communicate with each other over an interconnect (not illustrated), in one embodiment. Leader node 610 may implement query planning 612 (discussed in detail below with regard to FIG. 7) to generate query plan(s) and instructions 614 for executing queries on processing cluster 600 that perform tiered data processing, … . As described herein, each node in a processing cluster 600 may include attached storage, such as attached storage 622 a, 622 b, and 622 n, on which a database (or portions thereof) may be stored [i.e., where the corresponding/respective compute node stores its part of the data to process] on behalf of clients (e.g., users, client applications, and/or storage service subscribers), …; and Petro, [0062], see processing cluster 600 may also include compute nodes, such as compute nodes 620 a, 620 b, and 620 n. Compute nodes 620 be implemented on servers or other computing devices, such as those described below with regard to computer system 2000 in FIG. 13, and each may include individual query processing “slices” defined, for example, for each core of a server's multi-core processor, one or more query processing engine(s), such as query engine(s) 624 a, 624 b, and 624 n, to execute the instructions 614 or otherwise perform the portions of the query plan assigned to the compute node, ... . Query engine(s) 624 may access a certain memory and disk space in order to process a portion of the workload for a query (or other database operation) that is sent to one or more of the compute nodes 620. Query engine 624 may access attached storage, such as 622 a, 622 b, and 622 n, to perform local operation(s), …).
Claim 14
With respect to claim 14, the combination of Petro and Watari discloses wherein the second memory section further stores operational instructions that, when executed by a second storage cluster of the data SRP sub-system of the database system, causes the second storage cluster to:
store a second set of segments of the plurality of segments, wherein a first computing device of the second storage cluster stores a first segment of the second set of segments, wherein "x" nodes of the second computing device store "x" divisions of the second segment (Petro, [0058], see FIG. 6 is a logical block diagram illustrating an example processing cluster of a data warehouse service using a not-structured data processing service to perform operations at a remote data store to execute a query, … . Processing cluster 600 may be data warehouse service cluster, like processing clusters 320 discussed above with regard to FIG. 3, or another processing cluster that distributes execution of a query among multiple processing nodes, … . As illustrated in this example, a processing cluster 600 may include a leader node 610 and compute nodes 620 a, 620 b, and 620 n, which may communicate with each other over an interconnect (not illustrated), in one embodiment. Leader node 610 may implement query planning 612 (discussed in detail below with regard to FIG. 7) to generate query plan(s) and instructions 614 for executing queries on processing cluster 600 that perform tiered data processing, … . As described herein, each node in a processing cluster 600 may include attached storage, such as attached storage 622 a, 622 b, and 622 n, on which a database (or portions thereof) may be stored [i.e., where the corresponding/respective compute node stores its part of the data to process] on behalf of clients (e.g., users, client applications, and/or storage service subscribers), …; and Petro, Fig. 6, see compute node 620a with query engine(s) 624a and attached storage 622a [i.e., “first storage cluster”] separate from compute node 620b with query engine(s) 624b and attached storage 622b [i.e., “second storage cluster”]).
Claim 15
With respect to claim 15, the combination of Petro and Watari discloses further comprises:
a fourth memory section (Petro, [0096], see system memory 2020) that stores operational instructions that, when executed by the computing device of the data SRP sub-system, causes the computing device to:
receive the optimized query plan and the distribution plan (Petro, [0071], see the optimized query plan may then be provided to plan generator 730. Plan generator 730 perform various operations to generate a query execution plan (e.g., a tree of plan operation nodes, which may be later used to generate query execution code);
identify a plurality of storage clusters of the data SRP sub-system based on the distribution plan (Petro, [0012], see FIG. 10 is a high-level flowchart illustrating methods and techniques to generate a query execution plan at a query engine [i.e., “storage cluster”] to apply a query to not-structured data at one or more remote query engines [i.e., “storage clusters”], … .);
send the first portion of the optimized query plan and a first corresponding portion of the distribution plan to a computing device of the first storage cluster (See below); and
send a second portion of the optimized query plan and a second corresponding portion of the distribution plan to a computing device of a second storage cluster (Petro, [0084], see a query execution plan [i.e., “distribution plan”] generated for the query that include(s) stateless operations to apply the query to the not-structured portion of the data set, … . Local or remote metadata (from a metadata store like data catalog service 240) may be evaluated to determine what operations to perform in order to satisfy the query (e.g., full text searches, predicates, or other conditions, etc.), … .; and Petro, [0085], see performance of the operation(s) may be initiated at one or more remote query processing engine(s) with respect to the not-structured portion as part of executing the query plan, in some embodiments. For example, the query engine may begin performing local operations according to the query plan [i.e., “first portion of the optimized query plan and a first corresponding portion of the distribution plan to a computing device of the first storage cluster”] and upon reaching a remote operation, send one or more requests to complete the remote operation to a remote query engine (or engines) [i.e., “second portion of the optimized query plan and a second corresponding portion of the distribution plan to a computing device of a second storage cluster”], such as processing nodes 640 in not-structured data processing service 220 in FIG. 2 above. The query engine may, in some embodiments, continue performing local operations until results for the remote operations are needed to continue executing in accordance with the query execution plan).
Claim 16
With respect to claim 16, the combination of Petro and Watari discloses wherein the fourth memory section further stores operational instructions that, when executed by the computing device of the first storage cluster, causes the computing device of the first storage cluster to:
receive the first portion of the optimized query plan and the first corresponding portion of the distribution plan (See below);
partition the first portion of the optimized query plan into a plurality of sub-portions based on the corresponding portion of the distribution plan (See below);
send a first sub-portion of the plurality of sub-portions of the first portion of the optimized query plan and a first sub-portion of the first corresponding portion of the distribution plan to the first computing device of the first storage cluster (See below); and
send a second sub-portion of the plurality of sub-portions of the first portion of the optimized query plan and a second sub-portion of the first corresponding portion of the distribution plan to a second computing device of the first storage cluster (Petro, [0058], see a processing cluster 600 includes a leader node 610 and compute nodes 620 a, 620 b, and 620 n, which may communicate with each other over an interconnect (not illustrated), in one embodiment. Leader node 610 implement query planning 612 (discussed in detail below with regard to FIG. 7) to generate query plan(s) and instructions 614 for executing queries on processing cluster 600 that perform tiered data processing, where this discloses that the leader node directs the other nodes to process the query against a specific portion of the data; Petro, [0064], see the processing nodes 640 access metadata stored along with the data object in the data storage service to check whether the operation will return any results out the data object. … , range values for a timestamp may be used as an index to check whether any data objects in a partition have a timestamp value within a range specified by the operation [i.e., data objects are partitioned among storage and the processing nodes only processes the data objects assigned to its assigned partition]. If not, then processing node 640 may return an empty result without accessing the assigned partition. … , multiple processing nodes 640 may receive operations from a single compute node 620, so that processing operations for not-structured data may be highly parallelized. In at least some embodiments, an operation requests for each partition of a not-structured data set may be processed by a different processing node, which could initiate processing operations at a large number of nodes (e.g., 1,000 processing nodes 640) reporting results to a significantly smaller number of compute nodes 620 (e.g., 4 compute nodes). … ; and Petro, [0068], see query planning 712 implement query optimizer 720 to rewrite the parsed query based on metadata that describes both the local data and remote data. …, as illustrated in FIG. 7, query optimizer 720 has access to local metadata 740 (e.g., table descriptions or definitions, including the names and data types of each column, physical information (e.g., partitions information), number of rows, number of distinct values, value ranges, value cardinality, value distribution, indexes, views, etc.), to rewrite portions of a query tree, such as changing the location or ordering of predicates, join operations, or other portions or operations in the query tree, , which discloses that queries are optimized based on which partition the data objects are stored).
Claim 17
With respect to claim 17, the combination of Petro and Watari discloses wherein the fourth memory section further stores operational instructions that, when executed by the computing device of the first storage cluster, causes the computing device of the first storage cluster to:
receive the first sub-portion of the plurality of sub-portions of the first portion of the optimized query plan and the first sub-portion of the first corresponding portion of the distribution plan (See below);
allocate a first division of the first sub-portion of the first portion of the optimized query plan a first node of the "x" nodes of the first computing device based on the first sub-portion of the first corresponding portion of the distribution plan (See below); and
allocate a second division of the first sub-portion of the first portion of the optimized query plan a second node of the "x" nodes of the first computing device based on the first sub-portion of the first corresponding portion of the distribution plan (Petro, [0058], see a processing cluster 600 includes a leader node 610 and compute nodes 620 a, 620 b, and 620 n, which may communicate with each other over an interconnect (not illustrated), in one embodiment. Leader node 610 implement query planning 612 (discussed in detail below with regard to FIG. 7) to generate query plan(s) and instructions 614 for executing queries on processing cluster 600 that perform tiered data processing, where this discloses that the leader node directs the other nodes to process the query against a specific portion of the data; and Petro, [0062], see processing cluster 600 may also include compute nodes, such as compute nodes 620 a, 620 b, and 620 n. Compute nodes 620 be implemented on servers or other computing devices, such as those described below with regard to computer system 2000 in FIG. 13, and each may include individual query processing “slices” defined, for example, for each core of a server's multi-core processor, one or more query processing engine(s), such as query engine(s) 624 a, 624 b, and 624 n, to execute the instructions 614 or otherwise perform the portions of the query plan assigned to the compute node, ... . Query engine(s) 624 may access a certain memory and disk space in order to process a portion of the workload for a query (or other database operation) that is sent to one or more of the compute nodes 620. Query engine 624 may access attached storage, such as 622 a, 622 b, and 622 n, to perform local operation(s), …, see assigning a portion of the query to a compute node and each core of a server’s multi-core processor in the processing cluster).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
– Goerzig et al., 2018/0113905 for optimization of split queries;
– Watari, US 2018/0060391 for concurrent processing of data sources;
– Howes et al., 10929388 for distributed multi-version partitioned MapReduce for a data fabric;
– Kong et al., 10366082 for parallel processing of queries with inverse distribution function;
– Attaluri et al., 11615083 for storage level parallel query processing;
– Uppala, 7860865 for query processing of column chunks in a distributed column chunk data store; and
– Bradshaw et al., CN 111158693 for incremental data parallel processing.
Point of Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUBERT G CHEUNG whose telephone number is (571) 270-1396. The examiner can normally be reached M-R 8:00A-5:00P EST; alt. F 8:00A-4:00P EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Neveen Abel-Jalil can be reached at (571) 270-0474. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
HUBERT G. CHEUNG
Assistant Examiner
Art Unit 2152
Examiner: Hubert Cheung
/Hubert Cheung/Assistant Examiner, Art Unit 2152Date: March 5, 2026
/NEVEEN ABEL JALIL/Supervisory Patent Examiner, Art Unit 2152