DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on December 12, 2025 was filed after the mailing date of the non-final rejection on August 11, 2025. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Response to Amendment
Applicant's Amendment, filed November 11, 2025, has been fully considered and entered. Accordingly, Claims 1-20 are pending in this application. Claims 1, 12, and 20 are independent claims and have been amended.
In view of Applicant’s Amendment, the rejection of Claims 1-20 under 35 U.S.C. 112(b) has been withdrawn.
Claim Interpretation
In view of paragraph [0014] of the Specification, the “computer program product comprising one or more computer readable storage media” of Claim 20 is limited to non-transitory media.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-5, 12-16, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Baby (PG Pub. No. 2017/0116278 A1), and further in view of Marathe (PG Pub. No. 2024/0220499 A1) Li (PG Pub. No. 2020/0364226 A1), and Maurya (PG Pub. No. 2018/0285389 A1).
Regarding Claim 1, Baby discloses a computer-implemented method of planning and executing a query with a JOIN statement in one or more remote databases, the method comprising:
obtaining, by one or more processors of a local computing system, the query comprising the JOIN statement (see Baby, paragraph [0177], wherein Fig. 10, at block 1000, the database server 100 receives a query to be applied at the application root 900);
determining, by the one or more processors that the JOIN statement references two or more tables in the one or more remote databases (see Baby, paragraph [0062], where a database link is a pointer that defines a one-way communication path from one database to another; the database link allows a user of a first database to access and manipulate objects stored on another database … links to local databases (such as PDBs within the same CDB) may specify the ID of the target database whereas links to remote databases may specify network information for the responsible database server (e.g., domain name, IP address, port number, etc.) as well as the ID number of the target database within the remote CDB; see also paragraph [0209], Push Down of Local Joins; see also paragraph [0224], Push Down of Local Joins Based on Statistics);
wherein the one or more remote databases are physically remote from the one or more processors (see Baby, paragraph [0062], where a database link is a pointer that defines a one-way communication path from one database to another; the database link allows a user of a first database to access and manipulate objects stored on another database … links to local databases (such as PDBs within the same CDB) may specify the ID of the target database whereas links to remote databases may specify network information for the responsible database server (e.g., domain name, IP address, port number, etc.) as well as the ID number of the target database within the remote CDB), and wherein executing the JOIN statement without optimization comprises:
fetching datasets from the two or more remote tables (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables [join statement comprising fetching datasets from the two or more remote tables]);
wherein the first datasets are of a first size (see Baby, paragraph [0087], where in other cases, computing a join at the remote site actually decreases the amount of data that would need to be transferred back across the network … as a result, it is entirely possible that the execution of the inner join will result in less data being returned than issuing each query separately. In such cases, it is more efficient to execute the join at the remote site and then ship the result back to the original site); and
applying, by the one or more processors, the filter, to the at least one table to fetch results for the query and exclude unrelated data from the fetching, wherein the results comprise a filtered dataset (see Baby, paragraph [0091], where in addition to pushing down joins, there are also advantages to pushing down other query operators, such as predicates, grouping operations, and sorting operations. By pushing the aforementioned operators down to be executed by the slave processes more efficient execution of the query can be achieved due to the fact that those operations will be performed in parallel by the slave processes, rather than serially by the query coordinator process. Furthermore, in the case of proxy PDBs, pushing down predicate filtering will often result in a smaller result set that would need to be returned through the network);
wherein the filtered dataset is of a second size, and the first size is larger than the second size (see Baby, paragraph [0087], where in other cases, computing a join at the remote site actually decreases the amount of data that would need to be transferred back across the network … as a result, it is entirely possible that the execution of the inner join will result in less data being returned than issuing each query separately. In such cases, it is more efficient to execute the join at the remote site and then ship the result back to the original site).
Baby does not disclose:
utilizing a local computing resource to translate the datasets to a format usable by a database host local to the one or more processors;
determining, by the one or more processors, if executing the JOIN statement without optimization utilizes the system resources above a pre-defined allotment, comprising:
for each of the two tables: obtaining, by the one or more processors, statistics related to the table;
based on the statistics related to the table, determining, by the one or more processors, whether executing the JOIN statement without optimization utilizes system resources of the local computing system above a pre-defined allotment, wherein the system resources comprise system bandwidth and computing resources local to the one or more processors; and
based on determining that executing the JOIN statement without optimization utilizes the system resources above a pre-defined allotment, for at least one of the two or more tables, generating, by the one or more processors, a filter for at least one table based on an intersection between predicates in the query.
Baby in view of Li and Marathe discloses:
determining, by the one or more processors, if executing the JOIN statement without optimization utilizes the system resources (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables [join statement comprising fetching datasets from the two or more remote tables] … much of this overhead can be avoided if the query is instead broken into separate queries for the rows of the first table and the second table respectively, with the cross-join being computed by the original site. Thus, in this case the remote site would only need to ship back 2,000 rows (1,000 rows for each table) and then the cross join could be computed without the need to transfer the bulk of the results from the remote site to the original site where the query was first received; see also paragraph [0089], where factors that the query optimizer uses to develop the query plan may include … an estimated cost of the result (resource footprint of executing the query according to a particular plan), search heuristics [it is the position of the Examiner that an unoptimized query is not patentably distinguishable from the Applicant’s ‘without optimization’ query]) above a pre-defined allotment (see Marathe, paragraph [0051], where given that during the “cost-based optimization” 418 stage an estimated execution cost of a query becomes known, and therefore, complex queries are those whose estimated execution costs exceed a threshold value), comprising
for each of the two tables: obtaining, by the one or more processors, statistics related to the table (see Li, paragraph [0004], where the method includes … acquiring statistics information of one or more database tables; see also paragraph [0027], where statistics information may include how many distinct values are in each table and how many rows of each table satisfy the criteria of the query); and
based on the statistics related to the table, determining (see Li, paragraph [0004], where the method includes … determining a selectivity of the one or more database tables based on the statistics information, determining whether the selectivity satisfies a threshold condition, and pushing down the one or more filters to the one or more database tables based on the determination of whether the selectivity satisfies a threshold condition), by the one or more processors, whether executing the JOIN statement without optimization utilizes system resources of the local computing system (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables [join statement comprising fetching datasets from the two or more remote tables] … much of this overhead can be avoided if the query is instead broken into separate queries for the rows of the first table and the second table respectively, with the cross-join being computed by the original site. Thus, in this case the remote site would only need to ship back 2,000 rows (1,000 rows for each table) and then the cross join could be computed without the need to transfer the bulk of the results from the remote site to the original site where the query was first received; see also paragraph [0089], where factors that the query optimizer uses to develop the query plan may include … an estimated cost of the result (resource footprint of executing the query according to a particular plan), search heuristics [it is the position of the Examiner that an unoptimized query is not patentably distinguishable from the Applicant’s ‘without optimization’ query]) above a pre-defined allotment (see Marathe, paragraph [0051], where given that during the “cost-based optimization“ 418 stage an estimated execution cost of a query becomes known, and therefore, complex queries are those whose estimated execution costs exceed a threshold value), wherein the system resources comprise system bandwidth (see Baby, paragraph [0087], where in other cases, computing a join at the remote site actually decreases the amount of data that would need to be transferred back across the network … as a result, it is entirely possible that the execution of the inner join will result in less data being returned than issuing each query separately. In such cases, it is more efficient to execute the join at the remote site and then ship the result back to the original site [it is the position of the Examiner that transmitting less data across a network suggests reducing utilization of system bandwidth]) and computing resources local to the one or more processors (see Marathe, paragraph [0038], where cost associated with a query can be unit-less, however the cost may correlate with the query's actual execution time, however other costs may include complexity of the actions to be performed, resources needed in order to perform the actions); and
based on determining that executing the JOIN statement without optimization utilizes the system resources (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; [it is the position of the Examiner that an unoptimized query is not patentably distinguishable from the Applicant’s ‘without optimization’ query]) above the pre-defined allotment (see Marathe, paragraph [0051], where given that during the “cost-based optimization” 418 stage an estimated execution cost of a query becomes known, and therefore, complex queries are those whose estimated execution costs exceed a threshold value), for at least one table of the two or more tables, generating, by the one or more processors, a filter for the at least one table based on an intersection between predicates in the query (see Marathe, paragraph [0040], where output of the “cost-based optimization” stage is input to the “plan refinement” stage 218. This stage can include the handling of a plurality of data processing operations. The data processing operations can include data aggregations, group-level filtering, predicate pushdown).
Baby discloses predicate pushdown if it lowers query cost (see Baby, paragraph [0089]). Marathe discloses predicate pushdown in response to a cost threshold (see Marathe, paragraphs [0040], [0051]), which is not patentably distinguishable from pre-defined allotment of system resources. Li discloses predicate pushdown based on table statistics (see Li, paragraph [0004]) which lowers query cost (see Li, paragraph [0038]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with Marathe and Li because it amounts to combining known prior art elements according to known methods to yield predictable results (MPEP 2143(I)(A)).
Baby in view of Marathe and Li does not disclose utilizing a local computing resource to translate the datasets to a format usable by a database host local to the one or more processors. Maurya discloses utilizing a local computing resource to translate the datasets to a format usable by a database host local to the one or more processors (see Maurya, paragraph [0007], where a system for translating data, extracted from disparate data sources, into a homogeneous dataset in accordance with a database schema to provide meaningful information is disclosed; see also paragraph [0020], where the one or more disparate data sources comprises raw data stored in distributed location and in disparate formats).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby, Marathe, and Li with Maurya for the benefit of obtaining meaningful information from a homogenized dataset created from disparate heterogeneous data sources (see Maurya, Abstract).
Regarding Claim 2, Baby in view of Marathe, Li, and Maurya discloses the computer-implemented method of Claim 1, further comprising returning, by the one or more processors, query results (see Baby, Claim 1, where the method includes the database server aggregating the respective result set of results from each pluggable database of the one or more particular pluggable databases into an aggregated result set for the query and the database server sending the aggregated result set to the database client).
Regarding Claim 3, Baby in view of Marathe, Li, and Maurya discloses the computer-implemented method of Claim 2, wherein applying the filter comprises:
applying results of the JOIN statement (see Baby, paragraph [0091], where in addition to pushing down joins, there are also advantages to pushing down other query operators, such as predicates, grouping operations, and sorting operations. By pushing the aforementioned operators down to be executed by the slave processes more efficient execution of the query can be achieved due to the fact that those operations will be performed in parallel by the slave processes, rather than serially by the query coordinator process. Furthermore, in the case of proxy PDBs, pushing down predicate filtering will often result in a smaller result set that would need to be returned through the network), and wherein returning the query results further comprises:
fetching, by the one or more processors, a dataset for predicate columns in the query and a dataset for the JOIN statement (see Baby, paragraph [0091], where in addition to pushing down joins, there are also advantages to pushing down other query operators, such as predicates, grouping operations, and sorting operations. By pushing the aforementioned operators down to be executed by the slave processes more efficient execution of the query can be achieved due to the fact that those operations will be performed in parallel by the slave processes, rather than serially by the query coordinator process. Furthermore, in the case of proxy PDBs, pushing down predicate filtering will often result in a smaller result set that would need to be returned through the network); and
merging, by the one or more processors, the dataset for the predicate columns in the query and for the dataset for the JOIN statement with the filtered results to produce the query results (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables … much of this overhead can be avoided if the query is instead broken into separate queries for the rows of the first table and the second table respectively, with the cross-join being computed by the original site. Thus, in this case the remote site would only need to ship back 2,000 rows (1,000 rows for each table) and then the cross join could be computed without the need to transfer the bulk of the results from the remote site to the original site where the query was first received).
Regarding Claim 4, Baby in view of Marathe, Li, and Maurya discloses the computer-implemented method of Claim 1, wherein the determining whether executing the JOIN statement utilizes system resources above a pre-defined threshold comprises:
Baby does not disclose:
for each table of the two or more tables:
based on a portion of the statistics and pre-defined system limits, determining, by the one or more processors, if the table is small table;
based on determining that the table is not a small table, utilizing, by the one or more processors, the statistics to calculate a compression ratio; and
based on the compression ratio being less than or equal to a threshold value, determining that fetching a full dataset from the table utilizes the system resources above the pre-defined allotment.
Li discloses:
for each table of the two or more tables: based on a portion of the statistics and pre-defined system limits, determining, by the one or more processors, if the table is small table (see Li, paragraph [0021], where a small table is defined as the table with the smaller size based on the number of rows and columns from each table that are part of the result set);
based on determining that the table is not a small table, utilizing, by the one or more processors, the statistics to calculate a compression ratio (see Li, paragraph [0004], where the present disclosure provides a method for dynamic filter pushdown for massive parallel processing databases on the cloud; the method includes acquiring one or more filters corresponding to a query, acquiring statistics information of one or more database tables, determining a selectivity of the one or more database tables based on the statistics information, determining whether the selectivity satisfies a threshold condition, and pushing down the one or more filters to the one or more database tables based on the determination of whether the selectivity satisfies a threshold condition);
based on the compression ratio being less than or equal to a threshold value, determining that fetching a full dataset from the table utilizes the system resources above the pre-defined allotment (see Li, paragraph [0004], where the present disclosure provides a method for dynamic filter pushdown for massive parallel processing databases on the cloud; the method includes acquiring one or more filters corresponding to a query, acquiring statistics information of one or more database tables, determining a selectivity of the one or more database tables based on the statistics information, determining whether the selectivity satisfies a threshold condition, and pushing down the one or more filters to the one or more database tables based on the determination of whether the selectivity satisfies a threshold condition).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with Li for the benefit of implementing dynamic filter pushdown in massive parallel processing databases in cloud computing environments (see Li, Abstract).
Regarding Claim 5, Baby in view of Marathe, Li, and Maurya discloses the computer-implemented method of Claim 4, further comprising:
Baby does not disclose based on determining that the table is small table, fetching, by the one or more processors, an unfiltered dataset from the table. Li discloses based on determining that the table is small table, fetching, by the one or more processors, an unfiltered dataset from the table (see Li, paragraph [0038], where the server may pushdown the dynamic filters from the small table to the big table in order to reduce the number of tuples fetched and thereby reduce input/output cost [it is the position of the Examiner that applying filter pushdown to the big table and not the small table is not patentably distinguishable from returning an unfiltered dataset from the small table]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with Li for the benefit of implementing dynamic filter pushdown in massive parallel processing databases in cloud computing environments (see Li, Abstract).
Regarding Claim 11, Baby in view of Marathe, Li, and Maurya discloses the computer-implemented method of Claim 1, wherein the one or more remote databases comprise at least two databases (see Baby, Claim 1, where the method includes the database server identifying one or more particular pluggable databases of the plurality of member pluggable databases that are implicated by the query based on a container map stored within the application root).
Regarding Claim 12, Baby discloses a computer system for planning and executing a query with a JOIN statement in one or more remote databases, the computer system comprising:
a memory (see Baby, Fig. 13, for memory 1306); and
one or more processors in communication with the memory (see Baby, Fig. 13, for processor 1304), wherein the computer system is configured to perform a method, said method comprising:
obtaining, by one or more processors of a local computing system, the query comprising the JOIN statement (see Baby, paragraph [0177], wherein Fig. 10, at block 1000, the database server 100 receives a query to be applied at the application root 900);
determining, by the one or more processors that the JOIN statement references two or more tables in the one or more remote databases (see Baby, paragraph [0062], where a database link is a pointer that defines a one-way communication path from one database to another; the database link allows a user of a first database to access and manipulate objects stored on another database … links to local databases (such as PDBs within the same CDB) may specify the ID of the target database whereas links to remote databases may specify network information for the responsible database server (e.g., domain name, IP address, port number, etc.) as well as the ID number of the target database within the remote CDB; see also paragraph [0209], Push Down of Local Joins; see also paragraph [0224], Push Down of Local Joins Based on Statistics);
wherein the one or more remote databases are physically remote from the one or more processors (see Baby, paragraph [0062], where a database link is a pointer that defines a one-way communication path from one database to another; the database link allows a user of a first database to access and manipulate objects stored on another database … links to local databases (such as PDBs within the same CDB) may specify the ID of the target database whereas links to remote databases may specify network information for the responsible database server (e.g., domain name, IP address, port number, etc.) as well as the ID number of the target database within the remote CDB), and wherein executing the JOIN statement without optimization comprises:
fetching datasets from the two or more remote tables (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables [join statement comprising fetching datasets from the two or more remote tables]);
wherein the first datasets are of a first size (see Baby, paragraph [0087], where in other cases, computing a join at the remote site actually decreases the amount of data that would need to be transferred back across the network … as a result, it is entirely possible that the execution of the inner join will result in less data being returned than issuing each query separately. In such cases, it is more efficient to execute the join at the remote site and then ship the result back to the original site); and
applying, by the one or more processors, the filter, to the at least one table to fetch results for the query and exclude unrelated data from the fetching, wherein the results comprise a filtered dataset (see Baby, paragraph [0091], where in addition to pushing down joins, there are also advantages to pushing down other query operators, such as predicates, grouping operations, and sorting operations. By pushing the aforementioned operators down to be executed by the slave processes more efficient execution of the query can be achieved due to the fact that those operations will be performed in parallel by the slave processes, rather than serially by the query coordinator process. Furthermore, in the case of proxy PDBs, pushing down predicate filtering will often result in a smaller result set that would need to be returned through the network);
wherein the filtered dataset is of a second size, and the first size is larger than the second size (see Baby, paragraph [0087], where in other cases, computing a join at the remote site actually decreases the amount of data that would need to be transferred back across the network … as a result, it is entirely possible that the execution of the inner join will result in less data being returned than issuing each query separately. In such cases, it is more efficient to execute the join at the remote site and then ship the result back to the original site).
Baby does not disclose:
utilizing a local computing resource to translate the datasets to a format usable by a database host local to the one or more processors;
determining, by the one or more processors, if executing the JOIN statement without optimization utilizes the system resources above a pre-defined allotment, comprising:
for each of the two tables: obtaining, by the one or more processors, statistics related to the table;
based on the statistics related to the table, determining, by the one or more processors, whether executing the JOIN statement without optimization utilizes system resources of the local computing system above a pre-defined allotment, wherein the system resources comprise system bandwidth and computing resources local to the one or more processors; and
based on determining that executing the JOIN statement without optimization utilizes the system resources above a pre-defined allotment, for at least one of the two or more tables, generating, by the one or more processors, a filter for at least one table based on an intersection between predicates in the query.
Baby in view of Li and Marathe discloses:
determining, by the one or more processors, if executing the JOIN statement without optimization utilizes the system resources (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables [join statement comprising fetching datasets from the two or more remote tables] … much of this overhead can be avoided if the query is instead broken into separate queries for the rows of the first table and the second table respectively, with the cross-join being computed by the original site. Thus, in this case the remote site would only need to ship back 2,000 rows (1,000 rows for each table) and then the cross join could be computed without the need to transfer the bulk of the results from the remote site to the original site where the query was first received; see also paragraph [0089], where factors that the query optimizer uses to develop the query plan may include … an estimated cost of the result (resource footprint of executing the query according to a particular plan), search heuristics [it is the position of the Examiner that an unoptimized query is not patentably distinguishable from the Applicant’s ‘without optimization’ query]) above a pre-defined allotment (see Marathe, paragraph [0051], where given that during the “cost-based optimization” 418 stage an estimated execution cost of a query becomes known, and therefore, complex queries are those whose estimated execution costs exceed a threshold value), comprising
for each of the two tables: obtaining, by the one or more processors, statistics related to the table (see Li, paragraph [0004], where the method includes … acquiring statistics information of one or more database tables; see also paragraph [0027], where statistics information may include how many distinct values are in each table and how many rows of each table satisfy the criteria of the query); and
based on the statistics related to the table, determining (see Li, paragraph [0004], where the method includes … determining a selectivity of the one or more database tables based on the statistics information, determining whether the selectivity satisfies a threshold condition, and pushing down the one or more filters to the one or more database tables based on the determination of whether the selectivity satisfies a threshold condition), by the one or more processors, whether executing the JOIN statement without optimization utilizes system resources of the local computing system (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables [join statement comprising fetching datasets from the two or more remote tables] … much of this overhead can be avoided if the query is instead broken into separate queries for the rows of the first table and the second table respectively, with the cross-join being computed by the original site. Thus, in this case the remote site would only need to ship back 2,000 rows (1,000 rows for each table) and then the cross join could be computed without the need to transfer the bulk of the results from the remote site to the original site where the query was first received; see also paragraph [0089], where factors that the query optimizer uses to develop the query plan may include … an estimated cost of the result (resource footprint of executing the query according to a particular plan), search heuristics [it is the position of the Examiner that an unoptimized query is not patentably distinguishable from the Applicant’s ‘without optimization’ query]) above a pre-defined allotment (see Marathe, paragraph [0051], where given that during the “cost-based optimization“ 418 stage an estimated execution cost of a query becomes known, and therefore, complex queries are those whose estimated execution costs exceed a threshold value), wherein the system resources comprise system bandwidth (see Baby, paragraph [0087], where in other cases, computing a join at the remote site actually decreases the amount of data that would need to be transferred back across the network … as a result, it is entirely possible that the execution of the inner join will result in less data being returned than issuing each query separately. In such cases, it is more efficient to execute the join at the remote site and then ship the result back to the original site [it is the position of the Examiner that transmitting less data across a network suggests reducing utilization of system bandwidth]) and computing resources local to the one or more processors (see Marathe, paragraph [0038], where cost associated with a query can be unit-less, however the cost may correlate with the query's actual execution time, however other costs may include complexity of the actions to be performed, resources needed in order to perform the actions); and
based on determining that executing the JOIN statement without optimization utilizes the system resources (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; [it is the position of the Examiner that an unoptimized query is not patentably distinguishable from the Applicant’s ‘without optimization’ query]) above the pre-defined allotment (see Marathe, paragraph [0051], where given that during the “cost-based optimization” 418 stage an estimated execution cost of a query becomes known, and therefore, complex queries are those whose estimated execution costs exceed a threshold value), for at least one table of the two or more tables, generating, by the one or more processors, a filter for the at least one table based on an intersection between predicates in the query (see Marathe, paragraph [0040], where output of the “cost-based optimization” stage is input to the “plan refinement” stage 218. This stage can include the handling of a plurality of data processing operations. The data processing operations can include data aggregations, group-level filtering, predicate pushdown).
Baby discloses predicate pushdown if it lowers query cost (see Baby, paragraph [0089]). Marathe discloses predicate pushdown in response to a cost threshold (see Marathe, paragraphs [0040], [0051]), which is not patentably distinguishable from pre-defined allotment of system resources. Li discloses predicate pushdown based on table statistics (see Li, paragraph [0004]) which lowers query cost (see Li, paragraph [0038]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with Marathe and Li because it amounts to combining known prior art elements according to known methods to yield predictable results (MPEP 2143(I)(A)).
Baby in view of Marathe and Li does not disclose utilizing a local computing resource to translate the datasets to a format usable by a database host local to the one or more processors. Maurya discloses utilizing a local computing resource to translate the datasets to a format usable by a database host local to the one or more processors (see Maurya, paragraph [0007], where a system for translating data, extracted from disparate data sources, into a homogeneous dataset in accordance with a database schema to provide meaningful information is disclosed; see also paragraph [0020], where the one or more disparate data sources comprises raw data stored in distributed location and in disparate formats).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby, Marathe, and Li with Maurya for the benefit of obtaining meaningful information from a homogenized dataset created from disparate heterogeneous data sources (see Maurya, Abstract).
Regarding Claim 13, Baby in view of Marathe, Li, and Maurya discloses the computer system of Claim 12, further comprising returning, by the one or more processors, query results (see Baby, Claim 1, where the method includes the database server aggregating the respective result set of results from each pluggable database of the one or more particular pluggable databases into an aggregated result set for the query and the database server sending the aggregated result set to the database client).
Regarding Claim 14, Baby in view of Marathe, Li, and Maurya discloses the computer system of Claim 13, wherein applying the filter comprises:
applying results of the JOIN statement (see Baby, paragraph [0091], where in addition to pushing down joins, there are also advantages to pushing down other query operators, such as predicates, grouping operations, and sorting operations. By pushing the aforementioned operators down to be executed by the slave processes more efficient execution of the query can be achieved due to the fact that those operations will be performed in parallel by the slave processes, rather than serially by the query coordinator process. Furthermore, in the case of proxy PDBs, pushing down predicate filtering will often result in a smaller result set that would need to be returned through the network), and wherein returning the query results further comprises:
fetching, by the one or more processors, a dataset for predicate columns in the query and a dataset for the JOIN statement (see Baby, paragraph [0091], where in addition to pushing down joins, there are also advantages to pushing down other query operators, such as predicates, grouping operations, and sorting operations. By pushing the aforementioned operators down to be executed by the slave processes more efficient execution of the query can be achieved due to the fact that those operations will be performed in parallel by the slave processes, rather than serially by the query coordinator process. Furthermore, in the case of proxy PDBs, pushing down predicate filtering will often result in a smaller result set that would need to be returned through the network); and
merging, by the one or more processors, the dataset for the predicate columns in the query and for the dataset for the JOIN statement with the filtered results to produce the query results (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables … much of this overhead can be avoided if the query is instead broken into separate queries for the rows of the first table and the second table respectively, with the cross-join being computed by the original site. Thus, in this case the remote site would only need to ship back 2,000 rows (1,000 rows for each table) and then the cross join could be computed without the need to transfer the bulk of the results from the remote site to the original site where the query was first received).
Regarding Claim 15, Baby in view of Marathe, Li, and Maurya discloses the computer system of Claim 12, wherein the determining whether executing the JOIN statement utilizes system resources above a pre-defined threshold comprises:
Baby does not disclose:
for each table of the two or more tables:
based on a portion of the statistics and pre-defined system limits, determining, by the one or more processors, if the table is small table;
based on determining that the table is not a small table, utilizing, by the one or more processors, the statistics to calculate a compression ratio; and
based on the compression ratio being less than or equal to a threshold value, determining that fetching a full dataset from the table utilizes the system resources above the pre-defined allotment.
Li discloses:
for each table of the two or more tables: based on a portion of the statistics and pre-defined system limits, determining, by the one or more processors, if the table is small table (see Li, paragraph [0021], where a small table is defined as the table with the smaller size based on the number of rows and columns from each table that are part of the result set);
based on determining that the table is not a small table, utilizing, by the one or more processors, the statistics to calculate a compression ratio (see Li, paragraph [0004], where the present disclosure provides a method for dynamic filter pushdown for massive parallel processing databases on the cloud; the method includes acquiring one or more filters corresponding to a query, acquiring statistics information of one or more database tables, determining a selectivity of the one or more database tables based on the statistics information, determining whether the selectivity satisfies a threshold condition, and pushing down the one or more filters to the one or more database tables based on the determination of whether the selectivity satisfies a threshold condition);
based on the compression ratio being less than or equal to a threshold value, determining that fetching a full dataset from the table utilizes the system resources above the pre-defined allotment (see Li, paragraph [0004], where the present disclosure provides a method for dynamic filter pushdown for massive parallel processing databases on the cloud; the method includes acquiring one or more filters corresponding to a query, acquiring statistics information of one or more database tables, determining a selectivity of the one or more database tables based on the statistics information, determining whether the selectivity satisfies a threshold condition, and pushing down the one or more filters to the one or more database tables based on the determination of whether the selectivity satisfies a threshold condition).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with Li for the benefit of implementing dynamic filter pushdown in massive parallel processing databases in cloud computing environments (see Li, Abstract).
Regarding Claim 16, Baby in view of Marathe, Li, and Maurya discloses the computer system of Claim 15, further comprising:
Baby does not disclose based on determining that the table is small table, fetching, by the one or more processors, an unfiltered dataset from the table. Li discloses based on determining that the table is small table, fetching, by the one or more processors, an unfiltered dataset from the table (see Li, paragraph [0038], where the server may pushdown the dynamic filters from the small table to the big table in order to reduce the number of tuples fetched and thereby reduce input/output cost [it is the position of the Examiner that applying filter pushdown to the big table and not the small table is not patentably distinguishable from returning an unfiltered dataset from the small table]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with Li for the benefit of implementing dynamic filter pushdown in massive parallel processing databases in cloud computing environments (see Li, Abstract).
Regarding Claim 20, Baby discloses a computer program product for planning and executing a query with a JOIN statement in one or more remote databases, the computer program product comprising:
one or more computer readable storage media and program instructions collectively stored on the one or more computer readable storage media (see Baby, paragraph [0245], where various forms of media may be involved in carrying one or more sequences of instructions to processor 1304 for execution) readable by at least one processing circuit of a local computing system to:
obtaining, by one or more processors of a local computing system, the query comprising the JOIN statement (see Baby, paragraph [0177], wherein Fig. 10, at block 1000, the database server 100 receives a query to be applied at the application root 900);
determining, by the one or more processors that the JOIN statement references two or more tables in the one or more remote databases (see Baby, paragraph [0062], where a database link is a pointer that defines a one-way communication path from one database to another; the database link allows a user of a first database to access and manipulate objects stored on another database … links to local databases (such as PDBs within the same CDB) may specify the ID of the target database whereas links to remote databases may specify network information for the responsible database server (e.g., domain name, IP address, port number, etc.) as well as the ID number of the target database within the remote CDB; see also paragraph [0209], Push Down of Local Joins; see also paragraph [0224], Push Down of Local Joins Based on Statistics);
wherein the one or more remote databases are physically remote from the one or more processors (see Baby, paragraph [0062], where a database link is a pointer that defines a one-way communication path from one database to another; the database link allows a user of a first database to access and manipulate objects stored on another database … links to local databases (such as PDBs within the same CDB) may specify the ID of the target database whereas links to remote databases may specify network information for the responsible database server (e.g., domain name, IP address, port number, etc.) as well as the ID number of the target database within the remote CDB), and wherein executing the JOIN statement without optimization comprises:
fetching datasets from the two or more remote tables (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables [join statement comprising fetching datasets from the two or more remote tables]);
wherein the first datasets are of a first size (see Baby, paragraph [0087], where in other cases, computing a join at the remote site actually decreases the amount of data that would need to be transferred back across the network … as a result, it is entirely possible that the execution of the inner join will result in less data being returned than issuing each query separately. In such cases, it is more efficient to execute the join at the remote site and then ship the result back to the original site); and
applying, by the one or more processors, the filter, to the at least one table to fetch results for the query and exclude unrelated data from the fetching, wherein the results comprise a filtered dataset (see Baby, paragraph [0091], where in addition to pushing down joins, there are also advantages to pushing down other query operators, such as predicates, grouping operations, and sorting operations. By pushing the aforementioned operators down to be executed by the slave processes more efficient execution of the query can be achieved due to the fact that those operations will be performed in parallel by the slave processes, rather than serially by the query coordinator process. Furthermore, in the case of proxy PDBs, pushing down predicate filtering will often result in a smaller result set that would need to be returned through the network);
wherein the filtered dataset is of a second size, and the first size is larger than the second size (see Baby, paragraph [0087], where in other cases, computing a join at the remote site actually decreases the amount of data that would need to be transferred back across the network … as a result, it is entirely possible that the execution of the inner join will result in less data being returned than issuing each query separately. In such cases, it is more efficient to execute the join at the remote site and then ship the result back to the original site).
Baby does not disclose:
utilizing a local computing resource to translate the datasets to a format usable by a database host local to the one or more processors;
determining, by the one or more processors, if executing the JOIN statement without optimization utilizes the system resources above a pre-defined allotment, comprising:
for each of the two tables: obtaining, by the one or more processors, statistics related to the table;
based on the statistics related to the table, determining, by the one or more processors, whether executing the JOIN statement without optimization utilizes system resources of the local computing system above a pre-defined allotment, wherein the system resources comprise system bandwidth and computing resources local to the one or more processors; and
based on determining that executing the JOIN statement without optimization utilizes the system resources above a pre-defined allotment, for at least one of the two or more tables, generating, by the one or more processors, a filter for at least one table based on an intersection between predicates in the query.
Baby in view of Li and Marathe discloses:
determining, by the one or more processors, if executing the JOIN statement without optimization utilizes the system resources (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables [join statement comprising fetching datasets from the two or more remote tables] … much of this overhead can be avoided if the query is instead broken into separate queries for the rows of the first table and the second table respectively, with the cross-join being computed by the original site. Thus, in this case the remote site would only need to ship back 2,000 rows (1,000 rows for each table) and then the cross join could be computed without the need to transfer the bulk of the results from the remote site to the original site where the query was first received; see also paragraph [0089], where factors that the query optimizer uses to develop the query plan may include … an estimated cost of the result (resource footprint of executing the query according to a particular plan), search heuristics [it is the position of the Examiner that an unoptimized query is not patentably distinguishable from the Applicant’s ‘without optimization’ query]) above a pre-defined allotment (see Marathe, paragraph [0051], where given that during the “cost-based optimization” 418 stage an estimated execution cost of a query becomes known, and therefore, complex queries are those whose estimated execution costs exceed a threshold value), comprising
for each of the two tables: obtaining, by the one or more processors, statistics related to the table (see Li, paragraph [0004], where the method includes … acquiring statistics information of one or more database tables; see also paragraph [0027], where statistics information may include how many distinct values are in each table and how many rows of each table satisfy the criteria of the query); and
based on the statistics related to the table, determining (see Li, paragraph [0004], where the method includes … determining a selectivity of the one or more database tables based on the statistics information, determining whether the selectivity satisfies a threshold condition, and pushing down the one or more filters to the one or more database tables based on the determination of whether the selectivity satisfies a threshold condition), by the one or more processors, whether executing the JOIN statement without optimization utilizes system resources of the local computing system (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables [join statement comprising fetching datasets from the two or more remote tables] … much of this overhead can be avoided if the query is instead broken into separate queries for the rows of the first table and the second table respectively, with the cross-join being computed by the original site. Thus, in this case the remote site would only need to ship back 2,000 rows (1,000 rows for each table) and then the cross join could be computed without the need to transfer the bulk of the results from the remote site to the original site where the query was first received; see also paragraph [0089], where factors that the query optimizer uses to develop the query plan may include … an estimated cost of the result (resource footprint of executing the query according to a particular plan), search heuristics [it is the position of the Examiner that an unoptimized query is not patentably distinguishable from the Applicant’s ‘without optimization’ query]) above a pre-defined allotment (see Marathe, paragraph [0051], where given that during the “cost-based optimization“ 418 stage an estimated execution cost of a query becomes known, and therefore, complex queries are those whose estimated execution costs exceed a threshold value), wherein the system resources comprise system bandwidth (see Baby, paragraph [0087], where in other cases, computing a join at the remote site actually decreases the amount of data that would need to be transferred back across the network … as a result, it is entirely possible that the execution of the inner join will result in less data being returned than issuing each query separately. In such cases, it is more efficient to execute the join at the remote site and then ship the result back to the original site [it is the position of the Examiner that transmitting less data across a network suggests reducing utilization of system bandwidth]) and computing resources local to the one or more processors (see Marathe, paragraph [0038], where cost associated with a query can be unit-less, however the cost may correlate with the query's actual execution time, however other costs may include complexity of the actions to be performed, resources needed in order to perform the actions); and
based on determining that executing the JOIN statement without optimization utilizes the system resources (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; [it is the position of the Examiner that an unoptimized query is not patentably distinguishable from the Applicant’s ‘without optimization’ query]) above the pre-defined allotment (see Marathe, paragraph [0051], where given that during the “cost-based optimization” 418 stage an estimated execution cost of a query becomes known, and therefore, complex queries are those whose estimated execution costs exceed a threshold value), for at least one table of the two or more tables, generating, by the one or more processors, a filter for the at least one table based on an intersection between predicates in the query (see Marathe, paragraph [0040], where output of the “cost-based optimization” stage is input to the “plan refinement” stage 218. This stage can include the handling of a plurality of data processing operations. The data processing operations can include data aggregations, group-level filtering, predicate pushdown).
Baby discloses predicate pushdown if it lowers query cost (see Baby, paragraph [0089]). Marathe discloses predicate pushdown in response to a cost threshold (see Marathe, paragraphs [0040], [0051]), which is not patentably distinguishable from pre-defined allotment of system resources. Li discloses predicate pushdown based on table statistics (see Li, paragraph [0004]) which lowers query cost (see Li, paragraph [0038]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with Marathe and Li because it amounts to combining known prior art elements according to known methods to yield predictable results (MPEP 2143(I)(A)).
Baby in view of Marathe and Li does not disclose utilizing a local computing resource to translate the datasets to a format usable by a database host local to the one or more processors. Maurya discloses utilizing a local computing resource to translate the datasets to a format usable by a database host local to the one or more processors (see Maurya, paragraph [0007], where a system for translating data, extracted from disparate data sources, into a homogeneous dataset in accordance with a database schema to provide meaningful information is disclosed; see also paragraph [0020], where the one or more disparate data sources comprises raw data stored in distributed location and in disparate formats).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby, Marathe, and Li with Maurya for the benefit of obtaining meaningful information from a homogenized dataset created from disparate heterogeneous data sources (see Maurya, Abstract).
Claims 6-8 and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Baby, Marathe, Li and Maurya as applied to Claims 1-5, 12-16, and 20 above, and further in view of McConnell (PG Pub. No. 2021/0133193 A1).
Regarding Claim 6, Baby in view of Marathe, Li, and Maurya discloses the computer-implemented method of Claim 4, wherein:
Baby does not disclose the statistics related to the table comprise a size of a fetched dataset from the table responsive to the query, central processing unit usage in a local host of the table in the system, and bandwidth to transmit the fetched dataset to a data adaptor in the local database host. McConnell discloses the statistics related to the table comprise a size of a fetched dataset from the table responsive to the query, central processing unit usage in a local host of the table in the system, and bandwidth to transmit the fetched dataset to a data adaptor in the local database host (see McConnell, paragraph [0023], where metrics may include size of operands, selectivity of predicates, cardinality of tables or columns, performance and execution metrics such as query latency or CPU usage, and other metrics resulting from the execution of the queries).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with McConnell for the benefit of optimizing future queries based on compiled statistics of previous queries (see McConnell, Abstract).
Regarding Claim 7, Baby in view of Marathe, Li, and Maurya discloses the computer-implemented method of Claim 4, wherein:
Baby does not disclose the portion of the statistics related to the table comprise parameters of columns referenced in the JOIN statement. McConnell discloses the portion of the statistics related to the table comprise parameters of columns referenced in the JOIN statement (see McConnell, paragraph [0023], where metrics may include size of operands, selectivity of predicates, cardinality of tables or columns, performance and execution metrics such as query latency or CPU usage, and other metrics resulting from the execution of the queries).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with McConnell for the benefit of optimizing future queries based on compiled statistics of previous queries (see McConnell, Abstract).
Regarding Claim 8, Baby in view of Marathe, Li, Maurya, and McConnell discloses the computer-implemented method of Claim 7, wherein:
Baby does not disclose the parameters of the columns are selected from the group consisting of: quantile distribution, average length, and column cardinality. McConnell discloses the parameters of the columns are selected from the group consisting of: quantile distribution, average length, and column cardinality (see McConnell, paragraph [0023], where metrics may include size of operands, selectivity of predicates, cardinality of tables or columns, performance and execution metrics such as query latency or CPU usage, and other metrics resulting from the execution of the queries).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with McConnell for the benefit of optimizing future queries based on compiled statistics of previous queries (see McConnell, Abstract).
Regarding Claim 17, Baby in view of Marathe, Li, and Maurya discloses the computer system of Claim 15, wherein:
Baby does not disclose the statistics related to the table comprise a size of a fetched dataset from the table responsive to the query, central processing unit usage in a local host of the table in the system, and bandwidth to transmit the fetched dataset to a data adaptor in the local database host. McConnell discloses the statistics related to the table comprise a size of a fetched dataset from the table responsive to the query, central processing unit usage in a local host of the table in the system, and bandwidth to transmit the fetched dataset to a data adaptor in the local database host (see McConnell, paragraph [0023], where metrics may include size of operands, selectivity of predicates, cardinality of tables or columns, performance and execution metrics such as query latency or CPU usage, and other metrics resulting from the execution of the queries).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with McConnell for the benefit of optimizing future queries based on compiled statistics of previous queries (see McConnell, Abstract).
Regarding Claim 18, Baby in view of Marathe, Li, and Maurya discloses the computer system of Claim 15, wherein:
Baby does not disclose the portion of the statistics related to the table comprise parameters of columns referenced in the JOIN statement. McConnell discloses the portion of the statistics related to the table comprise parameters of columns referenced in the JOIN statement (see McConnell, paragraph [0023], where metrics may include size of operands, selectivity of predicates, cardinality of tables or columns, performance and execution metrics such as query latency or CPU usage, and other metrics resulting from the execution of the queries).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with McConnell for the benefit of optimizing future queries based on compiled statistics of previous queries (see McConnell, Abstract).
Regarding Claim 19, Baby in view of Marathe, Li, Maurya, and McConnell discloses the computer system of Claim 18, wherein:
Baby does not disclose the parameters of the columns are selected from the group consisting of: quantile distribution, average length, and column cardinality. McConnell discloses the parameters of the columns are selected from the group consisting of: quantile distribution, average length, and column cardinality (see McConnell, paragraph [0023], where metrics may include size of operands, selectivity of predicates, cardinality of tables or columns, performance and execution metrics such as query latency or CPU usage, and other metrics resulting from the execution of the queries).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with McConnell for the benefit of optimizing future queries based on compiled statistics of previous queries (see McConnell, Abstract).
Claims 9 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Baby, Marathe, Li, and Maurya as applied to Claims 1-5, 12-16, and 20 above, and further in view of Chowdhuri (PG Pub. No. 2006/0218123 A1) and Keller (PG Pub. No. 2007/0022136 A1).
Regarding Claim 9, Baby in view of Marathe, Li, and Maurya discloses the computer-implemented method of Claim 1, wherein generating the filter comprises:
generating, by the one or more processors, the filter based on the intersection (see Baby, paragraph [0091], where in addition to pushing down joins, there are also advantages to pushing down other query operators, such as predicates, grouping operations, and sorting operations. By pushing the aforementioned operators down to be executed by the slave processes more efficient execution of the query can be achieved due to the fact that those operations will be performed in parallel by the slave processes, rather than serially by the query coordinator process. Furthermore, in the case of proxy PDBs, pushing down predicate filtering will often result in a smaller result set that would need to be returned through the network);
Baby does not disclose:
determining, by the one or more processors, based on the query, that the query does not comprise constants in a predicate list of the query; and
wherein the intersection comprises an intersection between ranges of the predicates via a quantile distribution.
Chowdhuri discloses determining, by the one or more processors, based on the query, that the query does not comprise constants in a predicate list of the query (see Chowdhuri, paragraph [0224], [0225], for Subqueries (in Conjunction with Filters) … subqueries can be classified into two groups: quantified predicate subqueries; and expression subqueries; quantified predicate subqueries are ones that have a predicate as [NOT] IN, [NOT]EXISTS[, <, <=],=, !=, >, >=,<,<=] ANY/ALL. In general, such subqueries are processed using a set of internal guidelines).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with Chowdhuri for the benefit of performing query fragments concurrently (see Chowdhuri, paragraph [0012]).
Baby in view of Chowdhuri does not disclose wherein the intersection comprises an intersection between ranges of the predicates via a quantile distribution. Keller discloses wherein the intersection comprises an intersection between ranges of the predicates via a quantile distribution (see Keller, paragraph [0106], where value range of each SQL table is split into some predetermined number of attributes, for instance using quantile statistics. Each attribute represents a certain number of records within the database table. For example, a column variable AGE may be subdivided into an adequate number of subranges, e.g., 20<AGE, 30<AGE, 40<AGE, etc., having an open upper limit, or both limits closed as 20<AGE<30, 30<AGE<40, etc.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby and Chowdhuri with Keller for the benefit of self-tuning database retrieval optimization (see Keller, Abstract).
Regarding Claim 10, Baby in view of Marathe, Li, and Maurya discloses the computer-implemented method of Claim 1, wherein generating the filter comprises:
generating, by the one or more processors, the filter based on the intersection (see Baby, paragraph [0091], where in addition to pushing down joins, there are also advantages to pushing down other query operators, such as predicates, grouping operations, and sorting operations. By pushing the aforementioned operators down to be executed by the slave processes more efficient execution of the query can be achieved due to the fact that those operations will be performed in parallel by the slave processes, rather than serially by the query coordinator process. Furthermore, in the case of proxy PDBs, pushing down predicate filtering will often result in a smaller result set that would need to be returned through the network);
Baby does not disclose:
determining, by the one or more processors, based on the query, that the query comprises constant predicates in a predicate list of the query; and
wherein the intersection comprises an intersection between ranges of all the predicates via quantile distribution.
Chowdhuri discloses determining, by the one or more processors, based on the query, that the query comprises constant predicates in a predicate list of the query (see Chowdhuri, paragraph [0224], [0225], for Subqueries (in Conjunction with Filters) … subqueries can be classified into two groups: quantified predicate subqueries; and expression subqueries; quantified predicate subqueries are ones that have a predicate as [NOT] IN, [NOT]EXISTS[, <, <=],=, !=, >, >=,<,<=] ANY/ALL. In general, such subqueries are processed using a set of internal guidelines).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby with Chowdhuri for the benefit of performing query fragments concurrently (see Chowdhuri, paragraph [0012]).
Baby in view of Chowdhuri does not disclose wherein the intersection comprises an intersection between ranges of all the predicates via quantile distribution. Keller discloses wherein the intersection comprises an intersection between ranges of all the predicates via quantile distribution (see Keller, paragraph [0106], where value range of each SQL table is split into some predetermined number of attributes, for instance using quantile statistics. Each attribute represents a certain number of records within the database table. For example, a column variable AGE may be subdivided into an adequate number of subranges, e.g., 20<AGE, 30<AGE, 40<AGE, etc., having an open upper limit, or both limits closed as 20<AGE<30, 30<AGE<40, etc.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Baby and Chowdhuri with Keller for the benefit of self-tuning database retrieval optimization (see Keller, Abstract).
Response to Arguments
Applicant’s Arguments, filed November 11, 2025, have been fully considered, but they are not persuasive.
Applicant argues in Applicant’s Remarks that Baby, alone, or in combination of Marathe, Li, and Maurya does not teach, disclose, or fairly suggest, all of the elements of Independent Claims 1, 12, and 20 as traversed by the Applicant. The Examiner respectfully disagrees.
Baby is directed to, inter alia, query optimization via predicate pushdown (see Baby, paragraphs [0084] – [0086], 1.4 Query Optimization … in many cases, the order in which a query is computed can have drastic consequences related to the size of the result that would need to be transferred back across the network; for example, consider two tables that each have 1000 rows and suppose a query is received to compute the cross join of the two tables [join statement comprising fetching datasets from the two or more remote tables] … much of this overhead can be avoided if the query is instead broken into separate queries for the rows of the first table and the second table respectively, with the cross-join being computed by the original site. Thus, in this case the remote site would only need to ship back 2,000 rows (1,000 rows for each table) and then the cross join could be computed without the need to transfer the bulk of the results from the remote site to the original site where the query was first received; see also paragraph [0089], where factors that the query optimizer uses to develop the query plan may include … an estimated cost of the result (resource footprint of executing the query according to a particular plan), search heuristics [it is the position of the Examiner that an unoptimized query is not patentably distinguishable from the Applicant’s ‘without optimization’ query]) and explicitly discloses that pushdown can optimize JOIN statements (see Baby, paragraph [0209], Push Down of Local Joins). Marathe discloses predicate pushdown in response to a cost threshold (see Marathe, paragraphs [0040], [0051]), which is not patentably distinguishable from pre-defined allotment of system resources. Li discloses predicate pushdown based on table statistics (see Li, paragraph [0004]) which lowers query cost (see Li, paragraph [0038]).
Applicant further argues that Marathe is not suitable for a combination with Baby and Li because it teaches away from utilizing table statistics to determine resource cost thresholds. However, “the prior art’s mere disclosure of more than one alternative does not constitute a teaching away from any of these alternatives because such disclosure does not criticize, discredit, or otherwise discourage the solution claimed….” In re Fulton, 391 F.3d 1195, 1201, 73 USPQ2d 1141, 1146 (Fed. Cir. 2004). In this case, paragraph [0051] of Marathe merely discloses an alternative rudimentary cost measure rather than criticizing, discrediting, or otherwise discouraging measuring cost by estimated execution, which is explicitly disclosed as “more realistic” (see Marathe, paragraph [0051]). Accordingly, it is the position of the Examiner that Marathe is not unsuitable for a combination with Baby and Li.
Applicant further argues that the combination of Baby, Li, and Marathe is counterintuitive. The Examiner respectfully disagrees. The test for obviousness is not whether the features of a secondary reference may be bodily incorporated into the structure of the primary reference; nor is it that the claimed invention must be expressly suggested in any one or all of the references. Rather, the test is what the combined teachings of the references would have suggested to those of ordinary skill in the art. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981). As stated above, Baby discloses query optimization through predicate (or filter) pushdown and pushdowns of joins. Marathe discloses pushdown in response to a cost threshold, which is not patentably distinguishable from pre-defined allotment of system resources. Li discloses predicate pushdown based on table statistics, which lowers query cost. For at least these reasons, it is the position of the Examiner that the combination of references discloses the elements of Independent Claims 1, 12, and 20 as traversed by the Applicant.
All other arguments are moot in light of the new grounds of rejection.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARHAD AGHARAHIMI whose telephone number is (571)272-9864. The examiner can normally be reached M-F 9am - 5pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Apu Mofiz can be reached at 571-272-4080. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/FARHAD AGHARAHIMI/Examiner, Art Unit 2161
/APU M MOFIZ/Supervisory Patent Examiner, Art Unit 2161