DETAILED ACTION
1. Claims 1, 3-11 and 13-19 are pending in this application.
Notice of Pre-AIA or AIA Status
2. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA 35 U.S.C. §102 and §103 (or as subject to pre-AIA 35 U.S.C. §102 and §103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Response to Amendment
3. This office action is in response to applicant’s amendment filed on 12/11/2025 in response to the final rejection mailed on 07/11/2025. Claims 1, 5-6, 11, 15-16 and 19 have been amended. Claims 3-4, 7-10, 13-14 and 17-18 have been kept original. Claims 2, 12 and 20 have been cancelled. Amendment has been entered.
Response to Arguments
4. Applicant's arguments, filed on 12/11/2025, with respect to the rejection of claims 1, 3-11 and 13-19 under 35 U.S.C. §101 an abstract idea (mental process) (Applicant' s arguments, page 1), have been fully considered but are not persuasive. Respectfully, the examiner disagrees, see the clarification below.
The Applicant argue that “no subject matter rejections under 35 U.S.C. § 101 were asserted in the previous office action (nor in the parent case).” The Examiner can only address examinations that the Examiner personally completed. The examiner rejected the claims under 35 U.S.C. § 101 as an abstract idea (mental process), following the analysis from the 2019 Revised Patent Subject Matter Eligibility Guidance (2019 PEG).
The Applicant argue that “the claimed approach applies and performs improvements in data access to a subject dataset by generating chunking schemes in advance of a query operation or data operations on the subject dataset for allowing the computer (database manager) to run more efficiently and improve performance.” (Applicant arguments, page 1). However, the claims do not have elements that reflects an improvement the functioning of a computer or other technology, see MPEP 2106.05(a). Simply arguing that generating chunking schemes in advance of query or data operations allows a computer to run more efficiently does not demonstrate an improvement to computer functioning, especially when the claims fail to show the specific steps required to achieve that improved performance, see MPEP 2106.05(f) – “(1) Whether the claim recites only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished.”.
The Applicant argue that “The applied chunking scheme (selected from a number of performance improving candidates) implements this chunking scheme precisely for improving database access performance, as discussed at [0029] and throughout the specification with respect to performance data.”. Simply applying a chunking scheme without disclosing the detailed process for doing so is a vague and broad statement that fails to demonstrate any improvement to computer functioning.
The applicant cited specification paragraph [0029] of the specification in their argument, with the apparent intent to justify an improvement to a computer system. However, specification paragraph [0029] simply says “FIG. 2 depicts a data statement chunking technique200 as implemented in systems that facilitate rule-based chunking of data statements for operation over large datasets in data storage environments. As an option, one or more variations of data statement chunking technique200 or any aspect thereof may be implemented in the context of the architecture and functionality of the embodiments described herein. The data statement chunking technique200 or any aspect thereof may be implemented in any environment.” The claim fails to recite details regarding how the solution to the problem is accomplished, and it does not reflect all the essential elements within the claim language.
For those reasons claims 1, 3-11 and 13-19 remain rejected under 35 U.S.C. §101 (abstract idea - mental process). See the rejection below.
Applicant’s arguments, filed on 12/11/2025, with respect to the U.S.C. § 112(b) rejection of claims 1, 3-11 and 13-19 with respect to (the or condition element indefinite rejection) have been fully considered and are persuasive. It is respectfully noted that the Applicant's amendments and clarifications have convinced the Examiner that the 35 U.S.C. § 112(b) rejection of claims 1, 3-11 and 13-19 is overcome (Applicant’s arguments, page 2). Therefore, the U.S.C. § 112(b) rejection of claims 11-19 with respect to (the or condition element indefinite rejection) is withdrawn from the record.
Applicant’s arguments, filed on 12/11/2025, with respect, the rejection of claims 11 and 13-18 under the U.S.C. § 112(b) (Claim 11, recites the limitations of “A computer readable medium, embodied in a non- transitory computer readable medium ...” However, it is unclear how the computer-readable medium can embody a non-transitory computer-readable medium. The specification does not clarify this capability of the computer-readable medium. For this reason, the claims are rendered indefinite.) have been fully considered and are not persuasive. The claims remain indefinite because it is unclear how computer program code can ‘embody’ a non-transitory computer-readable medium. In computer science, a computer-readable medium (CRM) refers to a tangible substrate; ‘non-transitory computer readable medium’ refers to the medium's tangibility rather than data persistence. The distinction is that a non-transitory medium does not encompass transitory signals. The specification fails to clarify how a non-transitory computer-readable medium is capable of 'embodying' a computer-readable medium (CRM), rendering the claims indefinite. For these reasons, the rejection is maintained.
Applicant’s arguments, filed on 12/11/2025, with respect to the U.S.C. § 112(b) rejection of claims 5-6 and 15-16 with respect to (“… the portion of the client-specific data …”) have been fully considered and are persuasive. It is respectfully noted that the Applicant's amendments and clarifications have convinced the Examiner that the 35 U.S.C. § 112(b) rejection of claims 5-6 and 15-16 is overcome (Applicant’s arguments, page 2). Therefore, the U.S.C. § 112(b) rejection of claims 5-6 and 15-16 with respect to (“… the portion of the client-specific data …”) is withdrawn from the record.
Applicant's arguments, filed on 12/11/2025, with respect to the rejection of claims 1, 3-11 and 13-19 under 35 U.S.C. §103 (Applicant' s arguments, pages 3-5), have been fully considered and are but are moot because the independent claims are amended and introduce new limitations that were not previously presented newly found prior art has been applied.
It is also noted that the Applicant raises several arguments without pointing to any specific element of the claim. The Applicant focuses on showing what the prior arts teach and fails to explain why the invention differs from what is disclosed in the prior arts.
Claim Rejections - 35 USC § 101
5. 35 U.S.C. §101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, 3-11 and 13-19 are rejected under 35 U.S.C. §101 because the claimed invention is directed to an abstract idea (Mental Process) without significantly more. The claims similarly recite steps for data statement chunking.
The following is an analysis based on 2019 Revised Patent Subject Matter Eligibility Guidance (2019 PEG).
Step 1, Statutory Category?
Claims 1, and 3-10 are directed to a method.
Claims 1, 13-18 are directed to a computer readable medium.
Claim 19 is directed to a system.
Therefore, claims 1, 3-11 and 13-19 fall into at least one of the four statutory categories.
Step 2A, Prong I: Judicial Exception Recited?
The examiner submits that the foregoing claim limitations constitute a “Mental Process”, as the claims cover performance of the limitations in the human mind, given the broadest reasonable interpretation.
As per claims 1, 11 and 19, the claims similarly recite the limitations of: “applying at least a portion of a set of client-specific data to the data statements to determine at least one chunking scheme, the client specific data further comprising one or more statement chunking rules and a set of performance data, based on the one or more attributes, and referencing at least one dimension in the subject dataset;” A human can mentally apply rules to data in order to determine a scheme that represents the data. For example, a human can mentally apply rules to data, such as organizing a list of names alphabetically, to determine a scheme that represents the data without needing a computer. The one or more statement-chunking rules and the set of performance data are merely components used to implement the abstract idea. The based on the one or more attributes, and referencing at least one dimension in the subject dataset are merely components used to implement the abstract idea. There is nothing so complex in the limitation that could not be doing in the human mind.
“accessing performance data to generate performance estimates for a set of candidate chunking schemes from the at least one chunking scheme;” A human can read data and mentally analyze it to create an estimate based on the data. . There is nothing so complex in the limitation that could not be doing in the human mind.
“selecting a predetermined chunking scheme from the set of candidate chunking schemes based on the performance estimates prior to applying the chunking scheme to the subject dataset;” A human can observe data, mentally establish criteria for operations based on pre-established restrictions, and select portions of that data that meet those restrictions. The criteria on performance estimates prior to applying the chunking scheme to the dataset is merely an element used to implement the abstract idea. For example, a human can observe a list of emails and mentally select only those from a specific sender based on a pre-established rule, such as 'important,' which is a mental process that can be performed without a computer. There is nothing so complex in the limitation that could not be doing in the human mind.
“prior to accessing the subject dataset, generating one or more data operations from the data statements, the data operations generated based at least in part on the referenced dimension, the chunking scheme and on the performance estimates;” A human can observe data, make judgments, and define tasks that may be used to manipulate the observed data. The data operations generated based at least in part on the referenced dimension, the chunking scheme and on the performance, estimates are merely instructions used to implement the abstract idea. For example, a human can observe a list of sales figures, judge which are highest, and define the task of highlighting them in a report, demonstrating that the claimed data manipulation is an ineligible mental process. There is nothing so complex in the limitation that could not be doing in the human mind.
“executing the data operations over the subject dataset to generate a result set.” A human can observe data and apply criteria operations to mentally produce a result set for the observed data. For example, a human can observe a list of names and mentally apply a 'starts with A' criterion to produce a filtered result set, the claimed process is an ineligible mental process that can be performed without a computer. There is nothing so complex in the limitation that could not be doing in the human mind.
As per dependent claims 3 and 13, the claims recite the limitation of:
“wherein the data operations are executed at one or more query engines, and wherein the client-specific data is inaccessible by the query engines.” The one or more query engines recited above is a merely component used for the mental steps recited in claims 1 and 11.
As per dependent claims 4 and 14, the claims recite the limitation of:
“consulting the expanded dataset metadata to perform at least one of, determining the at least one chunking scheme, or generating the one or more data operations.” A human can observe data to determine an outline of it. There is nothing so complex in the limitation that could not be doing in the human mind.
As per dependent claims 5 and 15, the claims recite the limitation of:
“analyzing the data statements to determine one or more statement attributes associated with the data statements;” A human can observe data to define attributes related to the observed data. There is nothing so complex in the limitation that could not be doing in the human mind.
“applying a portion of the client-specific data to at least one of the statement attributes to perform at least one of, determining the at least one chunking scheme, or generating the one or more data operations.” A human can mentally apply rules to data in order to determine a scheme that represents the data. There is nothing so complex in the limitation that could not be doing in the human mind.
As per dependent claims 6 and 16, the claims recite the limitation of:
“generating one or more performance estimates associated with the data statements;” A human can read data and mentally analyze it to create an estimate based on the data. There is nothing so complex in the limitation that could not be doing in the human mind.
“applying a portion of the client-specific data to at least one of the performance estimates to perform at least one of, determining the at least one chunking scheme, or generating the one or more data operations.” A human can mentally apply rules to data in order to determine a scheme that represents the data. There is nothing so complex in the limitation that could not be doing in the human mind.
As per dependent claims 7 and 17, the claims recite the limitation of:
“wherein at least one of the performance estimates is based at least in part on a set of performance data.” The at least in part on the set of performance data recited above is a merely component used for the mental steps recited in claims 1 and 11.
As per dependent claims 8 and 18, the claims recite the limitation of:
“wherein the performance data comprises at least one of, a set of historical data operations performance statistics, or a set of historical data operations behavioral characteristics.” The set of historical data operations performance statistics and the set of historical data operations behavioral characteristics recited above are merely components used for the mental steps recited in claims 1 and 11.
As per dependent claim 9, the claim recites the limitation of:
“further comprising merging two or more results from the data operations into the result set.” A human can observe two or more pieces of data and mentally combine them. There is nothing so complex in the limitation that could not be doing in the human mind.
As per dependent claim 10, the claim recites the limitation of:
“wherein the data operations are executed in accordance with one or more execution directives, the execution directives indicating that one or more of the data operations be executed in parallel, in sequence, asynchronously, or synchronously.” The in parallel, in sequence, asynchronously, and synchronously recited above are merely components used for the mental steps recited in claims 1 and 11.
Accordingly, claims 1, 3-11 and 13-19 recite at least one abstract idea.
Step 2A, Prong II: Integrated into a Practical Application?
The claims recite the following additional limitations/elements:
As per claims 1, 11 and 19, the claims recite the additional limitation of:
“… chunking data statements based at least in part on a set of client-specific information in a client data statement processing layer …” The set of client-specific information in the client data statement recited above is a merely component used for the mental steps recited in claims 1, 11 and 19.
“receiving one or more data statements issued by at least one client, the data statements issued by the client to operate over a subject dataset for querying one or more attributes of the subject dataset;” This limitation is example of adding insignificant extra-solution activity to the judicial exception (see MPEP § 2106.05(g)). Specifically, the additional limitation exemplify mere data gathering, without any further processing or analysis.
As per dependent claims 4 and 14, the claims recite the limitation of:
“receiving a set of dataset metadata associated with the subject dataset; expanding the dataset metadata into a set of expanded dataset metadata;” This limitation is example of adding insignificant extra-solution activity to the judicial exception (see MPEP § 2106.05(g)). Specifically, the additional limitation exemplify mere data gathering, without any further processing or analysis.
As per claims 11, the claim recites the additional elements of:
“a computer readable medium, a non- transitory computer readable medium, one or more processors” This element is example of mere instruction to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (see MPEP § 2106.05(f)). Specifically, the additional elements of the limitations invoke computers or other machinery merely as a tool to perform an existing process. Use of a computer or other machinery in its ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general-purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) do not provide improvements to the functioning of a computer or to any other technology or technical field; and do not integrate a judicial exception into a practical application.
“a sequence of instructions and a set of acts for chunking data statements” The sequence of instructions and the set of acts for chunking data statements recited above are merely components used for the mental steps recited in claim 11.
As per claims 19, the claim recites the additional elements of:
“a storage medium and an one or more processors” This element is example of mere instruction to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (see MPEP § 2106.05(f)). Specifically, the additional elements of the limitations invoke computers or other machinery merely as a tool to perform an existing process. Use of a computer or other machinery in its ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general-purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) do not provide improvements to the functioning of a computer or to any other technology or technical field; and do not integrate a judicial exception into a practical application.
“a sequence of instructions and a set of acts” The sequence of instructions and the set of acts for chunking data statements recited above are merely components used for the mental steps recited in claim 11.
Therefore, claims 1, 3-11 and 13-19 do not integrate the recited abstract ideas into a practical application.
Step 2B: Claim provides an Inventive Concept?
With respect to the limitations identified as insignificant extra-solution activity above the conclusions are carried over, and both the “receiving …;” are well-understood, routine, and conventional operations.
For support as being well-understood, routine, and conventional for “receiving …;” as noted by the courts is well understood routine and conventional, see MPEP 2106.05(d)(ii) “i. Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); … buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network);” and/or MPEP 2106.05(d)(ii) “iv. Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93;”, and/or MPEP 2106.05(d)(II) “iii. Ultramercial, 772 F.3d at 716, 112 USPQ2d at 1755 (updating an activity log);”.
Looking at the limitations in combination and the claim as a whole does not change this conclusion and the claim is ineligible.
Therefore, the claims 1, 2-11 and 13-19 are not patent eligible.
Claim Rejections - 35 USC § 112
6. The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
Claims 11 and 13-19 rejected under 35 U.S.C. 112(b), as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, regards as the invention.
Claim 11, recites the limitations of “A computer readable medium, including computer program code embodied in a non-transitory computer readable storage medium ...” However, it is unclear how the computer-readable medium including computer program code can embody a non-transitory computer-readable medium. The specification does not clarify this capability of the computer-readable medium. For this reason, the claims are rendered indefinite.
The Examiner gave the best reasonable interpretation to the claims. Therefore, for purpose of this examination the recited “A computer readable medium, including computer program code embodied in a non- transitory computer readable medium ...” will be interpreted as “ A non- transitory computer readable medium storing computer program code …”.
Applicant is encouraged to review the claims for similar inconsistencies.
Claims 13-18 are rejected for incorporating the deficiencies of parent claim 11, respectively.
Claim Rejections - 35 USC § 103
7. In the event the determination of the status of the application as subject to AIA 35 U.S.C. § 102 and § 103 (or as subject to pre-AIA 35 U.S.C. § 102 and § 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed
as set forth in section § 102 of this title, if the differences between the claimed invention and the prior art are such that the
claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person
having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in
which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under pre-AIA 35 U.S.C. § 103(a) are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
8. Claims 1, 3-11 and 13-19 are rejected under 35 U.S.C. § 103 as being unpatentable over Gerweck et al. (US 20160335318 A1) in view of Wilding et al. (US 20200250182 A1).
As per claim 1, Gerweck teaches a method (i.e. “systems, methods, and in computer program products for dynamic aggregate generation and updating for high performance querying of large datasets.”; Abstract)
for chunking data statements i.e. “generate various instances of database statements 104 to be interpreted on associated datasets.”; fig. 3A, para. [0033], [0051]; Examiner note: the chunking data statements is interpreted as the database statements)
based at least in part on a set of client-specific information in a client data statement processing layer (i.e. “a user 102 (e.g., business intelligence analyst) interacting with certain instances of analysis tools 103 (e.g., Tableau, Excel, QlikView, etc.)”; fig. 2, para. [0033]; Examiner note: the client data statement processing layer is interpreted as the analysis tools), the method comprising:
receiving one or more data statements (i.e. “The database statements 104 configured for the virtual multidimensional data model 124 can be received by the query planner 120 …”; para. [0007], [0033])
issued by at least one client (i.e. “a computing environment 201 comprises one or more instances of a client device 204 (e.g., a desktop computer)”; fig. 2, para. [0044]),
the data statements issued by the client to operate over a subject dataset for querying one or more attributes of the subject dataset (i.e. “the query monitor 302 might intercept any or all of the database statements 104 to build a query history 312 comprising the query attributes 322 associated with the stream of incoming queries.”; fig. 3A, para. [0048]-[0051]; Examiner note: the one or more attributes of the subject dataset is interpreted as the query attributes. Further, i.e. “a subject database the user would like to analyze”; para. [0024], [0028], [0038], [0041]; Examiner note: the subject dataset is interpreted as the subject database);
applying at least a portion of a set of client-specific data to the data statements (i.e. “For example, modifications to rule values, rule logic, rule application (e.g., when to apply a rule, the order of applying rules, etc.), and/or other rule attributes are possible.”; figs. 3A-B, para. [0051]; Examiner note: the applying at least a portion of a set of client-specific data is interpreted as the modifications to rule values, rule logic, rule application and/or other rule attributes; where the client-specific data can be interpreted as the rule values, rule logic, rule application and/or other rule attributes)
to determine at least one chunking scheme (i.e. “The aggregate simulator 306 can also use the rules 318 when determining and/or simulating the recommended aggregates 326. In some cases, certain instances of the rules 318 can be determined, in part, by the user 102.”; fig. 3A-B, para. [0051]; Examiner note: the determine at least one chunking scheme is interpreted as the determining and/or simulating the recommended aggregates),
the client specific data further comprising one or more statement chunking rules and a set of performance data, based on the one or more attributes (i.e. “When an aggregate has been selected, the aggregate selection logic 310 can further construct the aggregate logical plans (e.g., Aggregate1 logical plan 342.sub.1, . . . , AggregateN logical plan 342.sub.N) associated with each selected aggregate for processing according to the herein disclosed techniques.”; fig. 3A, para. [0034], [0052]; Examiner note: the one or more statement chunking rules is interpreted as the aggregate logical plans. Further, i.e. “For example, the rules 318 might comprise a set of values (e.g., thresholds) to compare to a corresponding set of attributes (e.g., performance metrics, record size, distinct count, fractional reduction, performance score, redundancy, relative performance gain, etc.) for the recommended aggregates 326 to determine which of the recommended aggregates 326 should be identified as selected aggregates 320.”; para. [0050]-[0051]; Examiner note: the set of performance data is interpreted as the set of attributes), and
referencing at least one dimension in the subject dataset (i.e. “As another example, the AggregateN logical plan 342.sub.N refers to a “distinct-count” type aggregate of the “cust ID” virtual cube attribute (e.g., a dimension in the virtual cube)”; fig. 3A, para. [0034], [0052]. Further, i.e. “The virtual order quantity per month cube 428 is defined by the dimensions “Product Name”, “Order YearMonth”, and “Other Dimension” (e.g., geographic region), with each cell holding an “Order Quantity” amount for a respective combination of dimension values (e.g., “Widget A”, “July 2005”, and “North America”, respectively).”; para. [0060], [0064]);
accessing performance data (i.e. “the aggregate selection technique 3B00 can commence with the aggregate selector 122 capturing incoming queries configured to operate on a subject database (see step 352).”; fig. 3B, para. [0055])
to generate performance estimates for a set of candidate chunking schemes from the at least one chunking scheme (i.e. “The performance gain (e.g., as compared to the basis attributes of the incoming queries) of such recommended aggregates can be estimated (see step 360).”; fig. 3B, para. [0053], [0056]-[0057]; Examiner note: the generate performance estimates for a set of candidate chunking schemes from the at least one chunking scheme is interpreted as the performance gain of such recommended aggregates can be estimated. Further, i.e. “a set of aggregate selection logic 310.”; fig. 3A, para. [0051]; Examiner note: the set of candidate chunking schemes from the at least one chunking scheme is interpreted as the set of aggregate selection logic. Furthermore, i.e. “the aggregate selection logic 310 can further construct the aggregate logical plans (e.g., Aggregate1 logical plan 342.sub.1, . . . , AggregateN logical plan 342.sub.N)”; fig. 3A, para. [0052]);
selecting a predetermined chunking scheme from the set of candidate chunking schemes based on the performance estimates prior to applying the chunking scheme to the subject dataset (i.e. “The aggregate logical plan can be delivered and/or scheduled for delivery for use by systems implementing dynamic aggregate generation and updating for high performance querying of large datasets (see step 368).”; fig. 3B, para. [0051], [0057]; Examiner note: the selecting a predetermined chunking scheme from the set of candidate chunking schemes is interpreted as the aggregate logical plan can be delivered. Further, i.e. “determine one or more instances of selected aggregates 320”; para. [0051]. Further, i.e. “the aggregate logical plans can be based in part on existing aggregate logical plans associated with a prior update of the subject aggregate.”; para. [0052], [0075], [0080]);
the data operations generated based at least in part on the referenced dimension, the chunking scheme and on the performance estimates (“The virtual order quantity per quarter cube 458 is defined by the dimensions “Product Name”, “Order YearQuarter”, and “Other Dimension” (e.g., geographic region), with each cell holding a “Order Quantity” amount for a respective combination of dimension values (e.g., “Widget A”, “2005 Q2”, and “North America”, respectively).”; fig. 4B, para. [0064]-[0065], [0075]); and
executing the data operations reflecting the predetermined chunking scheme over the subject dataset to generate a result set (i.e. “For large sets of subject data 101 stored in the subject database 118, a query response time 109 to return a result set 108 can be long (e.g., several minutes to hours).”; fig. 1C, para. [0034], [0038], [0041]. Further, i.e. “the generated aggregate physical plan can be an aggregate database statement comprising certain subject database statements conforming to a query language that can be executed by a distributed data query engine on a subject database to return an aggregate result set to be received by the aggregate generator 132 (see step 472)”; fig. 4, [0068]).
However, it is noted that the prior art of Gerweck does not explicitly teach “prior to accessing the subject dataset, generating one or more data operations from the data statements;”
On the other hand, in the same field of endeavor, Wilding teaches prior to accessing the subject dataset (i.e. “the statements identified by outlier filter 170 can be subject to further analysis before operation.”; fig. 1, para. [0022]),
generating one or more data operations from the data statements (i.e. “Client threads (e.g., 610, 612, 618) operate to perform various operations including execution of database statements.”; figs. 6-7, para. [0034]-[0035]; Examiner note: the one or more data operations is interpreted as the client threads (e.g., 610, 612, 618) operate; it is kwon that prior to perform an operation the operation must be generated/created. Further, i.e. “In the example of FIG. 7, elements (having identifiers and associated statistics) 710 can be utilized to supply both hash table 730 and array 760. In one embodiment, hash table 730 utilizes hash function 735 to map captured statements from the client threads to a corresponding entry in hash table 730. Hashing can be performed based on the statistical value (e.g., “141”, “14”) stored in the element.”; fig. 7, [0037]; Examiner note: where the elements are associated with the client treads);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Wilding that techniques for monitoring database statements to identify outlier statements into the prior art of Gerweck that teaches systems, methods, and in computer program products for dynamic aggregate generation and updating for high performance querying of large datasets. Additionally, this can improve query performance when new queries are matched to the cached query results.
The motivation for doing so would be to have a database statement monitoring architecture because it can provide deep visibility and proactive management (Wilding, para. [0018]).
As per claim 3, Gerweck and Wilding teach all the limitations as discussed in claim 1 above.
Additionally, Gerweck teaches wherein the data operations are executed at one or more query engines (i.e. “The database statements 104 configured for the virtual multidimensional data model 124 can be received by the query planner 120 to produce associated instances of subject database statements 107 that can be issued to the distributed data query engine 117.”; figs. 1A-C, para. [0033]. Further, i.e. “Various query languages and query engines (e.g., Impala, SparkSQL, Tez, Drill, Presto, etc.) are available to users for querying data stored in data warehouses and/or distributed file systems.”; para. [0003]), and
wherein the client-specific data is inaccessible by the query engines (i.e. “In other cases, when the number of dependent attributes is small (e.g., relative to a threshold in the rule 318), dependent attributes that have not been detected in incoming queries might be included in the selected aggregate.”; para. [0057]; Examiner note: the client-specific data is inaccessible by the query engines is interpreted as the dependent attributes that have not been detected in incoming queries).
As per claim 4, Gerweck and Wilding teach all the limitations as discussed in claim 1 above.
Additionally, Gerweck teaches further comprising: receiving a set of dataset metadata associated with the subject dataset (i.e. “a user receiving metadata associated with one or more virtual cubes comprising a multidimensional data model of a subject database the user would like to analyze (see step 162).”; figs. 1C, 4A, para. [0041]. Further, i.e. “the raw table data 406.sub.1 comprises rows of data (e.g., comma-delimited log entries) that span a temporal period 401 (e.g., July 2005), and the raw table data 406.sub.2 comprises rows of data that span the temporal period 401 (e.g., July 2005) and a temporal period 402 (e.g., August 2005).”; para. [0060]; Examiner note: the set of dataset metadata can be interpreted as the raw table data 406.sub.1);
expanding the dataset metadata into a set of expanded dataset metadata (i.e. “For example, certain attributes can be included in the selected aggregate based in part on certain information (e.g., the attributes will not increase the aggregate row count) derived from the virtual multidimensional data model 124.”; figs. 3A-B, 4A, para. [0057], [0060]-[0061]; Examiner note: the set of expanded dataset metadata can be interpreted as the raw table data 406.sub.2); and
consulting the expanded dataset metadata to perform at least one of, determining the at least one chunking scheme, or generating the one or more data operations (i.e. “In the second aggregate view 420, the metadata comprising the second view schema 422 can included certain view attributes describing, in part, the T1 partition 414, the T2 partition 424, and the association (e.g., logical mapping 415) between the partitions.”; figs. 4A-B, para. [0060]-[0061], [0064]).
As per claim 5, Gerweck and Wilding teach all the limitations as discussed in claim 1 above.
Additionally, Gerweck teaches further comprising: analyzing the data statements to determine one or more statement attributes associated with the data statements (i.e. “The incoming queries can be analyzed (e.g., parsed) to determine the various query attributes associated with the incoming queries (see step 354).”; para. [0055], [0057], [0068]); and
applying a portion of the client-specific data to at least one of the statement attributes to perform at least one of, determining the at least one chunking scheme, or generating the one or more data operations (i.e. “a first aggregate view 440 can be generated from the batch 431 of the raw table data 436.sub.1 to comprise a first view schema 442 and a B1 partition 444. Specifically, the data comprising the B1 partition 444 includes the aggregate of the “orderqty” for each “productkey” for the batch 431 of runids mapped to the quarter “2005Q2” (e.g., as specified by the virtual cube attributes, by the user, etc.). ”; figs. 4A-B, para. [0065]).
As per claim 6, Gerweck and Wilding teach all the limitations as discussed in claim 1 above.
Additionally, Gerweck teaches further comprising: generating one or more performance estimates associated with the data statements (i.e. “The performance gain (e.g., as compared to the basis attributes of the incoming queries) of such recommended aggregates can be estimated (see step 360).”; figs. 3A-B, para. [0050], [0056]-[0057]); and
applying a portion of the client-specific data to at least one of the performance estimates to perform at least one of, determining the at least one chunking scheme, or generating the one or more data operations (i.e. “For example, the performance gain might be compared to a threshold in the rules 318 to determine adequacy. If the estimated performance gain is adequate (see “Yes” path of decision 362), then the dependent attributes for the selected aggregate can be determined (see step 364).”; figs. 3A-B, para. [0057]).
As per claim 7, Gerweck and Wilding teach all the limitations as discussed in claim 6 above.
Additionally, Gerweck teaches wherein at least one of the performance estimates is based at least in part on a set of performance data (i.e. “In some cases, certain performance metrics and/or statistics (e.g., performance statistics 314) associated with, for example, certain historical and/or simulated aggregates, can be used to estimate the performance of the recommended aggregates.”; para. [0056]; Examiner note: the set of performance data is interpreted as the set of performance data).
As per claim 8, Gerweck and Wilding teach all the limitations as discussed in claim 7 above.
Additionally, Gerweck teaches wherein the performance data comprises at least one of, a set of historical data operations performance statistics, or a set of historical data operations behavioral characteristics (i.e. “For example, the aggregate logical plan can be received in response to an aggregate selector identifying an aggregate based on historical queries, predictions that the aggregate can improve query performance, user specification, and other factors.”; figs. 3A-B, 4, para. [0048]-[0049], [0068]; Examiner note: the set of historical data operations performance statistics or a set of historical data operations behavioral characteristics is interpreted as the historical queries, predictions …).
As per claim 9, Gerweck and Wilding teach all the limitations as discussed in claim 1 above.
Additionally, Gerweck teaches further comprising merging two or more results from the data operations into the result set (i.e. “the generated aggregate physical plan can be an aggregate database statement comprising certain subject database statements conforming to a query language that can be executed by a distributed data query engine on a subject database to return an aggregate result set to be received by the aggregate generator 132 (see step 472).”; figs. 1B-C, para. [0038], [0068]; Examiner note: Where aggregate data is known as a form of merging data).
As per claim 10, Gerweck and Wilding teach all the limitations as discussed in claim 1 above.
Additionally, Gerweck teaches wherein the data operations are executed in accordance with one or more execution directives, the execution directives indicating that one or more of the data operations be executed in parallel, in sequence, asynchronously, or synchronously (i.e. “As shown in FIG. 6A, the grace period partitioning technique 6A00 comprises four aggregate views, aggregate view 604, aggregate view 605, aggregate view 606, and aggregate view 607, generated (e.g., by the herein disclosed techniques) in sequence at progressively later times, T.sub.n+4, T.sub.n+5, T.sub.n+6, and T.sub.n+7, respectively.”; figs. 3A-B, 6A, para. [0050], [0072], [0075]).
As per claim 11, Gerweck teaches a computer readable medium (i.e. “computer readable/usable medium”; figs. 8A-B, para. [0085], [0089]),
including computer program code embodied in a non-transitory computer readable storage medium (i.e. “non-transitory medium from which a computer can read data.”; para. [0090]),
the non-transitory computer readable medium having stored thereon a sequence of instructions which (i.e. “execution of the sequences of instructions to practice the disclosure is performed by a single instance of the computer system 8A00.”; para. [0091]),
when stored in memory and executed by one or more processors causes the one or more processors (i.e. “The term “computer readable medium” or “computer usable medium” as used herein refers to any medium that participates in providing instructions to processor 807 for execution.”; para. [0089])
to perform a set of acts for chunking data statements (i.e. “generate various instances of database statements 104 to be interpreted on associated datasets.”; fig. 3A, para. [0033], [0051]; Examiner note: the chunking data statements is interpreted as the database statements)
based at least in part on a set of client-specific information (i.e. “a set of rules 318”; fig. 3A, para. [0051]; Examiner note: the least in part on a set of client-specific information is interpreted as the set of rules)
in a client data statement processing layer (i.e. “a user 102 (e.g., business intelligence analyst) interacting with certain instances of analysis tools 103 (e.g., Tableau, Excel, QlikView, etc.)”; fig. 2, para. [0033]; Examiner note: the client data statement processing layer is interpreted as the analysis tools), the acts comprising:
receiving one or more data statements (i.e. “The database statements 104 configured for the virtual multidimensional data model 124 can be received by the query planner 120 …”; para. [0007], [0033])
issued by at least one client (i.e. “a computing environment 201 comprises one or more instances of a client device 204 (e.g., a desktop computer)”; fig. 2, para. [0044]),
the data statements issued by the client to operate over a subject dataset for querying one or more attributes of the subject dataset (i.e. “the query monitor 302 might intercept any or all of the database statements 104 to build a query history 312 comprising the query attributes 322 associated with the stream of incoming queries.”; fig. 3A, para. [0048]-[0051]; Examiner note: the one or more attributes of the subject dataset is interpreted as the query attributes. Further, i.e. “a subject database the user would like to analyze”; para. [0024], [0028], [0038], [0041]; Examiner note: the subject dataset is interpreted as the subject database);
applying at least a portion of a set of client-specific data to the data statements (i.e. “For example, modifications to rule values, rule logic, rule application (e.g., when to apply a rule, the order of applying rules, etc.), and/or other rule attributes are possible.”; figs. 3A-B, para. [0051]; Examiner note: the applying at least a portion of a set of client-specific data is interpreted as the modifications to rule values, rule logic, rule application and/or other rule attributes; where the client-specific data can be interpreted as the rule values, rule logic, rule application and/or other rule attributes)
to determine at least one chunking scheme (i.e. “The aggregate simulator 306 can also use the rules 318 when determining and/or simulating the recommended aggregates 326. In some cases, certain instances of the rules 318 can be determined, in part, by the user 102.”; fig. 3A-B, para. [0051]; Examiner note: the determine at least one chunking scheme is interpreted as the determining and/or simulating the recommended aggregates),
the client specific data further comprising one or more statement chunking rules and a set of performance data, based on the one or more attributes (i.e. “When an aggregate has been selected, the aggregate selection logic 310 can further construct the aggregate logical plans (e.g., Aggregate1 logical plan 342.sub.1, . . . , AggregateN logical plan 342.sub.N) associated with each selected aggregate for processing according to the herein disclosed techniques.”; fig. 3A, para. [0034], [0052]; Examiner note: the one or more statement chunking rules is interpreted as the aggregate logical plans. Further, i.e. “For example, the rules 318 might comprise a set of values (e.g., thresholds) to compare to a corresponding set of attributes (e.g., performance metrics, record size, distinct count, fractional reduction, performance score, redundancy, relative performance gain, etc.) for the recommended aggregates 326 to determine which of the recommended aggregates 326 should be identified as selected aggregates 320.”; para. [0050]-[0051]; Examiner note: the set of performance data is interpreted as the set of attributes), and
referencing at least one dimension in the subject dataset (i.e. “As another example, the AggregateN logical plan 342.sub.N refers to a “distinct-count” type aggregate of the “cust ID” virtual cube attribute (e.g., a dimension in the virtual cube)”; fig. 3A, para. [0034], [0052]. Further, i.e. “The virtual order quantity per month cube 428 is defined by the dimensions “Product Name”, “Order YearMonth”, and “Other Dimension” (e.g., geographic region), with each cell holding an “Order Quantity” amount for a respective combination of dimension values (e.g., “Widget A”, “July 2005”, and “North America”, respectively).”; para. [0060], [0064]);
accessing performance data (i.e. “the aggregate selection technique 3B00 can commence with the aggregate selector 122 capturing incoming queries configured to operate on a subject database (see step 352).”; fig. 3B, para. [0055])
to generate performance estimates for a set of candidate chunking schemes from the at least one chunking scheme (i.e. “The performance gain (e.g., as compared to the basis attributes of the incoming queries) of such recommended aggregates can be estimated (see step 360).”; fig. 3B, para. [0053], [0056]-[0057]; Examiner note: the generate performance estimates for a set of candidate chunking schemes from the at least one chunking scheme is interpreted as the performance gain of such recommended aggregates can be estimated. Further, i.e. “a set of aggregate selection logic 310.”; fig. 3A, para. [0051]; Examiner note: the set of candidate chunking schemes from the at least one chunking scheme is interpreted as the set of aggregate selection logic. Furthermore, i.e. “the aggregate selection logic 310 can further construct the aggregate logical plans (e.g., Aggregate1 logical plan 342.sub.1, . . . , AggregateN logical plan 342.sub.N)”; fig. 3A, para. [0052]);
selecting a predetermined chunking scheme from the set of candidate chunking schemes based on the performance estimates prior to applying the chunking scheme to the subject dataset (i.e. “The aggregate logical plan can be delivered and/or scheduled for delivery for use by systems implementing dynamic aggregate generation and updating for high performance querying of large datasets (see step 368).”; fig. 3B, para. [0051], [0057]; Examiner note: the selecting a predetermined chunking scheme from the set of candidate chunking schemes is interpreted as the aggregate logical plan can be delivered. Further, i.e. “determine one or more instances of selected aggregates 320”; para. [0051]. Further, i.e. “the aggregate logical plans can be based in part on existing aggregate logical plans associated with a prior update of the subject aggregate.”; para. [0052], [0075], [0080]);
the data operations generated based at least in part on the referenced dimension, the chunking scheme and on the performance estimates (“The virtual order quantity per quarter cube 458 is defined by the dimensions “Product Name”, “Order YearQuarter”, and “Other Dimension” (e.g., geographic region), with each cell holding a “Order Quantity” amount for a respective combination of dimension values (e.g., “Widget A”, “2005 Q2”, and “North America”, respectively).”; fig. 4B, para. [0064]-[0065], [0075]); and
executing the data operations reflecting the predetermined chunking scheme over the subject dataset to generate a result set (i.e. “For large sets of subject data 101 stored in the subject database 118, a query response time 109 to return a result set 108 can be long (e.g., several minutes to hours).”; fig. 1C, para. [0034], [0038], [0041]. Further, i.e. “the generated aggregate physical plan can be an aggregate database statement comprising certain subject database statements conforming to a query language that can be executed by a distributed data query engine on a subject database to return an aggregate result set to be received by the aggregate generator 132 (see step 472)”; fig. 4, [0068]).
However, it is noted that the prior art of Gerweck does not explicitly teach “prior to accessing the subject dataset, generating one or more data operations from the data statements;”
On the other hand, in the same field of endeavor, Wilding teaches prior to accessing the subject dataset (i.e. “the statements identified by outlier filter 170 can be subject to further analysis before operation.”; fig. 1, para. [0022]),
generating one or more data operations from the data statements (i.e. “Client threads (e.g., 610, 612, 618) operate to perform various operations including execution of database statements.”; figs. 6-7, para. [0034]-[0035]; Examiner note: the one or more data operations is interpreted as the client threads (e.g., 610, 612, 618) operate; it is kwon that prior to perform an operation the operation must be generated/created. Further, i.e. “In the example of FIG. 7, elements (having identifiers and associated statistics) 710 can be utilized to supply both hash table 730 and array 760. In one embodiment, hash table 730 utilizes hash function 735 to map captured statements from the client threads to a corresponding entry in hash table 730. Hashing can be performed based on the statistical value (e.g., “141”, “14”) stored in the element.”; fig. 7, [0037]; Examiner note: where the elements are associated with the client treads);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Wilding that techniques for monitoring database statements to identify outlier statements into the prior art of Gerweck that teaches systems, methods, and in computer program products for dynamic aggregate generation and updating for high performance querying of large datasets. Additionally, this can improve query performance when new queries are matched to the cached query results.
The motivation for doing so would be to have a database statement monitoring architecture because it can provide deep visibility and proactive management (Wilding, para. [0018]).
As per claim 13, Gerweck and Wilding teach all the limitations as discussed in claim 11 above.
Additionally, Gerweck teaches wherein the data operations are executed at one or more query engines (i.e. “The database statements 104 configured for the virtual multidimensional data model 124 can be received by the query planner 120 to produce associated instances of subject database statements 107 that can be issued to the distributed data query engine 117.”; figs. 1A-C, para. [0033]. Further, i.e. “Various query languages and query engines (e.g., Impala, SparkSQL, Tez, Drill, Presto, etc.) are available to users for querying data stored in data warehouses and/or distributed file systems.”; para. [0003]), and
wherein the client- specific data is inaccessible by the query engines (i.e. “In other cases, when the number of dependent attributes is small (e.g., relative to a threshold in the rule 318), dependent attributes that have not been detected in incoming queries might be included in the selected aggregate.”; para. [0057]; Examiner note: the client-specific data is inaccessible by the query engines is interpreted as the dependent attributes that have not been detected in incoming queries).
As per claim 14, Gerweck and Wilding teach all the limitations as discussed in claim 11 above.
Additionally, Gerweck teaches further comprising instructions which, when stored in memory and executed by the one or more processors causes the one or more processors to perform acts of: receiving a set of dataset metadata associated with the subject dataset (i.e. “a user receiving metadata associated with one or more virtual cubes comprising a multidimensional data model of a subject database the user would like to analyze (see step 162).”; figs. 1C, 4A, para. [0041]. Further, i.e. “the raw table data 406.sub.1 comprises rows of data (e.g., comma-delimited log entries) that span a temporal period 401 (e.g., July 2005), and the raw table data 406.sub.2 comprises rows of data that span the temporal period 401 (e.g., July 2005) and a temporal period 402 (e.g., August 2005).”; para. [0060]; Examiner note: the set of dataset metadata can be interpreted as the raw table data 406.sub.1);
expanding the dataset metadata into a set of expanded dataset metadata (i.e. “For example, certain attributes can be included in the selected aggregate based in part on certain information (e.g., the attributes will not increase the aggregate row count) derived from the virtual multidimensional data model 124.”; figs. 3A-B, 4A, para. [0057], [0060]-[0061]; Examiner note: the set of expanded dataset metadata can be interpreted as the raw table data 406.sub.2); and
consulting the expanded dataset metadata to perform at least one of, determining the at least one chunking scheme, or generating the one or more data operations (i.e. “In the second aggregate view 420, the metadata comprising the second view schema 422 can included certain view attributes describing, in part, the T1 partition 414, the T2 partition 424, and the association (e.g., logical mapping 415) between the partitions.”; figs. 4A-B, para. [0060]-[0061], [0064]).
As per claim 15, Gerweck and Wilding teach all the limitations as discussed in claim 11 above.
Additionally, Gerweck teaches further comprising instructions which, when stored in memory and executed by the one or more processors causes the one or more processors to perform acts of: analyzing the data statements to determine one or more statement attributes associated with the data statements (i.e. “The incoming queries can be analyzed (e.g., parsed) to determine the various query attributes associated with the incoming queries (see step 354).”; para. [0055], [0057], [0068]); and
applying a portion of the client-specific data to at least one of the statement attributes to perform at least one of, determining the at least one chunking scheme, or generating the one or more data operations (i.e. “a first aggregate view 440 can be generated from the batch 431 of the raw table data 436.sub.1 to comprise a first view schema 442 and a B1 partition 444. Specifically, the data comprising the B1 partition 444 includes the aggregate of the “orderqty” for each “productkey” for the batch 431 of runids mapped to the quarter “2005Q2” (e.g., as specified by the virtual cube attributes, by the user, etc.). ”; figs. 4A-B, para. [0065]).
As per claim 16, Gerweck and Wilding teach all the limitations as discussed in claim 11 above.
Additionally, Gerweck teaches further comprising instructions which, when stored in memory and executed by the one or more processors causes the one or more processors to perform acts of: generating one or more performance estimates associated with the data statements (i.e. “The performance gain (e.g., as compared to the basis attributes of the incoming queries) of such recommended aggregates can be estimated (see step 360).”; figs. 3A-B, para. [0050], [0056]-[0057]); and
applying a portion of the client-specific data to at least one of the performance estimates to perform at least one of, determining the at least one chunking scheme, or generating the one or more data operations (i.e. “For example, the performance gain might be compared to a threshold in the rules 318 to determine adequacy. If the estimated performance gain is adequate (see “Yes” path of decision 362), then the dependent attributes for the selected aggregate can be determined (see step 364).”; figs. 3A-B, para. [0057]).
As per claim 17, Gerweck and Wilding teach all the limitations as discussed in claim 16 above.
Additionally, Gerweck teaches wherein at least one of the performance estimates is based at least in part on a set of performance data (i.e. “In some cases, certain performance metrics and/or statistics (e.g., performance statistics 314) associated with, for example, certain historical and/or simulated aggregates, can be used to estimate the performance of the recommended aggregates.”; para. [0056]; Examiner note: the set of performance data is interpreted as the set of performance data).
As per claim 18, Gerweck and Wilding teach all the limitations as discussed in claim 17 above.
Additionally, Gerweck teaches wherein the performance data comprises at least one of, a set of historical data operations performance statistics, or a set of historical data operations behavioral characteristics (i.e. “For example, the aggregate logical plan can be received in response to an aggregate selector identifying an aggregate based on historical queries, predictions that the aggregate can improve query performance, user specification, and other factors.”; figs. 3A-B, 4, para. [0048]-[0049], [0068]; Examiner note: the set of historical data operations performance statistics or a set of historical data operations behavioral characteristics is interpreted as the historical queries, predictions …).
As per claim 19, Gerweck teaches a system (i.e. “systems, methods, and in computer program products for dynamic aggregate generation and updating for high performance querying of large datasets.”; Abstract, fig. 1A-C, para. [0034]-[0035])
for chunking data statements (i.e. “generate various instances of database statements 104 to be interpreted on associated datasets.”; fig. 3A, para. [0033], [0051]; Examiner note: the chunking data statements is interpreted as the database statements)
based at least in part on a set of client-specific information (i.e. “a set of rules 318”; fig. 3A, para. [0051]; Examiner note: the least in part on a set of client-specific information is interpreted as the set of rules)
in a client data statement processing layer (i.e. “a user 102 (e.g., business intelligence analyst) interacting with certain instances of analysis tools 103 (e.g., Tableau, Excel, QlikView, etc.)”; fig. 2, para. [0033]; Examiner note: the client data statement processing layer is interpreted as the analysis tools), the system comprising:
a storage medium (i.e. “computer readable/usable medium”; figs. 8A-B, para. [0085], [0089])
having stored thereon a sequence of instructions (i.e. “execution of the sequences of instructions to practice the disclosure is performed by a single instance of the computer system 8A00.”; para. [0091]); and
one or more processors that execute the instructions to cause the one or more processors (i.e. “The term “computer readable medium” or “computer usable medium” as used herein refers to any medium that participates in providing instructions to processor 807 for execution.”; para. [0089])
to perform a set of acts (i.e. “generate various instances of database statements 104 to be interpreted on associated datasets.”; fig. 3A, para. [0033], [0051]; Examiner note: the chunking data statements is interpreted as the database statements), the acts comprising:
receiving one or more data statements (i.e. “The database statements 104 configured for the virtual multidimensional data model 124 can be received by the query planner 120 …”; para. [0007], [0033])
issued by at least one client (i.e. “a computing environment 201 comprises one or more instances of a client device 204 (e.g., a desktop computer)”; fig. 2, para. [0044]),
the data statements issued by the client to operate over a subject dataset for querying one or more attributes of the subject dataset (i.e. “the query monitor 302 might intercept any or all of the database statements 104 to build a query history 312 comprising the query attributes 322 associated with the stream of incoming queries.”; fig. 3A, para. [0048]-[0051]; Examiner note: the one or more attributes of the subject dataset is interpreted as the query attributes. Further, i.e. “a subject database the user would like to analyze”; para. [0024], [0028], [0038], [0041]; Examiner note: the subject dataset is interpreted as the subject database);
applying at least a portion of a set of client-specific data to the data statements (i.e. “For example, modifications to rule values, rule logic, rule application (e.g., when to apply a rule, the order of applying rules, etc.), and/or other rule attributes are possible.”; figs. 3A-B, para. [0051]; Examiner note: the applying at least a portion of a set of client-specific data is interpreted as the modifications to rule values, rule logic, rule application and/or other rule attributes; where the client-specific data can be interpreted as the rule values, rule logic, rule application and/or other rule attributes)
to determine at least one chunking scheme (i.e. “The aggregate simulator 306 can also use the rules 318 when determining and/or simulating the recommended aggregates 326. In some cases, certain instances of the rules 318 can be determined, in part, by the user 102.”; fig. 3A-B, para. [0051]; Examiner note: the determine at least one chunking scheme is interpreted as the determining and/or simulating the recommended aggregates),
the client specific data further comprising at one or more statement chunking rules and a set of performance data, based on the one or more attributes (i.e. “When an aggregate has been selected, the aggregate selection logic 310 can further construct the aggregate logical plans (e.g., Aggregate1 logical plan 342.sub.1, . . . , AggregateN logical plan 342.sub.N) associated with each selected aggregate for processing according to the herein disclosed techniques.”; fig. 3A, para. [0034], [0052]; Examiner note: the one or more statement chunking rules is interpreted as the aggregate logical plans. Further, i.e. “For example, the rules 318 might comprise a set of values (e.g., thresholds) to compare to a corresponding set of attributes (e.g., performance metrics, record size, distinct count, fractional reduction, performance score, redundancy, relative performance gain, etc.) for the recommended aggregates 326 to determine which of the recommended aggregates 326 should be identified as selected aggregates 320.”; para. [0050]-[0051]; Examiner note: the set of performance data is interpreted as the set of attributes), and
referencing at least one dimension in the subject dataset (i.e. “As another example, the AggregateN logical plan 342.sub.N refers to a “distinct-count” type aggregate of the “cust ID” virtual cube attribute (e.g., a dimension in the virtual cube)”; fig. 3A, para. [0034], [0052]. Further, i.e. “The virtual order quantity per month cube 428 is defined by the dimensions “Product Name”, “Order YearMonth”, and “Other Dimension” (e.g., geographic region), with each cell holding an “Order Quantity” amount for a respective combination of dimension values (e.g., “Widget A”, “July 2005”, and “North America”, respectively).”; para. [0060], [0064]);
accessing performance data (i.e. “the aggregate selection technique 3B00 can commence with the aggregate selector 122 capturing incoming queries configured to operate on a subject database (see step 352).”; fig. 3B, para. [0055])
to generate performance estimates for a set of candidate chunking schemes from the at least one chunking scheme (i.e. “The performance gain (e.g., as compared to the basis attributes of the incoming queries) of such recommended aggregates can be estimated (see step 360).”; fig. 3B, para. [0053], [0056]-[0057]; Examiner note: the generate performance estimates for a set of candidate chunking schemes from the at least one chunking scheme is interpreted as the performance gain of such recommended aggregates can be estimated. Further, i.e. “a set of aggregate selection logic 310.”; fig. 3A, para. [0051]; Examiner note: the set of candidate chunking schemes from the at least one chunking scheme is interpreted as the set of aggregate selection logic. Furthermore, i.e. “the aggregate selection logic 310 can further construct the aggregate logical plans (e.g., Aggregate1 logical plan 342.sub.1, . . . , AggregateN logical plan 342.sub.N)”; fig. 3A, para. [0052]);
selecting a predetermined chunking scheme from the set of candidate chunking schemes based on the performance estimates prior to applying the chunking scheme to the subject dataset (i.e. “The aggregate logical plan can be delivered and/or scheduled for delivery for use by systems implementing dynamic aggregate generation and updating for high performance querying of large datasets (see step 368).”; fig. 3B, para. [0051], [0057]; Examiner note: the selecting a predetermined chunking scheme from the set of candidate chunking schemes is interpreted as the aggregate logical plan can be delivered. Further, i.e. “determine one or more instances of selected aggregates 320”; para. [0051]. Further, i.e. “the aggregate logical plans can be based in part on existing aggregate logical plans associated with a prior update of the subject aggregate.”; para. [0052], [0075], [0080]);
the data operations generated based at least in part on the referenced dimension, the chunking scheme and on the performance estimates (“The virtual order quantity per quarter cube 458 is defined by the dimensions “Product Name”, “Order YearQuarter”, and “Other Dimension” (e.g., geographic region), with each cell holding a “Order Quantity” amount for a respective combination of dimension values (e.g., “Widget A”, “2005 Q2”, and “North America”, respectively).”; fig. 4B, para. [0064]-[0065], [0075]); and
executing the data operations reflecting the predetermined chunking scheme over the subject dataset to generate a result set (i.e. “For large sets of subject data 101 stored in the subject database 118, a query response time 109 to return a result set 108 can be long (e.g., several minutes to hours).”; fig. 1C, para. [0034], [0038], [0041]. Further, i.e. “the generated aggregate physical plan can be an aggregate database statement comprising certain subject database statements conforming to a query language that can be executed by a distributed data query engine on a subject database to return an aggregate result set to be received by the aggregate generator 132 (see step 472)”; fig. 4, [0068]).
However, it is noted that the prior art of Gerweck does not explicitly teach “prior to accessing the subject dataset, generating one or more data operations from the data statements;”
On the other hand, in the same field of endeavor, Wilding teaches prior to accessing the subject dataset (i.e. “the statements identified by outlier filter 170 can be subject to further analysis before operation.”; fig. 1, para. [0022]),
generating one or more data operations from the data statements (i.e. “Client threads (e.g., 610, 612, 618) operate to perform various operations including execution of database statements.”; figs. 6-7, para. [0034]-[0035]; Examiner note: the one or more data operations is interpreted as the client threads (e.g., 610, 612, 618) operate; it is kwon that prior to perform an operation the operation must be generated/created. Further, i.e. “In the example of FIG. 7, elements (having identifiers and associated statistics) 710 can be utilized to supply both hash table 730 and array 760. In one embodiment, hash table 730 utilizes hash function 735 to map captured statements from the client threads to a corresponding entry in hash table 730. Hashing can be performed based on the statistical value (e.g., “141”, “14”) stored in the element.”; fig. 7, [0037]; Examiner note: where the elements are associated with the client treads);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Wilding that techniques for monitoring database statements to identify outlier statements into the prior art of Gerweck that teaches systems, methods, and in computer program products for dynamic aggregate generation and updating for high performance querying of large datasets. Additionally, this can improve query performance when new queries are matched to the cached query results.
The motivation for doing so would be to have a database statement monitoring architecture because it can provide deep visibility and proactive management (Wilding, para. [0018]).
Prior Art of Record
9. The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Lee et al. (US 20250156458 A1), teaches smart chunking techniques for more efficient retrieval.
Wilding et al. (US 20180046665 A1), teaches techniques for detecting intrusion attempts via SQL injection by detecting statistically improbable SQL statements.
Conclusion
10. Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANTONIO CAIA DO whose telephone number is (469)295-9251. The examiner can normally be reached on Monday - Friday / 06:30 to 16:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ng, Amy can be reached on (571) 270-1698. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ANTONIO J CAIA DO/
Examiner, Art Unit 2164
/MARK E HERSHLEY/Primary Examiner, Art Unit 2164