Last updated: April 19, 2026
Application No. 18/060,632
DATA SAMPLING METHOD THAT MAINTAINS ACCURACY FOR DATA ANALYSIS

Non-Final OA §101§103
Filed
Dec 01, 2022
Examiner
CAO, PHUONG THAO
Art Unit
2164
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
1 (Non-Final)
Interview Optional

— +13.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 760 resolved cases, 2023–2026
Examiner Intelligence

CAO, PHUONG THAO View full profile →
Grants 78% — above average
Career Allow Rate
592 granted / 760 resolved
+22.9% vs TC avg
Moderate +14% lift
Without
With
+13.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
22 currently pending
Career history
782
Total Applications
across all art units
Statute-Specific Performance

§101
14.8%
-25.2% vs TC avg
§103
37.6%
-2.4% vs TC avg
§102
15.8%
-24.2% vs TC avg
§112
21.9%
-18.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 760 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is in response to Application filed on 12/01/2022.
No priority date is claimed.  Therefore, the effective filing date is 12/01/2022.
Claims 1-20 are pending.

Information Disclosure Statement

The Information Disclosure Statement (IDS) filed by Applicant on 12/01/2022 has been considered.  A copy of the considered IDS is enclosed with this Office action.

Claim Objections

Claims 19-20 are objected to because of the following informalities: 

Regarding claim 19, the paragraph “one or more processors…to perform operations comprising” in lines 4-6 should be removed.

Dependent claim 20 is objected as incorporating the informality of the objected independent claim 19 upon which it depends.

Appropriate correction is required.

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea of determining whether to update a sampling database. 
The claims recite an abstract idea of determining whether to update a sampling database based on broadly recited steps of comparing, determining and initiating an update, etc., which can be performed in the human mind and/or with the aid of pencil and paper and directed to mental processes grouping of abstract ideas . This judicial exception is not integrated into a practical application because other additional elements including genetic computer components and common computer functionality (e.g., accessing, storing, displaying, etc.) and/or insignificant extra-solution activity (e.g., mere data gathering) for implementing the abstract idea are not sufficient to integrate the abstract idea into a practical application. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because additional elements include only generic/common computer components (e.g., memory, processor, program instructions, etc.) and generic/common computer functions (e.g., accessing, storing, displaying, etc.) and/or insignificant extra-solution activity (e.g., mere data gathering), which are not sufficient to amount to significantly more than the recited abstract idea.
Abstract idea analysis as follows:
Step 1:
According to the first part of the analysis, in the instant claims, claims 1-9 are directed to a method (i.e., a process), claims 10-18 are directed to a system (i.e., a machine), and claims 19-20 are directed to a computer program product comprising a computer readable storage medium having program instructions embodied therewith (i.e., article of manufacture). Thus, each of the claims 1-20 falls within one of the four statutory categories (i.e. process, machine, manufacture or composition of matter).  

Step 2a Prong 1 (claims 1, 10 and 19):
The following limitations recited in claims 1, 10 and 19 are abstract ideas that fall under mental processes:
periodically comparing the statistics of the original database and the statistics of the sampling database to determine whether the sampling database is within a predetermined threshold of the original database (these steps of comparing and determining as broadly recited can be mentally performed in the human mind or with the aid of pencil and paper through mental processes such as observation, evaluation, judgment and opinion performed with respect to data (i.e., statistics) associated with a dataset (i.e., original database) and a subset of the dataset (i.e., sampling database)); and 
in response to determining that the sampling database is not within the predetermined threshold of the original database, initiating an update to the sampling database (the step of initiating an update as broadly recited indicating a plan/intention which can be mentally performed in the human mind or with the aid of pencil and paper, the recitation of initiating an update to the sampling database without specifying on how to initiate the update and/or how to update the sampling database (i.e., a subset of dataset) is directed to an abstract concept or mental step).
All the limitations above are mental steps that can be performed in the human mind or with the aid of pencil and paper.

Step 2a Prong 2 (Claims 1, 10 and 19): 
The following limitations in claims 1, 10 and 19 are additional elements: 
collecting statistics of an original database (the step of collecting statistics as broadly recited is directed to mere data gathering recited at high level of generality, as such being insignificant extra-solution activity); 
collecting statistics of a sampling database that comprises a subset of the original database (the step of collecting statistics as broadly recited is directed to mere data gathering recited at high level of generality, as such being insignificant extra-solution activity); 
periodically updating the statistics of the original database (the step of periodically updating the statistics as broadly recited is directed to mere data gathering recited at high level of generality, as such being insignificant extra-solution activity);
a memory having computer readable instructions (these elements are directed to generic computer component and mere instructions for implementing the abstract idea and/or insignificant extra-solution activity); 
one or more processors for executing the computer readable instructions, the computer readable instructions controlling the one or more processors to perform operations comprising (these elements are directed to generic computer components and mere instructions for implementing the abstract idea and/or insignificant extra-solution activity),
a computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform operations comprising (these elements are directed to generic computer components and mere instructions for implementing the abstract idea and/or insignificant extra-solution activity).
These are a generic computer and/or generic computer components used to perform generic computer functions or insignificant extra-solution activity, such that they amount to no more than generic computer components or generic computer used to execute mere instructions for implementing the abstract idea. Accordingly, these additional elements do not integrate the abstract idea(s) into a practical application because they do not impose any meaningful limits on practicing the abstract idea(s).

Step 2b (Claims 1, 10 and 19): 
The following limitations in claims 1, 10 and 19 are additional elements: 
collecting statistics of an original database (the step of collecting statistics as broadly recited is directed to mere data gathering recited at high level of generality, as such being insignificant extra-solution activity or well-understood, routine, conventional activity); 
collecting statistics of a sampling database that comprises a subset of the original database (the step of collecting statistics as broadly recited is directed to mere data gathering recited at high level of generality, as such being insignificant extra-solution activity or well-understood, routine, conventional activity); 
periodically updating the statistics of the original database (the step of periodically updating the statistics as broadly recited is directed to mere data gathering recited at high level of generality, as such being insignificant extra-solution activity or well-understood, routine, conventional activity);
a memory having computer readable instructions (these elements are directed to generic computer component and mere instructions for implementing the abstract idea and/or insignificant extra-solution activity or well-understood, routine, conventional activity); 
one or more processors for executing the computer readable instructions, the computer readable instructions controlling the one or more processors to perform operations comprising (these elements are directed to generic computer components and mere instructions for implementing the abstract idea and/or insignificant extra-solution activity or well-understood, routine, conventional activity),
a computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform operations comprising (these elements are directed to generic computer components and mere instructions for implementing the abstract idea and/or insignificant extra-solution activity or well-understood, routine, conventional activity).
These are a generic computer and/or generic computer components and/or insignificant extra-solution activity (i.e., well-understood, routine, conventional activity) for implementing or applying the abstract idea, and do not amount to significantly more, see MPEP 2106.05(d)(II).

Regarding claims 2 and 11, claims 2 and 11 depend on claims 1 and 10 respectively.  As such, claims 2 and 11 recite the abstract idea as presented in claims 1 and 10.
In addition, claims 2 and 11 include additional elements:
wherein the initiating an update to the sampling database comprises outputting a recommendation to recreate the sampling database based at least in part on the original database (this element specifying the step of initiating an update by outputting a recommendation to recreate, which is directed to mere outputting/displaying information, as such being insignificant extra-solution activity).
These are additional elements directed to insignificant extra-solution activity (i.e., well-understood, routine, conventional activity) for implementing the abstract idea, which do not integrate the judicial exception into a practical application and do not amount to significantly more, see MPEP 2106.05(d)(II). 

Regarding claims 3 and 12, claims 3 and 12 depend on claims 1 and 10 respectively.  As such, claims 3 and 12 recite the abstract idea as presented in claims 1 and 10.
In addition, claims 3 and 12 include additional elements:
wherein the initiating an update to the sampling database comprises automatically recreating the sampling database based at least in part on the original database (this element specifying the step of initiating an update by automatically recreating, wherein step of automatically recreating as broadly recited without specifying how can be mentally performed in the human mind or with the aid of pencil and paper).
These are additional elements directed to mental step or abstract idea, which do not integrate the judicial exception into a practical application and do not amount to significantly more, see MPEP 2106.05(d)(II).

Regarding claims 4 and 13, claims 4 and 13 depend on claims 1 and 10 respectively.  As such, claims 4 and 13 recite the abstract idea as presented in claims 1 and 10.
In addition, claims 4 and 13 include additional elements:
wherein the initiating an update to the sampling database comprises one of outputting a recommendation to update a subset of the sampling database based at least in part on the original database and automatically updating a subset of the sampling database based at least in part on the original database (this element specifying the step of initiating an update comprises one of  outputting a recommendation or automatically updating, wherein the step of outputting a recommendation as broadly recited is directed to mere outputting/displaying information, as such being insignificant extra-solution activity, wherein step of automatically recreating as broadly recited without specifying how can be mentally performed in the human mind or with the aid of pencil and paper).
These are additional elements directed to insignificant extra-solution activity (i.e., well-understood, routine, conventional activity) or mental process (i.e., abstract idea), which do not integrate the judicial exception into a practical application and do not amount to significantly more, see MPEP 2106.05(d)(II). 

Regarding claims 5 and 14, claims 5 and 14 depend on claims 1 and 10 respectively.  As such, claims 5 and 14 recite the abstract idea as presented in claims 1 and 10.
In addition, claims 5 and 14 include additional elements:
creating a sampling definition structure (this step of creating as broadly recited can be mentally performed in the human mind or with the aid of pencil and paper); and 
creating the sampling database based at least in part on the sampling definition structure and the original database (this step of creating as broadly recited can be mentally performed in the human mind or with the aid of pencil and paper; it should be noted that a database can be broadly interpreted as any set of data (e.g., a set of numbers/words, etc.).
These are additional elements directed to mental steps (i.e., abstract idea), which do not integrate the judicial exception into a practical application and do not amount to significantly more, see MPEP 2106.05(d)(II).

Regarding claims 6 and 15, claims 6 and 15 depend on claims 5 and 14 respectively.  As such, claims 6 and 15 recite the abstract idea as presented in claims 5 and 14.
In addition, claims 6 and 15 include additional elements:
wherein the sampling definition structure comprises a sampling rate, the predetermined threshold, a frequency of the comparing, and an action performed in response to the initiating an update (these elements are directed to mere descriptive data/information).
These are additional elements directed to mere information/data, which do not integrate the judicial exception into a practical application and do not amount to significantly more, see MPEP 2106.05(d)(II). 

Regarding claims 7 and 16, claims 7 and 16 depend on claims 1 and 10 respectively.  As such, claims 7 and 16 recite the abstract idea as presented in claims 1 and 10.
In addition, claims 7 and 16 include additional elements:
wherein the statistics of the original database and the statistics of the sampling database each comprise an overall record count, a cardinality of a column, and a histogram statistic for the column (these elements are directed to mere descriptive data/information).
These are additional elements directed to mere information/data, which do not integrate the judicial exception into a practical application and do not amount to significantly more, see MPEP 2106.05(d)(II).

Regarding claims 8 and 17, claims 8 and 17 depend on claims 1 and 10 respectively.  As such, claims 8 and 17 recite the abstract idea as presented in claims 1 and 10.
In addition, claims 8 and 17 include additional elements:
wherein the comparing comprises comparing a relative frequency of a range of values in a column (this step of comparing as broadly recited can be mentally performed in the human mind or with the aid of pencil and paper).
These are additional elements directed to mental process (i.e., abstract idea), which do not integrate the judicial exception into a practical application and do not amount to significantly more, see MPEP 2106.05(d)(II).

Regarding claims 9, 18 and 20, claims 9, 18 and 20 depend on claims 1, 10 and 19 respectively.  As such, claims 9, 18 and 20 recite the abstract idea as presented in claims 1, 10 and 19.
In addition, claims 9, 18 and 20 include additional elements:
wherein the updating the statistics of the original database is performed in response to an update to the original database and is based at least in part on a log of the original database (this step of updating the statistics as broadly recited is directed to mere data gathering recited at high level of generality, as such being insignificant extra-solution activity or well-understood, routine, conventional activity).
These are additional elements directed to insignificant extra-solution activity or well-understood, routine, conventional activity, which do not integrate the judicial exception into a practical application and do not amount to significantly more, see MPEP 2106.05(d)(II).

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



Claims 1-20 (effective filing date 12/01/2022) are rejected under 35 U.S.C. 103 as being unpatentable over Burger (U.S. Publication No. 2009/0083215, Publication date 03/26/2009), in view of Lightstone et al. (U.S. Publication No 2006/0112093, Publication date 05/25/2006), and further in view of Le at al. (U.S. Publication No. 2019/0095487, Publication date 03/28/2019).

As to claim 1, Burger teaches:
“A computer-implemented method” (see Burger, Abstract) comprising: 
“collecting statistics of an original database” (see Burger, [0043] for performing a statistics collection from a full scan, wherein a full scan represents an entire dataset or original database as recited); 
“collecting statistics of a sampling database that comprises a subset of the original database” (see Burger, [0043] for performing a statistics collection from a sampling, wherein a sampling represents a subset of dataset or a sampling database as recited); 
“periodically updating the statistics of the original database” (see Burger, [0031] for updating statistics of the database within the data dictionary; also see [0029]); 
“periodically comparing the statistics of the original database and the statistics of the sampling database to determine whether the sampling database is within a predetermined threshold of the original database” (Burger, [0043] for a comparison of the statistics generated from sampling and those from the full scan which represent the actual, or correct statistics is made to determine the level of inaccuracy introduced by sampling, whether the threshold for determining whether the level of inaccuracy introduced by sampling is tolerable can be interpreted as a predetermined threshold as recited).
Burger teaches a feature for updating statistics of the database (see [0031]).
However, Burger does not explicitly teach a feature for periodically updating statistics of the database.
On the other hand, Lightstone et al. explicitly teaches a feature periodically updating statistics of the database (see Lightstone et al., [0026] and [0042] for periodic automated statistics collection operation to collect statistics for tables of data in the database).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Lightstone et al.'s teaching to Burger’s system by implementing a feature for performing a periodic automated statistics collection on tables of the database.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Lightstone et al. (see [0040]) to provide Burger’s system with an effective way to enhance statistics collection for prioritized tables in a database.  In addition, both of the references (Burger and Lightstone et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, database management including statistics collections and using sampling for collecting statistics for the database.  This close relation between both of the references highly suggests an expectation of success when combined.
Burger as modified by Lightstone et al. teaches determining whether the level of inaccuracy introduced by sampling is tolerable (i.e., determining whether the sampling database is within the predetermined threshold of the original database) (see Burger, [0043]).
However, Burger as modified by Lightstone et al. does not explicitly teach a feature of initiating an update to sampling in response to the tolerable inaccuracy/error threshold as equivalently recited as follows:
“in response to determining that the sampling database is not within the predetermined threshold of the original database, initiating an update to the sampling database”. 
On the other hand, Le et al. explicitly teaches a feature of initiating an update to sampling in response to the tolerable inaccuracy/error threshold (see Le et al., [0111] for recommending re-constructing a heavy hitter summary to reflect the augmented dataset when the size of the dataset or the number of partitions increases substantially or higher than a pre-determined threshold number and the previously generated heavy hitter summaries may become less accurate, wherein a heavy hitter summary can be interpreted as a subset representing the dataset (see [0125]); also see [0132] for determining/adjusting the sampling rate (i.e., sampling size) based on the tolerable error rate).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Le et al.'s teaching to Burger’s system (modified by Lightstone et al.) by implementing a feature for adjusting the data sampling or re-constructing a sampling subset of the dataset if the inaccuracy level introduced by the sampling is not tolerable.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Le et al. (see [0111]), to provide Burger’s system with an effective way to generate and using the sampling of the dataset/database which likely reflects or represents the dataset/database.  In addition, both of the references (Burger and Le et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, database management including using sampling for representing the database.  This close relation between both of the references highly suggests an expectation of success when combined.

As to claim 10, Burger teaches:
“A system” (see Burger, Abstract) comprising: 
“a memory having computer readable instructions” (see Burger, [0065]); and 
“one or more processors for executing the computer readable instructions, the computer readable instructions controlling the one or more processors to perform operations comprising” (see Burger, [0065]):
“collecting statistics of an original database” (see Burger, [0043] for performing a statistics collection from a full scan, wherein a full scan represents an entire dataset or original database as recited); 
“collecting statistics of a sampling database that comprises a subset of the original database” (see Burger, [0043] for performing a statistics collection from a sampling, wherein a sampling represents a subset of dataset or a sampling database as recited); 
“periodically updating the statistics of the original database” (see Burger, [0031] for updating statistics of the database within the data dictionary; also see [0029]); 
“periodically comparing the statistics of the original database and the statistics of the sampling database to determine whether the sampling database is within a predetermined threshold of the original database” (Burger, [0043] for a comparison of the statistics generated from sampling and those from the full scan which represent the actual, or correct statistics is made to determine the level of inaccuracy introduced by sampling, whether the threshold for determining whether the level of inaccuracy introduced by sampling is tolerable can be interpreted as a predetermined threshold as recited).
Burger teaches a feature for updating statistics of the database (see [0031]).
However, Burger does not explicitly teach a feature for periodically updating statistics of the database.
On the other hand, Lightstone et al. explicitly teaches a feature periodically updating statistics of the database (see Lightstone et al., [0026] and [0042] for periodic automated statistics collection operation to collect statistics for tables of data in the database).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Lightstone et al.'s teaching to Burger’s system by implementing a feature for performing a periodic automated statistics collection on tables of the database.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Lightstone et al. (see [0040]) to provide Burger’s system with an effective way to enhance statistics collection for prioritized tables in a database.  In addition, both of the references (Burger and Lightstone et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, database management including statistics collections and using sampling for collecting statistics for the database.  This close relation between both of the references highly suggests an expectation of success when combined.
Burger as modified by Lightstone et al. teaches determining whether the level of inaccuracy introduced by sampling is tolerable (i.e., determining whether the sampling database is within the predetermined threshold of the original database) (see Burger, [0043]).
However, Burger as modified by Lightstone et al. does not explicitly teach a feature of initiating an update to sampling in response to the tolerable inaccuracy/error threshold as equivalently recited as follows:
“in response to determining that the sampling database is not within the predetermined threshold of the original database, initiating an update to the sampling database”. 
On the other hand, Le et al. explicitly teaches a feature of initiating an update to sampling in response to the tolerable inaccuracy/error threshold (see Le et al., [0111] for recommending re-constructing a heavy hitter summary to reflect the augmented dataset when the size of the dataset or the number of partitions increases substantially or higher than a pre-determined threshold number and the previously generated heavy hitter summaries may become less accurate, wherein a heavy hitter summary can be interpreted as a subset representing the dataset (see [0125]); also see [0132] for determining/adjusting the sampling rate (i.e., sampling size) based on the tolerable error rate).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Le et al.'s teaching to Burger’s system (modified by Lightstone et al.) by implementing a feature for adjusting the data sampling or re-constructing a sampling subset of the dataset if the inaccuracy level introduced by the sampling is not tolerable.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Le et al. (see [0111]), to provide Burger’s system with an effective way to generate and using the sampling of the dataset/database which likely reflects or represents the dataset/database.  In addition, both of the references (Burger and Le et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, database management including using sampling for representing the database.  This close relation between both of the references highly suggests an expectation of success when combined.

As to claim 19, Burger teaches:
“A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform operations comprising” (see Burger, Abstract and [0065]): 
“collecting statistics of an original database” (see Burger, [0043] for performing a statistics collection from a full scan, wherein a full scan represents an entire dataset or original database as recited); 
“collecting statistics of a sampling database that comprises a subset of the original database” (see Burger, [0043] for performing a statistics collection from a sampling, wherein a sampling represents a subset of dataset or a sampling database as recited); 
“periodically updating the statistics of the original database” (see Burger, [0031] for updating statistics of the database within the data dictionary; also see [0029]); 
“periodically comparing the statistics of the original database and the statistics of the sampling database to determine whether the sampling database is within a predetermined threshold of the original database” (Burger, [0043] for a comparison of the statistics generated from sampling and those from the full scan which represent the actual, or correct statistics is made to determine the level of inaccuracy introduced by sampling, whether the threshold for determining whether the level of inaccuracy introduced by sampling is tolerable can be interpreted as a predetermined threshold as recited).
Burger teaches a feature for updating statistics of the database (see [0031]).
However, Burger does not explicitly teach a feature for periodically updating statistics of the database.
On the other hand, Lightstone et al. explicitly teaches a feature periodically updating statistics of the database (see Lightstone et al., [0026] and [0042] for periodic automated statistics collection operation to collect statistics for tables of data in the database).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Lightstone et al.'s teaching to Burger’s system by implementing a feature for performing a periodic automated statistics collection on tables of the database.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Lightstone et al. (see [0040]) to provide Burger’s system with an effective way to enhance statistics collection for prioritized tables in a database.  In addition, both of the references (Burger and Lightstone et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, database management including statistics collections and using sampling for collecting statistics for the database.  This close relation between both of the references highly suggests an expectation of success when combined.
Burger as modified by Lightstone et al. teaches determining whether the level of inaccuracy introduced by sampling is tolerable (i.e., determining whether the sampling database is within the predetermined threshold of the original database) (see Burger, [0043]).
However, Burger as modified by Lightstone et al. does not explicitly teach a feature of initiating an update to sampling in response to the tolerable inaccuracy/error threshold as equivalently recited as follows:
“in response to determining that the sampling database is not within the predetermined threshold of the original database, initiating an update to the sampling database”. 
On the other hand, Le et al. explicitly teaches a feature of initiating an update to sampling in response to the tolerable inaccuracy/error threshold (see Le et al., [0111] for recommending re-constructing a heavy hitter summary to reflect the augmented dataset when the size of the dataset or the number of partitions increases substantially or higher than a pre-determined threshold number and the previously generated heavy hitter summaries may become less accurate, wherein a heavy hitter summary can be interpreted as a subset representing the dataset (see [0125]); also see [0132] for determining/adjusting the sampling rate (i.e., sampling size) based on the tolerable error rate).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Le et al.'s teaching to Burger’s system (modified by Lightstone et al.) by implementing a feature for adjusting the data sampling or re-constructing a sampling subset of the dataset if the inaccuracy level introduced by the sampling is not tolerable.  Ordinarily skilled artisan would have been motivated to do so, as suggested by Le et al. (see [0111]), to provide Burger’s system with an effective way to generate and using the sampling of the dataset/database which likely reflects or represents the dataset/database.  In addition, both of the references (Burger and Le et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, database management including using sampling for representing the database.  This close relation between both of the references highly suggests an expectation of success when combined.

As to claims 2 and 11, these claims are rejected based on the same arguments as above to reject claims 1 and 10, and are similarly rejected including the following:
Burger as modified by Lightstone et al. and Le et al. teaches:
“wherein the initiating an update to the sampling database comprises outputting a recommendation to recreate the sampling database based at least in part on the original database” (see Burger, [0043] for using sampling (i.e., creating a sampling database) in statistic collection; also see Le et al., [0111] for recommending re-constructing a heavy hitter summary (i.e., recreating a sampling database)). 

As to claims 3 and 12, these claims are rejected based on the same arguments as above to reject claims 1 and 10, and are similarly rejected including the following:
Burger as modified by Lightstone et al. and Le et al. teaches:
“wherein the initiating an update to the sampling database comprises automatically recreating the sampling database based at least in part on the original database” (see Burger, [0043] for using sampling (i.e., creating a sampling database) in statistic collection; also see Le et al., [0111] for recommending re-constructing a heavy hitter summary (i.e., recreating a sampling database) and also see [0114] for re-constructing a dataset-level heavy hitter summary (i.e., recreating a sampling database)).

As to claims 4 and 13, these claims are rejected based on the same arguments as above to reject claims 1 and 10, and are similarly rejected including the following:
Burger as modified by Lightstone et al. and Le et al. teaches:
“wherein the initiating an update to the sampling database comprises one of outputting a recommendation to update a subset of the sampling database based at least in part on the original database and automatically updating a subset of the sampling database based at least in part on the original database” (see Burger, [0043] for using sampling (i.e., creating a sampling database) in statistic collection; also see Le et al., [0111] for recommending re-constructing a heavy hitter summary (i.e., recreating a sampling database) and also see [0114] for re-constructing a dataset-level heavy hitter summary (i.e., recreating a sampling database)).

As to claims 5 and 14, these claims are rejected based on the same arguments as above to reject claims 1 and 10, and are similarly rejected including the following:
Burger as modified by Lightstone et al. and Le et al. teaches:
“creating a sampling definition structure” (see Burger, [0031] for the sample size; also see Le et al., [0034] for uniformed sampling rate or a progressive sampling scheme, wherein storage for any attribute(s) for defining the sampling can be interpreted as a sampling definition structure as recited); and 
“creating the sampling database based at least in part on the sampling definition structure and the original database” (see Burger, [0031] for creating sampling of a column or index based on the sample size; also see Lee et al., [0012] and [0125] for generating a dataset-level heavy hitter summary (i.e., sampling dataset/database) based on uniformed sampling rate and/or a progressive sampling scheme). 

As to claims 6 and 15, these claims are rejected based on the same arguments as above to reject claims 5 and 14, and are similarly rejected including the following:
Burger as modified by Lightstone et al. and Le et al. teaches:
“wherein the sampling definition structure comprises a sampling rate, the predetermined threshold, a frequency of the comparing, and an action performed in response to the initiating an update” (see Burger, [0031] for sample size, and [0043] for threshold to determine whether the inaccuracies (i.e., level of inaccuracy) are tolerable; also see Le et al., [0111] for a pre-determined threshold number used in determining whether to recommend re-constructing a heavy hitter summary to reflect the augmented/updated dataset).

As to claims 7 and 16, these claims are rejected based on the same arguments as above to reject claims 1 and 10, and are similarly rejected including the following:
Burger as modified by Lightstone et al. and Le et al. teaches:
“wherein the statistics of the original database and the statistics of the sampling database each comprise an overall record count, a cardinality of a column, and a histogram statistic for the column” (see Burger, [0025] for statistics for database in the data dictionary including the number of records (i.e., an overall record count) in each data structure, see [0049] for the set of distinct values of a column (i.e., a cardinality of a column), and see [0038] for generating histogram(s) based on frequency of each distinct value (i.e., histogram statistic for the column); also see Lightstone et al., [0039] for histogram comparison and column cardinalities). 

As to claims 8 and 17, these claims are rejected based on the same arguments as above to reject claims 1 and 10, and are similarly rejected including the following:
Burger as modified by Lightstone et al. and Le et al. teaches:
“wherein the comparing comprises comparing a relative frequency of a range of values in a column” (see Burger, [0025] for statistics for database in the data dictionary including the number of records (i.e., an overall record count) in each data structure, see [0049] for the set of distinct values of a column (i.e., a cardinality of a column), and see [0038] for generating histogram(s) based on frequency of each distinct value (i.e., histogram statistic for the column); also see Lightstone et al., [0039] for histogram comparison and column cardinalities). 

As to claims 9, 18 and 20, these claims are rejected based on the same arguments as above to reject claims 1, 10 and 19, and are similarly rejected including the following:
Burger as modified by Lightstone et al. and Le et al. teaches:
“wherein the updating the statistics of the original database is performed in response to an update to the original database and is based at least in part on a log of the original database” (see Burger, [0031] for the statistics update utility operating to update statistics within the data dictionary; also see Lightstone et al., [0025] and [0028] for performing statistics collection/updating based on tracking/monitoring changes (i.e., log) to the tables/databases). 


















Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHUONG THAO CAO whose telephone number is (571)272-2735. The examiner can normally be reached Monday - Friday: 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amy Ng can be reached at 571-270-1698. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/Phuong Thao Cao/Primary Examiner, Art Unit 2164
Read full office action
Prosecution Timeline

Dec 01, 2022
Application Filed
Oct 18, 2023
Response after Non-Final Action
Dec 31, 2025
Non-Final Rejection — §101, §103
Apr 01, 2026
Interview Requested
Apr 07, 2026
Examiner Interview Summary
Apr 07, 2026
Applicant Interview (Telephonic)
Precedent Cases

Applications granted by this same examiner with similar technology

18/359,918
Patent 12602391
LABEL ARCHITECTURE BUILDING SYSTEM AND LABEL ARCHITECTURE BUILDING METHOD
2y 5m to grant Granted Apr 14, 2026
18/094,582
Patent 12596938
SYSTEMS AND METHODS FOR IDENTIFICATION AND MANAGEMENT OF COMPLIANCE-RELATED INFORMATION ASSOCIATED WITH ENTERPRISE IT NETWORKS
2y 5m to grant Granted Apr 07, 2026
18/597,638
Patent 12585615
SIMPLIFYING UNSTRUCTURED DATA FOR DATA ANALYTICS
2y 5m to grant Granted Mar 24, 2026
18/196,913
Patent 12579133
GENERATING QUERY VARIANTS USING A TRAINED GENERATIVE MODEL
2y 5m to grant Granted Mar 17, 2026
18/444,293
Patent 12579196
SYSTEMS AND METHODS OF RETROSPECTIVELY DETERMINING HOW SUBMITTED DATA TRANSACTION REQUESTS OPERATE AGAINST A DYNAMIC DATA STRUCTURE
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
78%
Grant Probability
92%
With Interview (+13.9%)
3y 0m
Median Time to Grant
Low
PTA Risk
Based on 760 resolved cases by this examiner. Grant probability derived from career allow rate.