Last updated: April 19, 2026
Application No. 19/221,871
DATA GENERATION PROCESS FOR MULTI-VARIABLE DATA

Non-Final OA §101§103§112§DP
Filed
May 29, 2025
Examiner
THAI, HANH B
Art Unit
2163
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
1 (Non-Final)
Interview Optional

— +2.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 797 resolved cases, 2023–2026
Examiner Intelligence

THAI, HANH B View full profile →
Grants 87% — above average
Career Allow Rate
694 granted / 797 resolved
+32.1% vs TC avg
Minimal +3% lift
Without
With
+2.6%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
16 currently pending
Career history
813
Total Applications
across all art units
Statute-Specific Performance

§101
23.9%
-16.1% vs TC avg
§103
41.2%
+1.2% vs TC avg
§102
9.7%
-30.3% vs TC avg
§112
5.7%
-34.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 797 resolved cases
Office Action

§101 §103 §112 §DP
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This is Non-Final Office Action in response to application filed on May 29, 2025 in which claims 1-20 are presented for examination.
Information Disclosure Statement
The references listed in the IDS filed on July 18, 2025 has been considered and entered into record. A copy of the signed or initialed IDS is hereby attached. 
Examiner Notes
Examiner cites particular columns, paragraphs, figures and line numbers in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner. The entire reference is considered to provide disclosure relating to the claimed invention. The claims & only the claims form the metes & bounds of the invention. Office personnel are to give the claims their broadest reasonable interpretation in light of the supporting disclosure. Unclaimed limitations appearing in the specification are not read into the claim. Prior art was referenced using terminology familiar to one of ordinary skill in the art. Such an approach is broad in concept and can be either explicit or implicit in meaning. Examiner's Notes are provided with the cited references to assist the applicant to better understand how the examiner interprets the applied prior art. Such comments are entirely consistent with the intent & spirit of compact prosecution.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite “dividing a first data set into a subset of continuous-type data values and a first subset of discrete-type data values based on variable types in the first data set; generating a second subset of discrete-type data values and a continuous data set based on the subset of continuous-type data values; and generating a second data set based on combining a third subset of discrete-type data values from a conditional contingency table with the continuous data set”. This judicial exception is not integrated into a practical application because the claim recites mathematical concept (dividing data into continuous-type and discrete-type values, generating a new discrete subset, and combining statistical outputs to produce a second dataset). These activities resemble statistical data analysis/probabilistic modeling, which falls under the abstract idea category of “mathematical concepts”. 
ANALYSIS under Revised Guidance of 2019 PEG:
Statutory Category: 
The claims 1-20 are directed to one of the four statutory category (claims 1-8 system or a machine, claims 9-16 and 17-20 a computer program product). 
Step 2A – Prong 1: Judicial Exception Recited?  
The claim 1 recites the limitations of dividing a first data set into a subset of continuous-type data values and a first subset of discrete-type data values based on variable types in the first data set; generating a second subset of discrete-type data values and a continuous data set based on the subset of continuous-type data values; and generating a second data set based on combining a third subset of discrete-type data values from a conditional contingency table with the continuous data set”. This judicial exception is not integrated into a practical application because the claim recites mathematical concept (dividing data into continuous-type and discrete-type values, generating a new discrete subset, and combining statistical outputs to produce a second dataset). Thus, the claim 1 recites an abstract idea under one of groupings of abstract idea, mathematical processes (mathematical relationships, formulas, statistical calculations). (MPEP 2106.05(a)). 
Step 2A – Prong 2: integrated into a practical application? 
The claim 1 does not integrate the abstract idea into a practical application because the claim recites generic computer components such as “a processor set”, “computer-readable storage media” to perform the purely data manipulation like: dividing datasets, generating subsets and combining statistical tables. The claim does not recite any improvement to computer functionality, improvement to a technological process, specific implementation details of the algorithm and/or transformation of data tied to a technical application. Instead, the processor simply performs statistical data processing. Thus the abstract idea is not integrated into a practical application. 
Step 2B: The claim recites additional elements that are sufficient to amount to significantly more than the judicial exception? or an incentive concept? 
The claim does not include additional elements sufficient to amount to significantly more than the judicial exception, nor does it recite and inventive concept. Although the claim recites: a processor set, and computer-readable storage media. These are well-understood, routine, and conventional computer components performing generic functions. The claim does not recite: specialized hardware, unconventional data structures and specific algorithmic improvement to computing. Therefore, the claim lacks an inventive concept. The claim is directed to the abstract idea. 
Dependent claim 2 recites “transforming the subset of continuous-type data values into the second subset of discrete-type data values based on one or more of a data binning operation and a dimension reduction operation” abstract idea under step 2A(ii). Therefore, the claimed elements fail to integrate the judicial exception into a practical application.
Dependent claim 3 recites “reducing a number of dimensions within the subset of continuous-type data values; and generating the second subset of discrete-type data values based on execution of the data binning operation on the reduced number of dimensions” abstract idea under step 2A(ii). Therefore, the claimed elements fail to integrate the judicial exception into a practical application.
Dependent claim 4 recites “splitting the first data set into a first subset of columns of the continuous-type data values and a second subset of columns of the discrete-type data values” abstract idea under step 2A(i). Therefore, the claimed elements fail to integrate the judicial exception into a practical application.
Dependent claim 5 recites “generating the conditional contingency table that includes rows of data within the subset of columns of continuous-type data values that share a common value for one of the discrete-type data values” abstract idea under step 2A(i). Therefore, the claimed elements fail to integrate the judicial exception into a practical application.
Dependent claim 6 recites “generating a plurality of conditional contingency tables that include different subsets of discrete-type values from the second subset of discrete-type data values; and generating the continuous data set after the plurality of conditional contingency tables is generated” abstract idea under step 2A(i). Therefore, the claimed elements fail to integrate the judicial exception into a practical application.
Dependent claim 7 recites “determining a probability of each of rows of data within the conditional contingency table being within a different conditional contingency table from among the plurality of conditional contingency tables; and adding the probability to each of the rows of data within the conditional contingency table” abstract idea under step 2A(i). Therefore, the claimed elements fail to integrate the judicial exception into a practical application.
Dependent claim 8 recites “executing a machine learning model on the second data set; and displaying the predictive performance via a user interface” abstract idea under step 2A(ii), and “determining a predictive performance of the machine learning model” abstract idea under step 2A(i). Therefore, the claimed elements fail to integrate the judicial exception into a practical application. 
Claim 9 and 17 are rejected due to the similar analysis of claim 1.  Claims 10-16 and 18-20 are similar analysis of claims 2-8 and do not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element in claims 10-16 and 18-20 represent a further mental process step.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer component, then it falls within the “mathematical processes” group of abstract ideas.  Each additional step is considered an abstract idea (mathematical process step) and does not integrate the judicial exception into a practical application. An additional abstract idea (mathematical process step) is not sufficient to amount to significantly more than the judicial exception.  Therefore, claims 1-20 are not patent eligible.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1, similar claim 9 and claim 17 recite the limitations "the second subset of discrete-type variables" in “generating a second data set based on combining a third subset of discrete-type data values from a conditional contingency table with the continuous data set”. However, the claims fail to provide proper antecedent basis for the limitation. Specifically, the claims do not previously introduce “a second subset of discrete-type variables” before referring to it as “the” second subset of discrete-type variables And the “discrete-type data values” and “discrete-type variables” are not automatically interchangeable, even if they are conceptually related in statistics or data science. Therefore, the scope of the claims is unclear, raising an issue of indefiniteness.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 1, 3-4, 6-9, 11-17 and 19-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-16 and 19 of U.S. Patent No. 12,353,432. Although the claims at issue are not identical, they are not patentably distinct from each other because instant application claims 1, 3-4, 6-9, 11-17 and 19-20 are anticipated by patent claims 1-16 and 19.
All limitations and elements in claim 1 of the instant application are found in claim 1 of Zhang et al., except for “convert the subset of continuous-type data values into a second subset of discrete-type data values based on a data binning operation.” However, “generating a second subset of discrete-type data values and a continuous data set based on the subset of continuous-type data values” inherently includes converting the subset of continuous-type data values into a second subset of discrete-type data values. Therefore, “generating a second subset of discrete-type data values and a continuous data set” is equivalent to “converting the subset of continuous-type data values into a second subset of discrete-type data values.” Although the claims at issue are not identical, they are not patentably distinct from one another because they are substantially similar in scope and they use the similar limitations as showed in the Claims Comparison Table below as the claims of the cited patent teach every claims of the instant application, and as such, anticipate the claims of the instant application.
Claims Comparison Table:
Instant application  #19/221,871
Patent # 12,353,432
Claim 1. A computer system comprising: a processor set; one or more computer-readable storage media; and program instructions stored on the one or more computer-readable storage media to cause the processor set to perform operations comprising: 
dividing a first data set into a subset of continuous-type data values and a first subset of discrete-type data values based on variable types in the first data set; 

generating a second subset of discrete-type data values and a continuous data set based on the subset of continuous-type data values; and 

generating a second data set based on combining a third subset of discrete-type data values from a conditional contingency table with the continuous data set, wherein the conditional contingency table is based on the first subset of discrete-type data values and the second subset of discrete-type variables.

3. The computer system of claim 2, wherein transforming the subset of continuous-type data values comprises: reducing, based on a principal component analysis (PCA) model, a number of dimensions within the subset of continuous-type data values; and generating the second subset of discrete-type data values based on execution of the data binning operation on the reduced number of dimensions. 
4. The computer system of claim 1, wherein dividing the first data set comprises: splitting the first data set into a first subset of columns of the continuous-type data values and a second subset of columns of the discrete-type data values.

5. The computer system of claim 4, wherein the operations further comprise: generating the conditional contingency table that includes rows of data within the subset of columns of continuous-type data values that share a common value for one of the discrete-type data values.

6. The computer system of claim 1, wherein the operations further comprise: generating a plurality of conditional contingency tables that include different subsets of discrete-type values from the second subset of discrete-type data values; and generating the continuous data set after the plurality of conditional contingency tables is generated 

7. The computer system of claim 6, wherein the operations further comprise: determining a probability of each of rows of data within the conditional contingency table being within a different conditional contingency table from among the plurality of conditional contingency tables; and adding the probability to each of the rows of data within the conditional contingency table. 
8. The computer system of claim 1, wherein the operations further comprise: executing a machine learning model on the second data set; determining a predictive performance of the machine learning model; and displaying the predictive performance via a user interface.

9. A computer program product comprising: one or more computer-readable storage media; and program instructions stored on the one or more computer-readable storage media to perform operations comprising: dividing a first data set into a subset of continuous-type data values and a first subset of discrete-type data values based on variable types in the first data set; generating a second subset of discrete-type data values and a continuous data set based on the subset of continuous-type data values; and generating a second data set based on combining a third subset of discrete-type data values from a conditional contingency table with the continuous data set, wherein the conditional contingency table is based on the first subset of discrete-type data values and the second subset of discrete-type variables.
11. The computer program product of claim 10, wherein transforming the subset of continuous-type data values comprises: reducing, based on a principal component analysis (PCA) model, a number of dimensions within the subset of continuous-type data values; and generating the second subset of discrete-type data values based on execution of the data binning operation on the reduced number of dimensions. 
12. The computer program product of claim 9, wherein dividing the first data set comprises: splitting the first data set into a first subset of columns of the continuous-type data values and a second subset of columns of the discrete-type data values.

13. The computer program product of claim 12, wherein the operations further comprise: generating the conditional contingency table that includes rows of data within the subset of columns of continuous-type data values that share a common value for one of the discrete-type data values.

14. The computer program product of claim 9, wherein the operations further comprise: generating a plurality of conditional contingency tables that include different subsets of discrete-type values from the second subset of discrete-type data values; and generating the continuous data set after the plurality of conditional contingency tables is generated.

15. The computer program product of claim 14, wherein the operations further comprise: determining a probability of each of rows of data within the conditional contingency table being within a different conditional contingency table from among the plurality of conditional contingency tables; and adding the probability to each of the rows of data within the conditional contingency table.

16. The computer program product of claim 9, wherein the operations further comprise: executing a machine learning model on the second data set; determining a predictive performance of the machine learning model; and displaying the predictive performance via a user interface.

17. A method, comprising: dividing a first data set into a subset of continuous-type data values and a first subset of discrete-type data values based on variable types in the first data set; generating a second subset of discrete-type data values and a continuous data set based on the subset of continuous-type data values; and generating a second data set based on combining a third subset of discrete-type data values from a conditional contingency table with the continuous data set, wherein the conditional contingency table is based on the first subset of discrete-type data values and the second subset of discrete-type variables. 





19. The method of claim 18, wherein transforming the subset of continuous-type data values comprises: reducing, based on a principal component analysis (PCA) model, a number of dimensions within the subset of continuous-type data values; and generating the second subset of discrete-type data values based on execution of the data binning operation on the reduced number of dimensions.


20. The method of claim 17, wherein dividing the first data set comprises: splitting the first data set into a first subset of columns of the continuous-type data values and a second subset of columns of the discrete-type data values.
Claim 1. An apparatus comprising: a memory configured to store an original data set; and a processor configured to: 



split the original data set into a subset of continuous-type data values and a subset of discrete-type data values based on variable types in the original data set, 

convert the subset of continuous-type data values into a second subset of discrete-type data values based on a data binning operation, 

generate a new subset of continuous-type data values based on the subset of continuous-type data values in the original data set, and combine a subset of discrete-type data values from a conditional contingency table within the new subset of continuous-type data values to generate a new data set.

5. The apparatus of claim 1, wherein the processor is configured to: execute a principal analysis component (PCA) model on the subset of continuous-type data values to reduce a number of dimensions within the subset of continuous-type data values, and execute the data binning operation on the reduced number of dimensions to generate the second subset of discrete-type data values. 

2. The apparatus of claim 1, wherein the original data set comprises a table, and the processor is configured to: split the table into a subset of columns of the continuous-type data values and a second subset of columns of the discrete-type data values within the table.

3. The apparatus of claim 2, wherein the processor is configured to: generate a conditional contingency table that includes rows of data within the subset of columns of continuous-type data values that share a common value for one of the discrete-type data values.

6. The apparatus of claim 1, wherein the processor is configured to: generate a plurality of conditional contingency tables that include different subsets of discrete-type values from the second subset of discrete-type data values, and generate the new subset of continuous-type data values after the plurality of conditional contingency tables is generated. 
4. The apparatus of claim 3, wherein the processor is configured to: determine a probability of each of the rows of data within the conditional contingency table being within a different conditional contingency table from among a plurality of conditional contingency tables, and add the probability to each of the rows of data within the conditional contingency table. 
7. The apparatus of claim 1, wherein the processor is configured to: execute a machine learning model on the new data set, determine a predictive performance of the machine learning model, and display the predictive performance via a user interface.

8. A method comprising: storing an original data set in memory; splitting the original data set into a subset of continuous-type data values and a subset of discrete-type data values based on variable types in the original data set; converting the subset of continuous-type data values into a second subset of discrete-type data values based on a data binning operation; generating a new subset of continuous-type data values based on the subset of continuous-type data values in the original data set; and combining a subset of discrete-type data values from a conditional contingency table within the new subset of continuous-type data values to generate a new data set.




12. The method of claim 8, wherein the method further comprises: executing a principal analysis component (PCA) model on the subset of continuous-type data values to reduce a number of dimensions within the subset of continuous-type data values; and executing the data binning operation on the reduced number of dimensions to generate the second subset of discrete-type data values. 

9. The method of claim 8, wherein the original data set comprises a table, and the splitting comprises splitting the table into a subset of columns of the continuous-type data values and a second subset of columns of the discrete-type data values within the table.

10. The method of claim 9, wherein the method further comprises: generating a conditional contingency table that includes rows of data within the subset of columns of continuous-type data values that share a common value for one of the discrete-type data values.

13. The method of claim 8, wherein the method further comprises: generating a plurality of conditional contingency tables that include different subsets of discrete-type values from the second subset of discrete-type data values; and generating the new subset of continuous-type data values after the plurality of conditional contingency tables is generated. 

11. The method of claim 10, wherein the method further comprises: determining a probability of each of the rows of data within the conditional contingency table being within a different conditional contingency table from among a plurality of conditional contingency tables; and adding the probability to each of the rows of data within the conditional contingency table.

14. The method of claim 8, wherein the method further comprises: executing a machine learning model on the new data set; determining a predictive performance of the machine learning mode; and displaying the predictive performance via a user interface. 


15. A computer-readable storage medium comprising instructions that, when executed by a processor, cause the processor to perform: storing an original data set in memory; splitting the original data set into a subset of continuous-type data values and a subset of discrete-type data values based on variable types in the original data set; converting the subset of continuous-type data values into a second subset of discrete-type data values based on a data binning operation; generating a new subset of continuous-type data values based on the subset of continuous-type data values in the original data set; and combining a subset of discrete-type data values from a conditional contingency table within the new subset of continuous-type data values to generate a new data set. 
19. The computer-readable storage medium of claim 15, wherein the instructions further cause the processor to perform: executing a principal analysis component (PCA) model on the subset of continuous-type data values to reduce a number of dimensions within the subset of continuous-type data values; and executing the data binning operation on the reduced number of dimensions to generate the second subset of discrete-type data values.

16. The computer-readable storage medium of claim 15, wherein the original data set comprises a table, and the splitting further comprises: splitting the table into a subset of columns of the continuous-type data values and a second subset of columns of the discrete-type data values within the table.


Therefore, claims 1-16 and 19 of the above patent are in essence a “species” of the generic invention of claims 1, 3-4, 6-9, 11-17 and 19-20 of the instant application. It has been held that a generic invention is anticipated by a “species” within the scope of the generic invention. See In re Goodman, 29 USPQ2d 2010 (Fed. Cir. 1993). 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Aggarwal (US 20040250188 A1) in view of Gebremariam et al. (US 11281689 B1).
Regarding claim 1, Aggarwal discloses a computer system comprising: 
a processor set (processor 50, Fig.1, Aggarwal); 
one or more computer-readable storage media (data storage 40 of server 20, Fig.1 and ¶[0025], Aggarwal); and 
program instructions stored on the one or more computer-readable storage media to cause the processor set to perform operations comprising: 
dividing a first data set into a subset of continuous-type data values and a first subset of discrete-type data values based on variable types in the first data set (¶[0028], Aggarwal, i.e., each data set is divided into a number of grid points, wherein the grid points are determined by discretizing each dimension “discrete-type data values” and the data is divided into a number of different ranges for each dimension “continuous-type data values”); 
generating a second subset of discrete-type data values and a continuous data set based on the subset of continuous-type data values (¶[0027]-[0028], Aggarwal, i.e., generating a new data set of data points); and 
generating a second data set based on combining a third subset of discrete-type data values from a conditional contingency table with the continuous data set (¶[0032]-[0033], Aggarwal, i.e., combining different data sets to generate the final synthetic data).
Aggarwal, however, does not explicitly disclose wherein the conditional contingency table is based on the first subset of discrete-type data values and the second subset of discrete-type variables.
Gebremariam discloses creating transformed datasets (col.20, line 20 to col.21, line 28, Gebremariam) based on combining a subset of discrete-type data values from a conditional contingency table with the continuous data set (col.17, line 46 to col.19, line 62 and col.20, line 20 to col.21, line 28, Gebremariam), wherein the conditional contingency table is based on the first subset of discrete-type data values and the second subset of discrete-type variables (col.17, line 46 to col.19, line 62, col.20, line 20 to col.21, line 28 and col.25, line 65 to col.26, line 46, Gebremariam). 
It would have been obvious to a person having ordinary skill in the art before the effective filing date, having both Aggarwal and Gebremariam before them to incorporate the continuous-type data into discrete-type data of Gebremariam into Aggarwal, as taught by Gebremariam. One of ordinary skill in the art would be motivated to integrate multi-variable “continuous-type and discrete-type“ data types into Aggarwal, with a reasonable expectation of success, in order to enhance synthetic data generation.
Regarding claim 2, Aggarwal/Gebremariam combination discloses wherein generating the continuous data set comprises: transforming the subset of continuous-type data values into the second subset of discrete-type data values based on one or more of a data binning operation (col.17, line 4 to col.18, line 25, Gebremariam) and a dimension reduction operation (col.9, lines 41-62 and col.17, line 4 to col.18, line 25, Gebremariam).
Regarding claim 3, Aggarwal/Gebremariam combination discloses reducing, based on a principal component analysis model, a number of dimensions within the subset of continuous-type data values (col.21, line 29 to col.22, line 31, Gebremariam); and generating the second subset of discrete-type data values based on execution of the data binning operation on the reduced number of dimensions (col.20, line 20 to col.21, line 28 and col.25, line 65 to col.26, line 46, Gebremariam).
Regarding claim 4, Aggarwal/Gebremariam combination discloses wherein dividing the first data set comprises: splitting the first data set into a first subset of columns of the continuous-type data values and a second subset of columns of the discrete-type data values (col.25, line 65 to col.26, line 46, Gebremariam).
Regarding claim 5, Aggarwal/Gebremariam combination discloses generating the conditional contingency table that includes rows of data within the subset of columns of continuous-type data values that share a common value for one of the discrete-type data values (col.33, lines 36-55, Gebremariam).
Regarding claim 6, Aggarwal/Gebremariam combination discloses generating a plurality of conditional contingency tables that include different subsets of discrete-type values from the second subset of discrete-type data values (col.20, line 20 to col.21, line 28 and col.25, line 65 to col.26, line 46, Gebremariam); and generating the continuous data set after the plurality of conditional contingency tables is generated (col. 22, lines 19-31 and col.32, lines 15-23, Gebremariam).
Regarding claim 7, Aggarwal/Gebremariam combination discloses determining a probability of each of rows of data within the conditional contingency table being within a different conditional contingency table from among the plurality of conditional contingency tables (col.17, line 46 to col.19, line 62, col.20, line 20 to col.21, line 28 and col.25, line 65 to col.26, line 46, Gebremariam); and adding the probability to each of the rows of data within the conditional contingency table (col.36, lines 53-64, Gebremariam).
Regarding claim 8, Aggarwal/Gebremariam combination discloses executing a machine learning model on the second data set (col.40, lines 25-45, Gebremariam); determining a predictive performance of the machine learning model (col.40, lines 25-45, Gebremariam); and displaying the predictive performance via a user interface (col.39, line 51 to col. 40, line 45, Gebremariam).
Regarding claim 9, Aggarwal discloses a computer program product comprising: 
one or more computer-readable storage media (data storage 40 of server 20, Fig.1 and ¶[0025], Aggarwal); and 
program instructions stored on the one or more computer-readable storage media  (data storage 40 of server 20, Fig.1 and ¶[0025], Aggarwal) to perform operations comprising: 
dividing a first data set into a subset of continuous-type data values and a first subset of discrete-type data values based on variable types in the first data set (¶[0028], Aggarwal, i.e., each data set is divided into a number of grid points, wherein the grid points are determined by discretizing each dimension “discrete-type data values” and the data is divided into a number of different ranges for each dimension “continuous-type data values”); 
generating a second subset of discrete-type data values and a continuous data set based on the subset of continuous-type data values (¶[0027]-[0028], Aggarwal, i.e., generating a new data set of data points); and 
generating a second data set based on combining a third subset of discrete-type data values from a conditional contingency table with the continuous data set (¶[0032]-[0033], Aggarwal, i.e., combining different data sets to generate the final synthetic data).
Aggarwal, however, does not explicitly disclose wherein the conditional contingency table is based on the first subset of discrete-type data values and the second subset of discrete-type variables.
Gebremariam discloses creating transformed datasets (col.20, line 20 to col.21, line 28, Gebremariam) based on combining a subset of discrete-type data values from a conditional contingency table with the continuous data set (col.17, line 46 to col.19, line 62 and col.20, line 20 to col.21, line 28, Gebremariam), wherein the conditional contingency table is based on the first subset of discrete-type data values and the second subset of discrete-type variables (col.17, line 46 to col.19, line 62, col.20, line 20 to col.21, line 28 and col.25, line 65 to col.26, line 46, Gebremariam). 
It would have been obvious to a person having ordinary skill in the art before the effective filing date, having both Aggarwal and Gebremariam before them to incorporate the continuous-type data into discrete-type data of Gebremariam into Aggarwal, as taught by Gebremariam. One of ordinary skill in the art would be motivated to integrate multi-variable “continuous-type and discrete-type“ data types into Aggarwal, with a reasonable expectation of success, in order to enhance synthetic data generation.
Regarding claim 10, Aggarwal/Gebremariam combination discloses transforming the subset of continuous-type data values into the second subset of discrete-type data values based on one or more of a data binning operation (col.17, line 4 to col.18, line 25, Gebremariam) and a dimension reduction operation (col.9, lines 41-62 and col.17, line 4 to col.18, line 25, Gebremariam).
Regarding claim 11, Aggarwal/Gebremariam combination discloses reducing, based on a principal component analysis model, a number of dimensions within the subset of continuous-type data values (col.21, line 29 to col.22, line 31, Gebremariam); and generating the second subset of discrete-type data values based on execution of the data binning operation on the reduced number of dimensions (col.20, line 20 to col.21, line 28 and col.25, line 65 to col.26, line 46, Gebremariam).
Regarding claim 12, Aggarwal/Gebremariam combination discloses splitting the first data set into a first subset of columns of the continuous-type data values and a second subset of columns of the discrete-type data values (col.25, line 65 to col.26, line 46, Gebremariam).

Regarding claim 13, Aggarwal/Gebremariam combination discloses generating the conditional contingency table that includes rows of data within the subset of columns of continuous-type data values that share a common value for one of the discrete-type data values (col.33, lines 36-55, Gebremariam).
Regarding claim 14, Aggarwal/Gebremariam combination discloses generating a plurality of conditional contingency tables that include different subsets of discrete-type values from the second subset of discrete-type data values (col.20, line 20 to col.21, line 28 and col.25, line 65 to col.26, line 46, Gebremariam); and generating the continuous data set after the plurality of conditional contingency tables is generated (col. 22, lines 19-31 and col.32, lines 15-23, Gebremariam).
Regarding claim 15, Aggarwal/Gebremariam combination discloses determining a probability of each of rows of data within the conditional contingency table being within a different conditional contingency table from among the plurality of conditional contingency tables (col.17, line 46 to col.19, line 62, col.20, line 20 to col.21, line 28 and col.25, line 65 to col.26, line 46, Gebremariam); and adding the probability to each of the rows of data within the conditional contingency table (col.36, lines 53-64, Gebremariam).
Regarding claim 16, Aggarwal/Gebremariam combination discloses executing a machine learning model on the second data set (col.40, lines 25-45, Gebremariam); determining a predictive performance of the machine learning model (col.40, lines 25-45, Gebremariam); and displaying the predictive performance via a user interface (col.39, line 51 to col. 40, line 45, Gebremariam).
Regarding claim 17, Aggarwal discloses a method, comprising: 
dividing a first data set into a subset of continuous-type data values and a first subset of discrete-type data values based on variable types in the first data set (¶[0028], Aggarwal, i.e., each data set is divided into a number of grid points, wherein the grid points are determined by discretizing each dimension “discrete-type data values” and the data is divided into a number of different ranges for each dimension “continuous-type data values”); 
generating a second subset of discrete-type data values and a continuous data set based on the subset of continuous-type data values (¶[0027]-[0028], Aggarwal, i.e., generating a new data set of data points); and 
generating a second data set based on combining a third subset of discrete-type data values from a conditional contingency table with the continuous data set (¶[0032]-[0033], Aggarwal, i.e., combining different data sets to generate the final synthetic data).
Aggarwal, however, does not explicitly disclose wherein the conditional contingency table is based on the first subset of discrete-type data values and the second subset of discrete-type variables.
Gebremariam discloses creating transformed datasets (col.20, line 20 to col.21, line 28, Gebremariam) based on combining a subset of discrete-type data values from a conditional contingency table with the continuous data set (col.17, line 46 to col.19, line 62 and col.20, line 20 to col.21, line 28, Gebremariam), wherein the conditional contingency table is based on the first subset of discrete-type data values and the second subset of discrete-type variables (col.17, line 46 to col.19, line 62, col.20, line 20 to col.21, line 28 and col.25, line 65 to col.26, line 46, Gebremariam). 
It would have been obvious to a person having ordinary skill in the art before the effective filing date, having both Aggarwal and Gebremariam before them to incorporate the continuous-type data into discrete-type data of Gebremariam into Aggarwal, as taught by Gebremariam. One of ordinary skill in the art would be motivated to integrate multi-variable “continuous-type and discrete-type“ data types into Aggarwal, with a reasonable expectation of success, in order to enhance synthetic data generation.
Regarding claim 18, Aggarwal/Gebremariam combination discloses transforming the subset of continuous-type data values into the second subset of discrete-type data values based on one or more of a data binning operation (col.17, line 4 to col.18, line 25, Gebremariam) and a dimension reduction operation (col.9, lines 41-62 and col.17, line 4 to col.18, line 25, Gebremariam).
Regarding claim 19, Aggarwal/Gebremariam combination discloses reducing, based on a principal component analysis model, a number of dimensions within the subset of continuous-type data values (col.21, line 29 to col.22, line 31, Gebremariam); and generating the second subset of discrete-type data values based on execution of the data binning operation on the reduced number of dimensions (col.20, line 20 to col.21, line 28 and col.25, line 65 to col.26, line 46, Gebremariam).
Regarding claim 20, Aggarwal/Gebremariam combination discloses splitting the first data set into a first subset of columns of the continuous-type data values and a second subset of columns of the discrete-type data values (col.25, line 65 to col.26, line 46, Gebremariam).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Kudo et al. (US 20170060988 A1) disclose generating a data set based on combining a subset of discrete-type data values from a conditional contingency table with the continuous data set (¶[0047]-[0052], Kudo).
Sturlaugson et al. (US 20160358099 A1) advanced analytical infrastructure for machine learning.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HANH B THAI whose telephone number is (571)272-4029. The examiner can normally be reached Mon-Friday 7-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tony Mahmoudi can be reached at 571-272-4078. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/HANH B THAI/Primary Examiner, Art Unit 2163                                                                                                                                                                                                        
February 23, 2026
Read full office action
Prosecution Timeline

May 29, 2025
Application Filed
Mar 06, 2026
Non-Final Rejection — §101, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/799,935
Patent 12602422
METHOD AND APPARATUS FOR THE CONVERSION AND DISPLAY OF DATA
2y 5m to grant Granted Apr 14, 2026
19/031,290
Patent 12602406
ARTIFICIAL INTELLIGENCE SANDBOX FOR AUTOMATING DEVELOPMENT OF AI MODELS
2y 5m to grant Granted Apr 14, 2026
18/533,620
Patent 12596709
MACHINE LEARNING RECOLLECTION AS PART OF QUESTION ANSWERING USING A CORPUS
2y 5m to grant Granted Apr 07, 2026
17/711,927
Patent 12561391
METHODS AND SYSTEMS FOR PRESENTING USER INTERFACES TO RENDER MULTIPLE DOCUMENTS
2y 5m to grant Granted Feb 24, 2026
18/368,289
Patent 12561296
INTUITIVE DATA FLOW (IDF)
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
87%
Grant Probability
90%
With Interview (+2.6%)
2y 9m
Median Time to Grant
Low
PTA Risk
Based on 797 resolved cases by this examiner. Grant probability derived from career allow rate.