Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 11/28/2022 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Claim Objections
Claims 2-6, 10-14, and 18-20 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Claim Rejections - 35 USC § 112
Claims 7, 8, 15, and 16 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
The claims flagged all recite some form of “[The GAN is] configured to generate legitimate data”. It is the understanding of the Examiner that a GAN cannot generate truly legitimate data, but can generate data that a discriminator has determined to be legitimate based on its relative accuracy to the truly legitimate training data. These claims are being interpreted as generating data that the discriminator has determined to be legitimate and it is the recommendation of the Examiner that the claim language be amended to reflect this.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1, 7-9, and 15-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Denton et al (Denton, E., Chintala, S., Szlam, A., & Fergus, R. (2015). Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. arXiv [Cs.CV]. Retrieved from http://arxiv.org/abs/1506.05751, hereinafter Denton), in view of Nica et al (US 20220374682 A1, hereinafter Nica), and in view of Chen et al (Chen, H., Jajodia, S., Liu, J., Park, N., Sokolov, V., & Subrahmanian, V. S. (7 2019). FakeTables: Using GANs to Generate Functional Dependency Preserving Tables with Bounded Real Data. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, 2074–2080. doi:10.24963/ijcai.2019/287, hereinafter Chen) .
Regarding Claim 1:
Denton teaches
combining the first and second GANs into a combined GAN (Denton [Figure 2]: it can be seen in figure 2 that multiple GANs are trained in series with each other; (EN): in the context of this claim the term “combine” is broad and could be interpreted as training models in series or using the output of one GAN to train a subsequent GAN);
training the combined GAN (Denton [Figure 2 caption]: "The training procedure for our LAPGAN model."; (EN): the training method for the combined network is described in more detail in the Figure 2 caption);
operating the trained combined GAN to generate new fabricated data that: imitate characteristics of the original structured data (Denton [Figure 2 caption]: "It outputs a generated high-pass image h˜0 = G0(z0, l0), which is input to D0. In both the real/generated cases, D0 also receives l0 (orange arrow). Optimizing Eqn. 2, G0 thus learns to generate realistic high-frequency structure h˜0 consistent with the low-pass image l0")
Denton does not distinctly disclose
training a second GAN based on fabricated structured data that adhere to user- defined constraints;
adhere to the user-defined constraints.
However, Nica teaches
training a second GAN based on fabricated structured data that adhere to user- defined constraints (Nica [0023]: "during the adversarial training process, generator model 120 learns the probability distribution of real data, and generates fake samples, synthetic data record 304, that can deceive discriminator model 130. At the same time, discriminator model 130 can receive real samples 306 and synthetic data record 304"; [0020]: "Data constraint 202 can include one or more atomic constraints, e.g., an atomic constraint C1, an atomic constraint C2, and an atomic constraint C3. C1 is specified as T.A>T.C, which means the value of column A in the table is greater than the value of column C in the table. In the current example, the value of column C is the education time, and the value of column A is the age of the person. Hence, the age of the person at column A must be greater than the education time of the person at column C. C2 is specified as T.C>5, which means that a person has received at least 5 years education since the value of column C represents the education time"; (EN): data constraints of Nica are analogous to the user constraints of the instant application)
adhere to the user-defined constraints (Nica [0020]: "Data constraint 202 can include one or more atomic constraints, e.g., an atomic constraint C1, an atomic constraint C2, and an atomic constraint C3. C1 is specified as T.A>T.C, which means the value of column A in the table is greater than the value of column C in the table. In the current example, the value of column C is the education time, and the value of column A is the age of the person. Hence, the age of the person at column A must be greater than the education time of the person at column C. C2 is specified as T.C>5, which means that a person has received at least 5 years education since the value of column C represents the education time"; (EN): data constraints of Nica are analogous to the user constraints of the instant application).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Denton and Nica before him or her, to modify the systems and techniques for generative image models using Laplacian a pyramid of Denton to include the techniques for supporting database constraints for synthetic data generation as shown in Nica. The motivation for doing so would have been to use the database constraints of Nica in order to guide the GAN to generate data while considering the constraints (Nica [0012]: “the current approaches based on GAN or other techniques for generating synthetic data often fail to consider database constraints of the real data, which should be maintained on generated synthetic data. Database constraints are different from the probability distribution of variables of a data table. Classical statistical distributions cannot describe complex and mixed distributions in relational databases. Instead, database constraints may often be represented by Boolean functions”).
Denton + Nica does not distinctly disclose
A computer-implemented method comprising: training a first Generative Adversarial Network (GAN) based on original structured data;
However, Chen teaches
A computer-implemented method comprising: training a first Generative Adversarial Network (GAN) based on original structured data (Chen [Page 2074, section 1, par. 1]: “In this paper, we study an incomplete table synthesis (ITS) problem for tabular data augmentation, where we wish to augment the released incomplete sub-table of records X0 by synthesizing a new table Y of records, so that a machine learning model trained on the augmented table X0 ∪Y works better for the full table X (which X0 originated from) than a model that is trained solely on X0.”; [Page 2076, section 4, par. 2]: "Second, it incorporates the table statistics and FD constraints in training the generator so that the generator establishes a trade-off between the two contradicting objectives mentioned above"; (EN): it is noted that the "structured data" of the instant application refers to tabular data);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Denton + Nica and Chen before him or her, to modify the systems and techniques for generative image models using Laplacian a pyramid of Denton + Nica to include the techniques for GAN training using real, bounded data as shown in Chen. The motivation for doing so would have been to use the bounded data of Chen in order to train the GAN with the real data to generate synthetic data while preserving any functional dependencies of the data (Chen [Abstract]: “In this paper, our goal is to find a way to augment the sub-table by generating a synthetic table from the released sub-table, under the constraints that the generated synthetic table (i) has similar statistics as the entire table, and (ii) preserves the functional dependencies of the released sub-table”).
Regarding Claim 7:
Denton does not distinctly disclose
The method of claim 1, wherein: the characteristics of the original structured data comprise properties, dependencies, and intrinsic constraints;
and the first GAN, following its training, is configured to generate legitimate data that imitate the properties, dependencies, and intrinsic constraints of the original structured data.
However, Chen teaches
The method of claim 1, wherein: the characteristics of the original structured data comprise properties, dependencies, and intrinsic constraints (Chen [Page 2076, section 4.2, par. 1]: "To handle the functional dependency and table statistics constraints, we propose the following adapted loss function for the generator: {Eqn. 2} where LG is the original loss function in Eq. (1). kY¯ − X¯k1 is an error term which penalizes the difference between the column-wise average of the generated table Y and the original table X";);
and the first GAN, following its training, is configured to generate legitimate data that imitate the properties, dependencies, and intrinsic constraints of the original structured data (Chen [Page 2078, section 6.2, par. 1]: "To evaluate whether a generated table Y is close to the original full table X, we first compare the cumulative distribution functions (CDFs) of X and Y . We compare with the state-of-the-art table synthesis approach, TableGAN [Park et al., 2018]. Figure 2 displays the CDFs of some selected schema (columns)").
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Denton and Chen before him or her, to modify the systems and techniques for generative image models using Laplacian a pyramid of Denton to include the techniques for GAN training using real, bounded data as shown in Chen. The motivation for doing so would have been to use the bounded data of Chen in order to train the GAN with the real data to generate synthetic data while preserving any functional dependencies of the data (Chen [Abstract]: “In this paper, our goal is to find a way to augment the sub-table by generating a synthetic table from the released sub-table, under the constraints that the generated synthetic table (i) has similar statistics as the entire table, and (ii) preserves the functional dependencies of the released sub-table”).
Regarding Claim 8:
Denton does not distinctly disclose
The method of claim 1, wherein: the second GAN, following its training, is configured to generate legitimate data that adhere to the user-defined constraints.
However, Nica teaches
The method of claim 1, wherein: the second GAN, following its training, is configured to generate legitimate data that adhere to the user-defined constraints (Nica [0025]: "discriminator model 130 attempts to distinguish between real and generated samples, while generator model 120 attempts to generate realistic fake samples that discriminator model 130 cannot distinguish from real samples."; [0020]: "Data constraint 202 can include one or more atomic constraints, e.g., an atomic constraint C1, an atomic constraint C2, and an atomic constraint C3. C1 is specified as T.A>T.C, which means the value of column A in the table is greater than the value of column C in the table. In the current example, the value of column C is the education time, and the value of column A is the age of the person. Hence, the age of the person at column A must be greater than the education time of the person at column C. C2 is specified as T.C>5, which means that a person has received at least 5 years education since the value of column C represents the education time"; (EN): data constraints of Nica are analogous to the user constraints of the instant application).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Denton and Nica before him or her, to modify the systems and techniques for generative image models using Laplacian a pyramid of Denton to include the techniques for supporting database constraints for synthetic data generation as shown in Nica. The motivation for doing so would have been to use the database constraints of Nica in order to guide the GAN to generate data while considering the constraints (Nica [0012]: “the current approaches based on GAN or other techniques for generating synthetic data often fail to consider database constraints of the real data, which should be maintained on generated synthetic data. Database constraints are different from the probability distribution of variables of a data table. Classical statistical distributions cannot describe complex and mixed distributions in relational databases. Instead, database constraints may often be represented by Boolean functions”).
Regarding Claim 9:
Due to claim language similar to that of Claim 1, Claim 9 is rejected for the same reasons as presented above in the rejection of Claim 1, with the exception of the limitation(s) covered below.
Denton does not distinctly disclose
A system comprising: (a) at least one hardware processor;
and (b) a non-transitory computer-readable storage medium having program code embodied therewith, the program code executable by said at least one hardware processor
However, Nica teaches
A system comprising: (a) at least one hardware processor (Nica [0056]: "Computer system 600 includes one or more processors (also called central processing units, or CPUs), such as a processor 604.");
and (b) a non-transitory computer-readable storage medium having program code embodied therewith, the program code executable by said at least one hardware processor (Nica [0059]: "Computer system 600 also includes a main or primary memory 608, such as random access memory (RAM). Main memory 608 may include one or more levels of cache. Main memory 608 has stored therein control logic (i.e., computer software) and/or data.")
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Denton and Nica before him or her, to modify the systems and techniques for generative image models using Laplacian a pyramid of Denton to include the techniques for supporting database constraints for synthetic data generation as shown in Nica. The motivation for doing so would have been to use the database constraints of Nica in order to guide the GAN to generate data while considering the constraints (Nica [0012]: “the current approaches based on GAN or other techniques for generating synthetic data often fail to consider database constraints of the real data, which should be maintained on generated synthetic data. Database constraints are different from the probability distribution of variables of a data table. Classical statistical distributions cannot describe complex and mixed distributions in relational databases. Instead, database constraints may often be represented by Boolean functions”).
Regarding Claim 15:
Due to claim language similar to that of Claim 7, Claim 15 is rejected for the same reasons as presented above in the rejection of Claim 7.
Regarding Claim 16:
Due to claim language similar to that of Claim 8, Claim 16 is rejected for the same reasons as presented above in the rejection of Claim 8.
Regarding Claim 17:
Due to claim language similar to that of Claims 1 and 9, Claim 17 is rejected for the same reasons as presented above in the rejection of Claims 1 and 9.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20210319090 A1 – An apparatus to facilitate an authenticator-integrated generative adversarial network (GAN) for secure deepfake generation
US 20210264280 A1 – The disclosure relates particularly to preventing mode collapse and stabilizing the training while training generative adversarial networks
US 20210142180 A1 – a computer implemented method to identify relevant feedback
Zhao, Z., Kunar, A., Birke, R., & Chen, L. Y. (2022). CTAB-GAN+: Enhancing Tabular Data Synthesis. arXiv [Cs.LG]. Retrieved from http://arxiv.org/abs/2204.00401 – CTAB-GAN+ a novel conditional tabular GAN
L. Hu, M. Kan, S. Shan and X. Chen, "Duplex Generative Adversarial Network for Unsupervised Domain Adaptation," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 1498-1507, doi: 10.1109/CVPR.2018.00162. – a novel GAN architecture with duplex adversarial discriminators (referred to as DupGAN), which can achieve domain-invariant representation and domain transformation
Y. Yuan and Y. Guo, "A Review on Generative Adversarial Networks," 2020 5th International Conference on Information Science, Computer Technology and Transportation (ISCTT), Shenyang, China, 2020, pp. 392-401, doi: 10.1109/ISCTT51595.2020.00074. – A Review on Generative Adversarial Networks
A. Ahmetoğlu and E. Alpaydın, "Hierarchical Mixtures of Generators for Adversarial Learning," 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 2021, pp. 316-323, doi: 10.1109/ICPR48806.2021.9413249. – we propose the hierarchical mixture of generators, inspired from the hierarchical mixture of experts model, that learns a tree structure implementing a hierarchical clustering with soft splits in the decision nodes and local generators in the leaves.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to COREY M SACKALOSKY whose telephone number is (703)756-1590. The examiner can normally be reached M-F 7:30am-3:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached at (571) 272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/COREY M SACKALOSKY/Examiner, Art Unit 2128
/OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128