Last updated: May 29, 2026

Application No. 18/203,076

MODEL-SPECIFIC SYNTHETIC DATA GENERATION FOR MACHINE LEARNING MODEL TRAINING

Non-Final OA §101§103

Filed

May 30, 2023

Examiner

NILSSON, ERIC

Art Unit

2151

Tech Center

2100 — Computer Architecture & Software

Assignee

International Business Machines Corporation

OA Round

1 (Non-Final)

Interview Optional

— +17.7% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 83% grant rate with +17.7% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 501 resolved cases, 2023–2026

Examiner Intelligence

NILSSON, ERIC View full profile →

Grants 83% — above average

Career Allowance Rate

415 granted / 501 resolved

+27.8% vs TC avg

Strong +18% interview lift

Without

With

+17.7%

Interview Lift

resolved cases with interview

Typical timeline

3y 1m

Avg Prosecution

26 currently pending

Career history

528

Total Applications

across all art units

Statute-Specific Performance

§101

14.4%

-25.6% vs TC avg

§103

63.9%

+23.9% vs TC avg

§102

7.7%

-32.3% vs TC avg

§112

1.3%

-38.7% vs TC avg

Black line = Tech Center average estimate • Based on career data from 501 resolved cases

Office Action

§101 §103

DETAILED ACTION
This action is in response to claims filed 30 May 2023 for application 18203076 filed 03 May 2023. Currently claims 1-20 are pending.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
In step 1, claims 1, 10 and 19 are directed to the statutory category of a method, an article of manufacture and a system. 
	In step 2a prong 1, claims 1, 10 and 19 recite, in part, receiving a trained model, extracting a set of features based on importance, generating a set of marginal queries, performing a measurement of the queries, using measurements to generate synthetic data. The limitations of receiving, extracting, generating, performing, and using are processes that, under its broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components. That is, other than reciting “computer”, “processor” and “computer-readable storage medium” in the context of the claims, the limitations encompass extracting features from a model based on importance and performing measurements to create synthetic data in the mind or with aid. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea.
	In step 2a prong 2, this judicial exception is not integrated into a practical application. In particular, the claims recite the additional elements of “computer”, “processor” and “computer-readable storage medium”. The computer components in the claim are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computer component (MPEP 2106.05(f)). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Please see MPEP §2106.04.(a)(2).III.C. This limitation amounts to mere insignificant extra-solution activity of transmitting information. Please see MPEP §2106.05(g).
	In step 2b, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception, either alone or in combination. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of “computer”, “processor” and “computer-readable storage medium” to perform the steps of the claims amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.
	Claims 2-9, 11-18 and 20 recite further limitations of retraining the model, source data is private and privacy preserving measurements are used, using differentially private measurements, the model has been trained on private data, using differentially private extraction, global or local feature importance, covariance or sensitivity, and using k-way marginal queries. These limitations amount to the same abstract idea recited above in step 2a prong 1. No further additional elements are recited and thus the claims do not recite any practical application in step 2a prong 2 or amount to significantly more in step 2b.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Vietri et al. (Private Synthetic Data for Multitask Learning and Marginal Queries) in view of Mireshghallah (Not All Features Are Equal: Discovering Essential Features for Preserving Prediction Privacy)(hereinafter “Miresh”).

Regarding claims 1, 10 and 19, Vietri discloses: A computer-implemented method comprising:
extracting a set of features (“We use the five largest states (California, New York, Texas, Florida, and Pennsylvania) which together with the five tasks constitute 25 datasets. We used the folktables package [DHMS21] to extract features and tasks.3 In the appendix, we include a table that summarizes the number of categorical and numerical features for each ACS task and the number of rows in each of the 25 datasets.” P7 §5.Datasets)
wherein each of said extracted features is assigned a feature importance score  (“This query class is constructed to preserve the relationship between feature columns and target columns with the goal of generating synthetic data useful for training ML models. Since the possible set of queries for this class is finite we enumerate over all possible combinations of 3-way marginals in our experiments.” P8 §2.1 ¶2);
generating a set of marginal queries based, at least in part, on a selected subset of said features (p3 Definition 3 discloses mixed marginal queries);
performing a measurement of said set of marginal queries on a source database, to obtain measurements of said set of marginal queries on said source database (p3 Definition 3 discloses mixed marginal queries and Definition 4 discloses a threshold and values for the queries against the dataset/features); and
using said measurements to generate synthetic data that matches said measurements (“The quality of synthetic data can be evaluated in task-specific ways. Since the goal is to accurately answer a set of queries over numerical features, we can evaluate the difference between answers to the queries on the synthetic data and those on the real data, summarized by an `1 norm.” p3 §2.1 ¶7).

Vietri does not explicitly disclose, however, Mihesh teaches:
receiving a trained machine learning model (“In the black-box setup, we have no access to the target classifier, nor the data it was trained on. In both cases, we need labeled training data from the data distribution D, that the target classifier was trained on. We do not, however, need access to the exact same training data, nor do we need any extra collaboration from the service provider, such as a change in infrastructure or model parameters” p2 §2 ¶2);
extracting a set of features associated with said machine learning model…
a feature importance score which represents a relative explanatory power of said feature with respect to an output of said machine learning model (features extracted based on importance p2 §2 ¶3);
feature importance score (only important features are kept p2 §2 ¶3).

	Vietri and Mihesh are in the same field of endeavor of extracting features in private environments. Vietri discloses a method of extracting features using marginal queries and differential privacy. Mihesh discloses extracting features from a trained model using feature importance. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the known privacy preserving feature extraction by using features related to a trained model and using feature importance scores as taught by Mihesh to yield predictable results of building the model using only the most useful features to ensure model privacy and accuracy (Mihesh p2 §2 ¶3). 


Regarding claims 2, 11 and 20, Vietri discloses: The computer-implemented method of claim 1, further comprising using said synthetic data to construct a training dataset, wherein said training dataset is used to re-train … model (“We also train linear models for multiple classification tasks using the synthetic datasets generated from different algorithms. We find that RAP++ provides the highest accuracy when the numeric features are predictive of the target label, and closely tracks all benchmark accuracy in all other cases” p2 §Empirical evaluations).

Vietri does not explicitly disclose, however, Mihesh teaches:
said received trained machine learning model p2 §2 ¶2

Regarding claims 3 and 12, Vietri discloses: The computer-implemented method of claim 1, wherein said source database comprises private data, and wherein said measurement is a privacy-preserving measurement of said set of marginal queries on said source database comprising said private data, to obtain privacy-preserving measurements of said set of marginal queries on said source database (“The notion of privacy that we adopt in this paper is differential privacy, which measures the effect of small changes in a dataset on a randomized algorithm. Formally, we say that two datasets are neighboring if they are different in at most one data point.” P3 §2.2 ¶1, see also Definition 6).

Regarding claims 4 and 13, Vietri discloses: The computer-implemented method of claim 3, wherein said privacy-preserving measurement is a differentially-private measurement (“The notion of privacy that we adopt in this paper is differential privacy, which measures the effect of small changes in a dataset on a randomized algorithm. Formally, we say that two datasets are neighboring if they are different in at most one data point.” P3 §2.2 ¶1, see also Definition 6).

Regarding claims 5 and 14, Vietri discloses: The computer-implemented method of claim 1, … wherein said extracting and said assigning are performed using a privacy-preserving feature importance extraction method (“The notion of privacy that we adopt in this paper is differential privacy, which measures the effect of small changes in a dataset on a randomized algorithm. Formally, we say that two datasets are neighboring if they are different in at most one data point.” P3 §2.2 ¶1, see also Definition 6).

Vietri does not explicitly disclose, however, Mihesh teaches: trained machine learning model is initially-trained on private data (§2 Mutual Information and §6.2 disclose that the models are trained using private data)

Regarding claims 6 and 15, Vietri discloses: The computer-implemented method of claim 5, wherein said privacy-preserving feature importance extraction method is a differentially-private feature importance extraction method  (“The notion of privacy that we adopt in this paper is differential privacy, which measures the effect of small changes in a dataset on a randomized algorithm. Formally, we say that two datasets are neighboring if they are different in at most one data point.” P3 §2.2 ¶1, see also Definition 6).

Regarding claims 7 and 16, Vietri does not explicitly disclose, however, Mihesh teaches:  The computer-implemented method of claim 1, wherein said feature importance scores comprise at least one of the following categories: global feature importance scores, and local feature importance scores (features extracted based on importance, importance for model interpreted as global importance p2 §2 ¶3).

Regarding claims 8 and 17, Vietri does not explicitly disclose, however, Mihesh teaches: The computer-implemented method of claim 1, wherein said assigning further comprises assigning, to at least some of said features, at least one of the following measures: covariance, and sensitivity (§5.5 discloses sensitivity of features).

Regarding claims 9 and 18, Vietri discloses: The computer-implemented method of claim 1, wherein said selected subset of features comprises k features, and wherein said set of marginal queries comprises all k-way marginal queries which include said selected subset of k features (p3 Definition 2 discloses the use of k-way marginal queries).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC NILSSON whose telephone number is (571)272-5246. The examiner can normally be reached M-F: 7-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, James Trujillo can be reached at (571)-272-3677. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ERIC NILSSON/           Primary Examiner, Art Unit 2151

Read full office action

Prosecution Timeline

May 30, 2023

Application Filed

Feb 26, 2026

Non-Final Rejection mailed — §101, §103

May 12, 2026

Interview Requested

May 20, 2026

Examiner Interview Summary

May 20, 2026

Applicant Interview (Telephonic)

Precedent Cases

Applications granted by this same examiner with similar technology

18/211,153

Patent 12626169

BAYESIAN CAUSAL RELATIONSHIP NETWORK MODELS FOR HEALTHCARE DIAGNOSIS AND TREATMENT BASED ON PATIENT DATA

2y 11m to grant Granted May 12, 2026

17/471,124

Patent 12619869

LEARNING APPARATUS, LEARNING METHOD, AND A NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM

4y 7m to grant Granted May 05, 2026

17/792,580

Patent 12619925

CONTEXT-LEVEL FEDERATED LEARNING

3y 9m to grant Granted May 05, 2026

17/781,539

Patent 12608613

PARAMETER OPTIMIZATION DEVICE, PARAMETER OPTIMIZATION METHOD, AND PARAMETER OPTIMIZATION PROGRAM

3y 10m to grant Granted Apr 21, 2026

17/954,485

Patent 12607972

Method and Apparatus for Monitoring Machine Learning Models

3y 6m to grant Granted Apr 21, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

83%

Grant Probability

99%

With Interview (+17.7%)

3y 1m (~1m remaining)

Median Time to Grant

Low

PTA Risk

Based on 501 resolved cases by this examiner. Grant probability derived from career allowance rate.