Last updated: April 19, 2026
Application No. 17/846,113
SYNTHETICALLY GENERATED HEALTHCARE DOCUMENTS FOR CLASSIFIER TRAINING

Final Rejection §103
Filed
Jun 22, 2022
Examiner
NEHCHIRI, KOOROSH
Art Unit
2174
Tech Center
2100 — Computer Architecture & Software
Assignee
Concord Iii LLC
OA Round
4 (Final)
This examiner grants 43% of cases after interview

— +30.3% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 135 resolved cases, 2023–2026
Examiner Intelligence

NEHCHIRI, KOOROSH View full profile →
Grants 43% of resolved cases
Career Allow Rate
58 granted / 135 resolved
-12.0% vs TC avg
Strong +30% interview lift
Without
With
+30.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 11m
Avg Prosecution
24 currently pending
Career history
159
Total Applications
across all art units
Statute-Specific Performance

§101
3.5%
-36.5% vs TC avg
§103
71.6%
+31.6% vs TC avg
§102
10.9%
-29.1% vs TC avg
§112
10.4%
-29.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 135 resolved cases
Office Action

§103
DETAILED ACTION
This action is in response to the Amendment dated 04 December 2025. No claim has been amended. No claims have been added. Claims 3, 7 and 11 had been cancelled before. Claims 1-2, 4-6, 8-10 and 12 remain pending and have been considered below.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 21 April 2025 has been entered.

Response to Arguments
Applicant argues that [“Examiner does not map the claimed "extraction of data FROM A SPECIFIC COMMON FIELD" as claimed to any portion of Shetty. Instead, Examiner admits that Shetty merely teaches the use of "form features for non-text field characteristics" like field spacing" or "field label text". As conceded by Examiner, Shetty uses these "non-text field characteristics" together with "text-based field characteristics" to produce a categorization of the form. Nowhere in Examiner's analysis is there any discussion of the "extraction of data from a specific common field" to all of the forms that have been received” (Pages 11-12)]. Examiner respectfully disagrees.
Applicant’s own specification recites “a set of documents of a specific type of healthcare form are queued for processing and a specific common field is identified in each of the documents meaning that the field is present in each of the documents in the set” (par. 0014). As recited in the previous office action, SHETTY discloses “The invention automatically classifies forms into categories using form features for non-text field characteristics (e.g., field spacing) or field-specific text characteristics (e.g., that a form has a field with field label text “Full Legal Name”) as an alternative to, or in addition to, using the words in the form” (par. 0015), and “Typically, a form includes a template of fields and additional information added by one or more persons completing the form. A form will generally provide a way for the persons entering information to enter information in a consistent way so that a receiver of multiple instances of the completed form can read or extract information at particular locations on the form and understand, based on the location, the information. Similarly, the use of fields at particular locations on forms facilitates the automatic interpretation of information entered onto the forms. A form may, for example, have a name field and a recipient or analysis application may understand based on the location of the text added to the form by a person completing the form that the added text is the name of the person. The template of a form can specify fields and field characteristics” (par. 0020). SHETTY further discloses “A field information type identifies the subject matter of the field (e.g., “first name” field, “address” field, “VIN” field, etc.). The characteristics of a field may be manually, semi-automatically, or automatically detected on a form” (par. 0022), and “the location of a field on a page (located by scanning or already known based on the form's metadata), the locations of some or all of the fields with respect to each other (i.e., the field layout), the information type of those fields, and other field characteristics can alone or in combination with one another, form a feature set or feature vector for the form that can be compared with the feature set or feature vector of other forms (or of a particular category) to categorize the forms” (par. 0035), and “Associating the forms with the categories is then based on the feature vectors. In one embodiment, a feature vector is defined by several normalized measurements including, but not limited to, the average font size of field labels, average height of form fields, average vertical spacing between vertically stacked form fields, and percentage of form fields contained within a table” (par. 0040), and “fields are identified based on metadata associated with the form. Other embodiments use these techniques in combination with one another or using alternative or additional techniques to identify fields in the forms” (par.0051) [emphases added].  Therefore, SHETTY teaches classifying forms through identifying fields in the forms through their metadata, such as “Full Legal Name”, “First Name”, “Address” or “VIN” and finding the characteristics of the field to be used to classify the form. Thus, SHETTY teaches extracting data from specific common field to help categorizing the form.
Applicant further argues that [“it is clear that Examiner has interpreted Zhu to teach generative models trained locally at a site using unlabeled data observed at that site to generate synthetic unlabeled data that mimics the unlabeled data used to train the generative model. So much is plainly not the same or even similar to Applicant's claimed computation of a statistical metric for a specific common field of a multiplicity of different forms. To be clear, the training of a generative AI model is not the computation of a statistical metric” (Pages 13)]. Examiner respectfully disagrees.
ZHU teaches “some machine learning techniques use an underlying model M, whose parameters are optimized for minimizing the cost function associated to M, given the input data. For instance, in the context of classification, the model M may be a straight line that separates the data into two classes (e.g., labels) such that M=a*x+b*y+c and the cost function is a function of the number of misclassified points. The learning process then operates by adjusting the parameters a,b,c such that the number of misclassified points is minimal. After this optimization/learning phase, machine learning process 248 can use the model M to classify new data points, such as information regarding new traffic flows in the network, new data samples from sensors, etc. Often, M is a statistical model, and the cost function is inversely proportional to the likelihood of M, given the input data” (par. 0035; see also pars. 0037 and 0042) (emphasis added). Therefore, ZHU teaches machine learning techniques based on a statistical model [ZHU also is cited in combination with, SHETTY, UNSAL and BERSETH, see also UNSAL, par. 0129]. 
Applicant further argues that [“Examiner has not provided any evidence showing any "insertion" of any "synthetically generated value" into any "field" of any "training version" of any "electronic form". Further, Examiner closing statement of "including labeled samples in the training data which may include multiple labels ... which would enable inserting the synthetically generated value into the specific common field of a training version of the electronic forms lacks any support in Zhu. Examiner has simply made up the underlined portion of Examiner's argument at page 5 of the New Non-Final Office Action” (Pages 13-14)]. Examiner respectfully disagrees.
ZHU teaches “Embodiments of the present disclosure generate results data that indicates the best determined candidate functions for each data field of the new and/or updated form, based on how well test data from the best functions match the training set data” (par. 0055), and “FIGS. 5A-5B illustrate examples of training approaches of the global model after representation learning, in various embodiments. As shown in FIG. 5A, global model 510 may be trained using only originally labeled samples, in one embodiment. More specifically, unlabeled data 504 may be mapped to features 508 by an encoder 506 and global model 510 trained using the labeled subset of data 502 and features 508 from encoder 506. In FIG. 5B, a further approach would be to train global model 510 using synthetic unlabeled samples. Notably, n-number of generative models 524 may take as input noise 522 and output a set of synthetic data 526. In turn, synthetic data 526 can be fed to encoder 528 for mapping to features 530, which can then be used to train global model 510” (par. 0060) (emphases added) [ZHU also is cited in combination with, SHETTY, UNSAL and BERSETH, see also UNSAL, par. 195; and BERSETH fig. 2, items 235 and 240, and pars. 0052-0053, wherein at 235, synthetic data may be generated. At 240, synthetic forms may be populated with synthetic data].
Applicant further argues that [“Examiner admits that Berseth only refers to the use of random noise to modify the x-y coordinates of an IMAGE MASK. An IMAGE MASK is hardly the same as a value for a field in a form used for training a classifier. Examiner has failed to show how one of skill in the art of generating a training form for training a classifier would tum to the unrelated art of image mask use in generating synthetic images” (Pages 15)]. Examiner respectfully disagrees.
BERSETH teaches “In some aspects, the process may further comprise the process step of adding noise to at least a portion of one or more of the plurality of sets of synthetic data, the first set of data, the synthetic form, or the background data, wherein adding noise is variable” (par. 0014), and “In some aspects, the process may further comprise the process step of adding noise to at least a portion of one or more of the plurality of sets of synthetic data, the first set of data, the synthetic form, or the background data, wherein adding noise is variable” (par. 0018) (see also figs. 4A-4B), and “Noise: as used herein refers to aberrations on an authentic form or a synthetic form that does not comprise background data or content data” (par. 0033) (see also pars. 0064-0065) [BERSETH also is cited in combination with, SHETTY, UNSAL and ZHU, so see also ZHU, figs. 5A-5B, par. 0060].
Thus, the combination of  SHETTY, ZHU, UNSAL and BERSETH adequately discloses applicant's claimed limitation. Examiner respectfully reminds Applicants that during examination, the claims must be interpreted as broadly as their terms reasonably allow. In re American Academy of Science Tech Center, 367 F.3d 1359, 1369, 70 U.S.P.Q.2d 1827, 1834 (Fed. Cir. 2004).


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4-6, 8-10 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over SHETTY et al. (US20170075974A1) in view of ZHU et al. (US20200090002A1) and further view of UNSAL et al.  (US20180018582A1) and further view of BERSETH et al. (US20190340466A1).

Regarding independent claim 1, SHETTY teaches a method for synthetically generating health care forms for use in training a health care form classifier, the method comprising: receiving a multiplicity of electronic forms in memory of a host computing system; extracting data from a specific common field located in each of the forms (SHETTY teaches automatic form classification uses a machine learning algorithm, which automatically classifies forms into categories using form features for non-text field characteristics (e.g., field spacing) or field-specific text characteristics (e.g., that a form has a field with field label text “Full Legal Name”) as an alternative to, or in addition to, using the words in the form, where a training set of a collection of user forms is used to create a training model and each form is represented in terms of a feature vector for categorization using the training model (par. 0015-0017; par. 0020-0025; see also par. 0020-0021). Shetty teaches non-text field characteristics can be used together with text-based field characteristics to provide the form features, and that the field characteristics are used for form categorization (see par. 0034-0037);
persisting the training version of the electronic forms as part of a training data set for a classifier adapted to classify the multiplicity of electronic form (SHETTY teaches a training set of a collection of user forms is used to create a training model and each form is represented in terms of a feature vector for categorization using the training model, where the feature vectors for the forms are generated with features based on non-text field characteristics (e.g., the number of fields, the types of fields, the locations of fields, etc.) or field-specific text characteristics (e.g., the field label text, font, or orientation associated with a particular field, etc.) in addition to, or as an alternative to, features based on plain document text, where the feature set includes one or more of these form specific features in addition to text-based features (par. 0016). Shetty teaches classification of the forms into categories such as a “Medical” form category (see par. 0016-0017);
and, submitting the training data set to the classifier so as to train the classifier with forms containing synthetically generated values (SHETTY teaches once the forms are represented as feature vector, the feature vector is input into the learning algorithm, e.g., a linear classifier algorithm (see par. 0059).
SHETTY does not expressly teach computing a statistical metric for the specific common field; synthetically generating a value for the specific common field according to the computed statistical metric, generating random noise and modifying the synthetically generated value with the random noise; inserting the synthetically generated value into the specific common field of a training version of the electronic forms.
In similar field of endeavor, ZHU teaches computing a statistical metric for the specific common field (ZHU teaches a service receives machine learning-based generative models from a plurality of distributed sites, and each generative model is trained locally at a site using unlabeled data observed at that site to generate synthetic unlabeled data that mimics the unlabeled data used to train the generative model, and the service receives, from each of the distributed sites, a subset of labeled data observed at that site, and the service uses the generative models to generate synthetic unlabeled data (see pars. 0045-0046 and 0051-0054). ZHU teaches using a statistical techniques and models to classify the gathered data (see par. 0035-0037));
inserting the synthetically generated value into the specific common field of a training version of the electronic forms (ZHU teaches the service trains a global machine learning-based model using the received subsets of labeled data received from the distributed sites and the synthetic unlabeled data generated by the generative models (par. 0045-0046; 0048). ZHU also teaches including labeled samples in the training data which may include multiple labels (see par. 0055), which would enable inserting the synthetically generated value into the specific common field of a training version of the electronic forms).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the SHETTY apparatus to include the teachings of ZHU for computing a statistical metric for the specific common field; inserting the synthetically generated value into the specific common field of a training version of the electronic forms. Such a person would have been motivated to make this combination as Zhu recognized that these techniques could be applied to a wide range of learning tasks, such as representation learning (e.g., anomaly detection), classification (e.g., employee identification, etc.), and regression (e.g., application performance score prediction) (see par. 0045), and contemplated modification of the models to use for other functions (see par. 0076-0077), and therefore would have provided these benefits to the form classification and equation techniques as disclosed by Shetty.
SHETTY and ZHU do not expressly teach synthetically generating a value for the specific common field according to the computed statistical metric, generating random noise and modifying the synthetically generated value with the random noise.
In similar field of endeavor, UNSAL teaches synthetically generating a value for the specific common field according to the computed statistical metric (UNSAL teaches machine learning and genetic algorithms to quickly and accurately determine an acceptable function needed to complete the one or more data fields in a form, gathering training set data that includes previously filled forms related to the new and/or updated form in order to assist in the machine learning process, and generating candidate functions for one or more data fields of the new and/or updated form (par. 0007-0012). UNSAL also teaches a function may require that multiple data values be gathered from multiple places within other forms, the same form, from a user, or from other locations or databases, and a complex function may also include mathematical relationships that will be applied to the multiple data values in complex ways in order to generate the proper data value for the data field (par. 0129). UNSAL also teaches a function may include finding the minimum data value among two or more data values, finding the maximum data value among two or more data values, addition, subtraction, multiplication, division, exponential functions, logic functions, existence conditions, string comparisons, etc. where the machine learning module 113 can generate and test complex candidate functions until an acceptable function has been found for a particular data field (par. 0129). UNSAL also teaches applying a standard deviation to the fitness values while producing candidate functions (par. 0216)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the SHETTY and ZHU apparatus to include the teachings of UNSAL for synthetically generating a value for the specific common field according to the computed statistical metric. Such a person would have been motivated to make this combination as by utilizing machine learning to learn and incorporate new and/or updated forms in an electronic document preparation system, users can save money and time and can better manage their finances (see UNSAL, par. 0022).

SHETTY, ZHU and UNSAL do not expressly teach generating random noise and modifying the synthetically generated value with the random noise.
In similar field of endeavor, BERSETH teaches generating random noise and modifying the synthetically generated value with the random noise (BERSETH teaches that the process may further comprise the process step of adding noise to at least a portion of one or more of the plurality of sets of synthetic data, the first set of data, the synthetic form, or the background data, wherein adding noise is variable(see pars. 0014 and 0018). BERSETH also teaches that the development steps may be used to translate the mask to different coordinates. In some aspects, synthetic data and synthetic form images may be combined to create high quality synthetic images. In some implementations, the mask may be translated from the original x, y coordinates to new x, y coordinates by adding or subtracting a small number of elements from the mask. In some implementations, a synthetic form 500 may have random noise added to the image and have the location of the data shifted along the y axis so the data is overlapping with the form elements. In some implementations, a synthetic form 550 may have random noise added to the image and have the location of the data shifted along the x axis so the data is overlapping with the form elements (see fig. 5, par. 0063)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the SHETTY, ZHU and UNSAL apparatus to include the teachings of BERSETH for generating random noise and modifying the synthetically generated value with the random noise. Such a person would have been motivated to make this combination as for producing better models, and to enhance or expedite machine learning, there is a need for a greater volume of reliable and varied training data (see BERSETH, par. 0007).


Regarding dependent claim 2, SHETTY, ZHU, UNSAL and BERSETH teach the limitations of claim 1. SHETTY further teaches the method of claim 1, wherein the electronic forms conform to an annotated template including an identification of the specific common field; since Shetty teaches a form includes a template of fields and additional information added by one or more persons completing the form, and the template of a form can specify fields and field characteristics (par. 0020-0021).

Regarding dependent claim 4, SHETTY, ZHU, UNSAL and BERSETH teach the limitations of claim 1. SHETTY suggests but does not expressly disclose the method of claim 1, wherein the computed statistical metric is a distribution of values for the specific common field; since SHETTY teaches computing form characteristics such as the distributions of field labels (par. 0023) and the font size distributions of field labels (par. 0052), and the data from the feature storage is provided in an export of the data for training purposes (par. 0053-0054). SHETTY further teaches feature vectors to contain a distribution of values for the specific common field (see par. 0057-0058).
However, ZHU teaches examples of computing statistics or metrics using distribution of data, and distributed data sources (see par. 0042-0043).
It would have been obvious to one of ordinary skill in the art at the time before the effective filing date of the invention to have combined the automatic form classification using a machine learning algorithm taught by SHETTY, with the techniques for machine learning model training using partially labeled data from across multiple geographical locations taught by ZHU, and the techniques to determine an acceptable function needed to complete the one or more data fields in a form taught by UNSAL, since ZHU recognized that these techniques could be applied to a wide range of learning tasks, such as representation learning (e.g., anomaly detection), classification (e.g., employee identification, etc.), and regression (e.g., application performance score prediction) (see par. 0045), and contemplated modification of the models to use for other functions (see par. 0076-0077), and therefore would have provided these benefits to the form classification and equation techniques as disclosed by SHETTY and UNSAL.

Regarding independent claim 5, SHETTY teaches a data processing system comprising: a host computing platform comprising one or more computers, each with memory and one or processing units including one or more processing cores; and, a synthetic form generation module comprising computer program instructions enabled while executing in the memory of at least one of the processing units of the host computing platform; because Shetty teaches computer devices with a processor and memory, to execute computer program code (see par. 0061-0064).
The remaining limitations of independent claim 5 are directed to the system for implementing the methods claimed in independent claim 1, and recite substantially similar subject matter, therefore these limitations of claim 5 are rejected along the same rationale as claims 1.

Regarding dependent claims 6 and 8, claims 6 and 8 are directed to the system for implementing the methods claimed in dependent claims 2 and 4, and are directed to substantially similar subject matter, therefore, claims 6 and 8 are rejected along the same rationale.

Regarding independent claim 9 and dependent claims 10 and 12, claims 9-10 and 12 are directed to the computing device for use with the system recited in independent claim 5 and dependent claims 6 and 8, therefore claims 9-10 and 12 are rejected along the same rationale as claims 5-6 and 8.




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Publication Number
Filing Date
Title
US 20230010686 A1
2020-12-04
Generating synthetic patient health data
US 20210357688 A1
2020-12-16
Artificial Intelligence System For Automated Extraction And Processing Of Dental Claim Forms


THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KOOROSH NEHCHIRI whose telephone number is (408)918-7643. The examiner can normally be reached M-F, 11-7 PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William L. Bashore can be reached at 571-272-4088. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KOOROSH NEHCHIRI/Examiner, Art Unit 2174                                                                                                                                                                                                        

							/WILLIAM L BASHORE/                                                                                                     Supervisory Patent Examiner, Art Unit 2174
Read full office action
Prosecution Timeline

Jun 22, 2022
Application Filed
Jun 05, 2024
Non-Final Rejection — §103
Dec 10, 2024
Response Filed
Dec 16, 2024
Final Rejection — §103
Apr 21, 2025
Request for Continued Examination
May 01, 2025
Response after Non-Final Action
May 30, 2025
Non-Final Rejection — §103
Dec 04, 2025
Response Filed
Feb 26, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/572,746
Patent 12596969
CROSS-JURISDICTION WORKLOAD CONTROL SYSTEMS AND METHODS
2y 5m to grant Granted Apr 07, 2026
17/846,816
Patent 12591360
USER INTERFACE ADJUSTMENT BASED ON PROXIMITY TO UPCOMING MANEUVER
2y 5m to grant Granted Mar 31, 2026
17/877,420
Patent 12580827
SYSTEMS AND METHODS FOR IMPROVED NETWORK SERVICES MANAGEMENT
2y 5m to grant Granted Mar 17, 2026
18/243,351
Patent 12570146
INFORMATION APPARATUS AND MENU DISPLAY METHOD
2y 5m to grant Granted Mar 10, 2026
18/105,431
Patent 12547295
SMART CAROUSEL OF IMAGE MODIFIERS
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
43%
Grant Probability
73%
With Interview (+30.3%)
3y 11m
Median Time to Grant
High
PTA Risk
Based on 135 resolved cases by this examiner. Grant probability derived from career allow rate.