Last updated: May 29, 2026
Application No. 17/480,447
SYSTEMS AND METHODS FOR SYNTHETIC DOCUMENT AND DATA GENERATION

Final Rejection §101§103
Filed
Sep 21, 2021
Priority
Oct 17, 2018 — continuation of 11/157,816
Examiner
HONORE, EVEL NMN
Art Unit
2142
Tech Center
2100 — Computer Architecture & Software
Assignee
Capital One Services LLC
OA Round
4 (Final)
Interview Optional

— +18.8% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 46% grant rate with +18.8% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 22 resolved cases, 2023–2026
Examiner Intelligence

HONORE, EVEL NMN View full profile →
Grants 46% of resolved cases
Career Allowance Rate
10 granted / 22 resolved
-9.5% vs TC avg
Strong +19% interview lift
Without
With
+18.8%
Interview Lift
resolved cases with interview
Typical timeline
4y 2m
Avg Prosecution
13 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
11.2%
-28.8% vs TC avg
§103
85.8%
+45.8% vs TC avg
§102
2.2%
-37.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 22 resolved cases
Office Action

§101 §103
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on  07/28/2025 has been entered.
 
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim(s) 21-22 and 24-41 are rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.

When considering subject matter eligibility under 35 U.S.C. 101, it must be determined
whether the claim is directed to one of the four statutory categories of invention, i.e., process,
machine, manufacture, or composition of matter (Step 1). If the claim does fall within one of the
statutory categories, the second step in the analysis is to determine whether the claim is directed
to a judicial exception (Step 2A). The Step 2A analysis is broken into two prongs. In the first
prong (Step 2A, Prong 1), it is determined whether or not the claims recite a judicial exception (e.g., mathematical concepts, mental processes, certain methods of organizing human activity). If
it is determined in Step 2A, Prong 1 that the claims recite a judicial exception, the analysis
proceeds to the second prong (Step 2A, Prong 2), where it is determined whether or not the
claims integrate the judicial exception into a practical application. If it is determined at step 2A,
Prong 2 that the claims do not integrate the judicial exception into a practical application, the
analysis proceeds to determining whether the claim is a patent-eligible application of the
exception (Step 2B). If an abstract idea is present in the claim, any element or combination of
elements in the claim must be sufficient to ensure that the claim integrates the judicial exception
into a practical application, or else amounts to significantly more than the abstract idea
itself. Applicant is advised to consult the 2019 PEG for more details of the analysis.

Step 1 Analysis: Is the claim to a process, machine, manufacture or composition of matter? See
MPEP § 2106.03

Claims 21-22 and 24-38 are drawn to a computer storage media comprising computer-executable instructions that when executed by a computing device cause the computing device to perform a method, claims 8-15 are drawn to a computer-implemented method and claims 16-20 are drawn to a system, therefore each of these claim groups falls under one of four categories of statutory
subject matter (machine/products/apparatus, process/method, manufactures and compositions of
mater; Step 1).
Nonetheless, the claims are directed to a judicially recognized exception of an abstract idea
without significant more (Step 2A, see below). Independent claims 21, 39 and 40 are non-verbatim but similar in claim construction, hence share the same rationale that the claimed inventions are directed to non-statutory subject matter as follows:

Regarding claim 21:

Claim 21 recites: A system for improving machine learning by using a template to generate a corpus of synthetic document for training a machine learning model, comprising: 
at least one processor; 
and at least one non-transitory memory storing instructions that, when executed by the at least one processor, cause the system to perform operations comprising: 
receiving a plurality of documents that includes sensitive data; 
determining a distribution of positional values for a set of corresponding pixels having a same position across multiple documents of [[in]] the plurality of documents;
 determining a standard deviation of the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents; 
identifying at least one common input field based on determining the standard deviation of the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents the distribution of positional values for the set of corresponding pixels in the plurality of documents; 
generating, based on identifying the at least one common input field in the plurality of documents, a template that is for a document type of the plurality of documents; 
extracting, based on generating the template, information from the at least one input field; 
analyzing at least one statistic associated with the extracted information to construct a data model that corresponds to each at least one input field; 
generating non-sensitive data using the data model; 
generating the corpus of synthetic documents, for training the machine learning model, by inserting the non-sensitive data based on the at least one statistic into the template; and 
training the machine learning model based on generating the corpus of synthetic documents by inserting the non-sensitive data into the template

Step 2A Prong One Analysis: Does the claim recite an abstract idea, law of nature, or natural
phenomenon? See MPEP § 2106.04(II)(A)(1).

Claim 21 is directed to an abstract idea, specifically, a mental process-concepts that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III). As well as a mathematical concept, when the claim recites," a mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number."  See MPEP § 2106.04(a)(2)(I)(C).


Independent claim 21 recites in part:

determining a distribution of positional values for a set of corresponding pixels having a same position across multiple documents of [[in]] the plurality of documents
The limitation above is broadly and reasonably interpreted as a mental process that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III). For example, one can evaluate pixels that are received, and determine based on a judgment and opinion on how similar pixels are located in different documents. 

determining a standard deviation of the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents
The limitation above is broadly and reasonably interpreted as a mathematical concept, when the claim recites," a mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number."  See MPEP § 2106.04(a)(2)(I)(C). Standard deviation is a mathematical value defined by a specific formula involving calculating the mean, finding squared differences from the mean (variance), and taking the square root.

identifying at least one common input field based on determining the standard deviation of the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents the distribution of positional values for the set of corresponding pixels in the plurality of documents
The limitation above is broadly and reasonably interpreted as a mental process that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III). For example, one can determine at least one common input by looking at the standard deviation of the positions of matching pixels across two documents.

analyzing at least one statistic associated with the extracted information to construct a data model that corresponds to each at least one input field
The limitation above is broadly and reasonably interpreted as a mental process that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III). For example, one can mentally analyze data that is received, and determine based on a judgment and opinion how data fields should be filled in his/her mind. 

Step 2A Prong Two Analysis: Does the claim recite additional elements that integrate the judicial
exception into a practical application? See MPEP § 2106.04(d).

Independent claim 21 recites in part:
“A system for improving machine learning by using a template to generate a corpus of synthetic document for training a machine learning model, comprising:” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “Natural Language Processing” is used nor the specification makes it clear how these actions are performed. Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer. See MPEP § 2106.05(f) and §2106.04(d).

“at least one processor; 
and at least one non-transitory memory storing instructions that, when executed by the at least one processor, cause the system to perform operations comprising:” as drafted, amount to a high-level of generality (i.e., as a generic processor performing data gathering and mathematical calculations) such that they amount to no more than mere instructions to apply the exception using generic computer components. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.

“receiving a plurality of documents that includes sensitive data” as drafted, amount to adding insignificant extra-solution activity (e.g., pre-solution activity, a step of gathering data) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“generating, based on identifying the at least one common input field in the plurality of documents, a template that is for a document type of the plurality of documents” as drafted, amount to adding insignificant extra-solution activity (e.g., post-solution activity, a step used to output gathered input) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).
“extracting, based on generating the template, information from the at least one input field” as drafted, amount to adding insignificant extra-solution activity to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“generating non-sensitive data using the data model” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “data model” is used nor the specification makes it clear how these actions are performed.  Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer.  See MPEP § 2106.05(f) and § 2106.04(d).

“generating the corpus of synthetic documents, for training the machine learning model, by inserting the non-sensitive data based on the at least one statistic into the template” as drafted, amount to adding insignificant extra-solution activity (e.g., post-solution activity) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“training the machine learning model based on generating the corpus of synthetic documents by inserting the non-sensitive data into the template” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “ML model” is used nor the specification makes it clear how these actions are performed.  Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer.  See MPEP § 2106.05(f) and § 2106.04(d).

Step 2B Analysis: Does the claim recite additional elements that amount to significantly more
than the judicial exception? See MPEP § 2106.05.

First, the additional elements directed to generally linking the use of a judicial exception to a
particular technological environment or field of use are deemed insufficient to transform the
judicial exception to a patentable invention because the claimed limitations generally link the
judicial exception to the technology environment, see MPEP 2106.05(h). However, they are
included below for the sake of completeness.

Second, the additional elements mere application of the abstract idea or mere instructions to
implement an abstract idea on a computer are deemed insufficient to transform the judicial
exception to a patentable invention because the limitations generally apply the use of a generic
computer and/or process with the judicial exception. See MPEP 2106.05(f). However, they are
included below for the sake of completeness.

Independent claim 21 recites in part:

“A system for improving machine learning by using a template to generate a corpus of synthetic document for training a machine learning model, comprising:” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “Natural Language Processing” is used nor the specification makes it clear how these actions are performed. Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer. See MPEP § 2106.05(f) and §2106.04(d).

“at least one processor; 
and at least one non-transitory memory storing instructions that, when executed by the at least one processor, cause the system to perform operations comprising:” as drafted, amount to a high-level of generality (i.e., as a generic processor performing data gathering and mathematical calculations) such that they amount to no more than mere instructions to apply the exception using generic computer components. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.

“receiving a plurality of documents that includes sensitive data” as drafted, amount to adding insignificant extra-solution activity (e.g., pre-solution activity, a step of gathering data) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“generating, based on identifying the at least one common input field in the plurality of documents, a template that is for a document type of the plurality of documents” as drafted, amount to adding insignificant extra-solution activity (e.g., post-solution activity, a step used to output gathered input) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).
“extracting, based on generating the template, information from the at least one input field” as drafted, amount to adding insignificant extra-solution activity to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“generating non-sensitive data using the data model” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “data model” is used nor the specification makes it clear how these actions are performed.  Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer.  See MPEP § 2106.05(f) and § 2106.04(d).

“generating the corpus of synthetic documents, for training the machine learning model, by inserting the non-sensitive data based on the at least one statistic into the template” as drafted, amount to adding insignificant extra-solution activity (e.g., post-solution activity) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“training the machine learning model based on generating the corpus of synthetic documents by inserting the non-sensitive data into the template” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “ML model” is used nor the specification makes it clear how these actions are performed.  Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer.  See MPEP § 2106.05(f) and § 2106.04(d).

Thus, considering the additional elements individually and in combination and the claims as a
whole, the additional elements do not provide significantly more than the abstract idea. The
claims are not eligible subject matter.

Therefore, in examining elements as recited by the limitations individually and as an ordered
combination, as a whole the independent claim limitations do not recite what have the courts
have identified as “significantly more”.

Regarding claim 39:

Claim 39 recites: A method, comprising: 
receiving a plurality of documents; 
determining a distribution of positional values for a set of corresponding pixels of the documents having a same position across multiple documents of the plurality of documents; 
identifying at least one input field based on determining the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents; 
generating, based on identifying the at least one input field, a template that is for a document type; 
extracting, based on generating the template, information from the at least one input field;
 analyzing at least one statistic associated with the extracted information to construct a data model that corresponds to each at least one input field; 
generating data using the data model; 
generating a synthetic document, by inserting the data based on the at least one statistic into the template; and 
training a machine learning model using the synthetic document

Step 2A Prong One Analysis: Does the claim recite an abstract idea, law of nature, or natural
phenomenon? See MPEP § 2106.04(II)(A)(1).

Claim 39 is directed to an abstract idea, specifically, a mental process-concepts that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III).

Independent claim 39 recites in part:

determining a distribution of positional values for a set of corresponding pixels of the documents having a same position across multiple documents of the plurality of documents
The limitation above is broadly and reasonably interpreted as a mental process that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III). For example, one can evaluate pixels that are received, and determine based on a judgment and opinion on how similar pixels are located in different documents. 

identifying at least one input field based on determining the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents
The limitation above is broadly and reasonably interpreted as a mental process that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III). For example, one can determine at least one common input by looking at the standard deviation of the positions of matching pixels across two documents.

analyzing at least one statistic associated with the extracted information to construct a data model that corresponds to each at least one input field
The limitation above is broadly and reasonably interpreted as a mental process that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III). For example, one can mentally analyze data that is received, and determine based on a judgment and opinion how data fields should be filled in his/her mind. 

Step 2A Prong Two Analysis: Does the claim recite additional elements that integrate the judicial
exception into a practical application? See MPEP § 2106.04(d).

Independent claim 39 recites in part:

“A method, comprising: 
receiving a plurality of documents” as drafted, amount to adding insignificant extra-solution activity (e.g., pre-solution activity, a step of gathering data) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“generating, based on identifying the at least one input field, a template that is for a document type” as drafted, amount to adding insignificant extra-solution activity (e.g., post-solution activity, a step used to output gathered input) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“extracting, based on generating the template, information from the at least one input field” as drafted, amount to adding insignificant extra-solution activity to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“generating data using the data model” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “data model” is used nor the specification makes it clear how these actions are performed.  Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer.  See MPEP § 2106.05(f) and § 2106.04(d).
“generating a synthetic document, by inserting the data based on the at least one statistic into the template” as drafted, amount to adding insignificant extra-solution activity (e.g., post-solution activity) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“training a machine learning model using the synthetic document” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “ML model” is used nor the specification makes it clear how these actions are performed.  Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer.  See MPEP § 2106.05(f) and § 2106.04(d).

Step 2B Analysis: Does the claim recite additional elements that amount to significantly more
than the judicial exception? See MPEP § 2106.05.

First, the additional elements directed to generally linking the use of a judicial exception to a
particular technological environment or field of use are deemed insufficient to transform the
judicial exception to a patentable invention because the claimed limitations generally link the
judicial exception to the technology environment, see MPEP 2106.05(h). However, they are
included below for the sake of completeness.

Second, the additional elements mere application of the abstract idea or mere instructions to
implement an abstract idea on a computer are deemed insufficient to transform the judicial
exception to a patentable invention because the limitations generally apply the use of a generic
computer and/or process with the judicial exception. See MPEP 2106.05(f). However, they are
included below for the sake of completeness.

Independent claim 39 recites in part:

“A method, comprising: 
receiving a plurality of documents” as drafted, amount to adding insignificant extra-solution activity (e.g., pre-solution activity, a step of gathering data) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“generating, based on identifying the at least one input field, a template that is for a document type” as drafted, amount to adding insignificant extra-solution activity (e.g., post-solution activity, a step used to output gathered input) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“extracting, based on generating the template, information from the at least one input field” as drafted, amount to adding insignificant extra-solution activity to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“generating data using the data model” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “data model” is used nor the specification makes it clear how these actions are performed.  Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer.  See MPEP § 2106.05(f) and § 2106.04(d).

“generating a synthetic document, by inserting the data based on the at least one statistic into the template” as drafted, amount to adding insignificant extra-solution activity (e.g., post-solution activity) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“training a machine learning model using the synthetic document” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “ML model” is used nor the specification makes it clear how these actions are performed.  Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer.  See MPEP § 2106.05(f) and § 2106.04(d).

Thus, considering the additional elements individually and in combination and the claims as a
whole, the additional elements do not provide significantly more than the abstract idea. The
claims are not eligible subject matter.

Therefore, in examining elements as recited by the limitations individually and as an ordered
combination, as a whole the independent claim limitations do not recite what have the courts
have identified as “significantly more”.

Regarding claim 40:

Claim 40 recites: A system for determining synthetic information for documents, comprising: 
at least one processor; 
and at least one non-transitory memory storing instructions that, when executed by the at least one processor, cause the system to perform operations comprising: 
receiving a plurality of documents associated with respective individuals; 
determining a distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents; 
identifying input fields of the plurality of documents based on determining the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents;
 extracting data from the identified input fields; 
determine expected values for the identified input fields based on the extracted data;
 generating a document template, for a document type, based on the identified input fields; 
analyzing at least one statistic associated with the extracted data to construct a data model that corresponds to each at least one input field; 
generating data using the data model; 
generating a corpus of synthetic documents by populating the generated document template using the expected values; and 
training a machine learning model based on the corpus of synthetic documents

Step 2A Prong One Analysis: Does the claim recite an abstract idea, law of nature, or natural
phenomenon? See MPEP § 2106.04(II)(A)(1).

Claim 40 is directed to an abstract idea, specifically, a mental process-concepts that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III).

Independent claim 40 recites in part:

determining a distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents
The limitation above is broadly and reasonably interpreted as a mental process that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III). For example, one can evaluate pixels that are received, and determine based on a judgment and opinion on how similar pixels are located in different documents. 

identifying input fields of the plurality of documents based on determining the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents
The limitation above is broadly and reasonably interpreted as a mental process that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III). For example, one can determine at least one common input by looking at the standard deviation of the positions of matching pixels across two documents.

determine expected values for the identified input fields based on the extracted data;
The limitation above is broadly and reasonably interpreted as a mental process that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III). For example, one can with pen and paper analyze data that is received, and determine based on a judgment and opinion, expected values for the input fields based on the collected data.

analyzing at least one statistic associated with the extracted data to construct a data model that corresponds to each at least one input field
The limitation above is broadly and reasonably interpreted as a mental process that can
practically be performed in the human mind, with or without the use of a physical aid such as
pen and paper (including an observation, evaluation, judgment, opinion). See MPEP §
2106.04(a)(2)(III). For example, one can mentally analyze data that is received, and determine based on a judgment and opinion how data fields should be filled in his/her mind. 

Step 2A Prong Two Analysis: Does the claim recite additional elements that integrate the judicial
exception into a practical application? See MPEP § 2106.04(d).

Independent claim 21 recites in part:

A system for determining synthetic information for documents, comprising: 
at least one processor; 
and at least one non-transitory memory storing instructions that, when executed by the at least one processor, cause the system to perform operations comprising: as drafted, amount to a high-level of generality (i.e., as a generic processor performing data gathering and mathematical calculations) such that they amount to no more than mere instructions to apply the exception using generic computer components. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.

“receiving a plurality of documents associated with respective individuals” as drafted, amount to adding insignificant extra-solution activity (e.g., pre-solution activity, a step of gathering data) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“extracting data from the identified input fields” as drafted, amount to adding insignificant extra-solution activity to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“generating a document template, for a document type, based on the identified input fields” as drafted, amount to adding insignificant extra-solution activity (e.g., post-solution activity, a step used to output gathered input) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“generating data using the data model” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “data model” is used nor the specification makes it clear how these actions are performed.  Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer.  See MPEP § 2106.05(f) and § 2106.04(d).

“generating a corpus of synthetic documents by populating the generated document template using the expected values” as drafted, amount to adding insignificant extra-solution activity (e.g., post-solution activity) to the judicial exception. See MPEP §§ 2106.04(d), 2106.05(g).

“training a machine learning model based on the corpus of synthetic documents” as drafted, amount to adding the words “apply it” (or an equivalent) with the judicial exception and reciting only the idea of a solution or outcome, i.e., the claim fails to recite details of how a solution to a problem is accomplished because it is unclear how the “ML model” is used nor the specification makes it clear how these actions are performed.  Thus, these additional elements are recited in a manner that represent no more than mere instructions to apply the judicial exceptions on a computer.  See MPEP § 2106.05(f) and § 2106.04(d).

Thus, considering the additional elements individually and in combination and the claims as a
whole, the additional elements do not provide significantly more than the abstract idea. The
claims are not eligible subject matter.

Therefore, in examining elements as recited by the limitations individually and as an ordered
combination, as a whole the independent claim limitations do not recite what have the courts
have identified as “significantly more”.

Furthermore, regarding dependent claims 22, 24-38 which are dependent on claim 21, claim 41 which is dependent on claim 39, the claims are directed to a judicial exception without significantly more as highlighted below in the claim limitations by evaluating the claim limitations under Step 2A and 2B:

Claim 22 incorporates the rejection of independent claim 21, and additional limitation recited in dependent claims 21 does not integrate the judicial exception into a practical application.
Claim 24 incorporates the rejection of independent claim 21, and additional limitation recited in dependent claims 21 does not integrate the judicial exception into a practical application.

Claim 25 incorporates the rejection of independent claim 24, and does not integrate the judicial exception into a practical application.

Claim 26 incorporates the rejection of claim 24, a mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number”.  See MPEP § 2106.04(a)(2)(I)(C).

Claim 27 incorporates the rejection of claim 26, a mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number”.  See MPEP § 2106.04(a)(2)(I)(C).

Claim 28 incorporates the rejection of claim 26, a mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number”.  See MPEP § 2106.04(a)(2)(I)(C).

Claim 29 incorporates the rejection of independent claim 21, and additional limitation recited in dependent claims 21 does not integrate the judicial exception into a practical application.

Claim 30 incorporates the rejection of independent claim 21, and additional limitation recited in dependent claims 21 does not integrate the judicial exception into a practical application.
Claim 31 incorporates the rejection of claim 30, and does not integrate the judicial exception into a practical application.

Claim 32 incorporates the rejection of claim 31, and does not integrate the judicial exception into a practical application.

Claim 33 incorporates the rejection of independent claim 21, and additional limitation recited in dependent claims 21 does not integrate the judicial exception into a practical application.

Claim 34 incorporates the rejection of claim 33, and does not integrate the judicial exception into a practical application.

Claim 35 incorporates the rejection of independent claim 21, and additional limitation recited in dependent claims 21 does not integrate the judicial exception into a practical application.

Claim 36 incorporates the rejection of independent claim 21, and additional limitation recited in dependent claims 21 does not integrate the judicial exception into a practical application.

Claim 37 incorporates the rejection of independent claim 21, and additional limitation recited in dependent claims 21 does not integrate the judicial exception into a practical application.

Claim 38 incorporates the rejection of independent claim 21, and additional limitation recited in dependent claims 21 does not integrate the judicial exception into a practical application.
Claim 41 incorporates the rejection of independent claim 39, and additional limitation recited in dependent claims 39 does not integrate the judicial exception into a practical application.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 21-22, 24, 29-34 and 39-41 are rejected under 35 U.S.C 103 as being unpatentable over Hoehne et al. (US Pub No.: 20200082218 A1), hereinafter referred to as Hoehne in view of Vyas et al. (US Patent No.10,891,311 B1), hereinafter referred to as Vyas and
further in view of Kawai et al. (US Pub No.: 20140122495 A1), hereinafter referred to as
Kawai.

With respect to claim 21, Hoehne disclose:
Determining a distribution of positional values for a set of corresponding pixels having a same position across multiple documents of [[in]] the plurality of documents (In Fig. 4 and paragraph [0024], Hoehne discloses that the optical character recognition (OCR) system gives an index value to each character in a document. These index values can replace the area where the pixel area occupied by the characters. )
Generating, based on identifying the at least one common input field in the plurality of documents, a template that is for a document type of the plurality of documents (In paragraph [0073], Hoehne discloses generating a document with optically recognized text intended for users to interact with and review the OCR version of the document.)
Extracting, based on generating the template, information from the at least one input field (In paragraph [0034], Hoehne discloses the input OCR version of a document to a machine learning model, extracting relevant information, such as, for example, key-values or table information. )
Analyzing at least one statistic associated with the extracted information to construct a data model that corresponds to each at least one input field (In Fig. 4 and paragraph [0042], Hoehne discloses analyzing document 200A, including index values that corresponded to each character of the document. Each character of document 200A may be replaced with an index value based on OCR system 110 identifying an index value corresponding to the pixel information of the character information of the document.)
Generating non-sensitive data using the data model (In paragraph [0030], Hoehne disclosed that using the information derived by CNN, the OCR system may generate a segmentation mask using a semantic segmentation generator. The OCR system may also generate bounding boxes using a bounding box detector. The OCR system may use the segmentation mask and/or the bounding boxes to construct a version of a document  with optically recognized characters.)
With respect to claim 21, Hoehne do not explicitly disclose:
A system for improving machine learning by using a template to generate a corpus of synthetic document for training a machine learning model, comprising: at least one processor
At least one non-transitory memory storing instructions that, when executed by the at least one processor, cause the system to perform operations comprising: receiving a plurality of documents that includes sensitive data
Determining a standard deviation of the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents
Identifying at least one common input field based on determining the standard deviation of the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents the distribution of positional values for the set of corresponding pixels in the plurality of documents
Generating the corpus of synthetic documents, for training the machine learning model, by inserting the non-sensitive data based on the at least one statistic into the template
Training the machine learning model based on generating the corpus of synthetic documents by inserting the non-sensitive data into the template
However, Vyas is known to disclose:
A system for improving machine learning by using a template to generate a corpus of synthetic document for training a machine learning model, comprising: at least one processor (In Fig. 1 and Col. 3, lines 37–42, Vyas discloses a system for generating synthetic data sets. The training module includes a processor and a memory.)
At least one non-transitory memory storing instructions that, when executed by the at least one processor, cause the system to perform operations comprising: receiving a plurality of documents that includes sensitive data (In Fig. 3 and In Col. 10, lines 55–58, Vyas discloses receiving a plurality of data sets.  Each data set of the plurality of data sets includes a plurality of attributes. In Col. 6, lines 8-15, Vyas further discloses the plurality of data sets being individual patient medical records (e.g., 1,000 individual patient records) that have been scrubbed for testing purposes.)
Generating the corpus of synthetic documents, for training the machine learning model, by inserting the non-sensitive data based on the at least one statistic into the template (In Col. 11, lines 30–42, Vyas discloses generating, using a machine learning model with a stochastic model, a synthetic data set, where the synthetic data set has generated data for each one of the plurality of attributes. )
Training the machine learning model based on generating the corpus of synthetic documents by inserting the non-sensitive data into the template (In Col. 11, lines 4–14, Vyas discloses a training module, a respective stochastic model for each respective clustered data set of the plurality of clustered data sets.)

Hoehne and Vyas are analogous pieces of art because both references concern generating synthetic data sets. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hoehne, with an optical character recognition (OCR) system to generate a document with optically recognized text as taught by Hoehne, with generating a synthetic data set, where the syntenic data set has generate data for each of the plurality of attributes as taught by Vyas. The motivation for doing so would have been to improve optical character recognition (OCR) system to yield more accurate results (See [0066] of Hoehne)

With respect to claim 21, Hoehne in view of  Vyas do not explicitly disclose:
Determining a standard deviation of the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents
Identifying at least one common input field based on determining the standard deviation of the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents the distribution of positional values for the set of corresponding pixels in the plurality of documents
However, Kawai is known to disclose:
Determining a standard deviation of the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents (In paragraph [0090], Kawai disclosed determining a standard deviation of the documents in the cluster based on the similarities to the cluster center.)
Identifying at least one common input field based on determining the standard deviation of the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents the distribution of positional values for the set of corresponding pixels in the plurality of documents (In paragraph [0090], Kawai discloses identifying the outlier documents in the current group that are outside the set limit. These outlier documents are identified and then processed in a loop. We check how similar each outlier document is to the main points of the group using a specific method. We look for the best match between the outlier document and the main points, following a minimum match standard and a set limit.)

Hoehne in view of  Vyas and Kawai are analogous pieces of art because both references concern a system and method for grouping documents based on metrics. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Kawai, with a deviation determination module to determine a standard deviation for the documents in the cluster based on the similarities to the cluster center as taught by Kawai. The motivation for doing so would have been to optimize computing resources (e.g., multiple machines being used for data generation). (See (Col. 13, lines 25-33) of Vyas.)

Regarding claim 22, Hoehne in view of  Vyas and Kawai disclose elements of claim 21. In addition, Hoehne disclose:
The system of claim 21, wherein determining the distribution of positional values comprises generating the distribution of positional values (In Fig. 4 and paragraph [0024], Hoehne discloses assigning an index value corresponding to the characters.)

Regarding claim 24, Hoehne in view of  Vyas and Kawai disclose elements of claim 21. In addition, Kawai disclose:
The system of claim 21, wherein the operations further comprise: determining a statistical value of the distribution of positional values, wherein the at least one common input field is identified based on the statistical value (In paragraph [0090], Kawai discloses determining a standard deviation of the documents in the cluster based on the similarities to the cluster center.)
Regarding claim 29, Hoehne in view of  Vyas and Kawai disclose elements of claim 21. In addition, Hoehne disclose:
The system of claim 21, wherein the operations further comprise: identifying a common feature of multiple documents of the plurality of documents, wherein the template is generated to include the common feature (In paragraph [0022], Hoehne discloses identifying desired characters and/or aspects of document 120. CNN 140 may be trained using training document examples to recognize characters as well as pixel information to identify groups of characters, such as, for example, words, lines, or sentences.)

Regarding claim 30, Hoehne in view of  Vyas and Kawai disclose elements of claim 21. In addition, Hoehne disclose:
The system of claim 21, wherein the operations further comprise: extracting data from a document using the template (In paragraph [0028], Hoehne discloses extracting characters and pixel dimensions from document 120 without first identifying word boxes.)

Regarding claim 31, Hoehne in view of  Vyas and Kawai disclose elements of claim 30. In addition, Hoehne disclose:
The system of claim 30, wherein extracting data from the document using the template comprises performing a visual characterization process to input fields of the document (In paragraph [0033], Hoehne discloses that utilizing OCR system 110, computer systems may easily recognize the character content of a document 120 as well as extract the character information from document 120.)
Regarding claim 32, Hoehne in view of  Vyas and Kawai disclose elements of claim 31. In addition, Hoehne disclose:
The system of claim 31, wherein the visual characterization process comprises optical character recognition (In paragraph [0034], Hoehne teaches that an OCR system may input the OCR version of a document to a machine learning model, such as, for example, another convolutional neural network (CNN). The other CNN may process the document to extract relevant information.)

Regarding claim 33, Hoehne in view of  Vyas and Kawai disclose elements of claim 21. In addition, Vyas disclose:
The system of claim 21, wherein at least a portion of the inserted data is retrieved from a database prior to insertion (In Col. 4, lines 50–57, Vyas discloses receiving a plurality of data sets 205 (e.g., input data) from the external data source 150 (block 210). Each data set of the plurality of data sets 205 includes a plurality of attributes.)

Regarding claim 34, Hoehne in view of  Vyas and Kawai disclose elements of claim 33. In addition, Vyas disclose:
The system of claim 33, wherein the database associates synthetic documents with respective metadata (In Col. 14, lines 5-13, Vyas discloses that each of the first synthetic data set, the second synthetic data set, and the third synthetic data set are used for testing purposes (e.g., testing an application or a database). In a related example, testing of the database includes indexing of the database using each of the first synthetic data set, the second synthetic data set, and the third synthetic data set.)
With respect to claim 39, Hoehne disclose:
Determining a distribution of positional values for a set of corresponding pixels of the documents having a same position across multiple documents of the plurality of documents (In Fig. 4 and paragraph [0024], Hoehne discloses that the optical character recognition (OCR) system gives an index value to each character in a document. These index values can replace the area where the pixel area occupied by the characters. )
Generating, based on identifying the at least one input field, a template that is for a document type (In paragraph [0073], Hoehne discloses generating a document with optically recognized text intended for users to interact with and review the OCR version of the document.)
Extracting, based on generating the template, information from the at least one input field (In paragraph [0034], Hoehne discloses the input OCR version of a document to a machine learning model, extracting relevant information, such as, for example, key-values or table information. )
Analyzing at least one statistic associated with the extracted information to construct a data model that corresponds to each at least one input field (In Fig. 4 and paragraph [0042], Hoehne discloses analyzing document 200A, including index values that corresponded to each character of the document. Each character of document 200A may be replaced with an index value based on OCR system 110 identifying an index value corresponding to the pixel information of the character information of the document.)
Generating data using the data model (In paragraph [0030], Hoehne disclosed that using the information derived by CNN, the OCR system may generate a segmentation mask using a semantic segmentation generator. The OCR system may also generate bounding boxes using a bounding box detector. The OCR system may use the segmentation mask and/or the bounding boxes to construct a version of a document  with optically recognized characters.)
With respect to claim 39, Hoehne do not explicitly disclose:
A method, comprising: receiving a plurality of documents
Identifying at least one input field based on determining the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents
Generating a synthetic document, by inserting the data based on the at least one statistic into the template
Training a machine learning model using the synthetic document
However, Vyas is known to disclose:
A method, comprising: receiving a plurality of documents (In Fig. 3 and In Col. 10, lines 55–58, Vyas discloses receiving a plurality of data sets.  Each data set of the plurality of data sets includes a plurality of attributes. In Col. 6, lines 8-15, Vyas further discloses the plurality of data sets being individual patient medical records (e.g., 1,000 individual patient records) that have been scrubbed for testing purposes.)
Generating a synthetic document, by inserting the data based on the at least one statistic into the template (In Col. 11, lines 30–42, Vyas discloses generating, using a machine learning model with a stochastic model, a synthetic data set, where the synthetic data set has generated data for each one of the plurality of attributes. )
Training a machine learning model using the synthetic document (In Col. 11, lines 4–14, Vyas discloses a training module, a respective stochastic model for each respective clustered data set of the plurality of clustered data sets.)

Hoehne and Vyas are analogous pieces of art because both references concern generating synthetic data sets. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hoehne, with an optical character recognition (OCR) system to generate a document with optically recognized text as taught by Hoehne, with generating a synthetic data set, where the syntenic data set has generate data for each of the plurality of attributes as taught by Vyas. The motivation for doing so would have been to improve optical character recognition (OCR) system to yield more accurate results (See [0066] of Hoehne)

With respect to claim 39, Hoehne in view of  Vyas do not explicitly disclose:
Identifying at least one input field based on determining the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents
However, Kawai is known to disclose:
Identifying at least one input field based on determining the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents (In paragraph [0090], Kawai discloses identifying the outlier documents in the current group that are outside the set limit. These outlier documents are identified and then processed in a loop. We check how similar each outlier document is to the main points of the group using a specific method. We look for the best match between the outlier document and the main points, following a minimum match standard and a set limit.)

Hoehne in view of  Vyas and Kawai are analogous pieces of art because both references concern a system and method for grouping documents based on metrics. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Kawai, with a deviation determination module to determine a standard deviation for the documents in the cluster based on the similarities to the cluster center as taught by Kawai. The motivation for doing so would have been to optimize computing resources (e.g., multiple machines being used for data generation). (See (Col. 13, lines 25-33) of Vyas.)

With respect to claim 40, Hoehne disclose:
Determining a distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents (In Fig. 4 and paragraph [0024], Hoehne discloses that the optical character recognition (OCR) system gives an index value to each character in a document. These index values can replace the area where the pixel area occupied by the characters. )
Extracting data from the identified input fields (In paragraph [0034], Hoehne discloses the input OCR version of a document to a machine learning model, extracting relevant information, such as, for example, key-values or table information. )
Analyzing at least one statistic associated with the extracted data to construct a data model that corresponds to each at least one input field (In Fig. 4 and paragraph [0042], Hoehne discloses analyzing document 200A, including index values that corresponded to each character of the document. Each character of document 200A may be replaced with an index value based on OCR system 110 identifying an index value corresponding to the pixel information of the character information of the document.)
Generating data using the data model (In paragraph [0030], Hoehne disclosed that using the information derived by CNN, the OCR system may generate a segmentation mask using a semantic segmentation generator. The OCR system may also generate bounding boxes using a bounding box detector. The OCR system may use the segmentation mask and/or the bounding boxes to construct a version of a document  with optically recognized characters.)
With respect to claim 40, Hoehne do not explicitly disclose:
A system for determining synthetic information for documents, comprising: at least one processor
At least one non-transitory memory storing instructions that, when executed by the at least one processor, cause the system to perform operations comprising: receiving a plurality of documents associated with respective individuals
Identifying input fields of the plurality of documents based on determining the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents
determine expected values for the identified input fields based on the extracted data
Generating a corpus of synthetic documents by populating the generated document template using the expected values
Training a machine learning model based on the corpus of synthetic documents
However, Vyas is known to disclose:
A system for determining synthetic information for documents, comprising: at least one processor (In Fig. 1 and Col. 3, lines 37–42, Vyas discloses a system for generating synthetic data sets. The training module includes a processor and a memory.)
At least one non-transitory memory storing instructions that, when executed by the at least one processor, cause the system to perform operations comprising: receiving a plurality of documents associated with respective individuals (In Fig. 3 and In Col. 10, lines 55–58, Vyas discloses receiving a plurality of data sets.  Each data set of the plurality of data sets includes a plurality of attributes. In Col. 6, lines 8-15, Vyas further discloses the plurality of data sets being individual patient medical records (e.g., 1,000 individual patient records) that have been scrubbed for testing purposes.)
Generating a corpus of synthetic documents by populating the generated document template using the expected values (In Col. 11, lines 30–42, Vyas discloses generating, using a machine learning model with a stochastic model, a synthetic data set, where the synthetic data set has generated data for each one of the plurality of attributes. )
Training a machine learning model based on the corpus of synthetic documents (In Col. 11, lines 4–14, Vyas discloses a training module, a respective stochastic model for each respective clustered data set of the plurality of clustered data sets.)

Hoehne and Vyas are analogous pieces of art because both references concern generating synthetic data sets. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hoehne, with an optical character recognition (OCR) system to generate a document with optically recognized text as taught by Hoehne, with generating a synthetic data set, where the syntenic data set has generate data for each of the plurality of attributes as taught by Vyas. The motivation for doing so would have been to improve optical character recognition (OCR) system to yield more accurate results (See [0066] of Hoehne)

With respect to claim 40, Hoehne in view of  Vyas do not explicitly disclose:
Determine expected values for the identified input fields based on the extracted data
Identifying input fields of the plurality of documents based on determining the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents
However, Kawai is known to disclose:
Determine expected values for the identified input fields based on the extracted data (In paragraph [0090], Kawai disclosed determining a standard deviation of the documents in the cluster based on the similarities to the cluster center.)
Identifying input fields of the plurality of documents based on determining the distribution of positional values for the set of corresponding pixels having the same position across the multiple documents of the plurality of documents (In paragraph [0090], Kawai discloses identifying the outlier documents in the current group that are outside the set limit. These outlier documents are identified and then processed in a loop. We check how similar each outlier document is to the main points of the group using a specific method. We look for the best match between the outlier document and the main points, following a minimum match standard and a set limit.)

Hoehne in view of  Vyas and Kawai are analogous pieces of art because both references concern a system and method for grouping documents based on metrics. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Kawai, with a deviation determination module to determine a standard deviation for the documents in the cluster based on the similarities to the cluster center as taught by Kawai. The motivation for doing so would have been to optimize computing resources (e.g., multiple machines being used for data generation). (See (Col. 13, lines 25-33) of Vyas.)

Regarding claim 41, Hoehne in view of  Vyas and Kawai disclose elements of claim 33. In addition, Vyas disclose:
The method of claim 39, further comprising: identifying one or more common features shared by the documents, wherein generating the template comprises: generating the template based on identifying the one or more common features and based on identifying the at least one input field (In paragraph [0090], Kawai disclose determining a standard deviation of the documents in the cluster based on the similarities to the cluster center)

Claims 25-26 are rejected under 35 U.S.C 103 as being unpatentable over Hoehne 
in view of Vyas,  Kawai and further in view of Boroczky et al. (US Pub No.: 20190108632 A1), hereinafter referred to as Boroczky.

Regarding to claim 25, Hoehne in view of  Vyas and Kawai disclose elements of claim 24. Hoehne in view of  Vyas and Kawai do not explicitly disclose:
The system of claim 24, wherein the statistical value is associated with positional values of the set of corresponding pixels having, wherein the set of corresponding pixels comprises a number of adjacent pixels being great than or equal to a threshold value
However, Boroczky disclose the limitation (In paragraph [0079], Boroczky discloses that a threshold is used on an image to keep only the pixels that are brighter than a certain level. The feature measured can be the threshold value in terms of pixel brightness or the number of standard deviations from the average brightness of the detected cluster compared to the threshold.)
Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Boroczky, with grouping image pixels as taught by Boroczky . The motivation for doing so would have been to improve the feature extraction process (See [0006] of Boroczky).

Regarding to claim 26, Hoehne in view of  Vyas and Kawai disclose elements of claim 24. Hoehne in view of  Vyas and Kawai do not explicitly disclose:
The system of claim 24, wherein the statistical value is a standard deviation
However, Boroczky disclose the limitation (In paragraph [0079], Boroczky disclose the feature that is extracted is the given by either the threshold value expressed in units of image pixel intensity, or as the number of standard deviations between the mean intensity of the rest of the detected cluster and the critical threshold)
Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Boroczky, with grouping image pixels as taught by Boroczky . The motivation for doing so would have been to improve the feature extraction process (See [0006] of Boroczky.)

Claims 35-36 are rejected under 35 U.S.C 103 as being unpatentable over Hoehne 
in view of Vyas, Kawai and further in view of Zitouni et al. (US Pub No.: 20190035387A1), hereinafter referred to as Zitouni.

Regarding to claim 35, Hoehne in view of  Vyas and Kawai disclose elements of claim 21. Hoehne in view of  Vyas and Kawai do not explicitly disclose:
The system of claim 21, wherein the inserted data comprises synthetic data and actual data
However, Zitouni disclose the limitation (In paragraph [0038], Zitouni discloses that training data 202 were obtained in order to train the machine learning model. Training data typically comes from one of two sources. One source is synthetic data, the other is actual data.)
Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Zitouni, with one source is synthetic data the other is actual data as taught by Zitouni. The motivation for doing so would have been to training data can be the synthetic or actual data (See [0049] of Zitouni.)

Regarding to claim 36, Hoehne in view of  Vyas and Kawai disclose elements of claim 21. Hoehne in view of  Vyas and Kawai do not explicitly disclose:
The system of claim 21, wherein the inserted data comprises only synthetic data
However, Zitouni disclose the limitation (In paragraph [0148], Zitouni discloses that initial training data comprising synthetic data created to cold-start the machine learning model and wherein the set of user input data comprises data input into the system after the machine learning model has been trained using the synthetic data.)
Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Zitouni, with one source is synthetic data the other is actual data as taught by Zitouni. The motivation for doing so would have been to training data can be the synthetic or actual data (See [0049] of Zitouni.)

Claims 37-38 are rejected under 35 U.S.C 103 as being unpatentable over Hoehne 
in view of Vyas, Kawai and further in view of Lundberg et al. (US Pub No.: 20200117718 A1), hereinafter referred to as Lundberg.

Regarding to claim 37, Hoehne in view of  Vyas and Kawai disclose elements of claim 21. Hoehne in view of  Vyas and Kawai do not explicitly disclose:

The system of claim 21, wherein the operations further comprise: extracting metadata from the plurality of documents; and storing the metadata in a database
However, Lundberg disclose the limitation (In Fig. 6 and Col. 16-17, lines 66–4, Lundberg discloses that patent management system 102 stores docketing information for a plurality of matters, each of the plurality of matters including a plurality of activities and a plurality of documents. ) 
Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Lundberg, with extracting one or more features from the first document and the plurality of activities associated with the first matter. The motivation for doing so would have been to optimize the models to correctly predict the output for a given input (See (Col. 12, lines 49-51) of Lundberg.)

Regarding to claim 38, Hoehne in view of  Vyas and Kawai disclose elements of claim 21. Hoehne in view of  Vyas and Kawai do not explicitly disclose:
The system of claim 21, wherein the machine learning model is configured to be used for a program to: identify handwritten information; identify typed information; identify an expected data type; or identify a document type
However, Lundberg disclose the limitation (In Col. 17, lines 20–34, Lundberg discloses that the machine learning model module 230 may access data stored in the issue that identifies the title or type of the document. For example, using image processing and extracting object features from the selected document, the machine learning model module 230 may determine that the features correspond to an Office Action.)
Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Lundberg, with extracting one or more features from the first document and the plurality of activities associated with the first matter. The motivation for doing so would have been to optimize the models to correctly predict the output for a given input (See (Col. 12, lines 49-51) of Lundberg.)

Response to Arguments
Applicant's arguments filed on 07/28/2025 have been fully considered, and in part are persuasive

Pertaining to Rejection under 101
Arguments are not persuasive and a full 101 analysis is set forth above.

Pertaining to Rejection under 103
Applicant’s arguments in regard to the examiner’s rejections under 35 USC 103 are moot in view of the new grounds of rejection.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EVEL HONORE whose telephone number is (703)756-1179. The examiner can normally be reached Monday-Friday 8 a.m. -5:30 p.m.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela D Reyes can be reached at (571) 270-1006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

EVEL HONORE
Examiner
Art Unit 2142



/Mariela Reyes/Supervisory Patent Examiner, Art Unit 2142
Read full office action
Prosecution Timeline

Show 5 earlier events
May 28, 2025
Final Rejection mailed — §101, §103
Jul 28, 2025
Request for Continued Examination
Aug 01, 2025
Response after Non-Final Action
Nov 28, 2025
Non-Final Rejection mailed — §101, §103
Feb 12, 2026
Applicant Interview (Telephonic)
Feb 15, 2026
Examiner Interview Summary
Feb 20, 2026
Response Filed
May 27, 2026
Final Rejection mailed — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/399,470
Patent 12566942
System and Method For Generating Parametric Activation Functions
4y 6m to grant Granted Mar 03, 2026
17/484,623
Patent 12547946
SYSTEMS AND METHODS FOR FIELD EXTRACTION FROM UNLABELED DATA
4y 4m to grant Granted Feb 10, 2026
17/687,918
Patent 12547906
METHOD, DEVICE, AND PROGRAM PRODUCT FOR TRAINING MODEL
3y 11m to grant Granted Feb 10, 2026
17/189,160
Patent 12536156
UPDATING METADATA ASSOCIATED WITH HISTORIC DATA
4y 11m to grant Granted Jan 27, 2026
17/331,332
Patent 12406483
ONLINE CLASS-INCREMENTAL CONTINUAL LEARNING WITH ADVERSARIAL SHAPLEY VALUE
4y 3m to grant Granted Sep 02, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

5-6
Expected OA Rounds
46%
Grant Probability
64%
With Interview (+18.8%)
4y 2m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 22 resolved cases by this examiner. Grant probability derived from career allowance rate.