Last updated: April 19, 2026

Application No. 18/630,856

ATTENTION EMBEDDED TRANSFORMER NETWORK DRIVEN DOCUMENT DATA EXTRACTION

Non-Final OA §103

Filed

Apr 09, 2024

Examiner

VO, QUANG N

Art Unit

2683

Tech Center

2600 — Communications

Assignee

Adp Inc.

OA Round

1 (Non-Final)

Interview Optional

— +8.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 612 resolved cases, 2023–2026

Examiner Intelligence

VO, QUANG N View full profile →

Grants 72% — above average

Career Allow Rate

439 granted / 612 resolved

+9.7% vs TC avg

Moderate +8% lift

Without

With

+8.3%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

23 currently pending

Career history

635

Total Applications

across all art units

Statute-Specific Performance

§101

13.4%

-26.6% vs TC avg

§103

52.8%

+12.8% vs TC avg

§102

22.1%

-17.9% vs TC avg

§112

7.6%

-32.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 612 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 07/15/2025 was filed in compliance with the provisions of 37 CFR 1.97 and 1.98. Accordingly, the information disclosure statement is being considered by the examiner.
Applicant has not provided an explanation of relevance of cited document(s) discussed below.
Reference US 2022/0405484 A1 is a general background reference covering: A computer-implemented method and system for enrichment of responses in a multimodal conversation environment are disclosed. A Question Answer (QA) engine, such as a reinforcement document transformer exploits a document template structure or layout, adapts the information extraction using a domain ontology, stores the enriched contents in a hierarchical form, and learns context and query patterns based on the intent and utterances of one or more queries. The region of enriched content for preparing a response to a given query is expanded or collapsed by navigating upwards or downwards in the hierarchy. The QA engine returns the most relevant answer with the proper context for one or more questions. The responses are provided to the user in one or more modalities. (see abstract).
Claim Interpretation
Claim 18 is objected to because of the following informalities: because claim 18 depends on a method claim 14 not a system of claim 14.  Appropriate correction is required.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-9, 11-20 are rejected under 35 U.S.C. 103 as being unpatentable over Gutta et al. (Gutta) (US 11,580,150 B1).
Regarding claim 1, Gutta discloses a system (e.g., FIG. 1 is a schematic diagram that illustrates a first computing environment (or “system”) 100, paragraph 13), comprising: 
one or more processors, coupled with memory (e.g., he system 100 includes a computer system 102 (e.g., including a classifier subsystem 110 and a data extraction subsystem 112), a set of user devices 104 (e.g., including user devices 104a-104c) and databases 132 (e.g., including document databases 134, a model database 136 and a document feature database 138) communicatively coupled by way of a network 150, paragraph 13), to: 
identify a document of a first type received from a client device (e.g., the classifier subsystem 110 is employed to classify a document and the data extraction subsystem 112 is used to determine features and values of features contained in a document, paragraph 17); 
establish a boundary of a portion of the document based on a digital overlay (e.g., the classifier subsystem 110 may use a first set of model parameters from the model database 136 for a first deep learning model to classify a document to determine a set of document features of a document, paragraph 17);
extract the data from the document of the first type by inputting the query to a second trained machine learning model (e.g., the extracted data may be processed by a second deep learning model 240. Some embodiments may train the second deep learning model 240 based on corpora and a plurality of document type categories associated with the corpora, paragraph 22). 
Gutta, in one embodiment, does not specifically disclose select the portion of the document based on the boundary; generate, using a trained machine learning model, a query using the portion of the document, wherein the query is designed to facilitate an extraction of data, wherein the data to be extracted is based on the document being of the first type.
Gutta, in another embodiment, discloses select the portion of the document based on the boundary; generate, using a trained machine learning model, a query using the portion of the document, wherein the query is designed to facilitate an extraction of data, wherein the data to be extracted is based on the document being of the first type (e.g., certain embodiments provide perform operations to extract information from a document using a set of artificial intelligence (AI) learning models (or “learning models”), such as deep learning models or other machine learning models, and to store the information in a structured database. The information may be prioritized based on predefined or dynamically defined categories. In some embodiments, training the set of learning models may include using self-learning models or semi-automated learning models. Some embodiments may further train one or more learning models to store the extracted information in a structure that is organized based on the document, paragraph 12).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to have modified Gutta to include select the portion of the document based on the boundary; generate, using a trained machine learning model, a query using the portion of the document, wherein the query is designed to facilitate an extraction of data, wherein the data to be extracted is based on the document being of the first type as taught by Gutta’s another embodiment. It would have been obvious to one of ordinary skill in the art at the time of the invention to have modified Gutta’s first embodiment by the teaching of Gutta’s another embodiment to use for particular application.

Regarding claim 2, Gutta discloses wherein the one or more processors are further configured to: determine a validation score for the extracted data; and display the extracted data via the client device in response to the validation score being above a threshold.  31

Regarding claim 3, Gutta discloses wherein the one or more processors are further configured to: determine the validation score using the trained machine learning model, wherein the trained machine learning model receives the extracted data as an input (e.g.,  Some embodiments may further train one or more learning models to store the extracted information in a structure that is organized based on the document. Furthermore, some embodiments may use the set of learning models in a hierarchical arrangement and further dynamically structure a user interface associated with model training to increase training efficiency, paragraph 12).  

Regarding claim 4, Gutta discloses wherein the one or more processors are further configured to: determine, using the second trained machine learning model, the validation score, wherein the second trained machine learning model receives the extracted data as an input (e.g., after retrieving the set of fields using the document feature retriever 220, the extracted data may be processed by a second deep learning model 240. Some embodiments may train the second deep learning model 240 based on corpora and a plurality of document type categories associated with the corpora, paragraph 22). 

Regarding claim 5, Gutta discloses wherein the one or more processors are further configured to:
determine, via the trained machine learning model, a first validation score, wherein the trained machine learning model receives the extracted data as a first input (e.g., the classifier subsystem 110 may use a first set of model parameters from the model database 136 for a first deep learning model to classify a document to determine a set of document features of a document, paragraph 17);
determine, via the second trained machine learning model, a second validation score, wherein the second machine learning model receives the extracted data as a second input (e.g., the data extraction subsystem 112 may use a second set of model parameters from the model database 136 for a second deep learning model to determine values associated with the document features. In some embodiments, the first deep learning model is different from the second deep learning model, paragraph 17); and 
display the extracted data in response to a determination that the first validation score and the second validation score are both above the threshold (e.g., this may include the NLP computer system (e.g., field editor 232 of the interface 230 of the field retriever 220) displaying on a graphical user interface for viewing and selection by a user, a listing the fields “date” and “time” (which is the extracted data) determined to be associated with the real-estate contract document category. The listing of fields may be interactive, allowing a user to select to modify a displayed field, select to add a field to the listing, or select to delete a field from the listing, paragraph 66). 

Regarding claim 6, Gutta discloses wherein the one or more processors are further configured to:
determine a validation score for the extracted data (e.g., the classifier subsystem 110 is employed to classify a document and the data extraction subsystem 112 is used to determine features and values of features contained in a document, paragraph 17); 
extract new data from the document of the first type by inputting the query into the second trained machine learning model in response to a determination that the validation score is below a threshold (e.g., the data extraction subsystem 112 may use a second set of model parameters from the model database 136 for a second deep learning model to determine values associated with the document features, paragraph 17); 
determine a new validation score for the extracted new data (e.g.,  to determine document features based on the classifications, and to extract corresponding feature values from the documents, paragraph 18); and 
replace the extracted data with the extracted new data, in response to a determination that the new validation score is above the threshold (e.g., After retrieving the set of fields using the document feature retriever 220, the extracted data may be processed by a second deep learning model 240. Some embodiments may train the second deep learning model 240 based on corpora and a plurality of document type categories associated with the corpora, paragraph 22). 

Regarding claim 7, Gutta discloses wherein the one or more processors are further configured to: determine a domain of a plurality of domains of the document according to the first type; and template the extracted data according to an ontological library corresponding to the domain determined (e.g., the classifier subsystem 110 may use a Naïve Bayes classifier to assign a first document type (or “category”) to a document (where the first document type category is associated with a corresponding set of document features) and the data extraction subsystem 112 may then use the corresponding set of document features in conjunction with the text of the document to determine a set of document feature values using a multi-channel transformer model, such as a dual BERT model or a Siamese BERT model, paragraph 17).  

Regarding claim 8, Gutta discloses wherein the one or more processors are further configured to: create at least one new document by an action performed on the document (e.g.,  In some embodiments, data is obtained from one or more databases 132 for training or use by a set of AI learning models (e.g., deep learning models or other types of machine learning models) to process documents, paragraph 16); and input a first training data set to train the trained machine learning model, wherein the first training data set comprises the at least one new document and the document (e.g., Training operations for a deep learning model may include, for example, obtaining corpora from a document database 134 and using the corpora to determine model parameters for the deep learning model. The model parameters determined may then be stored in the model database 136 and retrieved for use in classifying a document or determining document feature values (or “field values”), which may be stored in the document feature database 138. In some embodiments, a set of document features (or “fields”) for a document are retrieved from a document feature database 138 based on a set of categories assigned to the document, paragraph 16).  

Regarding claim 9, Gutta discloses the action performed on the document is at least one of: a rotation; an inversion; a rescaling; a blurring; a sharpening; a modification of a quantitative aspect; and a modification of a qualitative aspect (e.g., A finer-grain model may, for example, be used to extract quantitative or categorical values of interest, where the context of the per-sentence level may be retained for the finer-grain model, paragraph 3).  

Regarding claim 11, Gutta discloses wherein the one or more processors are configured to: determine a domain of a plurality of domains corresponding to the first type of the document, the plurality of domains comprising; payroll; tax; benefits; human resources; time management; or performance management (e.g., the record having the title “shipment contracts—1.1” may have an associated set of record properties such as “ship type,” “ship color,” and “container type, ” and the record having the title “shipment contracts—backup” may have an associated set of record properties such as “cost,” “insured amount,” and “tax.”, paragraph 19). 

Regarding claim 12, Gutta discloses wherein the one or more processors are further configured to: receive, via the client device, an indication of the first type of document (e.g., In some embodiments, the first deep learning model is different from the second deep learning model. For example, the classifier subsystem 110 may use a Naïve Bayes classifier to assign a first document type (or “category”) to a document (where the first document type category is associated with a corresponding set of document features) and the data extraction subsystem 112 may then use the corresponding set of document features in conjunction with the text of the document to determine a set of document feature values using a multi-channel transformer model, such as a dual BERT model or a Siamese BERT model. In some embodiments, the set of document feature values are stored in the database(s) 132 or some other datastore, paragraph 17). 

Regarding claim 13, Gutta discloses wherein the second trained machine learning model is a trained attention embedded transformer network model (e.g., the data extraction subsystem 112 may use a second set of model parameters from the model database 136 for a second deep learning model to determine values associated with the document features, paragraph 17).  

	Regarding claim 14, claim 14 is a method claim with limitations similar of limitations of claim 1. Therefore, claim 14 is rejected as set forth above as claim 1.
Regarding claim 15, claim 15 is a method claim with limitations similar of limitations of claim 2. Therefore, claim 15 is rejected as set forth above as claim 2.
Regarding claim 16, claim 16 is a method claim with limitations similar of limitations of claim 6. Therefore, claim 16 is rejected as set forth above as claim 6.
Regarding claim 17, claim 17 is a method claim with limitations similar of limitations of claim 8. Therefore, claim 17 is rejected as set forth above as claim 8.
Regarding claim 18, claim 18 is the system claim with limitations similar of limitations of claim 12. Therefore, claim 18 is rejected as set forth above as claim 12.
Regarding claim 19, claim 19 is a non-transitory computer-readable medium claim with limitations similar of limitations of claim 1. Therefore, claim 19 is rejected as set forth above as claim 1.
Regarding claim 20, claim 20 is a non-transitory computer-readable medium claim with limitations similar of limitations of claim 2. Therefore, claim 20 is rejected as set forth above as claim 2.
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over over Gutta et al. (Gutta) (US 11,580,150 B1) as applied to claim 1 above, and further in view of Represas et al. (Represas) (US 12,106,140 B2).
	Regarding claim 10, Gutta does not specifically disclose wherein the one or more processors are further configured to: create at least one new document, wherein the new document is a rotation of the document; input a first training data set to a machine learning model to train the machine learning model, wherein the first training data set comprises the at least one new document and the document.
Represas discloses wherein the one or more processors are further configured to: create at least one new document, wherein the new document is a rotation of the document (e.g., If necessary: segment, merge, rotate or otherwise transform various document pages so as to enhance their legibility and allow for posterior storage according to the institution's ontology, paragraph 65); and
input a first training data set to a machine learning model to train the machine learning model, wherein the first training data set comprises the at least one new document and the document (e.g., the model-hosting services implementing a plurality of different trained machine learning classifiers and/or inference models, the stored program instructions of the main service being configured when executed by one or more computing instances of a virtual computing environment to cause the one or more computing instances to execute: receiving a digitally stored electronic document; invoking the model-hosting services to execute automatically inferring at least a subject and a date in the electronic document, paragraph 32). 
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of the invention to have modified Gutta to include wherein the one or more processors are further configured to: create at least one new document, wherein the new document is a rotation of the document; input a first training data set to a machine learning model to train the machine learning model, wherein the first training data set comprises the at least one new document and the document as taught by Represas. It would have been obvious to one of ordinary skill in the art at the time of the invention to have modified Gutta by the teaching of Represas to effectively use for particular application.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to QUANG N VO whose telephone number is (571)270-1121. The examiner can normally be reached Monday-Friday, 7AM-4PM, EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abderrahim Merouan can be reached at 571-270-5254. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/QUANG N VO/Primary Examiner, Art Unit 2683

Read full office action

Prosecution Timeline

Apr 09, 2024

Application Filed

Mar 06, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/219,509

Patent 12592002

COLOR CONVERSION SYSTEM, COLOR CONVERSION METHOD, AND INFORMATION PROCESSING APPARATUS

2y 5m to grant Granted Mar 31, 2026

18/018,367

Patent 12577842

METHOD AND SYSTEM FOR MEASURING VOLUME OF A DRILL CORE SAMPLE

2y 5m to grant Granted Mar 17, 2026

18/354,320

Patent 12581023

GREYSCALE IMAGES

2y 5m to grant Granted Mar 17, 2026

17/987,109

Patent 12572996

FRACTIONALIZED TRANSFERS OF SENSOR DATA FOR STREAMING AND LATENCY-SENSITIVE APPLICATIONS

2y 5m to grant Granted Mar 10, 2026

18/221,895

Patent 12573172

IMAGE OUTPUTTING DEVICE AND IMAGE OUTPUTTING METHOD

2y 5m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

72%

Grant Probability

80%

With Interview (+8.3%)

2y 9m

Median Time to Grant

Low

PTA Risk

Based on 612 resolved cases by this examiner. Grant probability derived from career allow rate.