Last updated: April 18, 2026
Application No. 18/671,218
Systems and Methods for Extracting Information from a Physical Document

Non-Final OA §101§103§DP
Filed
May 22, 2024
Examiner
HAUSMANN, MICHELLE M
Art Unit
2671
Tech Center
2600 — Communications
Assignee
Google LLC
OA Round
1 (Non-Final)
Interview Optional

— +21.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 863 resolved cases, 2023–2026
Examiner Intelligence

HAUSMANN, MICHELLE M View full profile →
Grants 76% — above average
Career Allow Rate
658 granted / 863 resolved
+14.2% vs TC avg
Strong +22% interview lift
Without
With
+21.6%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
23 currently pending
Career history
886
Total Applications
across all art units
Statute-Specific Performance

§101
14.6%
-25.4% vs TC avg
§103
61.2%
+21.2% vs TC avg
§102
5.7%
-34.3% vs TC avg
§112
10.1%
-29.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 863 resolved cases
Office Action

§101 §103 §DP
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 22 May, 2024 and 12 September, 2024 are in compliance with the provisions of 37 CFR 1.97.  While some of the non-patent literature and search reports were not included in this application, they were included in the parent case. Accordingly, the information disclosure statements are being considered by the examiner.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 1 (and by dependency claims 2-18) are rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 12033412. Although the claims at issue are not identical, they are not patentably distinct from each other because the current claims are a broader version of those in U.S. Patent No. 12033412 (difference underlined). See additional note on claim 17 below.
Current Application
1. A computer-implemented method for extracting information from documents, the method comprising: obtaining, at a computing system comprising one or more processors, data representing one or more units of text extracted from an image of a document; determining, by the computing system, one or more annotated values from the one or more units of text; determining, by the computing system, a label for each annotated value of the one or more annotated values, wherein the label for each annotated value comprises a key that explains the annotated value, and wherein determining, by the computing system, the label for each annotated value comprises performing, by the computing system for each annotated value, a search for the label among the one or more units of text based at least in part on a location of the annotated value within the document; and mapping, by the computing system, at least one annotated value from the one or more annotated values to an action that is presented to a user based at least in part on the label associated with the at least one annotated value.
U.S. Patent No. 12033412
1. A computer-implemented method for extracting information from documents, the method comprising: obtaining, at a computing system comprising one or more processors, data representing one or more units of text extracted from an image of a document; determining, by the computing system, one or more annotated values from the one or more units of text; determining, by the computing system, a label for each annotated value of the one or more annotated values, wherein the label for each annotated value comprises a key that explains the annotated value, and wherein determining, by the computing system, the label for each annotated value comprises performing, by the computing system for each annotated value, a search for the label among the one or more units of text based at least in part on a location of the annotated value within the document; wherein determining, by the computing system, the label for each annotated value comprises: determining, by the computing system based on the search, a set of one or more candidate labels for each annotated value, wherein preference is given to candidate labels that satisfy relative location characteristics, wherein the relative location characteristics comprise: inclusion in left-side region or a top-side region relative to the location associated with the annotated value in a coordinate space of the document when a language associated with the document is a Left-to-Right (LTR) language, or inclusion in a right-side region or the top-side region relative to the location associated with the annotated value in the coordinate space of the document when the language associated with the document is a Right-to-Left (RTL) language; and determining, by the computing system, a canonical label for each annotated value based at least in part on the set of one or more candidate labels associated with the annotated value; and mapping, by the computing system, at least one annotated value from the one or more annotated values to an action that is presented to a user based at least in part on the label associated with the at least one annotated value.


Additional note on claim 17: While claims 1 + 2 + 17 are similar to claim 1 of U.S. Patent No. 12033412, the current application will have the language “searching only a left-side region and a top-side region relative to the location associated with the annotated value in a coordinate space of the document when a language associated with the document is a Left-to-Right (LTR) language, or searching only a right-side region and a top-side region relative to the location associated with the annotated value in the coordinate space of the document when the language associated with the document is a Right-to-Left (RTL) language” and the U.S. Patent No. 12033412 has the language “wherein preference is given to candidate labels that satisfy relative location characteristics, wherein the relative location characteristics comprise: inclusion in left-side region or a top-side region relative to the location associated with the annotated value in a coordinate space of the document when a language associated with the document is a Left-to-Right (LTR) language, or inclusion in a right-side region or the top-side region relative to the location associated with the annotated value in the coordinate space of the document when the language associated with the document is a Right-to-Left (RTL) language” which are different enough to not be statutory double patenting but only non-statutory double patenting. 

Claim 19 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 + 18 of U.S. Patent No. 12033412. Although the claims at issue are not identical, they are not patentably distinct from each other because the current claims are a broader version of those in U.S. Patent No. 12033412 (difference underlined).
Current Application
19. A computing system, the system comprising: one or more processors; and a computer-readable medium having instructions stored thereon that, when executed by the one or more processors, cause the system to perform operations, the operations comprising: obtaining, by the computing system, data representing one or more units of text extracted from an image of a document; determining, by the computing system, one or more annotated values from the one or more units of text; determining, by the computing system, a label for each annotated value of the one or more annotated values, wherein the label for each annotated value comprises a key that explains the annotated value, and wherein determining, by the computing system, the label for each annotated value comprises performing, by the computing system for each annotated value, a search for the label among the one or more units of text based at least in part on a location of the annotated value within the document; and mapping, by the computing system, at least one annotated value from the one or more annotated values to an action that is presented to a user based at least in part on the label associated with the at least one annotated value.
U.S. Patent No. 12033412
1. A computer-implemented method for extracting information from documents, the method comprising: obtaining, at a computing system comprising one or more processors, data representing one or more units of text extracted from an image of a document; determining, by the computing system, one or more annotated values from the one or more units of text; determining, by the computing system, a label for each annotated value of the one or more annotated values, wherein the label for each annotated value comprises a key that explains the annotated value, and wherein determining, by the computing system, the label for each annotated value comprises performing, by the computing system for each annotated value, a search for the label among the one or more units of text based at least in part on a location of the annotated value within the document; wherein determining, by the computing system, the label for each annotated value comprises: determining, by the computing system based on the search, a set of one or more candidate labels for each annotated value, wherein preference is given to candidate labels that satisfy relative location characteristics, wherein the relative location characteristics comprise: inclusion in left-side region or a top-side region relative to the location associated with the annotated value in a coordinate space of the document when a language associated with the document is a Left-to-Right (LTR) language, or inclusion in a right-side region or the top-side region relative to the location associated with the annotated value in the coordinate space of the document when the language associated with the document is a Right-to-Left (RTL) language; and determining, by the computing system, a canonical label for each annotated value based at least in part on the set of one or more candidate labels associated with the annotated value; and mapping, by the computing system, at least one annotated value from the one or more annotated values to an action that is presented to a user based at least in part on the label associated with the at least one annotated value.
18. A computing system, the systems comprising: one or more processors; and a computer-readable medium having instructions stored thereon that, when executed by the one or more processors, cause the system to perform the computer-implemented method of claim 1.


Claim 20 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 + 19 of U.S. Patent No. 12033412. Although the claims at issue are not identical, they are not patentably distinct from each other because the current claims are a broader version of those in U.S. Patent No. 12033412(difference underlined).

Current Application
20. A computer-readable medium having instructions stored thereon that, when executed by one or more processors of a computing system, cause the computing system to perform operations, the operations comprising: obtaining by the computing system, data representing one or more units of text extracted from an image of a document; determining, by the computing system, one or more annotated values from the one or more units of text; determining, by the computing system, a label for each annotated value of the one or more annotated values, wherein the label for each annotated value comprises a key that explains the annotated value, and wherein determining, by the computing system, the label for each annotated value comprises performing, by the computing system for each annotated value a search for the label among the one or more units of text based at least in part on a location of the annotated value within the document; and mapping, by the computing system, at least one annotated value from the one or more annotated values to an action that is presented to a user based at least in part on the label associated with the at least one annotated value.
U.S. Patent No. 12033412
1. A computer-implemented method for extracting information from documents, the method comprising: obtaining, at a computing system comprising one or more processors, data representing one or more units of text extracted from an image of a document; determining, by the computing system, one or more annotated values from the one or more units of text; determining, by the computing system, a label for each annotated value of the one or more annotated values, wherein the label for each annotated value comprises a key that explains the annotated value, and wherein determining, by the computing system, the label for each annotated value comprises performing, by the computing system for each annotated value, a search for the label among the one or more units of text based at least in part on a location of the annotated value within the document; wherein determining, by the computing system, the label for each annotated value comprises: determining, by the computing system based on the search, a set of one or more candidate labels for each annotated value, wherein preference is given to candidate labels that satisfy relative location characteristics, wherein the relative location characteristics comprise: inclusion in left-side region or a top-side region relative to the location associated with the annotated value in a coordinate space of the document when a language associated with the document is a Left-to-Right (LTR) language, or inclusion in a right-side region or the top-side region relative to the location associated with the annotated value in the coordinate space of the document when the language associated with the document is a Right-to-Left (RTL) language; and determining, by the computing system, a canonical label for each annotated value based at least in part on the set of one or more candidate labels associated with the annotated value; and mapping, by the computing system, at least one annotated value from the one or more annotated values to an action that is presented to a user based at least in part on the label associated with the at least one annotated value.
19. A non-transitory computer-readable medium having instructions stored thereon that, when executed by the one or more processors, cause one or more processors to perform the computer-implemented method of claim 1.



Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

In 35 U.S.C. 101 requires that a claimed invention must fall within one of the four eligible categories of invention (i.e. process, machine, manufacture, or composition of matter) and must not be directed to subject matter encompassing a judicially recognized exception as interpreted by the courts.  MPEP 2106.  The four eligible categories of invention  include: (1) process which is an act, or a series of acts or steps, (2) machine which is an concrete thing, consisting of parts, or of certain devices and combination of devices, (3) manufacture which is an article produced from raw or prepared materials by giving to these materials new forms, qualities, properties, or combinations, whether by hand labor or by machinery, and (4) composition of matter which is all compositions of two or more substances and all composite articles, whether they be the results of chemical union, or of mechanical mixture, or whether they be gases, fluids, powders or solids.  MPEP 2106(I).
Claim 20 is rejected under 35 U.S.C. 101 as not falling within one of the four statutory categories of invention because the broadest reasonable interpretation of the instant claims in light of the specification encompasses transitory signals.  But, transitory signals are not within one of the four statutory categories (i.e. non-statutory subject matter).  See MPEP 2106(I).   However, claims directed toward a non-transitory computer readable medium may qualify as a manufacture and make the claim patent-eligible subject matter.  MPEP 2106(I).  Therefore, amending the claims to recite a “non-transitory computer-readable medium” would resolve this issue.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 2, 7-9, 11-14, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glass et al. (IDS: US 20040261016 A1) in view of Yellapragada et al. (IDS: US 20180032842 A1) in view of Dakin et al. (IDS: US 20160217119 A1).

Regarding claims 1, 19, and 20, Glass et al. disclose a computer-implemented method for extracting information from documents, the method comprising, a computing system, the system comprising: one or more processors; and a computer-readable medium having instructions stored thereon that, when executed by the one or more processors, cause the system to perform operations, the operations comprising, and computer-readable medium having instructions stored thereon that, when executed by one or more processors of a computing system, cause the computing system to perform operations, the operations comprising: obtaining, at a computing system comprising one or more processors, data representing one or more units of text extracted from an image of a document (document parsing and extraction of substrings, [0113], characters preceding the document boundary are extracted and digested, [0114], pair of extracted document text substrings and their associated digest values, [0118]); determining, by the computing system, one or more annotated values from the one or more units of text (annotion values, [0104], annotation values of yes and no, annotation values of 0, 1, 2, 3, 4, 5, and 6, [0107]); determining, by the computing system, a label for each annotated value of the one or more annotated values (a set of annotation value labels associated with each annotation value, [0107]), wherein the label for each annotated value comprises a key that explains the annotated value (stores at least one document annotation definition at the server computer, document annotation definition provides a structure for the method by which documents are annotated, [0064], “In another substring annotation definition example, FIG. 4 illustrates that, optionally, a second substring classification annotation type may be defined, such as a Substring classification 2: call to action text 189. This type of document substring, if found within a sample document and correctly annotated, enables the annotation system to record the existence within a document of specific types of content, such as URLs, email addresses, phone numbers, postal addresses or other text substrings that signify a method of contacting the document author or an entity attempting to identify themselves in a document”, [0109]), and wherein determining, by the computing system, the label for each annotated value comprises performing, by the computing system for each annotated value, a search for the label among the one or more units of text based at least in part on a location of the annotated value within the document (one or more possible document parsing boundaries, [0114], Regardless of the document parsing boundary conditions that are set, the result of applying these boundaries in steps 198 and 200 of FIG. 5 is the identification and storage of one or more document text substrings that are contiguous to each other within the original document, [0121]); and mapping, by the computing system, at least one annotated value from the one or more annotated values to an action that is presented to a user based at least in part on the label associated with the at least one annotated value (“As an example, in FIG. 4 a set of email message documents can be classified, in a first document annotation type 184, as either junk or not junk, in a second annotation type 186 as having a selected topic, and in third and fourth annotation types 188 and 189 as having one or more document text substrings that may be annotated according as to whether the substrings are valid or not and whether the substring text represents call to action text”, [0106], pick list value labels exist to assist a human annotator, [0107], “In another substring annotation definition example, FIG. 4 illustrates that, optionally, a second substring classification annotation type may be defined, such as a Substring classification 2: call to action text 189”, [0109]).

Glass et al. imply “wherein determining, by the computing system, the label for each annotated value comprises performing, by the computing system for each annotated value, a search for the label among the one or more units of text based at least in part on a location of the annotated value within the document” however as this is not made explicit, another reference is provided herein. Also, while a mapping process is implied and almost inherent, another reference is provided to also make this limitation explicit.

Yellapragada et al. teach obtaining, at a computing system comprising one or more processors, data representing one or more units of text extracted from an image of a document (extracting data from documents, OCR, abstract, generate a classifier model for use in extraction of information e.g., labels and values from documents, [0021]); determining, by the computing system, one or more annotated values from the one or more units of text (At 512 OCR is performed in the cropped images to obtain values within the value regions, [0043]); determining, by the computing system, a label for each annotated value of the one or more annotated values (At 508, locations of one or more value regions corresponding to each identified label are obtained, [0042]), wherein the label for each annotated value comprises a key that explains the annotated value (Further, the user may be prompted to identify each of the regions tagged as labels. For example, a user may drag a mouse pointed on a training image to draw a rectangular box around a label “employee name”, and may tag the selected box as a label. The user may further specify that the label corresponds to “employee name”, [0036]), and wherein determining, by the computing system, the label for each annotated value comprises performing, by the computing system for each annotated value, a search for the label among the one or more units of text based at least in part on a location of the annotated value within the document (“At 506, each candidate label is classified (e.g., identified) and located within the candidate document based on the trained classifier generated during the offline processing (e.g., process block 408 of FIG. 4). For example, each region of the image 524 identified in step 504 as potentially containing a label is run through the trained classifier model. In an aspect, one or more of the regions are identified and located within the candidate document as a result of the classification operation. At 508, locations of one or more value regions corresponding to each identified label are obtained. For example, each of the regions not identified as a label during the classification step at 506 may be designated as a value region. Further, a designated value region may be identified as corresponding to an identified label based on the position of the value region relative to the label and other labels and value regions. The identification of correspondence between value regions and labels may be carried out based on meta data including spatial information stored during the offline process (e.g., step 404 of FIG. 4)”, [0042]).

Glass et al. and Yellapragada et al. are in the same art of scanned digital forms and documents (Glass et al., [0169]; Yellapragada et al., [0021]). The combination of Yellapragada et al. with Glass et al. enables the use of searching based on a location within a document. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the location of Yellapragada et al. with the invention of Glass et al. as this was known at the time of filing, the combination would have predictable results, and as Yellapragada et al. indicate, “Aspects of the present disclosure discuss techniques to efficiently and accurately extract data from semi-structured documents (e.g., tax forms). Further, some of the aspects presented herein may advantageously be designed to run on resource-constrained environments, such as mobile devices, which have limited memory and processing power. In particular, some of the embodiments presented herein may utilize little memory and processing power to identify labels in a document, and further to identify value regions in the document for performing OCR to determine values corresponding to the identified labels” ([0020]) indicating a processing savings when combined with Glass et al. which will cause the combination to be more flexible in different processing applications such as those on a cell phone.

While Glass et al. and Yellapragada et al. strongly imply “mapping” another reference is provided herein.

Dakin et al. teach mapping, by the computing system, at least one annotated value from the one or more annotated values to an action that is presented to a user based at least in part on the label associated with the at least one annotated value (A suggested response for populating the fillable form field candidate is selected from prior responses to other form fields having at least one attribute in common with the identified attribute of the fillable form field candidate. The prior responses are inputs obtained from or associated with the given user. The suggested response is presented to the user for subsequent acceptance or rejection, abstract, To aid in suggesting data for filling in electronic form fields, some existing form filling techniques maintain lists of user inputs gathered from prior form filling sessions. When a user opens an electronic form, these existing techniques provide suggestions for filling so-called live form fields that are defined in the form. For instance, if a form contains a live form field defined as “e-mail address,” the data entered into that field is stored in a database. When the user opens another form with a similar live form field, the application may suggest the same e-mail address that was previously entered by the user on the first form, [0010], The suggested responses are based on prior responses by the same user, or other contextual user information (e.g., the first and last name associated with a current login), to similar form fields in other forms, [0011], “Both the initial learning stage and subsequent response suggestion stages rely on a model for representing a mapping between a fillable form field candidate and a suggested response. To accomplish this, in an embodiment, a structure is generated that represents a decision that was made correlating a form field candidate label and the value associated with that field”, [0028]).

Glass et al. and Yellapragada et al. and Dakin et al. are in the same art of scanned digital forms and documents (Glass et al., [0169]; Yellapragada et al., [0021]; Dakin et al., [0010]). The combination of Dakin et al. with Glass et al. and Yellapragada et al. enables the use of mapping to suggested actions. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the action mapping of Dakin et al. with the invention of Glass et al. and Yellapragada et al. as this was known at the time of filing, the combination would have predictable results, and as Dakin et al. indicate this will allow prior information from fillable forms to be customized to a user (abstract) implying a time savings for the user and a minimizing of aggravation when filling out forms when combined with Glass et al. and Yellapragada et al..

Regarding claim 2, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 1. Glass et al. and Yellapragada et al. and Dakin et al. further indicate wherein determining, by the computing system based on the search, a set of one or more candidate labels for each annotated value; and determining, by the computing system, a canonical label for each annotated value based at least in part on the set of one or more candidate labels associated with the annotated value (Glass et al., This type of document substring, if found within a sample document and correctly annotated, enables the annotation system to record the existence within a document of specific types of content, [0109]; Yellapragada et al. “training a machine learning classifier model using the extracted spatial attributes as training data. OCR app 144 or OCR app 124 may perform a run-time process including obtaining an image of a candidate document, segmenting the image to obtain regions (e.g., identified by spatial information of the obtained regions) within the image encompassing candidate labels and values, classifying and locating labels in the image based on the classifier model trained in the offline process, and performing OCR in the regions not identified as labels to obtain values corresponding to the identified labels”, [0030], obtain candidate labels, the candidate labels including regions within the image 524 that may potentially include a label, [0040], At 506, each candidate label is classified (e.g., identified) and located within the candidate document based on the trained classifier, [0042]; Dakin et al., each data record includes a label-value pair, where the label associates the value with the respective fillable form field candidate, [0023], The neuron contains specific information (e.g., context, type, label, value) while the strength of this neuron is based on historical accumulation of this information (e.g., confidence, weight, recency, time) with the remaining factors (e.g., action, source, convert) corresponding to the corrective feedback sent to each existing neuron (e.g., rules and/or weights applied to those rules in adjusting the strength of each neuron). Note that the data used for this corrective feedback (e.g., action, source, convert) may be stored for supervised training purposes, but may also be used for decision making itself (e.g., a label from OCR conversion may require loose matching due to the potential for bad OCR results), [0024], a text label value or button group label is predicted by exhaustively searching the entire synapse array looking for label overlap/ commonality and adding candidates to form a prediction array. Next, new form field elements and user value elements are stored in the form and user databases, respectively. A new synapse record is created when the data or user database records change for a given form, [0028]).

Regarding claim 7, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 1. Yellapragada et al. and Dakin et al. further indicate obtaining the data representing the one or more units of text extracted from an image of a document comprises: obtaining, by the computing system, image data representing the image of the document; inputting, by the computing system, the image data into an optical character recognition (OCR) model; and obtaining, by the computing system, an output of the OCR model in response to the image data, the output including the one or more units of text (Yellapragada et al., Techniques are disclosed for facilitating optical character recognition (OCR) by identifying one or more regions in an electronic document to perform the OCR, abstract, once the labels are identified, OCR is performed on value regions, [0020], perform OCR by identifying at least labels and distinguishing between labels and values within a candidate document., [0025], one of the OCR apps 134, 114, and 124, or a combination thereof may be used to implement the techniques for facilitating identifying information in a document in accordance with aspects of the present disclosure, [0030], [0032]; Dakin et al., because text is not rendered and is only distinguished visually, optical character recognition (OCR) technology may be used along with pre-processing to detect lines and other graphic patterns, [0013], object detection module 118 is configured to perform optical character recognition (OCR), graphic object detection, or both on an image-based electronic form 130 or document, [0018], each group of words is a fillable form field candidate that can be selected or unselected, inputs may, for example, come from a document (e.g., PDF) content stream, or from optical character recognition (OCR) pre-processing performed on an image of the form, [0020], identifying, using at least one of content encoded in the electronic form and optical character recognition of the electronic form: a location of the fillable form field candidate within the electronic form, [0047]).

Regarding claim 8, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 1. Glass et al. further disclose the data representing the one or more units of text comprises one or more bounding regions associated with the one or more units of text, each bounding region representing a position of the unit of text within a coordinate space associated with the document (The present invention teaches, to the contrary, that cross-document comparison capability is enhanced by pre-selecting the boundaries of specific document portions and document evaluation formats rather than leaving these choices at the discretion of document evaluators, [0044], FIG. 6 illustrates a set of document text substring boundary definitions that may be used to define the boundaries for and identify document text substrings within a document, [0074], document parsing boundaries, [0114], FIG. 6 further lists a seventh type of document parsing boundary condition in the form of an arbitrary occurrence of a selected number of characters in succession. With this arbitrary non-conjoined boundary condition, each contiguous set of, say, 100 characters within a document would be considered a document text substring, [0122]).

Regarding claim 9, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 8. Glass et al. further disclose each annotated value is associated with at least one of the one or more units of text and the bounding region that is associated with the at least one unit of text (The present invention teaches, to the contrary, that cross-document comparison capability is enhanced by pre-selecting the boundaries of specific document portions and document evaluation formats rather than leaving these choices at the discretion of document evaluators, [0044], FIG. 6 illustrates a set of document text substring boundary definitions that may be used to define the boundaries for and identify document text substrings within a document, [0074], document parsing boundaries, In step 202 the resulting digest value for the extracted document text substring is stored, [0114], FIG. 6 further lists a seventh type of document parsing boundary condition in the form of an arbitrary occurrence of a selected number of characters in succession. With this arbitrary non-conjoined boundary condition, each contiguous set of, say, 100 characters within a document would be considered a document text substring, [0122]).

Regarding claim 11, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 1. Yellapragada et al. further indicate determining the one or more annotated values comprises: inputting, by the computing system, the one or more units of text into an annotation model; and obtaining, by the computing system, an output of the annotation model in response to the one or more units of text, the output including the one or more annotated values (Yellapragada et al., a set of training documents for each template of a plurality of templates for the electronic document, extracting spatial attributes for at least a first label region and at least a first corresponding value region from the set, and training a classifier model based on the extracted spatial attributes, wherein the classifier model is used to identify the information in the electronic document, abstract, In certain aspects, a computing device may be configured to generate a classifier model for use in extraction of information (e.g., labels and values) from documents. In particular, the computing device may obtain electronic (also referred to as “digital”) images of the documents. The documents may correspond to different semi-structured documents, e.g., tax forms. For example, different companies may use different templates for a W2 form, [0021], extract information from a candidate document by using a trained classifier model, [0023]).

Regarding claim 12, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 11. Yellapragada et al. further indicate the annotation model includes one or more of a regular expression-based model, grammar parsing based model, machine-learned model, or heuristics model (a machine learning classifier model based on the extracted spatial information, [0022, [0024], [0030]).

Regarding claim 13, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 1. Glass et al. and Yellapragada et al. and Dakin et al. further indicate the one or more annotated values include one or more of: a date, a numeric value, a phone number, or an address (Glass et al., email addresses, phone numbers, postal addresses, [0109]; Yellapragada et al., For example, an element 320 may include a label 322 (e.g., text label), which may indicate the type of data (e.g., social security number (SSN)), [0034]; Dakin et al., For example, the electronic form may contain live form fields for typing in a first and last name, street address, telephone number, or other types of information that the user enters into the form, address, [0019], [0020], user ID number, [0022]).

Regarding claim 14, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 1. Yellapragada et al. and Dakin et al. further indicate determining the label for each annotated value comprises: inputting, by the computing system, the one or more annotated values into a candidate label model; and obtaining an output of the candidate label model in response to the one or more annotated values, by the computing system, the output including a set of one or more candidate labels for at least one annotated value from the one or more annotated values (Yellapragada et al., candidate labels and values, [0030], potential labels and values, [0032], obtain candidate labels, the candidate labels, [0040], each candidate label is classified, [0042], one or more of candidate labels as labels and locates these labels within the candidate image, [0046; Dakin et al., attribute of the fillable form field candidate may be a label called “city.”, [0016], correlating a form field candidate label and the value associated with that field, [0028], field candidate label list of words, [0029]).

Claim(s) 3 and 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glass et al. (IDS: US 20040261016 A1) and Yellapragada et al. (IDS: US 20180032842 A1) and Dakin et al. (IDS: US 20160217119 A1) as applied to claim 2 above, further in view of Yen et al. (US 20180373791 A1).

Regarding claim 3, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 1. Glass et al. and Yellapragada et al. and Dakin et al.
do not explicitly disclose determining the canonical label for each annotated value comprises producing, by the computing system, an embedding for each of the one or more candidate labels determined for such annotated value; determining, by the computing system, a respective distance between the embedding for each of the one or more candidate labels and respective embeddings associated with a plurality of canonical labels; and selecting, by the computing system, the canonical label for the annotated value from the plurality of canonical labels based at least in part on the respective distances between the embedding for each of the one or more candidate labels and respective embeddings associated with a plurality of canonical labels.

Yen et al. teach determining the canonical label for each annotated value comprises producing, by the computing system, an embedding for each of the one or more candidate labels determined for such annotated value; determining, by the computing system, a respective distance between the embedding for each of the one or more candidate labels and respective embeddings associated with a plurality of canonical labels; and selecting, by the computing system, the canonical label for the annotated value from the plurality of canonical labels based at least in part on the respective distances between the embedding for each of the one or more candidate labels and respective embeddings associated with a plurality of canonical labels (Related concept generator 200 may further include a selection module 250 that receives embedded target concept 232 and embedded candidate concepts 234 and selects intermediate concepts 225 that satisfy a predetermined relationship with target concept 202. In some embodiments, the selection may be based on displacement vectors between embedded target concept 232 and each of embedded candidate concepts 234 in the semantic vector space, [0028], In addition to and/or instead of the displacement vector approach described above, selection module 250 may implement a neural network model that receives embedded target concept 232 and embedded candidate concepts 234 and predicts whether a given candidate concept satisfies a desired relationship with target concept 202. For example, the neural network model may assign a probability and/or score to each of embedded concepts 234. Based on the probability and/or score, selection module 250 may select intermediate concepts 225 that correspond to embedded candidate concepts 234 with probabilities and/or scores that exceed a predetermined threshold. In some embodiments, the neural network model may receive as additional inputs relationship information 206 and/or user information 208. In some embodiments, the neural network model may be trained according to a supervised learning process, in which a plurality of labeled training examples (e.g., sets of training target concepts, training candidate concepts, and training labels indicating whether the training candidate concepts satisfy the desired relationship with the training target concepts) are provided to the neural network model and used to iteratively update the parameters of the neural network model, [0029]).

Glass et al. and Yellapragada et al. and Dakin et al. and Yen et al. are in the same art of digital forms and documents (Glass et al., [0169]; Yellapragada et al., [0021]; Dakin et al., [0010]; Yen et al., [0031]). The combination of Yen et al. with Glass et al. and Yellapragada et al. and Dakin et al. enables the use of embedding. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the embedding of Yen et al. with the invention of Glass et al. and Yellapragada et al. and Dakin et al. as this was known at the time of filing, the combination would have predictable results, and as Yen et al. indicate “As a result, the lack of systems that are able to automatically identify concepts that are related to a target concept may be a disservice to students. Accordingly, it would be desirable to provide systems and methods for automatically identifying concepts that are related to a target concept” ([0010]) therefore upgrading the user experience of Glass et al. and Yellapragada et al. and Dakin et al. and expanding applications of the invention to schools when the concepts are combined.

Regarding claim 4, Glass et al., Yellapragada et al., Dakin et al. and Yen et al. disclose the computer-implemented method of claim 3. Yen et al. further teach selecting, by the computing system, the canonical label for the annotated value from the plurality of canonical labels based at least in part on the respective distances comprises selecting, by the computing system, the canonical label from the plurality of canonical labels such that the distance between the respective embeddings of the canonical label and the one or more candidate labels is the smallest distance and above a specified threshold (displacement vector approach, selection module 250 may select intermediate concepts 225 that correspond to embedded candidate concepts 234 with probabilities and/or scores that exceed a predetermined threshold, [0029]) [higher score = lower distance].

Claim(s) 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glass et al. (IDS: US 20040261016 A1) and Yellapragada et al. (IDS: US 20180032842 A1) and Dakin et al. (IDS: US 20160217119 A1) and Yen et al. (US 20180373791 A1) as applied to claim 3 above, further in view of Corocan et al. (US 10699112 B1).

Regarding claim 5, Glass et al., Yellapragada et al., Dakin et al. and Yen et al. disclose the computer-implemented method of claim 3. Glass et al., Yellapragada et al., Dakin et al. and Yen et al. do not disclose the plurality of canonical labels comprise one or more of: due date, amount due, or expiry date.

Corocan et al. teach the plurality of canonical labels comprise one or more of: due date, amount due, or expiry date (As seen, invoice 200, which may be one of the document images 102, has a number of labels and associated data fields that are necessary for an invoice. The invoice is labeled as an “invoice” at 201. There is an invoice number 202 that uniquely identifies the invoice. The invoicing entity and address, seen at 203, identify the entity issuing the invoice. The recipient of the invoice is shown at 204. In addition, the invoice has a date field 205, payment terms 206, a due date 207 and a balance due 208, col. 3, lines 35-50).

Glass et al. and Yellapragada et al. and Dakin et al. and Corocan et al. are in the same art of digital forms and documents (Glass et al., [0169]; Yellapragada et al., [0021]; Dakin et al., [0010]; Corocan et al., abstract, col. 1, lines 5-10). The combination of Corocan et al. with Glass et al. and Yellapragada et al. and Dakin et al. and Yen et al. enables the use of determining a balance due from a form. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the balance due from a form of Corocan et al. with the invention of Glass et al. and Yellapragada et al. and Dakin et al. and Yen et al. as this was known at the time of filing, the combination would have predictable results, and as Corocan et al. indicate “Accurate identification and extraction of data from business documents is an important aspect of computerized processing of business documents” (col. 1, lines 10-20) therefore improving the accuracy of the extraction of Glass et al. and Yellapragada et al. and Dakin et al. and Yen et al., expanding applications of the invention to additional business applications.

Claim(s) 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glass et al. (IDS: US 20040261016 A1) and Yellapragada et al. (IDS: US 20180032842 A1) and Dakin et al. (IDS: US 20160217119 A1) as applied to claim 1 above, further in view of Corocan et al. (US 10699112 B1).

Regarding claim 6, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 1. Glass et al. and Yellapragada et al. and Dakin et al. do not disclose performing, by the computing system, the search for the label based at least in part on a location of the annotated value within the document comprises, for each annotated value: defining, by the computing system, a search space relative to the location of the annotated value within a coordinate space associated with the document, the search space defined based at least in part on a directional language convention associated with the language of the document; and searching, by the computing system, for the label only within the defined search space.

Corocan et al. teach performing, by the computing system, the search for the label based at least in part on a location of the annotated value within the document comprises, for each annotated value: defining, by the computing system, a search space relative to the location of the annotated value within a coordinate space associated with the document, the search space defined based at least in part on a directional language convention associated with the language of the document; and searching, by the computing system, for the label only within the defined search space (In one embodiment, a simple model of segment overlap, seen in FIGS. 3A and 3B is employed such that a token is chosen to be a contextual feature if it physically overlaps the selected token by 50% and is the closest in the relevant direction, in order to identify the relationships explained in connection with FIGS. 2B, 2C and 2D. FIG. 3A shows the considered local context for constructing a feature vector for a given segment, 302. Segments 303, 304, 305, and 306 are representations of text segments generated by the OCR module 108, each containing digitized text and 2-d spatial coordinates relative to the document. In one embodiment a maximum of 4 context segments from 4 relative directions are considered. The 4 context segments come from the 4 contexts “immediately to the left of the given segment”, “immediately to the right of the given segment”, “immediately above the given segment” and “immediately underneath the given segment”. In one embodiment these local context segments are chosen based on the amount of X-axis or Y-axis overlap between the context segment and the given segment, here 302. “up segments” 303 and “down segments” 305 must have >50% of their length overlapped by 302, or 302 must be >50% overlapped by the considered up or down segment. In the event of multiple candidate overlap segments, the context segment is chosen so that it is vertically closest. When candidate segments overlap and are equally vertically close, the final context segment is chosen such that it its X-coordinate midpoint is closest to the given segments X-coordinate midpoint. Likewise, “left context” and “right context” segments are chosen with similar logic while exchanging coordinate axes, col. 5, lines 20-50).

Glass et al. and Yellapragada et al. and Dakin et al. and Corocan et al. are in the same art of digital forms and documents (Glass et al., [0169]; Yellapragada et al., [0021]; Dakin et al., [0010]; Corocan et al., abstract, col. 1, lines 5-10). The combination of Corocan et al. with Glass et al. and Yellapragada et al. and Dakin et al. enables the use of a search space defined based at least in part on a directional language convention. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the search space defined based at least in part on a directional language convention of Corocan et al. with the invention of Glass et al. and Yellapragada et al. and Dakin et al. as this was known at the time of filing, the combination would have predictable results, and as Corocan et al. indicate “Accurate identification and extraction of data from business documents is an important aspect of computerized processing of business documents” (col. 1, lines 10-20) therefore improving the accuracy of the extraction of Glass et al. and Yellapragada et al. and Dakin et al. and expanding applications of the invention to additional business applications.

Claim(s) 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glass et al. (IDS: US 20040261016 A1) and Yellapragada et al. (IDS: US 20180032842 A1) and Dakin et al. (IDS: US 20160217119 A1) as applied to claim 1 above, further in view of Lee et al. (US 20180314884 A1).

Regarding claim 10, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 1. Glass et al. and Yellapragada et al. and Dakin et al. do not disclose the data representing the one or more units of text comprises one or more language predictors associated with the one or more units of text.
 
Lee et al. teach the data representing the one or more units of text comprises one or more language predictors associated with the one or more units of text (In some cases, an OCR system may receive an image of a document that is rotated in an orientation other than the proper orientation (e.g., with text flowing from left to right and from top to bottom for documents written in left-to-right script languages (e.g., English, French, German, Russian, and the like), or with text flowing from right to left and top to bottom for documents written in right-to-left script languages (e.g., Hebrew, Arabic, and so on)). The OCR system can attempt to extract text from the image as received (which, as discussed above, may be oriented in an orientation other than the proper orientation for the language in which the document is written). Upon failing to extract text from the image, the OCR system generally rotates the image in steps (e.g., by 90 degrees in an anticlockwise or clockwise direction) until the OCR system is able to extract text from the image. Because rotating an image and attempting to extract text from an image is a computationally expensive process, an OCR system may waste resources attempting to extract text from a document. For example, in a worst-case scenario, the OCR system may perform four text extraction attempts (on the image as received, rotated 90 degrees, rotated 180 degrees, and rotated 270 degrees) before the OCR system is able to successfully extract text from the document, [0017], Embodiments presented herein provide techniques for determining a probable orientation of a document included in an image before attempting to extract text from the document included in the image. By determining a probable orientation of a document before attempting to extract text from the document, an OCR system can reduce the number of rotations that may be needed to orient the document in a manner that allows for usable text to be extracted from the document, [0018], As discussed above, the first rotation may be a 0 degree rotation of the image if image rotation analyzer 220 determines that the document depicted in the image is oriented horizontally and that the document is written in a left-to-right or right-to-left language, [0034], OCR engine 126 can use, in some cases, localization information identifying the direction in which text is written in a local language to determine whether the probable orientation of the document is a horizontal orientation (corresponding to the pair of rotations including a 0 degree rotation and a 180 degree rotation from the captured image) or a vertical orientation (corresponding to a pair of rotations including a 90 degree clockwise rotation and a 90 degree anticlockwise rotation from the captured image). For example, in a language that uses a right-to-left or left-to-right writing convention, OCR engine 126 can determine that the document depicted in an image is oriented horizontally if OCR engine 126 identifies a plurality of patterns of successive text boxes having a substantially similar Y-axis coordinate value, [0041]).

Glass et al. and Yellapragada et al. and Dakin et al. and Lee et al. are in the same art of digital forms and documents (Glass et al., [0169]; Yellapragada et al., [0021]; Dakin et al., [0010]; Lee et al., abstract). The combination of Lee et al. with Glass et al. and Yellapragada et al. and Dakin et al. enables use of language predictors. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the predictors of Lee et al. with the invention of Glass et al. and Yellapragada et al. and Dakin et al. as this was known at the time of filing, the combination would have predictable results, and as Lee et al. indicate “Accelerating the extraction of text from a document may reduce the amount of time that hardware components on a mobile device spend in an active state to obtain data from an image of a document and may improve the battery life of a mobile device on which an OCR process executes” ([0059]) therefore improving the processing time of Glass et al. and Yellapragada et al. and Dakin et al.

Claim(s) 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glass et al. (IDS: US 20040261016 A1) and Yellapragada et al. (IDS: US 20180032842 A1) and Dakin et al. (IDS: US 20160217119 A1) as applied to claim 14 above, further in view of Corocan et al. (US 10896357 B1).

Regarding claim 15, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 14. Glass et al. and Yellapragada et al. and Dakin et al. do not disclose the candidate label model determines each set of one or more candidate labels based on one or more key-value pairs represented by the data representing the one or more units of text extracted from the image of the document.

Corocan et al. teach candidate label model determines each set of one or more candidate labels based on one or more key-value pairs represented by the data representing the one or more units of text extracted from the image of the document (A system for automatically creating extraction templates consisting of spatial coordinates tagged with semantic labels for novel layout structures using deep neural networks is disclosed herein. The system employs deep-learning based object-detection for creating templates on document images and a novel way of preprocessing the document image for object detection. The system processes document images to extract key/value pairs of interest from the document, col. 2, lines 60-68).

Glass et al. and Yellapragada et al. and Dakin et al. and Corocan et al. are in the same art of digital forms and documents (Glass et al., [0169]; Yellapragada et al., [0021]; Dakin et al., [0010]; Corocan et al., abstract, col. 1, lines 5-10). The combination of Corocan et al. with Glass et al. and Yellapragada et al. and Dakin et al. enables the use of key value pairs. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the key value pairs described by Corocan et al. with the invention of Glass et al. and Yellapragada et al. and Dakin et al. as this was known at the time of filing, the combination would have predictable results, and as Corocan et al. indicate “Accurate identification and extraction of data from business documents is an important aspect of computerized processing of business documents” (col. 1, lines 10-20) therefore improving the accuracy of the extraction of Glass et al. and Yellapragada et al. and Dakin et al. and expanding applications of the invention to additional business applications.

Claim(s) 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glass et al. (IDS: US 20040261016 A1) and Yellapragada et al. (IDS: US 20180032842 A1) and Dakin et al. (IDS: US 20160217119 A1) as applied to claim 2 above, further in view of Kopec et al. (US 5594809 A).

Regarding claim 16, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 2. Glass et al. and Yellapragada et al. and Dakin et al. do not explicitly disclose the set of candidate labels for each annotated value includes at least a first candidate label based on a first technique and at least a second candidate label based on a second technique.

Kopec et al. teach a set of candidate labels for each annotated value includes at least a first candidate label based on a first technique and at least a second candidate label based on a second technique (The present invention encompasses two novel template training techniques. The first provides for the training of character templates defined according to the sidebearing model of letterform shape description and positioning and provides for the use of a text line image source of glyph samples and any form of an associated text line transcription, col. 13, lines 5-10, The second training technique of the present invention provides for the training of character templates defined according to any model of character letter spacing and positioning, including, for example templates defined according to the segment-based character template model; this second training technique specifically uses a tag text line transcription as the transcription associated with the text line input image, col. 17, lines 15-25, The second training technique produces labeled glyph samples of a type needed to train the bitmapped character templates according to the template model provided. Thus, if segment-based character templates are being trained, the second technique may produce, as training data, isolated, labeled character images, bounding boxes specified around glyph samples in a text line image source, each labeled with a character label; or 2D regions of the input text line image with origin positions of glyph samples identified, each labeled with a character label. If sidebearing character templates are being trained, the second technique produces image coordinate positions indicating glyph sample image origin positions in the input text line image, each labeled with a character label identifying the character in the glyph sample character set represented by the glyph sample image origin position, col. 17, lines 45-60).

Glass et al. and Yellapragada et al. and Dakin et al. and Kopec et al. are in the same art of digital forms and documents (Glass et al., [0169]; Yellapragada et al., [0021]; Dakin et al., [0010]; Kopec et al., abstract). The combination of Kopec et al. with Glass et al. and Yellapragada et al. and Dakin et al. enables the use of separate techniques. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the techniques described by Kopec et al. with the invention of Glass et al. and Yellapragada et al. and Dakin et al. as this was known at the time of filing, the combination would have predictable results, and as Kopec et al. indicate “Another significant advantage of this first training technique is the flexibility available to a user in the selection of the text line transcription. For example, the technique may be implemented in such a manner as to permit the user to prepare a literal transcription that results in correct character labels being assigned to specific glyph samples; however, the technique may also be implemented in a much more general manner so as to permit the user to merely select a suitable transcription that contains the information needed by the formal line image source model to map character labels to glyph samples. Thus, this first aspect of the present invention provides for the use of a wider range of transcription types for training purposes than the one-for-one sequence of character labels used in conventional supervised training systems” (col. 13, lines 30-35) therefore improving the flexibility of Glass et al. and Yellapragada et al. and Dakin et al..

Claim(s) 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glass et al. (IDS: US 20040261016 A1) and Yellapragada et al. (IDS: US 20180032842 A1) and Dakin et al. (IDS: US 20160217119 A1) as applied to claim 1 above, further in view of Gregory et al. (US 20190005020 A1).

Regarding claim 18, Glass et al. and Yellapragada et al. and Dakin et al. disclose the computer-implemented method of claim 1. Glass et al. and Yellapragada et al. and Dakin et al. do not explicitly disclose each annotated value includes or is based on a first subset of the one or more units of text and wherein the label for each annotated value includes or is based on a second subset of the one or more units of text that is different from and non-overlapping with the first subset of the one or more units of text.

Gregory et al. teach each annotated value includes or is based on a first subset of the one or more units of text and wherein the label for each annotated value includes or is based on a second subset of the one or more units of text that is different from and non-overlapping with the first subset of the one or more units of text (To train the SVM, annotated training data 238d may be used. The annotated training data 238d for training the SVM may include examples of positive and negative text segments (i.e., paragraphs with and without funding information 238c). For example, given a text document 238a, denoted as T, the automated extraction of funding information from the text translates into two separate tasks, (i) identify all text segments t∈T, which contain funding information, and (ii) process all the funding text segments t, in order to detect the set of the funding bodies, denoted as FB, and the sets of grants, denoted as GR that appear in the text. By using training data, which contains labeled examples of funding information, the step of classifying text sections as having funding information 238c or not may be viewed as a binary text classification problem. For example, the binary text classification problem may be defined such that given T and the set of all non-overlapping text segments t.sub.i, such that the ∪.sub.iT.sub.i=T (where t.sub.i∈T), a trained binary classifier can decide for the class label of t.sub.i, i.e., C.sub.ti=1, if t.sub.i contains funding information, or C.sub.ti=0 if not. That is, the trained binary classifier using an SVM, for example implemented by the classifier logic 244c, may be trained on annotated training data that contains example sections of text predefined as having funding information 238c and example sections of text predefined as not having funding information 238c. This allows the classifier logic 244c to distinguish between paragraphs that have and do not have funding information, [0038]).

Glass et al. and Yellapragada et al. and Dakin et al. and Gregory et al. are in the same art of digital forms and documents (Glass et al., [0169]; Yellapragada et al., [0021]; Dakin et al., [0010]; Gregory et al., abstract). The combination of Gregory et al. with Glass et al. and Yellapragada et al. and Dakin et al. enables the use of non-overlapping sets. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the non-overlapping sets described by Gregory et al. with the invention of Glass et al. and Yellapragada et al. and Dakin et al. as this was known at the time of filing, the combination would have predictable results, and as Gregory et al. indicate “Taken together, each of the processes implemented in the system and method of automatically extracting funding information from text improve detection accuracy and lower computational cost and time” ([0022]) therefore improving the accuracy of Glass et al. and Yellapragada et al. and Dakin et al..

Allowable Subject Matter
Claim 17 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims AND UPON FILING OF A TERMINAL DISCLAIMER (or otherwise addressing the current non-statutory double patenting rejection).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: US 20140306992 A1: An image processing apparatus includes: an attaching unit that attaches an annotation to a diagnostic image acquired by imaging an object; a recording unit that records, in a storing unit along with an annotation, attribute information which is information on a predetermined attribute, as information related to the annotation; a searching unit that searches a plurality of positions where annotations are attached respectively in the diagnostic image, for a target position which is a position a user has an interest in; and a displaying unit that displays the search result by the searching unit on a display. The searching unit searches for the target position using a word included in the annotation or the attribute information as a key; US 12592116 B2: Although certain exemplary embodiments have been described as including more or less “fixed” elements at the top, left, and bottom of the screen, other arrangements may be provided for such information. The arrangement shown in FIG. 3 and discussed above works well for English- and other-language localities, e.g., because a user tends to focus on the main content provided in the approximate vertical center of the screen and to the right (e.g., because many languages are written from left-to-right and top-to-bottom). However, other language localities may move these fixed elements around on the screen and/or relative to one another. For instance, for Hebrew-language localities, the main navigation elements 302a-302d may be moved to the right of the screen, e.g., because the language is written from right-to-left and top-to-bottom. For Asian languages, where a column-based approach may be more appropriate based on writing styles, etc., a 90 degree rotation of the basic elements shown in FIG. 3 may be in order. FIG. 4 is an example search screen that may be used in connection with certain exemplary embodiments. The FIG. 4 example screen may be displayed when an initial search is requested, e.g., before any search criteria is entered by the user. In addition to including the main navigation elements 302a-302d (with the search element 302b being highlighted to indicate its selection), the upper status bar may still be provided; US 10445355 B2: The non-transitory, processor-readable storage medium further includes one or more programming instructions that, when executed, cause the processing device to access metadata for each document in a plurality of search results that corresponds to the search query, annotate one or more locations in each document with a first indicator for each of the one or more search terms in a first unit of the plurality of units and a second indicator for each of the one or more search terms in a second unit of the plurality of units based on the metadata, and display a visualizable results list. The visualizable results list includes the plurality of search results and a corresponding hit pattern for each document in the plurality of search results. The hit pattern includes one or more sections of the document, a first one or more hashes corresponding to each first indicator, and a second one or more hashes corresponding to each second indicator. The first one or more hashes and the second one or more hashes positioned within the hit pattern in one or more locations that correspond to the one or more locations of the search terms in the document; US 20070055926 A1: In one embodiment, the users can also define custom search terms to search for annotations. The custom search terms may be used with or without the pre-defined categories and sub-categories. As discussed above, in one embodiment, the users can highlight a portion of an electronic document to be used as a search parameter. For example, the location of the highlighted portion may be used to search for annotations associated with the highlighted portion. Thus, the users can define the exact set of annotations to search for such that they download annotations from the annotation database 120 that most closely match the search parameters, text position, and permissions granted by the annotation author; US 20190317985 A1: As further shown in FIG. 1G, and by reference numbers 155 and 185, the annotation platform may map the determined annotations with coordinates in the visual structural information (e.g., the second document layer). In some implementations, the annotation platform may search for the relevant information associated with the annotations, may match the relevant information with the visual structural information, and may map the matched annotations with coordinates in the visual structural information. For example, if the relevant information includes the text “test,” the annotation platform may search for and locate the first occurrence of the text “test” at a coordinate (e.g., x, which indicates a left, top corner of the text “test”) of the visual structural information. The annotation platform may also determine a width (w) and a height (h) of the first occurrence of the text “test” in the visual structural information. In some implementations, the annotation platform may continue this process until all annotations of the text “test” are mapped with coordinates, widths, and heights of the visual structural information; US 20050154703 A1: Once the search is executed successfully, the corresponding character string area (document portion) in the input document is extracted (S707) based upon the input document edit start position and the input document edit end position in the results data, ascertains the value (label) stored in the label field in the records searched from the reference source document/label correspondence data 105 (S708), the label thus obtained is attached to the extracted character string area (document portion) and the labeled document portion is stored into the labeling result storage unit 106 (S709). The data stored into the labeling result storage unit 106 may be the type of data such as that shown in FIG. 3 that allows the generation of an output document (see FIG. 9) by using the input document in response to an output requests, or they may be the type of data that can be directly output in response to an output request, as shown in FIG. 9. It is to be noted that if the data adopt the former mode, processing for extracting the input document edit start position and the input document edit end position in the result data is executed in step S707; US 20230274570 A1 [does not predate]: A method of automatic extraction of values includes scanning a document and linking a first word of a plurality of words in the document, to each word of the plurality of words in the document positionally adjacent to the right, left, top, bottom of the first word. The method further includes repeating said scanning and linking for every word of the plurality of words in the document. The method further includes determining the x and y coordinate in the document for each of the plurality of words. The method further includes providing a plurality of checklist words corresponding to information of interest. The method further includes searching the plurality of words in the document for each of the plurality of checklist words. The method further includes for each checklist word found in the plurality of words of the documents determining if a linked positionally adjacent words to the right, left, top, bottom is a value that matches typical values for the plurality of checklist words and if so, extract said value. The method further includes directing a user interface to position in the document related to content of interest related to at least one of the plurality of checklist words.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M ENTEZARI HAUSMANN whose telephone number is (571)270-5084. The examiner can normally be reached 10-7 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent M Rudolph can be reached at (571) 272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHELLE M ENTEZARI HAUSMANN/Primary Examiner, Art Unit 2671
Read full office action
Prosecution Timeline

May 22, 2024
Application Filed
Apr 04, 2026
Non-Final Rejection — §101, §103, §DP (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/742,463
Patent 12602775
INTERPOLATION OF MEDICAL IMAGES
2y 5m to grant Granted Apr 14, 2026
17/855,522
Patent 12602793
Systems and Methods for Predicting Object Location Within Images and for Analyzing the Images in the Predicted Location for Object Tracking
2y 5m to grant Granted Apr 14, 2026
18/335,046
Patent 12602949
SYSTEM AND METHOD FOR DETECTING HUMAN PRESENCE BASED ON DEPTH SENSING AND INERTIAL MEASUREMENT
2y 5m to grant Granted Apr 14, 2026
17/964,716
Patent 12597261
OBJECT MOVEMENT BEHAVIOR LEARNING
2y 5m to grant Granted Apr 07, 2026
18/346,894
Patent 12597244
METHOD AND DEVICE FOR IMPROVING OBJECT RECOGNITION RATE OF SELF-DRIVING CAR
2y 5m to grant Granted Apr 07, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
76%
Grant Probability
98%
With Interview (+21.6%)
3y 1m
Median Time to Grant
Low
PTA Risk
Based on 863 resolved cases by this examiner. Grant probability derived from career allow rate.
Systems and Methods for Extracting Information from a Physical Document

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email