DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, 365(c), or 386(c) is acknowledged.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 08/10/2023, 10/09/2023, 09/08/2025 was filed. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 1, 6-7, 11, and 16-17 are rejected on the ground of nonstatutory double patenting as being unpatentable over claim claims 1, 3-4, 10 and 12-13 of U.S. Patent No. 11,315,353 in view of CRISTESCU (US PGPUB: US 20210012102 A1).
US PAT: 11,315,353 B1
Instant App: 18/339,631
1. A system configured for spatial-aware information extraction from electronic source documents, the system comprising:
one or more hardware processors configured by machine-readable instructions to:
obtain an electronic source document in electronic format,
wherein the electronic format is such that, upon presentation of the electronic source document through a user interface associated with a client computing platform,
the presentation includes human-readable information,
wherein the human-readable information includes a first group of characters and a second group of characters;
obtain extracted information that has been extracted or derived from the electronic source document,
wherein the extracted information includes: (i) sets of extracted characters and corresponding extracted spatial information, wherein the sets include a first set of extracted characters and a second set of extracted characters, wherein the first set of extracted characters corresponds to the first group of characters in the human-readable information,
and wherein the second set of extracted characters corresponds to the second group of characters of the human- readable information; (ii) sets of line segments and corresponding spatial line information;
generate a character-based representation of the electronic source document based on the extracted information,
wherein the character-based representation includes: the first set of extracted characters and the second set of extracted characters positioned within the grid of character positions, wherein a first relative positioning corresponds to a second relative positioning,
wherein the first relative positioning is between (a) the first group of characters in the human-readable information and (b) the second group of characters in the human-readable information, and wherein the second relative positioning is between (c) the first set of extracted characters in the character-based representation and (d) the second set of extracted characters in the character-based representation; and present a user interface on the client computing platform to the user,
wherein the user interface is configured to enable the user, through user input, to: (i) perform a search operation in a portion of the grid of character positions, and perform a cropping operation the electronic document based on a result of the search operation.
wherein the character-based representation uses a grid of character positions,
1. A system configured for spatial-aware information extraction from electronic source documents, the system comprising:
one or more hardware processors configured by machine-readable instructions to:
obtain an electronic source document in electronic format,
wherein the electronic format is such that, upon presentation of the electronic source document through a particular user interface associated with a client computing platform,
the presentation includes human-readable information,
wherein the human-readable information includes a first group of characters and a second group of characters,
wherein the first group of characters and the second group of characters are positioned relative to each other according to a first relative positioning;
and generate a character-based representation of the electronic source document based on extracted information,
wherein the extracted information includes a first set of extracted characters and a second set of extracted characters that have been extracted from the electronic source document at least in part by using machine- learning techniques,
wherein the first set of extracted characters and the second set of extracted characters are positioned relative to each other according to a second relative positioning,
wherein the character-based representation uses a grid of character positions in which the first set of extracted characters and the second set of extracted characters are positioned, and wherein the second relative positioning corresponds to the first relative positioning.
3. wherein the electronic source documents include electronic files including scanned documents.
6. wherein the electronic source documents include electronic files including scanned documents.
4. wherein the first group of characters in the human-readable information include one or more of words, numbers, names, and dates.
7. wherein the first group of characters in the human-readable information include one or more of words, numbers, names, and dates.
10. same limitations as claim 1 mapping
11 same limitations as claim 1 mapping.
12. wherein the electronic source documents include electronic files including scanned documents.
16. wherein the electronic source documents include electronic files including scanned documents.
13. wherein the first group of characters in the human-readable information include one or more of words, numbers, names, and dates.
17. wherein the first group of characters in the human-readable information include one or more of words, numbers, names, and dates.
As to independent claims 1 and 11 the only differences between the instant application 18/339,631 and the claims 1-6, 8-9, 11-19 of patent No. 11,315,353 is the limitation of: “wherein the extracted information includes a first set of extracted characters and a second set of extracted characters that have been extracted from the electronic source document at least in part by using machine- learning techniques,”
CRISTESCU teaches: wherein the extracted information includes a first set of extracted characters and a second set of extracted characters that have been extracted from the electronic source document at least in part by using machine- learning techniques, (CRISTESCU − [0050] In the exemplary configuration of FIG. 4, a text feature extractor 44 receives text token 30 from OCR engine 42 and outputs text feature vector 62 characterizing text token 30. FIG. 10 further illustrates the operation of extractor 44. [0052] Text feature extractor 44 may further comprise a text convolver 57 which may be structured as a convolutional neural network. In one example illustrated in FIG. 10,)
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Patent No. 11,315,353 with CRISTESCU because it would allow the convolutional neural networking for identifying characters and extracting characters through classification with feature vector. Therefore improving accuracy of textual information extract from electronic documents.
Claims 1, 6-7, 10-11, 16-17, and 20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claim claims 1, 5-6, 9-10, 14-15 and 18 of U.S. Patent No. 11,715,318 B2 in view of CRISTESCU (US PGPUB: US 20210012102 A1).
US PAT: 11,715,318 B2
Instant App: 18/339,631
1. A system configured for spatial-aware information extraction from electronic source documents, the system comprising:
one or more hardware processors configured by machine-readable instructions to:
obtain an electronic source document in electronic format,
wherein the electronic format is such that, upon presentation of the electronic source document through a particular user interface associated with a client computing platform,
the presentation includes human-readable information,
wherein the human-readable information includes a first group of characters and a second group of characters;
obtain extracted information that is based on the electronic source document, wherein the extracted information includes sets of extracted characters and corresponding extracted spatial information, wherein the sets include a first set of extracted characters and a second set of extracted characters, wherein the first set of extracted characters corresponds to the first group of characters in the human-readable information, and wherein the second set of extracted characters corresponds to the second group of characters of the human-readable information;
generate a character-based representation of the electronic source document based on the extracted information,
wherein the character-based representation uses a grid of character positions,
wherein the character-based representation includes the first set of extracted characters and the second set of extracted characters positioned within the grid of character positions,
wherein a first relative positioning is between (a) the first group of characters in the human-readable information and (b) the second group of characters in the human-readable information, wherein a second relative positioning is between (c) the first set of extracted characters in the character-based representation and (d) the second set of extracted characters in the character-based representation, and wherein the first relative positioning corresponds to the second relative positioning;
and present a user interface to the user, wherein the user interface is configured to enable the user to perform a search operation in a portion of the grid of character positions, such that presenting a result of the search operation includes performance of a cropping operation on at least one of the electronic source document and the character-based representation of the electronic source document.
1. A system configured for spatial-aware information extraction from electronic source documents, the system comprising:
one or more hardware processors configured by machine-readable instructions to:
obtain an electronic source document in electronic format,
wherein the electronic format is such that, upon presentation of the electronic source document through a particular user interface associated with a client computing platform,
the presentation includes human-readable information,
wherein the human-readable information includes a first group of characters and a second group of characters,
wherein the first group of characters and the second group of characters are positioned relative to each other according to a first relative positioning;
and generate a character-based representation of the electronic source document based on extracted information,
wherein the extracted information includes a first set of extracted characters and a second set of extracted characters that have been extracted from the electronic source document at least in part by using machine- learning techniques,
wherein the first set of extracted characters and the second set of extracted characters are positioned relative to each other according to a second relative positioning,
wherein the character-based representation uses a grid of character positions in which the first set of extracted characters and the second set of extracted characters are positioned,
and wherein the second relative positioning corresponds to the first relative positioning.
5. wherein the electronic source documents include electronic files including scanned documents.
6. wherein the electronic source documents include electronic files including scanned documents.
6. wherein the first group of characters in the human-readable information include one or more of words, numbers, names, and dates.
7. wherein the first group of characters in the human-readable information include one or more of words, numbers, names, and dates.
9. wherein the user interface is configured to enable the user, through user input, to perform two or more search operations in a portion of the grid of character positions, and perform a cropping operation on the electronic source document, wherein the cropping operation is based on results of the two or more search operations.
10. wherein the user interface is configured to enable the user, through user input, to perform two or more search operations in a portion of the grid of character positions, and perform a cropping operation on the electronic source document, wherein the cropping operation is based on results of the two or more search operations.
10. same limitations as claim 1 mapping
11 same limitations as claim 1 mapping.
14. wherein the electronic source documents include electronic files including scanned documents.
16. wherein the electronic source documents include electronic files including scanned documents.
15. wherein the first group of characters in the human-readable information include one or more of words, numbers, names, and dates.
17. wherein the first group of characters in the human-readable information include one or more of words, numbers, names, and dates.
18. wherein the user interface enables the user, through user input, to perform two or more search operations in a portion of the grid of character positions, and perform a cropping operation on the electronic source document, wherein the cropping operation is based on results of the two or more search operations.
20. wherein the user interface enables the user, through user input, to perform two or more search operations in a portion of the grid of character positions, and perform a cropping operation on the electronic source document, wherein the cropping operation is based on results of the two or more search operations.
As to independent claims 1 and 11 the only differences between the instant application 11,715,318 and the claims 1, 5-6, 9-10, 14-15 and 18 of patent No. 11,315,353 is the limitation of: “wherein the extracted information includes a first set of extracted characters and a second set of extracted characters that have been extracted from the electronic source document at least in part by using machine- learning techniques,”
CRISTESCU teaches: wherein the extracted information includes a first set of extracted characters and a second set of extracted characters that have been extracted from the electronic source document at least in part by using machine- learning techniques, (CRISTESCU − [0050] In the exemplary configuration of FIG. 4, a text feature extractor 44 receives text token 30 from OCR engine 42 and outputs text feature vector 62 characterizing text token 30. FIG. 10 further illustrates the operation of extractor 44. [0052] Text feature extractor 44 may further comprise a text convolver 57 which may be structured as a convolutional neural network. In one example illustrated in FIG. 10,)
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Patent No. 11,715,318 with CRISTESCU because it would allow the convolutional neural networking for identifying characters and extracting characters through classification with feature vector. Therefore improving accuracy of textual information extract from electronic documents.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over CRISTESCU (US PGPUB: US 20210012102 A1) in view of Huang (US PGPUB: US 20090132590 A1).
Regarding independent claim 1, CRISTESCU teaches: A system configured for spatial-aware information extraction from electronic source documents,
the system comprising: one or more hardware processors configured by machine-readable instructions to:
obtain an electronic source document in electronic format, (CRISTESCU − [0029] Document image 20 comprises an encoding of an optical image of a printed document. Image 20 may be acquired using an imaging device 12 (FIG. 1) which may be of any type known in the art (e.g., scanner, digital camera, etc.). The format, size, and encoding of image 20 may very among embodiments. [0037] [0037] In some embodiments, data scraping engine 40 includes an optical character recognition (OCR) engine 42 configured to receive document image 20 and to extract a set of text tokens 30 from image 20.) Note scanning, or digital image from a digital camera.
wherein the electronic format is such that, upon presentation of the electronic source document through a particular user interface associated with a client computing platform, the presentation includes human-readable information, (CRISTESCU − [0032] FIGS. 3-A-B show an exemplary invoice 24a and receipt 24b, respectively, [0027] Fig. 1, Fig. 2 Client systems 10a-c generically represent any computing appliance comprising a processor, a memory unit, and a communication interface. Exemplary client systems 10a-c include a corporate mainframe computer, a personal computer, a mobile computing device (e.g., tablet computer, laptop computer), a mobile telecommunications device (e.g., smartphone), a digital camera, a media player, and a wearable computing device (e.g., smartwatch), among others.)
wherein the human-readable information includes a first group of characters and a second group of characters, (CRISTESCU − [0032] Fig. 3-A-B, respectively, having a set of text fields 32a-f of various field types. In the case of an invoice, exemplary field types may include, among others: Vendor name, Vendor address, Buyer name, Billing address, Shipping address, Invoice number, Purchase order number, Invoice date, Tax due, Total due, Payment terms, Currency, Item description, Item quantity, Item unit price, Item line amount, Item purchase order number, Item number, and Item part number.) Note: Element 32a first group of characters, element 32b a second group of characters
wherein the first group of characters and the second group of characters are positioned relative to each other according to a first relative positioning; (CRISTESCU − [0032] [0037] Fig. 5, Fig. 7 A token bounding box is herein defined as a geometric shape fully enclosing a region of document image 20 containing text token 30, i.e., all pixels belonging to the respective text token are inside the token bounding box.) Note: Fig. 7 element 34b first group of characters and element 34c second group of characters position relative to each other.
and generate a character-based representation of the electronic source document based on extracted information, (CRISTESCU − [0022] transmits a document content indicator 22 back to the respective client system. [0034] return a content of the respective fields as document content indicator 22. Other embodiments may format document content indicator 22 as a table (e.g., comma-separated values—CSV) or may use some proprietary data format such as Microsoft Excel®)
wherein the extracted information includes a first set of extracted characters and a second set of extracted characters that have been extracted from the electronic source document at least in part by using machine- learning techniques, (CRISTESCU − [0050] In the exemplary configuration of FIG. 4, a text feature extractor 44 receives text token 30 from OCR engine 42 and outputs text feature vector 62 characterizing text token 30. FIG. 10 further illustrates the operation of extractor 44. [0052] Text feature extractor 44 may further comprise a text convolver 57 which may be structured as a convolutional neural network. In one example illustrated in FIG. 10,)
wherein the first set of extracted characters and the second set of extracted characters are positioned relative to each other according to a second relative positioning, (CRISTESCU − [0032] [0037] Fig. 5, Fig. 7 A token bounding box is herein defined as a geometric shape fully enclosing a region of document image 20 containing text token 30, i.e., all pixels belonging to the respective text token are inside the token bounding box.) Note: Fig. 7 element 34b first group of characters and element 34c second group of characters position relative to each other.
CRISTESCU does not explicitly teach: wherein the character-based representation uses a grid of character positions in which the first set of extracted characters and the second set of extracted characters are positioned, and wherein the second relative positioning corresponds to the first relative positioning.
However, Huang teaches: wherein the character-based representation uses a grid of character positions in which the first set of extracted characters and the second set of extracted characters are positioned, and wherein the second relative positioning corresponds to the first relative positioning. (Huang − [0044] Figs. 5A-5C button 510 activates the OCR process that reads the business card file in an image format and displays it in a grid mode 512 according to each character's position in the business card; [0045] In FIG. 5B an image of the business card is shown as item 511; [0046] FIG. 5C, each character of the original business card is displayed in a grid according to its corresponding position in the business card)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date to have combined the teaching of CRISTESCU and Huang as each invention are related to OCR document images for displaying within a user interface. Adding the teaching of Huang by providing an interface for displaying extracted characters within the grid. Therefore, the motivation to combine is for improving a user ability of validating the completeness and correctness of the OCR results.
Regarding dependent claim 2, depends on claim 1, CRISTESCU teaches: wherein the first set of extracted characters corresponds to the first group of characters in the human-readable information, and wherein the second set of extracted characters corresponds to the second group of characters in the human-readable information. (CRISTESCU − [0032] Fig. 3-A-B, respectively, having a set of text fields 32a-f of various field types. In the case of an invoice, exemplary field types may include, among others: Vendor name, Vendor address, Buyer name, Billing address, Shipping address, Invoice number, Purchase order number, Invoice date, Tax due, Total due, Payment terms, Currency, Item description, Item quantity, Item unit price, Item line amount, Item purchase order number, Item number, and Item part number.)
Regarding dependent claim 3, depends on claim 1, CRISTESCU teaches: wherein the first relative positioning is such that the first group of characters is positioned above the second group of characters, and wherein the second relative positioning is such that the first set of extracted characters is positioned above the second set of extracted characters. (CRISTESCU − [0032] [0037] Fig. 5, Fig. 7 A token bounding box is herein defined as a geometric shape fully enclosing a region of document image 20 containing text token 30, i.e., all pixels belonging to the respective text token are inside the token bounding box.) Fig. 7 first group element 34b above second group element 34c
Regarding dependent claim 4, depends on claim 1, CRISTESCU teaches: wherein the first relative positioning is such that the first group of characters is positioned left of the second group of characters, and wherein the second relative positioning is such that the first set of extracted characters is positioned left of the second set of extracted characters. (CRISTESCU − [0032] [0037] Fig. 5, Fig. 7 A token bounding box is herein defined as a geometric shape fully enclosing a region of document image 20 containing text token 30, i.e., all pixels belonging to the respective text token are inside the token bounding box.) Fig. 7 first group element 34b left of second group element 34b
Regarding dependent claim 5, depends on claim 1, CRISTESCU teaches: wherein the one or more hardware processors are further configured to: present a user interface to the user, (CRISTESCU − [0032] FIGS. 3-A-B show an exemplary invoice 24a and receipt 24b, respectively, [0027] Fig. 1, Fig. 2 Client systems 10a-c generically represent any computing appliance comprising a processor, a memory unit, and a communication interface. Exemplary client systems 10a-c include a corporate mainframe computer, a personal computer, a mobile computing device (e.g., tablet computer, laptop computer), a mobile telecommunications device (e.g., smartphone), a digital camera, a media player, and a wearable computing device (e.g., smartwatch), among others.)
CRISTESCU does not explicitly teach: wherein the user interface is configured to enable the user to perform a search operation in a portion of the grid of character positions,
However, Huang teaches: wherein the user interface is configured to enable the user to perform a search operation in a portion of the grid of character positions, (Huang − [0027] FIG. 3 illustrates a method for searching information from the relational database of FIG. 2 according to an embodiment of the present invention. In this example, the method starts in block 302 where the user enters a search query)
such that presenting a result of the search operation includes performance of a cropping operation on at least one of the electronic source document and the character-based representation of the electronic source document. (Huang − [0045] In FIG. 5B an image of the business card is shown as item 511; [0046] FIG. 5C, each character of the original business card is displayed in a grid according to its corresponding position in the business card; [0049] In the server, the business card information can be indexed for support of subsequent searching by the user.) Note: cropping operation retrieving a data from the source; search the business card information.
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date to have combined the teaching of CRISTESCU and Huang as each invention are related to OCR document images for displaying within a user interface. Adding the teaching of Huang by providing an interface for displaying extracted characters within the grid. Therefore, the motivation to combine is for improving a user ability of validating the completeness and correctness of the OCR results.
Regarding dependent claim 6, depends on claim 1, CRISTESCU teaches: wherein the electronic source documents include electronic files including scanned documents. (CRISTESCU − [0029] Document image 20 comprises an encoding of an optical image of a printed document. Image 20 may be acquired using an imaging device 12 (FIG. 1) which may be of any type known in the art (e.g., scanner, digital camera, etc.). The format, size, and encoding of image 20 may very among embodiments. [0037] [0037] In some embodiments, data scraping engine 40 includes an optical character recognition (OCR) engine 42 configured to receive document image 20 and to extract a set of text tokens 30 from image 20.)
Regarding dependent claim 7, depends on claim 1, CRISTESCU teaches: wherein the first group of characters in the human-readable information include one or more of words, numbers, names, and dates. (CRISTESCU − [0032] FIGS. 3-A-B show an exemplary invoice 24a and receipt 24b, respectively, having a set of text fields 32a-f of various field types. In the case of an invoice, exemplary field types may include, among others: Vendor name, Vendor address, Buyer name, Billing address, Shipping address, Invoice number, Purchase order number, Invoice date, Tax due, Total due, Payment terms, Currency, Item description, Item quantity, Item unit price, Item line amount, Item purchase order number, Item number, and Item part number. In the example of FIG. 3-A, field 32a contains a billing name and address, field 32b contains an invoice date, items 32c-d contain an item description and a total amount due, respectively.) Element 30d dates, element 32a words, numbers and name.
Regarding dependent claim 8, depends on claim 1, CRISTESCU teaches: wherein the extracted information has been extracted from the electronic source document at least in part by using deep learning techniques. (CRISTESCU − [0050] In the exemplary configuration of FIG. 4, a text feature extractor 44 receives text token 30 from OCR engine 42 and outputs text feature vector 62 characterizing text token 30. FIG. 10 further illustrates the operation of extractor 44. [0052] Text feature extractor 44 may further comprise a text convolver 57 which may be structured as a convolutional neural network. In one example illustrated in FIG. 10,) CNN is a type of deep learning algorithm
Regarding dependent claim 9, depends on claim 1, CRISTESCU teaches: wherein the extracted information for the first set of extracted characters includes a first set of spatial coordinates that indicate a first spatial position in the electronic source document, (CRISTESCU − [0037] [0041-0042] [0037] Token box indicator 31 may comprise a set of coordinates of vertices of bounding box 34, ordered according to a pre-determined rule (for instance, counter-clockwise, starting with the lower-left vertex). Vertex coordinates X and Y may be expressed in image pixels or may be determined as a fraction of the image size along the respective direction (e.g., for an image 800 pixels wide, a coordinate X=0.1 may indicate a position located 80 pixels from the left edge of image 20). In an alternative embodiment, bounding box 34 may be specified as a tuple {X, Y, w, h}, wherein X and Y denote coordinates of a vertex 36, and w and h denote a width and a height of box 34, respectively.)
and wherein the first set of spatial coordinates corresponds to the first set of textual coordinates. (CRISTESCU − [0037] [0041-0042] [0037] Token box indicator 31 may comprise a set of coordinates of vertices of bounding box 34, ordered according to a pre-determined rule (for instance, counter-clockwise, starting with the lower-left vertex). Vertex coordinates X and Y may be expressed in image pixels or may be determined as a fraction of the image size along the respective direction (e.g., for an image 800 pixels wide, a coordinate X=0.1 may indicate a position located 80 pixels from the left edge of image 20). In an alternative embodiment, bounding box 34 may be specified as a tuple {X, Y, w, h}, wherein X and Y denote coordinates of a vertex 36, and w and h denote a width and a height of box 34, respectively.)
CRISTESCU does not explicitly teach: grid of character positions,
However, Huang teaches: wherein the first set of characters in the character-based representation is associated with a first set of textual coordinates in the grid of character positions, (Huang − [0044] Figs. 5A-5C button 510 activates the OCR process that reads the business card file in an image format and displays it in a grid mode 512 according to each character's position in the business card; [0045] In FIG. 5B an image of the business card is shown as item 511; [0046] FIG. 5C, each character of the original business card is displayed in a grid according to its corresponding position in the business card)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date to have combined the teaching of CRISTESCU and Huang as each invention are related to OCR document images for displaying within a user interface. Adding the teaching of Huang by providing an interface for displaying extracted characters within the grid. Therefore, the motivation to combine is for improving a user ability of validating the completeness and correctness of the OCR results.
Regarding dependent claim 10, depends on claim 1, CRISTESCU does not explicitly teach: wherein the user interface is configured to enable the user, through user input, to perform two or more search operations in a portion of the grid of character positions, and perform a cropping operation on the electronic source document, wherein the cropping operation is based on results of the two or more search operations.
However, Huang teaches: wherein the user interface is configured to enable the user, through user input, to perform two or more search operations in a portion of the grid of character positions, and perform a cropping operation on the electronic source document, wherein the cropping operation is based on results of the two or more search operations. (Huang − [0027] FIG. 3 illustrates a method for searching information from the relational database of FIG. 2 according to an embodiment of the present invention. In this example, the method starts in block 302 where the user enters a search query [0045] In FIG. 5B an image of the business card is shown as item 511; [0046] FIG. 5C, each character of the original business card is displayed in a grid according to its corresponding position in the business card; [0049] In the server, the business card information can be indexed for support of subsequent searching by the user.) Note: cropping operation retrieving a data from the source; search the business card information.
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date to have combined the teaching of CRISTESCU and Huang as each invention are related to OCR document images for displaying within a user interface. Adding the teaching of Huang by providing an interface for displaying extracted characters within the grid. Therefore, the motivation to combine is for improving a user ability of validating the completeness and correctness of the OCR results.
Regarding independent claim 11, is directed to a method. Claim 11 have similar/same technical features/limitations as claim 1. Claim 11 is rejected under the same rationale.
Regarding dependent claim 12, depends on claim 11, CRISTESCU teaches: wherein the first set of extracted characters corresponds to the first group of characters in the human-readable information, and wherein the second set of extracted characters corresponds to the second group of characters in the human-readable information. (CRISTESCU − [0032] Fig. 3-A-B, respectively, having a set of text fields 32a-f of various field types. In the case of an invoice, exemplary field types may include, among others: Vendor name, Vendor address, Buyer name, Billing address, Shipping address, Invoice number, Purchase order number, Invoice date, Tax due, Total due, Payment terms, Currency, Item description, Item quantity, Item unit price, Item line amount, Item purchase order number, Item number, and Item part number.)
Regarding dependent claim 13, depends on claim 1, CRISTESCU teaches: wherein the first relative positioning is such that the first group of characters is positioned above the second group of characters, and wherein the second relative positioning is such that the first set of extracted characters is positioned above the second set of extracted characters. (CRISTESCU − [0032] [0037] Fig. 5, Fig. 7 A token bounding box is herein defined as a geometric shape fully enclosing a region of document image 20 containing text token 30, i.e., all pixels belonging to the respective text token are inside the token bounding box.) Fig. 7 first group element 34b above second group element 34c
Regarding dependent claim 14, depends on claim 1, CRISTESCU teaches: wherein the first relative positioning is such that the first group of characters is positioned left of the second group of characters, and wherein the second relative positioning is such that the first set of extracted characters is positioned left of the second set of extracted characters. (CRISTESCU − [0032] [0037] Fig. 5, Fig. 7 A token bounding box is herein defined as a geometric shape fully enclosing a region of document image 20 containing text token 30, i.e., all pixels belonging to the respective text token are inside the token bounding box.) Fig. 7 first group element 34b left of second group element 34b
Regarding dependent claim 15, depends on claim 1, CRISTESCU teaches: further comprising: presenting a user interface to the user, (CRISTESCU − [0032] FIGS. 3-A-B show an exemplary invoice 24a and receipt 24b, respectively, [0027] Fig. 1, Fig. 2 Client systems 10a-c generically represent any computing appliance comprising a processor, a memory unit, and a communication interface. Exemplary client systems 10a-c include a corporate mainframe computer, a personal computer, a mobile computing device (e.g., tablet computer, laptop computer), a mobile telecommunications device (e.g., smartphone), a digital camera, a media player, and a wearable computing device (e.g., smartwatch), among others.)
CRISTESCU does not explicitly teach: wherein the user interface is configured to enable the user to perform a search operation in a portion of the grid of character positions,
However, Huang teaches: wherein the user interface is configured to enable the user to perform a search operation in a portion of the grid of character positions, (Huang − [0027] FIG. 3 illustrates a method for searching information from the relational database of FIG. 2 according to an embodiment of the present invention. In this example, the method starts in block 302 where the user enters a search query)
such that presenting a result of the search operation includes performance of a cropping operation on at least one of the electronic source document and the character-based representation of the electronic source document. (Huang − [0045] In FIG. 5B an image of the business card is shown as item 511; [0046] FIG. 5C, each character of the original business card is displayed in a grid according to its corresponding position in the business card; [0049] In the server, the business card information can be indexed for support of subsequent searching by the user.) Note: cropping operation retrieving a data from the source; search the business card information.
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date to have combined the teaching of CRISTESCU and Huang as each invention are related to OCR document images for displaying within a user interface. Adding the teaching of Huang by providing an interface for displaying extracted characters within the grid. Therefore, the motivation to combine is for improving a user ability of validating the completeness and correctness of the OCR results.
Regarding dependent claim 16, depends on claim 1, CRISTESCU teaches: wherein the electronic source documents include electronic files including scanned documents. (CRISTESCU − [0029] Document image 20 comprises an encoding of an optical image of a printed document. Image 20 may be acquired using an imaging device 12 (FIG. 1) which may be of any type known in the art (e.g., scanner, digital camera, etc.). The format, size, and encoding of image 20 may very among embodiments. [0037] [0037] In some embodiments, data scraping engine 40 includes an optical character recognition (OCR) engine 42 configured to receive document image 20 and to extract a set of text tokens 30 from image 20.)
Regarding dependent claim 17, depends on claim 1, CRISTESCU teaches: wherein the first group of characters in the human-readable information include one or more of words, numbers, names, and dates. (CRISTESCU − [0032] FIGS. 3-A-B show an exemplary invoice 24a and receipt 24b, respectively, having a set of text fields 32a-f of various field types. In the case of an invoice, exemplary field types may include, among others: Vendor name, Vendor address, Buyer name, Billing address, Shipping address, Invoice number, Purchase order number, Invoice date, Tax due, Total due, Payment terms, Currency, Item description, Item quantity, Item unit price, Item line amount, Item purchase order number, Item number, and Item part number. In the example of FIG. 3-A, field 32a contains a billing name and address, field 32b contains an invoice date, items 32c-d contain an item description and a total amount due, respectively.) Element 30d dates, element 32a words, numbers and name.
Regarding dependent claim 18, depends on claim 1, CRISTESCU teaches: wherein the extracted information has been extracted from the electronic source document at least in part by using deep learning techniques. (CRISTESCU − [0050] In the exemplary configuration of FIG. 4, a text feature extractor 44 receives text token 30 from OCR engine 42 and outputs text feature vector 62 characterizing text token 30. FIG. 10 further illustrates the operation of extractor 44. [0052] Text feature extractor 44 may further comprise a text convolver 57 which may be structured as a convolutional neural network. In one example illustrated in FIG. 10,) CNN is a type of deep learning algorithm
Regarding dependent claim 19, depends on claim 1, CRISTESCU teaches: wherein the extracted information for the first set of extracted characters includes a first set of spatial coordinates that indicate a first spatial position in the electronic source document, (CRISTESCU − [0037] [0041-0042] [0037] Token box indicator 31 may comprise a set of coordinates of vertices of bounding box 34, ordered according to a pre-determined rule (for instance, counter-clockwise, starting with the lower-left vertex). Vertex coordinates X and Y may be expressed in image pixels or may be determined as a fraction of the image size along the respective direction (e.g., for an image 800 pixels wide, a coordinate X=0.1 may indicate a position located 80 pixels from the left edge of image 20). In an alternative embodiment, bounding box 34 may be specified as a tuple {X, Y, w, h}, wherein X and Y denote coordinates of a vertex 36, and w and h denote a width and a height of box 34, respectively.)
and wherein the first set of spatial coordinates corresponds to the first set of textual coordinates. (CRISTESCU − [0037] [0041-0042] [0037] Token box indicator 31 may comprise a set of coordinates of vertices of bounding box 34, ordered according to a pre-determined rule (for instance, counter-clockwise, starting with the lower-left vertex). Vertex coordinates X and Y may be expressed in image pixels or may be determined as a fraction of the image size along the respective direction (e.g., for an image 800 pixels wide, a coordinate X=0.1 may indicate a position located 80 pixels from the left edge of image 20). In an alternative embodiment, bounding box 34 may be specified as a tuple {X, Y, w, h}, wherein X and Y denote coordinates of a vertex 36, and w and h denote a width and a height of box 34, respectively.)
CRISTESCU does not explicitly teach: grid of character positions,
However, Huang teaches: wherein the first set of characters in the character-based representation is associated with a first set of textual coordinates in the grid of character positions, (Huang − [0044] Figs. 5A-5C button 510 activates the OCR process that reads the business card file in an image format and displays it in a grid mode 512 according to each character's position in the business card; [0045] In FIG. 5B an image of the business card is shown as item 511; [0046] FIG. 5C, each character of the original business card is displayed in a grid according to its corresponding position in the business card)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date to have combined the teaching of CRISTESCU and Huang as each invention are related to OCR document images for displaying within a user interface. Adding the teaching of Huang by providing an interface for displaying extracted characters within the grid. Therefore, the motivation to combine is for improving a user ability of validating the completeness and correctness of the OCR results.
Regarding dependent claim 20, depends on claim 1, CRISTESCU does not explicitly teach: wherein the user interface is configured to enable the user, through user input, to perform two or more search operations in a portion of the grid of character positions, and perform a cropping operation on the electronic source document, wherein the cropping operation is based on results of the two or more search operations.
However, Huang teaches: wherein the user interface is configured to enable the user, through user input, to perform two or more search operations in a portion of the grid of character positions, and perform a cropping operation on the electronic source document, wherein the cropping operation is based on results of the two or more search operations. (Huang − [0027] FIG. 3 illustrates a method for searching information from the relational database of FIG. 2 according to an embodiment of the present invention. In this example, the method starts in block 302 where the user enters a search query [0045] In FIG. 5