DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) was submitted on 1/10/2024. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Specification
The disclosure is objected to because of the following informalities:
In paragraph 0012, “for the or a deep learning system” should read, “for the deep learning system.”
In paragraphs 0061 and 0062, reference character “850” has been used to designate both a database and non-volatile storage.
Appropriate correction is required.
Claim Objections
Claim 1 and 11 are objected to because of the following informalities:
In claim 1, line 12, “for the or a deep learning system” should read, “for the deep learning system.”
In claim 11, line 15, “for the or a deep learning system” should read, “for the deep learning system.”
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1, 2, 6, 11, 12, and 16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation "the form" in lines 3 and 5. There is insufficient antecedent basis for this limitation in the claim. For examination purposes, claim 1 will be read as if it recites “the input form image” in lines 3 and 5.
Claim 1 recites the limitation "a deep learning system" in line 7. There is insufficient antecedent basis for this limitation in the claim. It is unclear as to whether “a deep learning system” refers to the previously recited deep learning system in line 5. For examination purposes, claim 1 will be read as if it recites “the deep learning system” in line 7.
Claim 2 recites the limitation "the input form" in line 1. There is insufficient antecedent basis for this limitation in the claim. For examination purposes, claim 2 will be read as if it recites “the input form image” in line 1.
Claim 6 recites the limitation “a first entity” in line 3. There is insufficient antecedent basis for this limitation in the claim. It is unclear whether “a first entity” in line 3 refers to the “first entities” recited in line 1. For examination purposes, claim 6 will be read as if it recites “the first entity” in line 3.
Claim 11 recites the limitation "the form" in lines 6 and 8. There is insufficient antecedent basis for this limitation in the claim. For examination purposes, claim 11 will be read as if it recites “the input form image” in lines 6 and 8.
Claim 11 recites the limitation "a deep learning system" in lines 8 and 10. There is insufficient antecedent basis for this limitation in the claim. It is unclear as to whether “a deep learning system,” referenced in lines 8 and 10 refers to the “deep learning system” previously recited in line 2. For examination purposes, claim 11 will be read as if it recites “the deep learning system” in lines 8 and 10.
Claim 12 recites the limitation "the input form" in line 1. There is insufficient antecedent basis for this limitation in the claim. For examination purposes, claim 12 will be read as if it recites “the input form image” in line 1.
Claim 16 recites the limitation “a first entity” in line 3. There is insufficient antecedent basis for this limitation in the claim. It is unclear whether “a first entity” recited in line 3 refers to the “first entities” recited in line 1. For examination purposes, claim 16 will be read as if it recites “the first entity” in line 3.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. When reviewing independent claims 1 and 11 and based upon consideration of all of the relevant factors with respect to the claims as a whole, claim(s) 1-20 are held to claim an abstract idea without reciting elements that amount to significantly more than the abstract idea and are therefore rejected as ineligible subject matter under 35 U.S.C. 101. The Examiner will analyze Claims 1, and similar rationale applies to independent Claim 11. The rationale, under MPEP § 2106, for this finding is explained below:
The claimed invention (1) must be directed to one of the four statutory categories, and (2) must not be wholly directed to subject matter encompassing a judicially recognized exception, as defined below. The following two step analysis is used to evaluate these criteria.
Step 1: Is the claim directed to one of the four patent-eligible subject matter categories: process, machine, manufacture, or composition of matter?
When examining the claim under 35 U.S.C. 101, the Examiner interprets that the claim is related to a process since the claim is directed to a method to generate and augment document forms.
Step 2a, Prong 1: Does the claim wholly embrace a judicially recognized exception, which includes laws of nature, physical phenomena, and abstract ideas, or is it a particular practical application of a judicial exception?
The Examiner interprets that the judicial exception applies since Claim 1 limitations of receiving an input form image, placing first bounding boxes around text in the form, inputting semantic information for text in the bounding boxes, identifying regions on the form, identifying first entities in a region containing semantic information, replacing the identified first entities with second entities, and placing second bounding boxes around the text in the second entities are directed to an abstract idea. The claim is related to mental process by performing a process that “can be performed in the human mind, or by a human using a pen and paper mental process on a generic computer.” If the claim recites a judicial exception (i.e., an abstract idea enumerated in MPEP § 2106.04(a), a law of nature, or a natural phenomenon), the claim requires further analysis in Prong Two.
Step 2a, Prong 2: Does the claim recite additional elements that integrate the judicial exception into a practical application?
The Examiner interprets that Claim 1 limitations do not provide additional elements or combination of additional elements to a practical application since the claim is adding the words of “applying it” with more instructions to implement an abstract idea on a computer. See MPEP 2106.05(f), For example, as explained by the Supreme Court; in order to make a claim directed to a judicial exception patent-eligible, the additional element or combination of elements must do "‘more than simply stat[e] the [judicial exception] while adding the words ‘apply it’". Alice Corp. v. CLS Bank, 573 U.S. 208, 221, 110 USPQ2d 1976, 1982-83 (2014) (quoting Mayo Collaborative Servs. V. Prometheus Labs., Inc., 566 U.S. 66, 72, 101 USPQ2d 1961, 1965). Thus, for example, claims that amount to nothing more than an instruction to apply the abstract idea using a generic computer do not render an abstract idea eligible. Alice Corp., 573 U.S. at 223, 110 USPQ2d at 1983. Specifically, the Examiner finds that “placing first bounding boxes around text in the form” can be done by mentally placing bounding boxes around text on a form. For example, a person can imagine a two-dimensional space around a distinct area to mentally organize the layout of the form. Additionally, the Examiner finds that “identifying regions on a form, the regions to contain one or more bounding boxes” can be done by simply looking at the form to find regions containing bounding boxes. Furthermore, the Examiner finds that “identifying first entities in a region containing semantic information” can be done by looking at the form to find entities containing semantic information. Additionally, the Examiner finds that “replacing the identified first entities with second entities” can be done by a person by mentally removing the first entity and replacing it with a second entity. Finally, the Examiner finds that “placing second bounding boxes around the text in the second entities” can be done by mentally placing bounding boxes around the text in the second entities. A person can imagine a two-dimensional space around a distinct area to mentally organize the layout of the form.
Step 2b: If a judicial exception into a practical application is not recited in the claim, the Examiner must interpret if the claim recites additional elements that amount to significantly more than the judicial exception.
The Examiner interprets that the Claims do not amount to significantly more since the Claims merely recite a series of abstract ideas carried out by a “deep learning system.”
Furthermore, the generic computer components of the processor or non-transitory memory recited in Claim 11 as performing generic computer functions that are well-understood, routine and conventional activities amount to no more than implementing the abstract idea with a computerized system.
Claims 2-10 and 12-20 depending on the independent claims include all the limitation of the independent claims. The Examiner finds that Claims 2-10 and 12-20 do not state significantly more since the claims only recites additional steps for describing the method of generating and augmenting forms.
Thus, Claims 2-10 and 12-20 recite the same abstract idea and therefore are not drawn to the eligible subject matter as they are directed to the abstract idea without significantly more.
Therefore, the Examiner interprets that the claims are rejected under 35 U.S.C. 101.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 2, 4-7, 11, 12 and 14-17 are rejected under 35 U.S.C. 103 as being unpatentable over Ast (U.S. Patent No. 11,776,244, hereafter referred to as Ast) in view of Streltsov et al. (U.S. Patent Pub. No. 2023/0334309, hereafter referred to as Streltsov).
Regarding Claim 1, Ast teaches a method comprising: responsive to receipt of an input form image, (Col. 2, lines 33-35, Fig. 8A, Ast teaches receiving an input image of a document), placing first bounding boxes around text in the form (Col. 5, lines 63-67, Fig. 8B, Ast teaches a text document containing the text of the input image and corresponding position information in the form of bounding boxes.); inputting semantic information for text in the bounding boxes (Col. 9, lines 5-15, Fig. 8C, Ast teaches a semantic image with semantic information inputted for textual information, positioned in accordance with the bounding boxes associated with the text strings.); using a deep learning system, identifying regions on the form, the regions to contain one or more of the bounding boxes (Col. 3, lines 45-51, Fig. 8D, Ast teaches a region-based convolutional neural network (R-CNN) that extracts regions from the semantic image, each of which contains one or more bounding boxes.); and forming text images to generate training data for the or a deep learning system (Col 1., lines 37-43, Ast teaches generating semantic images from input images of documents and utilizing the semantic images to improve training of a machine learning engine.).
Although Ast discloses using a deep learning system, Ast does not explicitly disclose performing one of randomly scaling or randomly translating one or more of the bounding boxes within at least one of the regions; identifying first entities in a region containing semantic information; replacing the identified first entities with second entities; and placing second bounding boxes around the text in the second entities.
Streltsov is in the same field of art of augmenting documents to generate training datasets for deep learning systems. Further, Streltsov teaches performing one of randomly scaling or randomly translating one or more of the bounding boxes within at least one of the regions (Paragraphs [0043], [0052], Streltsov teaches user-defined rules for automatically augmenting original electronic documents to create a plurality of synthetic electronic documents. For example, a rule may be specified to apply a “shift,” which may comprise a positional shift of an annotated data field up, down, left, right, or any combination thereof, to an annotated data field containing a bounding box. The amount of shift in the annotated data field may be randomly applied. The Examiner interprets the terms “translating” and “shift(ing)” to be synonymous in this context. Additionally, the Examiner interprets “regions” broadly as any area or location within the document since the claim is silent to the specifications of the region.); identifying first entities in a region containing semantic information (Fig. 1A, reference character 108, Paragraph [0039], Streltsov teaches an original electronic document comprising a plurality of annotated data fields. Each section may comprise annotated data fields. Each annotated data field comprises a bounding box and an associated label. Each bounding box may have coordinate data in the form of pixel positions of the corners stored therefor. The Examiner interprets “semantic information” to include a text string including, for example, a date, a name of an industrial part, a telephone number, a street address, or the like. The Examiner interprets “region” broadly as an area or location since the claim is silent to the specifications of the region.); replacing the identified first entities with second entities (Paragraph [0004], Streltsov teaches “semantic augmentations,” which may comprise changing a text string in the data field. For example, an address field in the original electronic document may be changed to a random address in the synthetic electronic document. The Examiner interprets the terms “changing” and “replacing” to be synonymous in this context.); and placing second bounding boxes around the text in the second entities (Fig. 1B, reference character 108, Paragraph [0062], Streltsov teaches modifying bounding boxes to account for the new text. For example, if semantic augmentation adds two new lines of text, the size of the bounding box may be increased accordingly. The Examiner interprets "placing second bounding boxes around the text in the second entities" to include updating the position and/or size of the bounding box to account for the newly inserted text in the second entity, because the claim is silent as to how the second bounding boxes differ.).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Ast by performing one of randomly scaling or randomly translating one or more of the bounding boxes within at least one of the regions; identifying first entities in a region containing semantic information; replacing the identified first entities with second entities; and placing bounding second bounding boxes around the text in the second entities that is taught by Streltsov, to make the invention that utilizes a deep learning system to generate and augment training forms with diverse layouts and text; thus one of ordinary skill in the art would have been motivated to combine the references to automate the rule-based or manual process of altering forms by employing a deep learning system to carry out the semantic and geometric augmentations to the annotated data fields and bounding boxes, thereby reducing or eliminating the programming previously required to implement algorithmic rules used in training ML engines (Ast, Col. 2, lines 20-32).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
In regard to Claim 2, Ast in view of Streltsov discloses wherein the input form comprises at least one table (Fig. 8A, Ast teaches an input form with a table.), the method further comprising randomly moving one or more columns and/or one or more rows within the table (Paragraph [0052], Streltsov teaches cloning a random number of line items in a table section to create a plurality of synthetic electronic documents with various-sized tables for training the model. A parameter may be set by a user such that for example, the random number has a lower bound of 5 clones and an upper bound of 50 clones. Similarly, rules may be defined to apply geometric augmentations such as shifts to the cloned line items. The Examiner interprets the terms “line items” and “rows” as synonymous within this context, considering both to be horizontal arrangements of related data. Further, the Examiner interprets “randomly moving” these rows or line items broadly to include the action of cloning a random number of line items. This interpretation allows for the addition of duplicate rows/line items to a table, thereby enlarging its size. This broad interpretation is applied since the claim is silent as to how the rows are randomly moved.)
In regards to Claim 4, Ast in view of Streltsov discloses performing text spotting in the input form image (Col. 5, lines 57-60, Ast teaches identifying characters that are depicted within the input image. The Examiner interprets “text spotting” as recognition of text in the image because the claim does not define or otherwise limit the term.), and optical character recognition (OCR) on the input form image (Col. 5, lines 57-60, Fig. 1, reference character 110, Ast teaches an OCR module).
In regards to Claim 5, Ast in view of Streltsov discloses wherein there are first bounding boxes for all of the first entities (Col. 14, lines 28-32, Fig. 8B, Ast teaches position information represented by bounding boxes for each piece of text or character string in the text document.).
In regards to Claim 6, Ast in view of Streltsov discloses wherein replacing the first entities comprises: randomly selecting a second entity; (Paragraph [0052], Streltsov teaches retrieving a random address from the dictionary); replacing a first entity with the second entity (Paragraph [0052], Streltsov teaches within an address field, replacing the address with a random address from the dictionary.); adding the second entity to a dictionary (Paragraph [0048], Streltsov teaches a dictionary for storing randomly generated addresses. The Examiner interprets that the second entities are automatically added to the dictionary when then they are stored in it.); and repeating said randomly selecting, said replacing the first entity with the second entity, and said adding for all of the first entities (Paragraph [0062], [0004] Streltsov teaches semantically augmenting each annotated data field in a document. The semantic augmentations may comprise changing a string in the data field, requiring each of the steps mentioned above, including selecting, replacing, and adding.).
In regards to Claim 7, Ast in view of Streltsov discloses wherein the randomly translating the one or more of the bounding boxes within one or more of the regions comprises (Paragraph [0043], [0052], Streltsov teaches a positional shift (up, down, left, right, or any combination thereof) of an annotated data field. In some embodiments, the amount of shift is randomly applied.), for a bounding box in a region, translating the bounding box only within said region (Paragraphs [0043], Streltsov teaches a positional shift of an annotated data field, in which a distance limit may be defined for the shift operation such that shifted annotated data fields may not be shifted outside of a specified region of the synthetic electronic document. For example, a 10% maximum shift of annotated data field may be defined.).
In regards to Claim 11, Ast discloses an apparatus comprising: a deep learning system (Col. 2, lines 39-41, Ast teaches an ML engine enhanced with a deep learning (DL) feedback loop) comprising at least one processor (Col. 4, lines 42-43, Fig. 5, reference character 502, Ast teaches a processor) and a non-transitory memory that contains instructions that, when executed, enable the deep learning system to perform a method (Col. 4, lines 43-46, Col. 20, lines 15-60, Ast teaches a non-transitory computer-readable medium and stored instructions translatable by the processor.) comprising: responsive to receipt of an input form image, (Col. 2, lines 33-35, Fig. 8A, Ast teaches receiving an input image of a document), placing first bounding boxes around text in the form (Col. 5, lines 63-67, Fig. 8B, Ast teaches a text document containing the text of the input image and corresponding position information in the form of bounding boxes.); inputting semantic information for text in the bounding boxes (Col. 9, lines 5-15, Fig. 8C, Ast teaches a semantic image with semantic information inputted for textual information, positioned in accordance with the bounding boxes associated with the text strings.); using a deep learning system, identifying regions on the form, the regions to contain one or more of the bounding boxes (Col. 3, lines 45-51, Fig. 8D, Ast teaches a region-based convolutional neural network that extracts regions from the semantic image, each of which contains one or more bounding boxes.); and forming text images to generate training data for the or a deep learning system (Col 1., lines 37-43, Ast teaches generating semantic images from input images of documents and utilizing the semantic images to improve training of a machine learning engine.).
Although Ast discloses using a deep learning system, Ast does not explicitly disclose performing one of randomly scaling or randomly translating one or more of the bounding boxes within at least one of the regions; identifying first entities in a region containing semantic information; replacing the identified first entities with second entities; and placing second bounding boxes around the text in the second entities.
Streltsov is in the same field of art of augmenting documents to generate diverse training datasets for deep learning systems. Further, Streltsov teaches, performing one of randomly scaling or randomly translating one or more of the bounding boxes within at least one of the regions; (Paragraphs [0043], [0052], Streltsov teaches applying a “shift,” which may comprise a positional shift of an annotated data field up, down, left, right, or any combination thereof, to an annotated data field containing a bounding box. The amount of shift in the annotated data field may be randomly applied.), identifying first entities in a region containing semantic information (Fig. 1A, reference character 108, Paragraph [0039], Streltsov teaches an original electronic document comprising a plurality of annotated data fields. Each section may comprise annotated data fields. Each annotated data field comprises a bounding box and an associated label. The Examiner interprets “semantic information” to be a text string including, for example, a date, a name of an industrial part, a telephone number, a street address, or the like. The Examiner interprets “region” broadly as an area or location since the claim is silent to the specifications of the region.); replacing the identified first entities with second entities (Paragraph [0004], Streltsov teaches “semantic augmentations,” which may comprise changing a text string in the data field. For example, an address field in the original electronic document may be replaced with a random address in the synthetic electronic document. The Examiner interprets the terms “changing” and “replacing” to be synonymous.); placing second bounding boxes around the text in the second entities (Fig. 1B, reference character 108, Paragraph [0062], Streltsov teaches modifying bounding boxes to account for the new text. For example, if replacing a text string adds two new lines of text, the size of the bounding box is increased accordingly. The Examiner interprets "placing second bounding boxes around the text in the second entities" to include updating the position and/or size of the bounding box to account for the newly inserted text in the second entity, because the claim is silent as to how the second bounding boxes differ.).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Ast by performing one of randomly scaling or randomly translating one or more of the bounding boxes within at least one of the regions; identifying first entities in a region containing semantic information; replacing the identified first entities with second entities; and placing bounding second bounding boxes around the text in the second entities that is taught by Streltsov, to make the invention more efficient at extracting data from electronic documents by augmenting the synthetic electronic documents to form a sufficiently diverse training data set to provide to a deep learning model. (Streltsov, Paragraph [0001]); thus, one of ordinary skill in the art would have been motivated to combine the references to overcome issues such as failure to accurately extract data due to structural differences in the document, or label imbalance, such that the model is biased towards labels seen more often during training, in which the model struggles to identify labels that are seen less frequently during the training phase. (Streltsov, Paragraph [0002]).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
In regard to Claim 12, Ast in view of Streltsov discloses wherein the input form comprises at least one table (Fig. 8A, Ast teaches an input form with a table.), the method further comprising randomly moving one or more columns and/or one or more rows within the table (Paragraph [0052], Streltsov teaches cloning a random number of line items in a table section to create a plurality of synthetic electronic documents with various-sized tables for training the model. A parameter may be set by a user such that for example, the random number has a lower bound of 5 clones and an upper bound of 50 clones. Similarly, rules may be defined to apply geometric augmentations such as shifts to the cloned line items. The Examiner interprets the terms “line items” and “rows” as synonymous within this context, considering both to be horizontal arrangements of related data. Further, the Examiner interprets “randomly moving” the rows or line items broadly to include the action of cloning a random number of line items. This interpretation allows for the addition of duplicate rows/line items to a table, thereby enlarging its size. This broad interpretation is applied since the claim is silent to how the rows are randomly moved.).
In regard to Claim 14, Ast in view of Streltsov discloses wherein the method further comprises performing text spotting in the input form image, (Col. 5, lines 57-60, Ast teaches identifying characters that are depicted within the input image. The Examiner interprets “text spotting” as recognition of text in the image because the claim does not define or otherwise limit the term.), and optical character recognition (OCR) on the input form image (Col. 5, lines 57-60, Fig. 1, reference character 110, Ast teaches an OCR module).
In regard to Claim 15, Ast in view of Streltsov discloses wherein there are first bounding boxes for all the first entities (Col. 14, lines 29-32, Fig. 8B, Ast teaches position information represented by bounding boxes for each piece of text or character string in the text document.).
In regard to Claim 16, Ast in view of Streltsov discloses wherein replacing the first entities comprises: randomly selecting a second entity (Paragraph [0052], Streltsov teaches retrieving a random address from the dictionary); replacing a first entity with the second entity (Paragraph [0052], Streltsov teaches within an address field, replacing the address with a random address from the dictionary.); adding the second entity to a dictionary (Paragraph [0048], Streltsov teaches a dictionary for storing randomly generated addresses. The Examiner interprets that the second entities are automatically added to the dictionary when then they are stored.); and repeating said randomly selecting, said replacing the first entity with the second entity, and said adding for all of the first entities (Paragraph [0062], [0004], Streltsov teaches semantically augmenting each annotated data field in a document. The semantic augmentations may comprise changing a string in the data field, requiring each of the steps mentioned above, including selecting, replacing, and adding.).
In regard to Claim 17, Ast in view of Streltsov discloses wherein the randomly translating the one or more of the bounding boxes within one or more of the regions comprises, (Paragraph [0043], [0052], Streltsov teaches a positional shift (up, down, left, right, or any combination thereof) of an annotated data field. In some embodiments, the amount of shift is randomly applied.), for a bounding box in a region, translating the bounding box only within said region (Paragraphs [0043], Streltsov teaches a positional shift of an annotated data field, in which a distance limit may be defined for the shift operation such that shifted annotated data fields may not be shifted outside of a specified region of the synthetic electronic document. For example, a 10% maximum shift of annotated data field may be defined.).
Claim 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Ast (U.S. Patent No. 11,776,244, hereafter referred to as Ast) in view of Streltsov et al. (U.S. Patent Pub. No. 2023/0334309, hereafter referred to as Streltsov) in further view of Gohari (U.S. Patent Pub. No. 2022/0318492, hereafter referred to as Gohari).
Regarding Claim 3, Ast in view of Streltsov teaches the method of claim 1.
Ast in view of Streltsov does not explicitly disclose updating weights of nodes in the deep learning system being trained, responsive to one or more of said randomly scaling, said randomly translating, said replacing, or said forming.
Gohari is in the same field of art of form generation and data extraction using a machine learning system. Further, Gohari teaches updating weights of nodes in the deep learning system being trained (Paragraph [0016], Fig. 7, Gohari teaches updating weights of nodes in the deep learning model), responsive to one or more of said randomly scaling, said randomly translating, said replacing, or said forming (Paragraphs [0033], [0036], Figs. 1A and 1B, reference characters 110 and 110’, Gohari teaches inputting training sets of synthetically generated blank forms to the neural network, and comparing the forms to identify differences between them. In one aspect, the synthetically generated forms within a training data set may be similar to each other, but may have minor changes from form to form, to enable the weights of nodes within the neural network to be altered properly. For example, the header in a form containing the word “Part” may be replaced with the word “Widget.”).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Ast in view of Streltsov by updating weights of nodes in the deep learning system being trained, responsive to one or more of said randomly scaling, said randomly translating, said replacing, or said forming, that is taught by Gohari, to effectively train the deep learning system to recognize relevant differences between otherwise similar forms, including types and locations of keywords and potential locations of values corresponding to keywords as well as save both time and labor associated with manual form creation (Gohari, Abstract, Paragraph [0004]); thus, one of ordinary skill in the art would be motivated to combine the references since they are in the field of generating documents to provide training data to a deep learning system to enable the deep learning system to identify types and locations of keywords in forms (Gohari, Abstract).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
In regards to Claim 13, Ast in view of Streltsov teaches the apparatus of claim 11.
Ast in view of Streltsov does not explicitly disclose updating weights of nodes in the deep learning system being trained, responsive to one or more of said randomly scaling, said randomly translating, said replacing, or said forming.
Gohari is in the same field of art of form generation and field identification using a machine learning system. Further, Gohari teaches updating weights of nodes in the deep learning system being trained (Paragraph [0016], Fig. 7, Gohari teaches updating weights of nodes in the deep learning model), responsive to one or more of said randomly scaling, said randomly translating, said replacing, or said forming (Paragraphs [0033], [0036], Figs. 1A and 1B, reference characters 110 and 110’, Gohari teaches inputting training sets of synthetically generated blank forms to the neural network, and comparing the forms to identify differences between them. In one aspect, the synthetically generated forms within a training data set may be similar to each other, but may have minor changes from form to form, to enable the weights of nodes within the neural network to be altered properly. For example, the header in a form containing the word “Part” may be replaced with the word “Widget.”).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Ast in view of Streltsov by updating weights of nodes in the deep learning system being trained, responsive to one or more of said randomly scaling, said randomly translating, said replacing, or said forming, that is taught by Gohari, to effectively train the deep learning system to recognize relevant differences between otherwise similar forms, including types and locations of keywords and potential locations of values corresponding to keywords to save both time and manual labor associated with manual form creation (Gohari, Abstract, Paragraph [0004]); thus, one of ordinary skill in the art would be motivated to combine the references since they are in the field of generating documents to provide training data to a deep learning system, therefore enabling the deep learning system to identify types and locations of keywords in forms (Gohari, Abstract).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
Claims 8 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Ast (U.S. Patent No. 11,776,244, hereafter referred to as Ast) in view of Streltsov et al. (U.S. Patent Pub. No. 2023/0334309, hereafter referred to as Streltsov) in further view of Ling et al. (NPL "Document Domain Randomization for Deep Learning Document Layout Extraction" Document Analysis and Recognition – ICDAR 2021, 2021, hereafter referred to as Ling).
Regarding Claim 8, Ast in view of Streltsov teaches the method of claim 1.
Ast in view of Streltsov does not explicitly disclose wherein the randomly translating the one or more of the bounding boxes within one or more of the regions comprises, for a bounding box in a region, translating the bounding box from said region to another of the one or more of the regions.
Ling is in the same field of art of document generation for training a deep learning system. Further, Ling discloses wherein the randomly translating the one or more of the bounding boxes within one or more of the regions comprises (Fig. 3, Section 3, “Document Domain Randomization,” Ling teaches randomly varying the locations of graphical components, including figures, tables, and textual content, each of which are surrounded by a bounding box. The Examiner interprets “region” broadly, to encompass any area or location within the document since the claim is silent to the specific constraints for the term.), for a bounding box in a region, translating the bounding box from said region to another of the one or more of the regions (Fig. 3, Section 3, “Document Domain Randomization,” Ling teaches randomly varying the locations of graphical components, each of which are surrounded by a bounding box, in the synthesized documents. For example, Fig. 3 (a-c) shows three examples of synthesized pages in which the “Abstract” section, shown within a bounding box, has been moved from one region on the document to another region of the document. The Examiner interprets “region” broadly, to encompass any area or location within the document since the claim is silent to the specific constraints for the term.).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Ast in view of Streltsov by wherein the randomly translating the one or more of the bounding boxes within one or more of the regions comprises, for a bounding box in a region, translating the bounding box from said region to another of the one or more of the regions, that is taught by Ling, to make the invention better equipped at handling documents with different structural and semantic organizations of sections and subsections. For example, with enough page appearance randomization, the real page should appear to the model as just another variant. (Ling, Introduction); thus one of ordinary skill in the art would be motivated to combine the references since this randomization would significantly lower the cost of producing training data by generating real-world page styles to quickly produce millions of training samples that infer real-world document structures (Ling, Section 1).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
In regards to Claim 18, Ast in view of Streltsov teaches the apparatus of claim 13.
Ast in view of Streltsov does not explicitly disclose wherein randomly translating the one or more of the bounding boxes within one or more of the regions comprises, for a bounding box in a region, translating the bounding box from said region to another of the one or more of the regions.
Ling is in the same field of art of document generation for training a deep learning system. Further, Ling discloses wherein the randomly translating the one or more of the bounding boxes within one or more of the regions comprises (Fig. 3, Section 3, “Document Domain Randomization,” Ling teaches randomly varying the locations of graphical components, including figures, tables, and textual content, each of which are surrounded by a bounding box. The Examiner interprets “region” broadly, to encompass any area or location within the document since the claim is silent to the specific constraints for the term.), for a bounding box in a region, translating the bounding box from said region to another of the one or more of the regions (Fig. 3, Section 3, “Document Domain Randomization”, Ling teaches randomly varying the locations of graphical components, each of which are surrounded by a bounding box, in the synthesized documents. For example, Fig. 3 (a-c) shows three examples of synthesized pages in which the “Abstract” section, shown within a bounding box, has been moved from one region on the document to another region of the document. The Examiner interprets “region” broadly, to encompass any area or location within the document since the claim is silent to the specific constraints for the term.).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Ast in view of Streltsov by wherein the randomly translating the one or more of the bounding boxes within one or more of the regions comprises, for a bounding box in a region, translating the bounding box from said region to another of the one or more of the regions, that is taught by Ling, to force the invention to learn important structures in a document by diversifying the structure and semantic content of a training set of documents (Ling, Section 2.1); thus one of ordinary skill in the art would be motivated to combine the references since it would make the invention more accurate for segmenting both graphic and semantic content in papers with perturbed layouts (Ling, Section 5).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
Claim(s) 9, 10, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ast (U.S. Patent No. 11,776,244, hereafter referred to as Ast) in view of Streltsov et al. (U.S. Patent Pub. No. 2023/0334309, hereafter referred to as Streltsov) in further view of Buban et al. (U.S. Patent No. 12,094,231).
Regarding Claim 9, Ast in view of Streltsov discloses the method of claim 1.
Ast in view of Streltsov does not explicitly disclose wherein the randomly scaling comprises, for a bounding box in a region, enlarging or shrinking the bounding box.
Buban is in the same field of art of form generation for training a machine learning model. Further, Buban discloses wherein the randomly scaling comprises, for a bounding box in a region, enlarging or shrinking the bounding box (Col. 5, lines 28-43, Buban teaches augmenting documents by scaling them. The bounding boxes on the document are transformed according to the augmentation applied to the document. The Examiner interprets the term “scaling” by its well-known definition in the art, in which scaling is a linear transformation that either enlarges or shrinks an object while maintaining its shape.).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Ast in view of Streltsov by, wherein the randomly scaling comprises, for a bounding box in a region, enlarging or shrinking the bounding box, that is taught by Buban, to make the invention that allows for a smaller set of human-labeled documents to be used to generate a larger set of documents that introduces variance to the training dataset (Buban, Col. 5, lines 28-43); thus, one of ordinary skill in the art would be motivated to combine the references to improve the accuracy of identifying and classifying form fields by training the deep learning system on a set of forms with sufficient variance (Col. 5, lines 65-67).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
In regards to Claim 10, Ast in view of Streltsov in view of Buban teaches the method of claim 9, further comprising enlarging or shrinking all of the bounding boxes in the region (Col. 5, lines 37-40, Buban teaches transforming the labeled bounding boxes according to the augmentation applied to the document as a whole, for example, if a document is scaled, the bounding box coordinates of the augmented version of the document are all adjusted accordingly.).
In regards to Claim 19, Ast in view of Streltsov discloses the apparatus of claim 11.
Ast in view of Streltsov does not explicitly disclose wherein the randomly scaling comprises, for a bounding box in a region, enlarging or shrinking the bounding box.
Buban is in the same field of art of form generation for training a machine learning model. Further, Buban teaches wherein the randomly scaling comprises, for a bounding box in a region, enlarging or shrinking the bounding box. (Col. 5, lines 28-43, Buban teaches augmenting documents by adjusting scaling. The bounding boxes on the document are transformed according to the augmentation applied to the document. The Examiner interprets the term “scaling” to mean its well-known definition, in which scaling is a linear transformation that either enlarges or shrinks an object while maintaining its shape.).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Ast in view of Streltsov by, wherein the randomly scaling comprises, for a bounding box in a region, enlarging or shrinking the bounding box, that is taught by Buban, to train the invention to be able to extract text from a document despite the presence of potential human induced variances in both document form generation and scanning or other techniques used to capture the document image (Col. 5, lines 34-36); thus, one of ordinary skill in the art would be motivated to combine the references to improve the invention’s accuracy for extracting text from forms having varied layouts by training the deep learning system on a set of forms with sufficient variance (Buban, Col. 3, lines 18-19, Col. 5, lines 65-67).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.
In regards to Claim 20, Ast in view of Streltsov in view of Buban teaches the apparatus of claim 19, wherein the method further comprises enlarging or shrinking all of the bounding boxes in the region (Col. 5, lines 37-40, Buban teaches transforming the labeled bounding boxes according to the augmentation applied to the document as a whole, for example, if a document is scaled, the bounding box coordinates of the augmented version of the document are all adjusted accordingly.).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SYDNEY L BLACKSTEN whose telephone number is 571-272-7651. The examiner can normally be reached 8:30am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Oneal Mistry can be reached at 313-446-4912. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SYDNEY L BLACKSTEN/Examiner, Art Unit 2674
/ONEAL R MISTRY/Supervisory Patent Examiner, Art Unit 2674