Last updated: May 29, 2026
Application No. 18/065,352
DOCUMENT IMAGE TEMPLATE MATCHING

Non-Final OA §101§103
Filed
Dec 13, 2022
Examiner
SATCHER, DION JOHN
Art Unit
2676
Tech Center
2600 — Communications
Assignee
International Business Machines Corporation
OA Round
1 (Non-Final)
Interview Optional

— +14.1% interview lift. Interview lift (+14.1%) is below the 15.0% threshold. A written response is recommended.
Based on 42 resolved cases, 2023–2026
Examiner Intelligence

SATCHER, DION JOHN View full profile →
Grants 86% — above average
Career Allowance Rate
36 granted / 42 resolved
+23.7% vs TC avg
Moderate +14% lift
Without
With
+14.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
21 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
2.5%
-37.5% vs TC avg
§103
94.2%
+54.2% vs TC avg
§102
1.7%
-38.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 42 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
This communication is in response to the Application Filed on 12/13/2022
Claims 1–20 are pending in this application.
Drawings
The drawing(s) filed on 12/13/2022 are accepted by the Examiner.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 12/13/2022 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim 20 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because they cover both statutory and non-statutory embodiments (under the broadest reasonable interpretation of the claim when read in light of the specification and in view of one skilled in the art) and embraces subject matter that is not eligible for patent protection and therefore is directed to non-statutory subject matter.
“[a] transitory, propagating signal … is not a “process, machine, manufacture, or composition of matter.”  Those four categories define the explicit scope and reach of subject matter patentable under 35 U.S.C. § 101; thus, such a signal cannot be patentable subject matter.”  (In re Petrus A.C.M. Nuijten; Fed Cir, 2006-1371, 9/20/2007).
Specifically, Applicant’s specification describes at paragraph ¶ [0019] of the specification recites: “A computer program product embodiment ("CPP embodiment" or "CPP") is a term used in the present disclosure to describe any set of one, or more, storage media (also called "mediums") collectively included in a set of one, or more, storage devices that collectively include machine readable code corresponding to instructions and/or data for performing computer operations specified in a given CPP claim” describes and as a result is drawn to a recording medium that covers both transitory and non-transitory embodiments. Thus, the claims are not eligible subject matter. It is recommended to amend and narrow the claims to cover only statutory embodiments to avoid a rejection under 35 U.S.C. § 101 by adding the limitation "non-transitory" to the claims. 
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim(s) 1–7, 9–10, 12–18 and 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The limitations, under their broadest reasonable interpretation, cover mental process (concept performed in a human mind, including as observation, evaluation, judgment, opinion, organizing human activity and mathematical concepts and calculations). The independent claim(s) 1, 12 and 20 recite(s) a method, a system, and a computer program product. This judicial exception is not integrated into a practical application because the steps do not add meaningful limitations to be considered specifically applied to a particular technological problem to be solved .The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the steps of the claimed invention can be done mentally and no additional features in the claims would preclude them from being performed as such except for the generic computer elements at high level of generality (i.e., processor, memory). 
According to the USPTO guidelines, a claim is directed to non-statutory subject matter if: 
STEP 1: the claim does not fall within one of the four statutory categories of invention (process, machine, manufacture or composition of matter), or 
STEP 2: the claim recites a judicial exception, e.g. an abstract idea, without reciting additional elements that amount to significantly more than the judicial exception, as determined using the following analysis:
STEP 2A (PRONG 1): Does the claim recite an abstract idea, law of nature, or natural phenomenon?
STEP 2A (PRONG 2): Does the claim recite additional elements that integrate the judicial exception into a practical application?
STEP 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception?
Using the two-step inquiry, it is clear that the independent claims 1, 12 and 20 are directed to an abstract idea as shown below:
STEP 1: Do the claims fall within one of the statutory categories? YES. Independent claims 1, 12 and 20 are directed to a method, a system, and a computer program product. 
STEP 2A (PRONG 1): Is the claim directed to a law of nature, a natural phenomenon or an abstract idea? YES, the claims are directed toward a mental process (i.e. abstract idea).
With regard to STEP 2A (PRONG 1), the guidelines provide three groupings of subject matter that are considered abstract ideas:
Mathematical concepts – mathematical relationships, mathematical formulas or equations, mathematical calculations;
Certain methods of organizing human activity – fundamental economic principles or practices (including hedging, insurance, mitigating risk); commercial or legal interactions (including agreements in the form of contracts; legal obligations; advertising, marketing or sales activities or behaviors; business relations); managing personal behavior or relationships or interactions between people (including social activities, teaching, and following rules or instructions); and
Mental processes – concepts that are practicably performed in the human mind (including an observation, evaluation, judgment, opinion).
Independent claims 1, 12 and 20 comprise a mental process that can be practicably performed in the human mind (or generic computers or components configured to perform the method) and, therefore, an abstract idea.
Regarding independent claim(s) 1: the limitations recite: 
A computer-implemented method of improving template matching in a document- image, the method comprising: 
merging, by one or more processors, a document comprising multiple pages into a single document image (mental process including observation and evaluation, and can be done mentally in the human mind and data gathering); 
processing, by the one or more processors, the single document image to identify structural elements and textual content comprising the structural elements (mental process including observation and evaluation, and can be done mentally in the human mind); 
comparing, by the one or more processors, the structural elements of the single document image to other structural elements of a group of document templates stored in a database and based on the comparing, identifying a subset of the group of documents templates with a threshold number of similarities to the single document image (mental process including observation and evaluation, and can be done mentally in the human mind); 
generating, by the one or more processors, from the single document image, a graph structure representing the document, wherein the graph structure comprises visual information and connections related to the structural elements and concepts comprising the textual content (mental process including observation and evaluation, and can be done mentally in the human mind); and 
identifying, by the one or more processors, based on comparing the graph structure to the subset of the group of documents templates, a document template that is a closest match to the document (mental process including observation and evaluation, and can be done mentally in the human mind). 
Regarding independent claim 12: the limitations recite: 
A computer system for improving template matching in a document-image, the computer system comprising: 
a memory (generic computer components); and 
one or more processors in communication with the memory, wherein the computer system is configured to perform a method, said method comprising (generic computer components): 
merging, by the one or more processors, a document comprising multiple pages into a single document image (mental process including observation and evaluation, and can be done mentally in the human mind and data gathering); 
processing, by the one or more processors, the single document image to identify structural elements and textual content comprising the structural elements (mental process including observation and evaluation, and can be done mentally in the human mind); 
comparing, by the one or more processors, the structural elements of the single document image to other structural elements of a group of document templates stored in a database and based on the comparing, identifying a subset of the group of documents templates with a threshold number of similarities to the single document image (mental process including observation and evaluation, and can be done mentally in the human mind); 
generating, by the one or more processors, from the single document image, a graph structure representing the document, wherein the graph structure comprises visual information and connections related to the structural elements and concepts comprising the textual content (mental process including observation and evaluation, and can be done mentally in the human mind); and 
identifying, by the one or more processors, based on comparing the graph structure to the subset of the group of documents templates, a document template that is a closest match to the document (mental process including observation and evaluation, and can be done mentally in the human mind). 
Regarding independent claim 20: the limitations recite: 
A computer program product for improving template matching in a document- image, the computer program product comprising: 
one or more computer readable storage media and program instructions collectively stored on the one or more computer readable storage media readable by at least one processing circuit to perform a method comprising (generic computer components): 
merging, by the one or more processors, a document comprising multiple pages into a single document image (mental process including observation and evaluation, and can be done mentally in the human mind and data gathering); 
processing, by the one or more processors, the document image to identify structural elements and textual content comprising the structural elements (mental process including observation and evaluation, and can be done mentally in the human mind); 
comparing, by the one or more processors, the structural elements of the single document image to structural elements of a group of document templates stored in a database and based on the comparing, identifying a subset of the group of documents templates with a threshold number of similarities to the single document image (mental process including observation and evaluation, and can be done mentally in the human mind); 
generating, by the one or more processors, from the single document image, a graph structure representing the document, wherein the graph structure comprises visual information and connections related to the structural elements and concepts comprising the textual content (mental process including observation and evaluation, and can be done mentally in the human mind); and 
identifying, by the one or more processors, based on comparing the graph structure to the subset of the group of documents templates, a document template that is a closest match to the document (mental process including observation and evaluation, and can be done mentally in the human mind).
These limitations, as drafted, is a simple process that, under their broadest reasonable interpretation, covers performance of the limitations in the mind or by a human. The Examiner notes that under MPEP 2106.04(a)(2)(III), the courts consider a mental process (thinking) that “can be performed in the human mind, or by a human using a pen and paper" to be an abstract idea. CyberSource Corp. v. Retail Decisions, Inc., 654 F.3d 1366, 1372, 99 USPQ2d 1690, 1695 (Fed. Cir. 2011). As the Federal Circuit explained, "methods which can be performed mentally, or which are the equivalent of human mental work, are unpatentable abstract ideas the ‘basic tools of scientific and technological work’ that are open to all.’" 654 F.3d at 1371, 99 USPQ2d at 1694 (citing Gottschalk v. Benson, 409 U.S. 63, 175 USPQ 673 (1972)). See also Mayo Collaborative Servs. v. Prometheus Labs. Inc., 566 U.S. 66, 71, 101 USPQ2d 1961, 1965 ("‘[M]ental processes[] and abstract intellectual concepts are not patentable, as they are the basic tools of scientific and technological work’" (quoting Benson, 409 U.S. at 67, 175 USPQ at 675)); Parker v. Flook, 437 U.S. 584, 589, 198 USPQ 193, 197 (1978) (same). 
As such, a person could mentally treat multiple pages as a single page by laying them vertically above each other and then visually compare the pages to a group of other pages and filter them by color, headers, etc to filter out a group of documents. Then a person could create a graph structure for each header or text in the document such as the section being a node and the edges representing the connection to adjacent sections and comparing that graph to another documents graph with the same graph structure to find the best match. The mere nominal recitation that the various steps are being executed by a processor, system, program product does not take the limitations out of the mental process grouping. Thus, the claims recite a mental process.

STEP 2A (PRONG 2): Does the claim recite additional elements that integrate the judicial exception into a practical application? NO, the claims do not recite additional elements that integrate the judicial exception into a practical application.
With regard to STEP 2A (prong 2), whether the claim recites additional elements that integrate the judicial exception into a practical application, the guidelines provide the following exemplary considerations that are indicative that an additional element (or combination of elements) may have integrated the judicial exception into a practical application:
an additional element reflects an improvement in the functioning of a computer, or an improvement to other technology or technical field;
an additional element that applies or uses a judicial exception to affect a particular treatment or prophylaxis for a disease or medical condition; 
an additional element implements a judicial exception with, or uses a judicial exception in conjunction with, a particular machine or manufacture that is integral to the claim;
an additional element effects a transformation or reduction of a particular article to a different state or thing; and
an additional element applies or uses the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception.
While the guidelines further state that the exemplary considerations are not an exhaustive list and that there may be other examples of integrating the exception into a practical application, the guidelines also list examples in which a judicial exception has not been integrated into a practical application:
an additional element merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea; 
an additional element adds insignificant extra-solution activity to the judicial exception; and 
an additional element does no more than generally link the use of a judicial exception to a particular technological environment or field of use.
Independent claims 1, 12 and 20 do not recite any of the exemplary considerations that are indicative of an abstract idea having been integrated into a practical application. Independent claims 1, 12 and 20 discloses a generic computer components, for example, system, memory, processor and a computer program product, which are generic computer components and/or insignificant pre/post-solution extra activity that do not add a meaningful limitation to the abstract idea because they amount to simply implementing the abstract idea in a method.
These limitations are recited at a high level of generality (i.e. as a general action or change being taken based on the results of the acquiring step) and amounts to mere post solution actions, which is a form of insignificant extra-solution activity. Further, the claims are claimed generically and are operating in their ordinary capacity such that they do not use the judicial exception in a manner that imposes a meaningful limit on the judicial exception. Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. 
STEP 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception? No, the claims do not recite additional elements that amount to significantly more than the judicial exception.
With regard to STEP 2B, whether the claims recite additional elements that provide significantly more than the recited judicial exception, the guidelines specify that the pre-guideline procedure is still in effect. Specifically, that examiners should continue to consider whether an additional element or combination of elements:
adds a specific limitation or combination of limitations that are not well-understood, routine, conventional activity in the field, which is indicative that an inventive concept may be present; or 
simply appends well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception, which is indicative that an inventive concept may not be present.
Independent claim(s) 1, 12 and 20 do not recite any additional elements that are not well-understood, routine or conventional. The use of a generic computer elements are routine, well-understood and conventional process that is performed by computers.
Thus, since independent claims 1, 12 and 20 are: (a) directed toward an abstract idea, (b) do not recite additional elements that integrate the judicial exception into a practical application, and (c) do not recite additional elements that amount to significantly more than the judicial exception, it is clear that independent claims 1, 12 and 20 are not eligible subject matter under 35 U.S.C 101.
Regarding claim(s) 2–7, 9, 10 and 13–18: the additional limitations do not integrate the mental process into practical application or add significantly more to the mental process. The limitations are a mental processes including observation and evaluation, and can be done mentally in the human mind.
Regarding claim 8, 11 and 19: the additional limitations do integrate the mental process into practical application or add significantly more to the mental process. 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or non-obviousness.
Claim(s) 1–4, 10, 12–15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Rastogi et al. (US 20220284215 A1, hereafter, "Rastogi") in view of Uppal et al. (US 20210019512 A1, hereafter, "Uppal") and further in view of Rings et al. (US 20210042343 A1, hereafter, "Rings").
Regarding claim 1, Rastogi teaches a computer-implemented method of improving template matching in a document- image (See Rastogi, [Abstract], This disclosure relates to a method and system for extracting information from images of one or more templatized documents), the method comprising: 
[merging, by one or more processors, a document comprising multiple pages into a single document image];
processing, by the one or more processors, the single document image to identify structural elements and textual content comprising the structural elements (See Rastogi, ¶ [0041], At step 206 of the method 200, the one or more hardware processors 104 of the system 100 are configured to detect (i) one or more text lines and one or more words present in each text line, for each visual element of the one or more visual elements present in the pre-processed image of templatized document, and (ii) one or more spatial elements for each word of the one or more words and one or more spatial elements for each text line of the one or more text lines, from the pre-processed image of templatized document obtained at step 206 of the method 200);
[comparing, by the one or more processors, the structural elements of the single document image to other structural elements of a group of document templates stored in a database and based on the comparing, identifying a subset of the group of documents templates with a threshold number of similarities to the single document image];
generating, by the one or more processors, from the single document image, a graph structure representing the document, wherein the graph structure comprises visual information and connections related to the structural elements and concepts comprising the textual content (See Rastogi, ¶ [0047],  At step 208 of the method 200, the one or more hardware processors 104 of the system 100 are configured to generate a knowledge graph for the pre-processed image of templatized document, based on a predefined knowledge graph schema, using (i) the identified one or more text lines and the one or more words present in each text line, for each visual element of the one or more visual elements, and (ii) the spatial elements of each word of the one or more words present in each text line and the spatial elements of each text line of the one or more text lines, present in the corresponding visual element); and
identifying, by the one or more processors, based on comparing the graph structure to the subset of the group of documents templates, a document template that is a closest match to the document (See Rastogi, ¶ [0054], Once the updated knowledge graph for the pre-processed image of templatized document against each initial closest document template, is generated, the layout structure similarity metric for the image of templatized document against each initial closest document template, is computed based on the (i) updated knowledge graph of the pre-processed image of templatized document, and (ii) the knowledge graph of the associated initial closest document template, using a lattice based approach which works based on the formal concept analysis (FCA)).
However, Rastogi fail(s) to teach merging, by one or more processors, a document comprising multiple pages into a single document image; comparing, by the one or more processors, the structural elements of the single document image to other structural elements of a group of document templates stored in a database and based on the comparing, identifying a subset of the group of documents templates with a threshold number of similarities to the single document image.
Uppal, working in the same field of endeavor, teaches: merging, by one or more processors, a document comprising multiple pages into a single document image (See Uppal, ¶ [0021], In an embodiment, the merging module 208 may merge the pages that may belong to the same document. Considering the example above, the merging module 208 may merge pages 1-3 to form the first document, 4-5 to form the second document and so on).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Rastogi’s reference to merging, by one or more processors, a document comprising multiple pages into a single document image based on the method of Uppal’s reference. The suggestion/motivation would have been to accurately processing multipage documents for specific purposes and process specific documents all at once (See Uppal, ¶ [0004, 0027]).
However, Rastogi and Uppal fail(s) to teach comparing, by the one or more processors, the structural elements of the single document image to other structural elements of a group of document templates stored in a database and based on the comparing, identifying a subset of the group of documents templates with a threshold number of similarities to the single document image.
Rings, working in the same field of endeavor, teaches: comparing, by the one or more processors, the structural elements of the single document image to other structural elements of a group of document templates stored in a database and based on the comparing, identifying a subset of the group of documents templates with a threshold number of similarities to the single document image (See Rings, ¶ [0036], The processing device may identify one classified notification document of the corpus as having a cosine similarity rating that exceeds a predetermined similarity threshold as a match to the received notification document (260), …, In an example, the foregoing predetermined threshold values may be changed to allow for use of more general document response templates by identifying a greater number of documents as being similar).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Rastogi’s reference to comparing, by the one or more processors, the structural elements of the single document image to other structural elements of a group of document templates stored in a database and based on the comparing, identifying a subset of the group of documents templates with a threshold number of similarities to the single document image based on the method of Rings’s reference. The suggestion/motivation would have been to process large volumes of documents and automate the process and finding similar template (See Rings, ¶ [0002–0004]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Rings and Uppal with Rastogi to obtain the invention as specified in claim 1.
Regarding claim 2, Rastogi teaches the computer-implemented method of claim 1, further comprising: 
generating, by the one or more processors, from the group of document templates stored in the database, a graph structure for each template, wherein the identifying comprises comparing the graph structure for each template in the subset of the group of document templates to the graph structure (See Rastogi, ¶ [0050], At step 210 of the method 200, the one or more hardware processors 104 of the system 100 are configured to detect a closest document template for the each image of the one or more templatized documents received at step 202 of the method 200, out of the plurality of document templates present in the document template dataset, based on a document similarity metric, using the knowledge graph of the pre-processed image of templatized document, obtained at step 206 of the method 200, and the knowledge graph of each document template of the plurality of document templates received at step 202 of the method 200).
Regarding claim 3, Rastogi teaches the computer-implemented method of claim 1, wherein processing the document image to identify the structural elements and the textual content comprising the structural elements comprises performing optical character recognition to identify text and block segments and layout types (See Rastogi, ¶ [0042], In one embodiment, the one or more text lines and the one or more words present in each text line, and associated spatial elements, of each visual element present in the pre-processed image of templatized document, may be detected by using a corresponding vision tool present in a set of vision tools. In another embodiment, the set of vision tools of the system are the optical character recognition (OCR) tools and includes a text detection tool, a table detection and tabular structure identification tool, a drawing information extraction tool and a visual cues tool).
Regarding claim 4, Rastogi teaches the computer-implemented method of claim 1, wherein the structural elements of the single document image utilized in the comparing are selected from the group (See Rastogi, ¶ [0047], At step 208 of the method 200, the one or more hardware processors 104 of the system 100 are configured to generate a knowledge graph for the pre-processed image of templatized document, based on a predefined knowledge graph schema, using (i) the identified one or more text lines and the one or more words present in each text line, for each visual element of the one or more visual elements, and (ii) the spatial elements of each word of the one or more words present in each text line and the spatial elements of each text line of the one or more text lines, present in the corresponding visual element. Note: Examiner is interpreting the block type as the visual cues) consisting of: an image hashing of the single document image, a block type (See Rastogi, ¶ [0046], The visual cues tool is used to detect font styles, lines, strokes, text structure, and so on associated with each textual element present in the one or more words, and the one or more text lines present in each visual element of the one or more visual elements of the pre-processed image of templatized document. In an embodiment, a combination of: (i) pre-trained deep neural model and (ii) a traditional vision may be used to obtain the visual cues tool), a quantity of the block type, a title of the document, and a heading of the document.
Regarding claim 10, Rastogi teaches the computer-implemented method of claim 1, wherein the document template that is the closest match to the document is the closest match based on visual similarities between the single document image and the document template that is the closest match and content similarities between the single document image and the document template that is the closest match (See Rastogi, ¶ [0051], The document similarity metric includes a textual similarity metric and a layout structure similarity metric. The textual similarity metric calculates a textual similarity for the image of templatized document, between (i) the pre-processed image of templatized document, and (ii) each document template of the plurality of document templates. In detail, the textual similarity metric for the image of the templatized document is calculated based on the number of matching entities present in (i) the knowledge graph of the pre-processed image of templatized document, and (ii) the knowledge graph of each document template of the plurality of document templates. The one or more document templates having a maximum textual similarity metric are chosen as one or more initial closest document templates for the image of templatized document).
Regarding claim 12, claim 12 is rejected the same as claim 1 and the arguments similar to that presented above for claim 1 are equally applicable to the claim 12, and all of the other limitations similar to claim 1 are not repeated herein, but incorporated by reference. Furthermore, Rastogi teaches a computer system for improving template matching in a document-image, the computer system comprising: a memory; and one or more processors in communication with the memory, wherein the computer system is configured to perform a method, said method comprising (See Rastogi, [FIG. 1], Memory 102, Hardware processor(s) 104).
Regarding claim 13, claim 13 is rejected the same as claim 2 and the arguments similar to that presented above for claim 2 are equally applicable to the claim 13, and all of the other limitations similar to claim 2 are not repeated herein, but incorporated by reference. 
Regarding claim 14, claim 14 is rejected the same as claim 3 and the arguments similar to that presented above for claim 3 are equally applicable to the claim 14, and all of the other limitations similar to claim 3 are not repeated herein, but incorporated by reference. 
Regarding claim 15, claim 15 is rejected the same as claim 4 and the arguments similar to that presented above for claim 4 are equally applicable to the claim 15, and all of the other limitations similar to claim 4 are not repeated herein, but incorporated by reference. 
Regarding claim 20, claim 20 is rejected the same as claim 1 and the arguments similar to that presented above for claim 1 are equally applicable to the claim 20, and all of the other limitations similar to claim 1 are not repeated herein, but incorporated by reference. Furthermore, Rastogi teaches a computer program product for improving template matching in a document- image, the computer program product comprising: one or more computer readable storage media and program instructions collectively stored on the one or more computer readable storage media readable by at least one processing circuit to perform a method comprising (See Rastogi, [FIG. 1], Memory 102, Hardware processor(s) 104).
Claim(s) 5 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Rastogi et al. (US 20220284215 A1, hereafter, "Rastogi") in view of Uppal et al. (US 20210019512 A1, hereafter, "Uppal") further in view of Rings et al. (US 20210042343 A1, hereafter, "Rings") and further in view of Dennis et al. (US 20220261144 A1, hereafter, "Dennis").
Regarding claim 5, Rastogi in view of Uppal and further in view of Rings teaches the computer-implemented method of claim 1, wherein generating the graph structure representing the document (See Rastogi, ¶ [0021], A knowledge graph with a fixed schema based on background knowledge is used to capture spatial and semantic relationships of the entities present in the scanned document image), comprises:
processing, by the one or more processors, layout blocks comprising the single document image (See Rastogi, ¶ [0042], In one embodiment, the one or more text lines and the one or more words present in each text line, and associated spatial elements, of each visual element present in the pre-processed image of templatized document, may be detected by using a corresponding vision tool present in a set of vision tools. In another embodiment, the set of vision tools of the system are the optical character recognition (OCR) tools and includes a text detection tool, a table detection and tabular structure identification tool, a drawing information extraction tool and a visual cues tool);
[redacting, by the one or more processors, specific types of text based on pre-defined business rules; and
generating, by the one or more processors, the graph structure, wherein the graph structure does not comprise the redacted text].
However, Rastogi, Uppal and Rings fail(s) to teach redacting, by the one or more processors, specific types of text based on pre-defined business rules; and generating, by the one or more processors, the graph structure, wherein the graph structure does not comprise the redacted text.
Dennis, working in the same field of endeavor, teaches: redacting, by the one or more processors, specific types of text based on pre-defined business rules (See Dennis, ¶ [0035], In an embodiment, a redacted graph may be generated based on filtering portions of a graph 102 in a series of redaction stages 300. The series of redaction stages 300 may be ordered in such a manner that each successive redaction stage 300 filters portions of the graph 102 at an increasingly finer level of granularity. For example, in FIG. 3, the series of redaction stages 300 may be designed so that redaction criteria 302 progress from broad categories to specific sub-categories. Note: Examiner is interpreting the pre-defined business rules as the redaction criteria); and
generating, by the one or more processors, the graph structure, wherein the graph structure does not comprise the redacted text (See Dennis, ¶ [0035], In an embodiment, a redacted graph may be generated based on filtering portions of a graph 102 in a series of redaction stages 300. The series of redaction stages 300 may be ordered in such a manner that each successive redaction stage 300 filters portions of the graph 102 at an increasingly finer level of granularity. For example, in FIG. 3, the series of redaction stages 300 may be designed so that redaction criteria 302 progress from broad categories to specific sub-categories).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Rastogi’s reference to redacting, by the one or more processors, specific types of text based on pre-defined business rules; and generating, by the one or more processors, the graph structure, wherein the graph structure does not comprise the redacted text based on the method of Dennis’s reference. The suggestion/motivation would have been to accurately prevent the sharing of certain information and remove certain information from consideration (See Dennis, ¶ [0002–0004]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Dennis with Rastogi, Uppal and Rings to obtain the invention as specified in claim 5.
Regarding claim 16, claim 16 is rejected the same as claim 5 and the arguments similar to that presented above for claim 5 are equally applicable to the claim 16, and all of the other limitations similar to claim 5 are not repeated herein, but incorporated by reference. 
Claim(s) 6–8, 11 and 17–19 are rejected under 35 U.S.C. 103 as being unpatentable over Rastogi et al. (US 20220284215 A1, hereafter, "Rastogi") in view of Uppal et al. (US 20210019512 A1, hereafter, "Uppal") further in view of Rings et al. (US 20210042343 A1, hereafter, "Rings") and further in view of Wheaton et al. (US 20210110527 A1, hereafter, "Wheaton").
Regarding claim 6, Rastogi in view of Uppal and further in view of Rings teaches the computer-implemented method of claim 5, wherein processing the layout blocks comprises:
recognizing, by the one or more processors, named entities comprising the layout blocks (See Rastogi, ¶ [0041], At step 206 of the method 200, the one or more hardware processors 104 of the system 100 are configured to detect (i) one or more text lines and one or more words present in each text line, for each visual element of the one or more visual elements present in the pre-processed image of templatized document. Note: Examiner is interpreting the detection of text as recognizing a named entities);
extracting, by the one or more processors, concepts in text comprising the layout blocks (See Rastogi, ¶ [0041],  In another embodiment, the set of vision tools of the system are the optical character recognition (OCR) tools and includes a text detection tool, a table detection and tabular structure identification tool, a drawing information extraction tool and a visual cues tool. Note: Examiner is interpreting the OCR of the text as extracting the concept);
connecting, by the one or more processors, the layout blocks based on position of each layout block in the single document image (See Rastogi, ¶ [0047], (ii) the spatial elements of each word of the one or more words present in each text line and the spatial elements of each text line of the one or more text lines, present in the corresponding visual element. Note: Examiner is interpreting the spatial elements as connecting the layout block or text blocks); and
[calculating, by the one or more processors, fingerprints for the layout blocks].
However, Rastogi, Uppal and Rings fail(s) to teach calculating, by the one or more processors, fingerprints for the layout blocks.
Wheaton, working in the same field of endeavor, teaches: calculating, by the one or more processors, fingerprints for the layout blocks (See Wheaton, ¶ [0231], Further data contextualizer 1306 may apply an image hash function (e.g., dhash, phash, or whash) to each of the document structures to generate a collection of image hashes).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Rastogi’s reference to calculating, by the one or more processors, fingerprints for the layout blocks based on the method of Wheaton’s reference. The suggestion/motivation would have been to automatically and accurately represent the structure of the image for more accurate processing (See Wheaton, ¶ [0005, 0088]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Wheaton with Rastogi, Uppal and Rings to obtain the invention as specified in claim 6.
Regarding claim 7, Rastogi in view of Uppal further in view of Rings and further in view of Wheaton teaches the computer-implemented method of claim 6, wherein the visual information of the graph structure (See Rastogi, ¶ [0047], At step 208 of the method 200, the one or more hardware processors 104 of the system 100 are configured to generate a knowledge graph for the pre-processed image of templatized document, based on a predefined knowledge graph schema, using (i) the identified one or more text lines and the one or more words present in each text line, for each visual element of the one or more visual elements, and (ii) the spatial elements of each word of the one or more words present in each text line and the spatial elements of each text line of the one or more text lines, present in the corresponding visual element) [comprises fingerprints of the blocks], wherein the connections related to the structural elements comprise the connections of the layout blocks based on the positions (See Rastogi, ¶ [0047], (ii) the spatial elements of each word of the one or more words present in each text line and the spatial elements of each text line of the one or more text lines, present in the corresponding visual element. Note: Examiner is interpreting the spatial elements as connecting the layout block or text blocks), and the concepts comprise the extracted concept in the text comprising the layout blocks (See Rastogi, ¶ [0041],  In another embodiment, the set of vision tools of the system are the optical character recognition (OCR) tools and includes a text detection tool, a table detection and tabular structure identification tool, a drawing information extraction tool and a visual cues tool. Note: Examiner is interpreting the OCR of the text as extracting the concept).
However, Rastogi, Uppal and Rings fail(s) to teach comprises fingerprints of the blocks.
Wheaton, working in the same field of endeavor, teaches: comprises fingerprints of the blocks (See Wheaton, ¶ [0231], Further data contextualizer 1306 may apply an image hash function (e.g., dhash, phash, or whash) to each of the document structures to generate a collection of image hashes).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Rastogi’s reference comprises fingerprints of the blocks based on the method of Wheaton’s reference. The suggestion/motivation would have been to automatically and accurately represent the structure of the image for more accurate processing (See Wheaton, ¶ [0005, 0088]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Wheaton with Rastogi, Uppal and Rings to obtain the invention as specified in claim 7.
Regarding claim 8, Rastogi in view of Uppal further in view of Rings and further in view of Wheaton teaches the computer-implemented method of claim 7, [wherein the fingerprints comprise image hashing of the layout blocks].
However, Rastogi, Uppal and Rings fail(s) to teach wherein the fingerprints comprise image hashing of the layout blocks.
Wheaton, working in the same field of endeavor, teaches: wherein the fingerprints comprise image hashing of the layout blocks (See Wheaton, ¶ [0231], Further data contextualizer 1306 may apply an image hash function (e.g., dhash, phash, or whash) to each of the document structures to generate a collection of image hashes).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Rastogi’s wherein the fingerprints comprise image hashing of the layout blocks based on the method of Wheaton’s reference. The suggestion/motivation would have been to automatically and accurately represent the structure of the image for more accurate processing (See Wheaton, ¶ [0005, 0088]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Wheaton with Rastogi, Uppal and Rings to obtain the invention as specified in claim 8.
Regarding claim 11, Rastogi in view of Uppal further in view of Rings and further in view of Wheaton teaches the computer-implemented method of claim 1, wherein comparing the structural elements of the single document image to other structural elements of the group of document templates stored in a database (See Rastogi, ¶ [0050], At step 210 of the method 200, the one or more hardware processors 104 of the system 100 are configured to detect a closest document template for the each image of the one or more templatized documents received at step 202 of the method 200, out of the plurality of document templates present in the document template dataset, based on a document similarity metric, using the knowledge graph of the pre-processed image of templatized document, obtained at step 206 of the method 200, and the knowledge graph of each document template of the plurality of document templates received at step 202 of the method 200) comprises [utilizing an image hash algorithm to determine a distance between the single document image and each template of the group of document templates].
However, Rastogi, Uppal and Rings fail(s) to teach utilizing an image hash algorithm to determine a distance between the single document image and each template of the group of document templates.
Wheaton, working in the same field of endeavor, teaches: utilizing an image hash algorithm to determine a distance between the single document image and each template of the group of document templates (See Wheaton, ¶ [0231], In many embodiments, a hash may be compared to all (or a subset) of the hashes from other documents, and the hash with the lowest hamming distance may be used. In many such embodiments, this may be available as hashes take up very little storage (in memory) and computing the hamming distance is not computationally intensive).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Rastogi’s utilizing an image hash algorithm to determine a distance between the single document image and each template of the group of document templates based on the method of Wheaton’s reference. The suggestion/motivation would have been to automatically and accurately represent the structure of the image for more accurate processing (See Wheaton, ¶ [0005, 0088]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Wheaton with Rastogi, Uppal and Rings to obtain the invention as specified in claim 11.
Regarding claim 17, claim 17 is rejected the same as claim 6 and the arguments similar to that presented above for claim 6 are equally applicable to the claim 17, and all of the other limitations similar to claim 6 are not repeated herein, but incorporated by reference. 
Regarding claim 18, claim 18 is rejected the same as claim 7 and the arguments similar to that presented above for claim 7 are equally applicable to the claim 18, and all of the other limitations similar to claim 7 are not repeated herein, but incorporated by reference. 
Regarding claim 19, claim 19 is rejected the same as claim 8 and the arguments similar to that presented above for claim 8 are equally applicable to the claim 19, and all of the other limitations similar to claim 8 are not repeated herein, but incorporated by reference. 
Claim(s) 9 is rejected under 35 U.S.C. 103 as being unpatentable over Rastogi et al. (US 20220284215 A1, hereafter, "Rastogi") in view of Uppal et al. (US 20210019512 A1, hereafter, "Uppal") further in view of Rings et al. (US 20210042343 A1, hereafter, "Rings") and further in view of Sanderson (US 20200272788 A1, hereafter, "Sanderson").
Regarding claim 9, Rastogi in view of Uppal further in view of Rings and further in view of Sanderson teaches the computer-implemented method of claim 1, wherein comparing the structural elements of the single document image to other structural elements of the group of document templates stored in the database (See Rastogi, ¶ [0050], At step 210 of the method 200, the one or more hardware processors 104 of the system 100 are configured to detect a closest document template for the each image of the one or more templatized documents received at step 202 of the method 200, out of the plurality of document templates present in the document template dataset, based on a document similarity metric, using the knowledge graph of the pre-processed image of templatized document, obtained at step 206 of the method 200, and the knowledge graph of each document template of the plurality of document templates received at step 202 of the method 200) comprises [simultaneously comparing the structural elements of the single document image to at least structural elements of at least two document templates of the group of document templates].
However, Rastogi, Uppal and Rings fail(s) to teach simultaneously comparing the structural elements of the single document image to at least structural elements of at least two document templates of the group of document templates.
Sanderson, working in the same field of endeavor, teaches: simultaneously comparing the structural elements of the single document image to at least structural elements of at least two document templates of the group of document templates (See Sanderson, ¶ [0004], In particular, the present disclosure provides for comparison of any given version of the document against a previous version (as compared to, say, an original version), such as the immediately preceding version of the document, and further provides for dynamically generating a text box that displays the differences between the given version and one or more of the previous versions of the document in sequential order. For e.g., simultaneously comparing Document 4 against 3, 3 against 2 and 2 against 1).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Rastogi’s reference to simultaneously comparing the structural elements of the single document image to at least structural elements of at least two document templates of the group of document templates based on the method of Sanderson’s reference. The suggestion/motivation would have been to decrease the processing time of comparing documents and process more documents (See Sanderson, ¶ [0004–0006]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Sanderson with Rastogi, Uppal and Rings to obtain the invention as specified in claim 9.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Pasterk et al. (US 20210075788 A1) teaches computer systems and methods are provided for determining authenticity of a document. A computer system receives first image data from a remote third party. The first image data includes a first image of a document. The computer system generates an image hash based on at least a portion of the received image data. The computer system compares the generated image hash with a stored image hash to determine whether the generated image hash meets image matching criteria. The stored image hash is associated with a first predetermined authentication decision. In accordance with a determination that the generated image hash meets the image matching criteria, the computer system transmits the first predetermined authentication decision to the remote third party.
Mukhopadhyay et al. (US 20200074169 A1) teaches a system and method for extracting structured information from image documents is disclosed. An input image document is obtained, and the input image document may be analyzed to determine a skeletal layout of information included in the input image document. A measure of similarity between the determined skeletal layout and each of the document templates may be determined. A document template may be selected as a matched template, based on the determined measure of similarity. Box areas from the input image document may be cropped out, and optical character recognition (OCR) may be performed on the box areas. Obtained recognized text may be automatically processed using directed search to correct errors made by the OCR. Statistical language modeling may be used to classify the input image document into a classification category, and the classified input image document may be processed according to the classification category.
Agrawal et al. (US 9235758 B1) teaches techniques for comparing documents may be provided. For example, a comparison between layouts of the documents may be performed. The comparison may include segmenting the documents into blocks, where an arrangement of blocks of a document represents a layout of the document. Once segmented, similarity metrics, such as distances, between blocks of one document and blocks of the other document may be computed. The similarity metrics may be used to match the blocks between the documents. Further, the similarity metrics between the matched blocks may be added to determine an overall similarity metric between the documents. This overall similarity metric may indicate how similar the documents may be.
Hsu et al. (See NPL attached, “Neural Graph Matching for Modification Similarity Applied to Electronic Document Comparison”) teaches In this paper, we present a novel neural graph matching approach applied to document comparison. Document comparison is a common task in the legal and financial industries. In some cases, the most important differences may be the addition or omission of words, sentences, clauses, or paragraphs.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DION J SATCHER whose telephone number is (703)756-5849. The examiner can normally be reached Monday - Thursday 5:30 am - 2:30 pm, Friday 5:30 am - 9:30 am PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Henok Shiferaw can be reached at (571) 272-4637. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DION J SATCHER/           Patent Examiner, Art Unit 2676  

/Henok Shiferaw/           Supervisory Patent Examiner, Art Unit 2676
Read full office action
Prosecution Timeline

Dec 13, 2022
Application Filed
Nov 20, 2023
Response after Non-Final Action
Feb 09, 2026
Non-Final Rejection mailed — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/254,733
Patent 12639835
METHOD AND APPARATUS OF FUSION OF MULTIMODAL IMAGES TO FLUOROSCOPIC IMAGES
3y 0m to grant Granted May 26, 2026
17/991,368
Patent 12620070
DETERMINING AND USING POINT SPREAD FUNCTION FOR IMAGE DEBLURRING
3y 5m to grant Granted May 05, 2026
18/462,579
Patent 12620053
IMAGE PROCESSING METHOD AND APPARATUS, MEDIUM, DEVICE AND DRIVING SYSTEM
2y 8m to grant Granted May 05, 2026
18/148,405
Patent 12611552
METHODS AND SYSTEMS FOR RADIATION THERAPY GUIDANCE
3y 4m to grant Granted Apr 28, 2026
18/340,043
Patent 12614262
IMAGE PROCESSING APPARATUS, ENDOSCOPIC APPARATUS, IMAGE PROCESSING METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM
2y 10m to grant Granted Apr 28, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
86%
Grant Probability
99%
With Interview (+14.1%)
2y 10m (~0m remaining)
Median Time to Grant
Low
PTA Risk
Based on 42 resolved cases by this examiner. Grant probability derived from career allowance rate.