Last updated: May 29, 2026

Application No. 18/165,125

OCR OF TEXT OVERLAPPING SCENES THROUGH TEXT GRAPH STRUCTURING

Non-Final OA §101§102§103

Filed

Feb 06, 2023

Examiner

COUSO, JOSE L

Art Unit

2667

Tech Center

2600 — Communications

Assignee

International Business Machines Corporation

OA Round

1 (Non-Final)

Interview Optional

— +8.1% interview lift. Interview lift (+8.1%) is below the 15.0% threshold. A written response is recommended.

Based on 1196 resolved cases, 2023–2026

Examiner Intelligence

COUSO, JOSE L View full profile →

Grants 90% — above average

Career Allowance Rate

1080 granted / 1196 resolved

+28.3% vs TC avg

Moderate +8% lift

Without

With

+8.1%

Interview Lift

resolved cases with interview

Typical timeline

2y 2m

Avg Prosecution

16 currently pending

Career history

1210

Total Applications

across all art units

Statute-Specific Performance

§101

17.1%

-22.9% vs TC avg

§103

16.6%

-23.4% vs TC avg

§102

44.5%

+4.5% vs TC avg

§112

10.1%

-29.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1196 resolved cases

Office Action

§101 §102 §103

DETAILED ACTION


Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Information Disclosure Statement
The information disclosure statement (IDS) submitted on February 6, 2023 complies with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner. 


35 USC § 101 Statutory Analysis
The claims do not recite any of the judicial exceptions enumerated in the 2019 Revised Patent Subject Matter Eligibility Guidance. Further, the claims do not recite any method of organizing human activity, such as a fundamental economic concept or managing interactions between people. Finally, the claims do not recite a mathematical relationship, formula, or calculation. Thus, the claims are eligible because they do not recite a judicial exception.


Claim Rejections - 35 USC § 101

35 U.S.C. §101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 16-20 are rejected under 35 U.S.C. §101 because the claimed invention is directed to non-statutory subject matter. The claims do not fall within at least one of the four categories of patent eligible subject matter because they are directed to a computer program product. The scope of a computer program product is broad enough to include either a computer program by itself, and/or a signal per se, both of which are non-statutory.  
In order to overcome the rejection, the examiner suggest amending the preamble as follows: “A non-transitory computer readable medium storing thereon a computer program, which when executed by a computer, performs a method comprising”.

Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. §102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1, 2, 4-12, 14-17, 19 and 20 are rejected under 35 U.S.C. §102(a)(1) as being anticipated by Walch (U.S. Patent No. US 8,452,108 B2) (hereafter referred to as “Walch”).  
	With regard to claim 1, Walch describes converting, as part of an Optical Character Recognition (OCR) process, each letter of multiple letters into graph structured data, the graph structured data comprising detected nodes and lines between the detected nodes (see Figure 1 and refer to column 5, lines 10-51); constructing a library of graph templates from the graph structured data of each letter of the multiple letters (refer for example to column 5, line 64 through column 6, line 12); identifying an image region of a document image with overlapping text (refer for example to column 17, line 49 through column 18, line 3); performing text graph structuring to convert visual content of the overlapping text image region to an overlapping text topology graph (refer for example to column 19, lines 13-26); and splitting the overlapping text topology graph into multiple subgraphs using the graph template library to match recognizable letters (refer for example to column 8, lines 1-30). 
As to claim 2, Walch describes wherein converting the letter image into the graph structured data for each letter of the multiple letters further comprises detecting joint points comprising an endpoint, a turning point, and an intersection point representing the detected nodes in the graph structured data (refer for example to column 6, line 59 through column 7, line 58). 
With regard to claim 4, Walch describes wherein converting the letter image into the graph structured data for each letter of the multiple letters further comprises converting the graph structured data of a letter into a topology diagram of a point-edge-point topology diagram (refer to column 5, line 31 through column 6, line 32).
As to claim 5, Walch describes wherein identifying the image region with the overlapping text of the image document further comprises extracting the overlapping text region from the image document, and detecting the nodes in the image content of extracted overlapping text region (refer for example to column 19, lines 13-26).
In regard to claim 6, Walch describes wherein performing text graph structuring to convert the visual content of the overlapping text image region to the overlapping text topology graph further comprises identifying detected joint points of the visual content as the nodes in the overlapping text topology graph and the lines between the nodes as edges in the overlapping text topology graph (refer for example to column 6, line 59 through column 7, line 58, and to column 19, lines 13-26). 
With regard to claim 7, Walch describes wherein performing text graph structuring to convert the visual content of the overlapping text image region further comprises encoding the nodes in the overlapping text topology graph to convert each node into an initialization vector and performing vector updates on the nodes to provide an updated node vector including neighboring nodes information and graph topology information (refer for example to column 6, lines 31-51, and to column 17, line 49 through column 18, line 3).
As to claim 8, Walch describes wherein performing text graph structuring to convert the visual content of the overlapping text image region further comprises labeling each node using node classification, updating and attaching vectors to each node in the in the overlapping text topology graph (refer for example to column 6, lines 13-21, and to column 10, line 66 through column 11, line 23).
In regard to claim 9, Walch describes wherein splitting the overlapping text topology graph into multiple independent subgraphs further comprises using a classification algorithm to split the overlapping region into recognizable characters (refer for example to column 17, lines 20-28).
With regard to claim 10, Walch describes wherein splitting the overlapping text topology graph into multiple independent subgraphs further comprises using topological information of the overlapping text topology graph and matching recognizable letters in the graph template library (refer for example to column 8, lines 1-30).
As to claim 11, Walch describes a processor and a memory, wherein the memory includes a computer program product configured to perform operations (see Figure 18, element 1806 and refer for example to column 22, lines 20-22) for implementing Optical Character Recognition (OCR) of text overlapping scenes, the operations comprising converting, as part of an Optical Character Recognition (OCR) process, each letter of multiple letters into graph structured data, the graph structured data comprising detected nodes and lines between the detected nodes (see Figure 1 and refer to column 5, lines 10-51); constructing a library of graph templates from the graph structured data of each letter of the multiple letters (refer for example to column 5, line 64 through column 6, line 12); identifying an image region of a document image with overlapping text (refer for example to column 17, line 49 through column 18, line 3); performing text graph structuring to convert visual content of the overlapping text image region to an overlapping text topology graph (refer for example to column 19, lines 13-26); and splitting the overlapping text topology graph into multiple subgraphs using the graph template library to match recognizable letters (refer to column 8, lines 1-30).
In regard to claim 12, Walch describes wherein converting the letter image into the graph structured data for each letter of the multiple letters further comprises detecting joint points comprising an endpoint, a turning point, and an intersection point representing the detected nodes in the graph structured data (refer for example to column 6, line 59 through column 7, line 58).
As to claim 14, Walch describes wherein converting the letter image into the graph structured data for each letter of the multiple letters further comprises converting the graph structured data of the letter into a topology diagram of a point-edge-point topology diagram (refer to column 5, line 31 through column 6, line 32).
In regard to claim 15, Walch describes wherein performing text graph structuring to convert the visual content of the overlapping text image region further comprises encoding nodes in the overlapping text topology graph to convert each node into an initialization vector and performing vector updates on the nodes to provide an updated node vector including neighboring nodes information and graph topology information (refer to column 6, lines 31-51, and to column 17, line 49 through column 18, line 3).
With regard to claim 16, Walch describes a computer program product for implementing Optical Character Recognition (OCR) of text overlapping scenes, the computer program product comprising a computer-readable storage medium having computer-readable program code embodied therewith, the computer-readable program code executable by one or more computer processors to perform an operation (see Figure 18, element 1806 and refer for example to column 22, lines 20-22) comprising converting, as part of an Optical Character Recognition (OCR) process, each letter of multiple letters into graph structured data, the graph structured data comprising detected nodes and lines between the detected nodes (see Figure 1 and refer to column 5, lines 10-51); constructing a library of graph templates from the graph structured data of each letter of the multiple letters (refer for example to column 5, line 64 through column 6, line 12); identifying an image region of a document image with overlapping text (refer for example to column 17, line 49 through column 18, line 3); performing text graph structuring to convert visual content of the overlapping text image region to an overlapping text topology graph (refer for example to column 19, lines 13-26); and splitting the overlapping text topology graph into multiple subgraphs using the graph template library to match recognizable letters (refer to column 8, lines 1-30).
As to claim 17, Walch describes wherein converting the letter image into the graph structured data for each letter of the multiple letters further comprises detecting joint points comprising an endpoint, a turning point, and an intersection point representing the detected nodes in the graph structured data (refer for example to column 6, line 59 through column 7, line 58).
With regard to claim 19, Walch describes wherein converting the letter image into the graph structured data for each letter of the multiple letters further comprises converting the graph structured data of the letter into a topology diagram of a point-edge-point topology diagram (refer to column 5, line 31 through column 6, line 32).
As to claim 20, Walch describes wherein performing text graph structuring to convert the visual content of the overlapping text image region further comprises encoding nodes in the overlapping text topology graph to convert each node into an initialization vector and performing vector updates on the nodes to provide an updated node vector including neighboring nodes information and graph topology information (refer to column 6, lines 31-51, and to column 17, line 49 through column 18, line 3).

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. §103(a) which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1.    Determining the scope and contents of the prior art.
2.    Ascertaining the differences between the prior art and the claims at issue.
3.    Resolving the level of ordinary skill in the pertinent art.
4.    Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 3, 13 and 18 are rejected under 35 U.S.C. §103(a) as being unpatentable over Walch (U.S. Patent No. US 8,452,108 B2) (hereafter referred to as “Walch”) in view of Cao et al. (CN 113536875 A - translation) (hereafter referred to as “Cao”).
	The arguments advanced in section 10 above, as to the applicability of Walch, are incorporated herein.
With regard to claims 3, 13 and 18, Walch discloses using a neural network (refer for example to column 12, lines 39-43, column 15, lines 29-33 and column 20, lines 59-67) In column 21, lines 19-21 there is an implication that the neural network is a graph neural network although Walch does not expressly describe the converting the letter image into the graph structured data for each letter of the multiple letters further comprises encoding each node in the graph structured data of a letter using a graph neural network. Such a neural network however is well known and widely utilized in the prior art.
Cao discloses an Optical Character Recognition system which recognizes text in identification certificate images and which provides for converting the letter image into the graph structured data for each letter of the multiple letters further comprises encoding each node in the graph structured data of a letter using a graph neural network (see Figures 4 and 7, and refer to page 15, second paragraph starting at “Step S502 …” through page 17, first partial paragraph before “Step S204 …”).
Given the teachings of the two references and the same environment of operation, namely that of Optical Character Recognition systems, it would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the neural network in Walch in the manner described by Cao to provide for using a graph neural network according to known methods to yield predictable results and would have been motivated to do so with a reasonable expectation of success in order to provide for increased processing efficiency and higher accuracy as suggested by Cao (refer for example to the abstract), which fails to patentably distinguish over the prior art absent some novel and unexpected result.

Relevant Prior Art

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Marcelli, Kalyuzhny, Agarwal, Lu, Tang, Fu, and Zheng all disclose systems similar to applicant’s claimed invention.  

Contact Information

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jose L. Couso whose telephone number is (571) 272-7388. The examiner can normally be reached on Monday through Friday from 5:30am to 1:30pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Bella, can be reached on 571-272-7778. The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Center information webpage on the USPTO website. For more information about the Patent Center, see https://www.uspto.gov/patents/apply/patent-center. Should you have questions about access to the Patent Center, contact the Patent Electronic Business Center (EBC) at 571-272-4100 or via email at: ebc@uspto.gov .
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.




/JOSE L COUSO/Primary Examiner, Art Unit 2667                                                                                                                                                                                                        
February 19, 2026

Read full office action

Prosecution Timeline

Feb 06, 2023

Application Filed

Oct 16, 2023

Response after Non-Final Action

Feb 25, 2026

Non-Final Rejection mailed — §101, §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/347,390

Patent 12626508

HOUSEHOLD APPLIANCE VIDEO ANALYSIS

2y 10m to grant Granted May 12, 2026

18/046,788

Patent 12620474

MEDICAL IMAGE PROCESSING APPARATUS, ENDOSCOPE SYSTEM, METHOD OF OPERATING MEDICAL IMAGE PROCESSING APPARATUS, AND NON-TRANSITORY COMPUTER READABLE MEDIUM

3y 6m to grant Granted May 05, 2026

18/319,872

Patent 12620061

METHOD AND APPARATUS WITH IMAGE PROCESSING

2y 11m to grant Granted May 05, 2026

18/372,397

Patent 12620068

DEVICE AND METHOD FOR TEXTURE-AWARE SELF-SUPERVISED BLIND DENOISING USING SELF-RESIDUAL LEARNING

2y 7m to grant Granted May 05, 2026

18/318,164

Patent 12608771

Noise Reduction in Ultrasound Images

2y 11m to grant Granted Apr 21, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

90%

Grant Probability

98%

With Interview (+8.1%)

2y 2m (~0m remaining)

Median Time to Grant

Low

PTA Risk

Based on 1196 resolved cases by this examiner. Grant probability derived from career allowance rate.