Last updated: April 18, 2026

Application No. 18/393,716

A SYSTEM AND METHOD FOR VISUAL TEXT TRANSFORMATION

Final Rejection §102§103

Filed

Dec 22, 2023

Examiner

RHIM, WOO CHUL

Art Unit

2676

Tech Center

2600 — Communications

Assignee

L&T Technology Services Limited

OA Round

2 (Final)

Interview Optional

— +21.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 140 resolved cases, 2023–2026

Examiner Intelligence

RHIM, WOO CHUL View full profile →

Grants 80% — above average

Career Allow Rate

112 granted / 140 resolved

+18.0% vs TC avg

Strong +21% interview lift

Without

With

+21.4%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

28 currently pending

Career history

168

Total Applications

across all art units

Statute-Specific Performance

§101

7.4%

-32.6% vs TC avg

§103

47.1%

+7.1% vs TC avg

§102

23.2%

-16.8% vs TC avg

§112

19.0%

-21.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 140 resolved cases

Office Action

§102 §103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
Submission dated 02/24/2026 amends claim 10.  Claims 1-16 are pending.
In view of the amendment to claim 10, the previously set forth claim objection has been withdrawn.

Response to Arguments
Applicant's arguments filed with the submission dated 02/24/2026 have been fully considered but they are not persuasive.
On pages 9-11, the applicant argues that Baheti as applied does not disclose “segmenting the plurality of text characters within the at least one visual text region from a background associated with the at least one visual text region” because the segmentation in the claimed invention is “a character level separation process that distinguishes text pixels from non-text background pixels within the same region” (see page 10 of the submission).  The examiner finds the arguments not persuasive for the following reasons.
First, the argument is not germane because the quoted claim language does not recite the “character-level” separation process and such a process is not specifically defined in the specification.  
Second, under the broadest reasonable interpretation, the cited portion’s teaching reads on the quoted claim language because it discloses segmenting a text region, e.g., a region containing a plurality of text characters, from other non-text regions by extracting one or more text regions from the obtained images (see, e.g., pars. 38-40, 57-58 and 65-66 and FIGS. 2, 3A and 4A). 
The examiner notes that since cited teaching reads on the claimed language, how the claimed segmenting is different from that of Baheti is not material when such difference is not recited in the claimed language (see MPEP 2173.01, which sets forth that under the broadest reasonable interpretation, words of the claims are given their plain meaning and claim limitations are not imported from the specification).
For these reasons, the examiner finds the applicant’s argument unpersuasive.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-4, 6-9, 11-14, and 16 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Us patent application publication no. 2014/0168478 to Baheti et al. (hereinafter Baheti).
For claim 1, Baheti as applied disclose a method of extracting text from a video stream, the method comprising: 
determining at least one visual text region in each image of a plurality of images (see, e.g., pars. 38-40 and FIG. 2, which teach identifying one or more text regions from obtained images), wherein the video stream comprises the plurality of images in sequential order (see, e.g., pars. 2, 5, 6, 36, 38, and 82-85 and claims 7, 20 and 26, which teach receiving multiple frames/images, e.g., video frames, in sequence), wherein the at least one visual text region comprises a plurality of text characters (see, e.g., pars. 40-42, 45-46, 51, and 84 and FIG. 2, which teach verifying presence of text in the identified regions), and 
wherein determining the at least one visual text region is based on analysis of one of (the examiner notes that to be consistent with pars. 49-50 of the specification under the broadest reasonable interpretation, the following limitations are interpreted disjunctively (see MPEP 2111.01): 
lines and curves associated with each of the plurality of text characters of the at least one visual text region; 
a blob associated with the plurality of text characters along an axis (see, e.g., pars. 39, 49-50, 61-62 and FIGS. 2, 3A, which teach identifying regions of pixels in the image that are connected to one another and different from surrounding pixels in one or more properties, such as color and/or intensity, e.g., maximally stable extremal regions (MSER)); 
distribution of the plurality of text characters along a major axis; and 
a common attribute associated with the plurality of text characters; 
segmenting the plurality of text characters within the at least one visual text region from a background associated with the at least one visual text region (see, e.g., pars. 38-40, 57-58 and 65-66 and FIGS. 2, 3A and 4A, which teach extracting one or more text regions from the obtained images); and 
upon segmenting, inpainting each of the plurality of text characters with a predefined color (see, e.g., pars. 55, 63 and 65, which teach binarizing the extracted region, and pars. 76-77 and FIGS. 5A, 6A-B, which teach, upon extracting the regions, performing text image enhancement, e.g., deblurring and improving contrast of the region by changing intensities of pixels in the region and correcting over/under exposure of region).

For claim 7, Baheti as applied discloses a system for extracting text from a video stream (see, e.g., FIGS. 2, 8A and 9), the system comprising:
	a processor (see, e.g., pars. 36-37 and FIG. 9); and 
	a memory storing a plurality of instructions (see, e.g., FIGS. 2 and 9), wherein the plurality of instructions, upon execution by the processor, cause the processor to: 
determine at least one visual text region in each image of a plurality of images (see, e.g., pars. 38-40 and FIG. 2, which teach identifying one or more text regions from obtained images), wherein the video stream comprises the plurality of images in sequential order (see, e.g., pars. 2, 5, 6, 36, 38, and 82-85 and claims 7, 20 and 26, which teach receiving multiple frames/images, e.g., video frames, in sequence), wherein the at least one visual text region comprises a plurality of text characters (see, e.g., pars. 40-42, 45-46, 51, and 84 and FIG. 2, which teach verifying presence of text in the identified regions), and 
	wherein determining the at least one visual text region is based on analysis of one of (the examiner notes that to be consistent with pars. 49-50 of the specification under the broadest reasonable interpretation, the following limitations are interpreted disjunctively (see MPEP 2111.01): 
lines and curves associated with each of the plurality of text characters of the at least one visual text region; 
a blob associated with the plurality of text characters along an axis (see, e.g., pars. 39, 49-50 and 61-62 and FIGS. 2 and 3A, which teach identifying regions of pixels in the image that are connected to one another and different from surrounding pixels in one or more properties, such as color and/or intensity, e.g., maximally stable extremal regions (MSER)); 
distribution of the plurality of text characters along a major axis; and 
a common attribute associated with the plurality of text characters; 
	segment the plurality of text characters within the at least one visual text region from a background associated with the at least one visual text region (see, e.g., pars. 38-40, 57-58 and 65-66 and FIGS. 2, 3A and 4A, which teach extracting one or more text regions from the obtained images); and 
	upon segmenting, inpaint each of the plurality of text characters with a predefined color (see, e.g., pars. 55, 63 and 65, which teach binarizing the extracted region, and pars. 76-77 and FIGS. 5A and 6A-B, which teach, upon extracting the regions, performing text image enhancement, e.g., deblurring and improving contrast of the region by changing intensities of pixels in the region and correcting over/under exposure of region).  

For claim 11, Baheti as applied discloses a non-transitory computer-readable medium storing computer-executable instructions for extracting text from a video stream (see, e.g., pars. 104-107 and FIG. 9), the computer-executable instructions configured for:
determining at least one visual text region in each image of a plurality of images (see, e.g., pars. 38-40 and FIG. 2, which teach identifying one or more text regions from obtained images), wherein the video stream comprises the plurality of images in sequential order (see, e.g., pars. 2, 5, 6, 36, 38, and 82-85 and claims 7, 20 and 26, which teach receiving multiple frames/images, e.g., video frames, in sequence), wherein the at least one visual text region comprises a plurality of text characters (see, e.g., pars. 40-42, 45-46, 51, and 84 and FIG. 2, which teach verifying presence of text in the identified regions), and 
wherein determining the at least one visual text region is based on analysis of one of (the examiner notes that to be consistent with pars. 49-50 of the specification under the broadest reasonable interpretation, the following limitations are interpreted disjunctively (see MPEP 2111.01): 
lines and curves associated with each of the plurality of text characters of the at least one visual text region; 
a blob associated with the plurality of text characters along an axis (see, e.g., pars. 39, 49-50 and 61-62 and FIGS. 2, 3A, which teach identifying regions of pixels in the image that are connected to one another and different from surrounding pixels in one or more properties, such as color and/or intensity, e.g., maximally stable extremal regions (MSER)); 
distribution of the plurality of text characters along a major axis; and 
a common attribute associated with the plurality of text characters; 
segmenting the plurality of text characters within the at least one visual text region from a background associated with the at least one visual text region (see, e.g., pars. 38-40, 57-58 and 65-66 and FIGS. 2, 3A and 4A, which teach extracting one or more text regions from the obtained images); and 
upon segmenting, inpainting each of the plurality of text characters with a predefined color (see, e.g., pars. 55, 63 and 65, which teach binarizing the extracted region, and pars. 76-77 and FIGS. 5A and 6A-B, which teach, upon extracting the regions, performing text image enhancement, e.g., deblurring and improving contrast of the region by changing intensities of pixels in the region and correcting over/under exposure of region).    
For claims 2, 8 and 12, Baheti as applied discloses that the analysis of lines and curves comprises: 
detecting one or more corners formed with intersection of the lines and curves associated with each of the plurality of text characters of the at least one visual text region (see, e.g., pars. 40, 52-54 and 65-66 and FIGS. 3B-C, which teach computing stroke width by detecting a number of points corresponding to intersection of line and curves in a character); 
determining pixel intensity and average axial distance between the one or more corners (see, e.g., pars. 40, 52-54 and 65-66 and FIGS. 3B-C, which teach computing a variance of the stroke widths by averaging the stroke width, which are defined by pixel intensities); and 
determining the at least one visual text region based on the pixel intensity and the average axial distance (see, e.g., pars. 40, 52-54 and 65-66 and FIGS. 3B-C, which teach classifying the region as text based on the variance of the stroke widths).  
For claims 3 and 13, Baheti as applied discloses that the blob associated with the plurality of text characters along an axis is analyzed based on maximally stable extremal visual text regions (MSER) model (see, e.g., pars. 39, 49-50 and 61-62 and FIGS. 2 and 3A, which teach identifying regions of pixels in the image that are connected to one another and different from surrounding pixels in one or more properties, such as color and/or intensity, e.g., maximally stable extremal regions (MSER)).  
For claims 4 and 14, Baheti as applied discloses that the common attribute is one of a font, a size, a color, or a spacing associated with each of the plurality of text characters (see, e.g., pars. 65-66, which teach determining whether a line of pixels of a common binary value is present in the region, and pars. 41, 55 and 75 and FIGS. 2 and 5A-B, which teach determining text image quality of the regions by checking whether an attribute of the region meets an OCR threshold, wherein the attribute is a text/font size).
For claim 6 and 16, Baheti as applied discloses that each of the plurality of text characters is inpainted with Black color (see, e.g., pars. 55, 63 and 65, which teach binarizing the extracted region, and pars. 76 and 77 and FIGS. 5A and 6A-B, which teach, upon extracting the regions, performing text image enhancement, e.g., deblurring and improving contrast of the region by changing intensities of pixels in the region and correcting over/under exposure of region).  

For claim 9, Baheti as applied discloses that the blob associated with the plurality of text characters along an axis is analyzed based on maximally stable extremal visual text regions (MSER) model (see, e.g., pars. 39, 49-50 and 61-62 and FIGS. 2 and 3A, which teach identifying regions of pixels in the image that are connected to one another and different from surrounding pixels in one or more properties, such as color and/or intensity, e.g., maximally stable extremal regions (MSER)); and 
wherein the common attribute is one of a font, a size, or a spacing associated with each of the plurality of text characters (see, e.g., pars. 65-66, which teach determining whether a line of pixels of a common binary value is present in the region, and pars. 41, 55 and 75 and FIGS. 2 and 5A-B, which teach determining text image quality of the regions by checking whether an attribute of the region meets an OCR threshold, wherein the attribute is a text/font size).  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 5, 10 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Baheti in view of us patent application publication no. 2017/0309003 to Bako et al. (hereinafter Bako).
For claims 5, 10 and 15, while Baheti as applied does not explicitly teach, Bako in the analogous art teaches: 
wherein the segmenting is based on k-means clustering (see, e.g., pars. 30-31 of Bako, which segmenting pixels of text from those of a background), and 
wherein the segmenting comprises: 
	classifying each pixel associated with the at least one visual text region into a predefined category of one or more predefined categories based on the pixel intensity and color (see, e.g., pars. 29-32 and FIG. 3, which assigning higher color intensity values represent background compared to the values pixel representing text); 
	generating one or more clusters corresponding to the one or more predefined categories (see, e.g., pars. 29-32 and FIG. 3, which generating a color intensity histogram with peaks corresponding to color intensities of pixels representing text and those representing a background); and 
	identifying, from the one or more clusters, a cluster corresponding to text characters distinct from clusters corresponding to background (see, e.g., pars. 29-32 and FIG. 3, which identifying cluster/peak of pixels representing text and cluster/peak representing a background).  
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Baheti to perform the clustering as taught by Bako because doing so allow yield predictable results of detecting peaks/clusters corresponding to undesired artifacts and removing the artifacts (see par. 31 of Bako and also MPEP 2143(I)(D)).

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WOO RHIM whose telephone number is (571)272-6560. The examiner can normally be reached Mon - Fri 9:30 am - 6:00 pm et.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Henok Shiferaw can be reached at 571-272-4637. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/WOO C RHIM/Examiner, Art Unit 2676                 


/Henok Shiferaw/Supervisory Patent Examiner, Art Unit 2676

Read full office action

Prosecution Timeline

Dec 22, 2023

Application Filed

Dec 12, 2025

Non-Final Rejection — §102, §103

Feb 24, 2026

Response Filed

Mar 31, 2026

Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/075,952

Patent 12601667

AUTOMATED TURF TESTING APPARATUS AND SYSTEM FOR USING SAME

2y 5m to grant Granted Apr 14, 2026

17/784,160

Patent 12596134

DEVICE, MOVEMENT SPEED ESTIMATION SYSTEM, FEEDING CONTROL SYSTEM, MOVEMENT SPEED ESTIMATION METHOD, AND RECORDING MEDIUM IN WHICH MOVEMENT SPEED ESTIMATION PROGRAM IS STORED

2y 5m to grant Granted Apr 07, 2026

18/210,943

Patent 12591997

ARRANGEMENT DEVICE AND METHOD

2y 5m to grant Granted Mar 31, 2026

18/198,440

Patent 12586169

Mass Image Processing Apparatus and Method

2y 5m to grant Granted Mar 24, 2026

18/144,652

Patent 12579607

DEMOSAICING METHOD AND APPARATUS FOR MOIRE REDUCTION

2y 5m to grant Granted Mar 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

80%

Grant Probability

99%

With Interview (+21.4%)

2y 11m

Median Time to Grant

Moderate

PTA Risk

Based on 140 resolved cases by this examiner. Grant probability derived from career allow rate.