Last updated: April 19, 2026

Application No. 18/672,799

Computer Vision Systems and Methods for Information Extraction from Inspection Tag Images

Non-Final OA §103

Filed

May 23, 2024

Examiner

THOMAS, MIA M

Art Unit

2665

Tech Center

2600 — Communications

Assignee

Insurance Services Office Inc.

OA Round

1 (Non-Final)

Interview Optional

— +15.7% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 703 resolved cases, 2023–2026

Examiner Intelligence

THOMAS, MIA M View full profile →

Grants 86% — above average

Career Allow Rate

606 granted / 703 resolved

+24.2% vs TC avg

Strong +16% interview lift

Without

With

+15.7%

Interview Lift

resolved cases with interview

Typical timeline

2y 12m

Avg Prosecution

12 currently pending

Career history

715

Total Applications

across all art units

Statute-Specific Performance

§101

14.5%

-25.5% vs TC avg

§103

43.0%

+3.0% vs TC avg

§102

20.5%

-19.5% vs TC avg

§112

17.9%

-22.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 703 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is responsive to communications filed on 05/23/2024. Claims 1-30 are pending in the instant application. Claims 1 and 16 are independent. An Office Action on the merits follows here below. 
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 05/23/2024 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Kim (US 20230408417 A1).

Regarding Claim 1: Kim discloses computer vision system for extracting information from an inspection tag (Refer to para [018]; “object of the present invention is to set a certain area as a detection area by using a tag as a reference position to improve a detection speed and a detection accuracy of a sample and to allow users to easily recognize a location where a sample is placed.”) comprising: a processor (Refer to para [068]; “the user terminal (30) may include a notebook, a mobile terminal, a smart phone, a tablet PC etc., and all types of cameras equipped with a camera unit capable of taking an image may be included therein.”) in communication with a memory (Refer to para [069 and 070]; “In addition, the user terminal (30) transmits the image acquired through the camera to the server (40) or provides the user with the diagnosis result transmitted from the server (40). Accordingly, when the application installed in the user terminal (30) is executed to drive the camera, the user terminal (30) recognizes the pattern of the tag (21) and finds the reference coordinates based on the recognized pattern.”) the processor programmed to perform the steps of: receiving an image of an inspection tag from the memory (Refer to para [029]; “…and the server that extracts an image of a sample corresponding to the urine area from the inspection image received from the user terminal, corrects a color of the image of the sample by comparing a color of the reference color area of the tag in the inspection image with a pre-registered reference color value…”) process the image to detect one or more tags in the image (Refer to para [063 and 066]; “Also, the tag (21) may be used for the purpose of providing reference coordinates for detecting the image of the sample (10) from the image photographed by the user terminal (30). The tag (21) is a white polygonal pattern marked on a black background. In this case, the application for examining the sample (10) finds the pattern shape of the tag (21) in the captured(photographed) image and sets a specific position of the pattern shape as a reference point. Also, an area located at a set distance from the corresponding reference point or an area within a set distance from the corresponding reference point may be set as the detection area, so that the sample (10) located within the corresponding detection area may be detected. The inspection application is to induce the user to position the sample in the detection area by displaying the detection area on the photographing screen of the user terminal (30) in the form of a highlight. In addition, it is to shorten the image analysis time by image processing only the area set based on the reference coordinates set through the tag (21).”) cropping and aligning the image to focus on the detected one or more tags (Refer to para [065, 072 and 073]; “The inspection application is to induce the user to position the sample in the detection area by displaying the detection area on the photographing screen of the user terminal (30) in the form of a highlight. For example, the user terminal (30) displays the detection area (70) in the form of a box at the set position as shown in FIG. 3. Alternatively, the user terminal (30) may display the image of the sample (10) in the form of a virtual image at a specific location on the detection area based on the reference coordinates, so that the user can easily recognize the location where the sample (10) should be placed.”) and process the cropped and aligned image to automatically extract information from the detected one or more tags (Refer to para [074 and 075]; “The server (40) detects the samples in the reference color area and the detection area from the image data received from the user terminal (30) and analyzes the color of the samples to diagnose the health status of the companion animal. To this end, the server (40) stores data including health status information of the companion animals corresponding to the color values of the samples in contact with urine.”).

While the Kim reference does not use the language crop and align, the Kim reference more than fairly discloses positioning the sample in the detection area and displaying a specific location based on said recognition. It would have been obvious to one of ordinary skill in the art to utilize the processor so as to position the image in order to processing the extracted information as rejected above by Kim.

The Examiner contends that it would have been obvious to one of ordinary skill in the art to calculate edited image data as rejected above by Kim to obtain the specified claimed elements of Claim 1. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding Claim 16: Kim discloses a computer vision method for extracting information from an inspection tag (Refer to para [018]; “object of the present invention is to set a certain area as a detection area by using a tag as a reference position to improve a detection speed and a detection accuracy of a sample and to allow users to easily recognize a location where a sample is placed.”) comprising the steps of: receiving by a processor (Refer to para [068]) an image of an inspection tag stored in memory (Refer to para [029]; “and the server that extracts an image of a sample corresponding to the urine area from the inspection image received from the user terminal, corrects a color of the image of the sample by comparing a color of the reference color area of the tag in the inspection image with a pre-registered reference color value…”) process the image by the processor to detect one or more tags in the image (Refer to para [063 and 066]; “Also, the tag (21) may be used for the purpose of providing reference coordinates for detecting the image of the sample (10) from the image photographed by the user terminal (30). The tag (21) is a white polygonal pattern marked on a black background. In this case, the application for examining the sample (10) finds the pattern shape of the tag (21) in the captured(photographed) image and sets a specific position of the pattern shape as a reference point. Also, an area located at a set distance from the corresponding reference point or an area within a set distance from the corresponding reference point may be set as the detection area, so that the sample (10) located within the corresponding detection area may be detected. The inspection application is to induce the user to position the sample in the detection area by displaying the detection area on the photographing screen of the user terminal (30) in the form of a highlight. In addition, it is to shorten the image analysis time by image processing only the area set based on the reference coordinates set through the tag (21).”) cropping and aligning the image by the processor to focus on the detected one or more tags (Refer to para [065, 072 and 073]; “The inspection application is to induce the user to position the sample in the detection area by displaying the detection area on the photographing screen of the user terminal (30) in the form of a highlight. For example, the user terminal (30) displays the detection area (70) in the form of a box at the set position as shown in FIG. 3. Alternatively, the user terminal (30) may display the image of the sample (10) in the form of a virtual image at a specific location on the detection area based on the reference coordinates, so that the user can easily recognize the location where the sample (10) should be placed.”)
and process the cropped and aligned image by the processor to automatically extract information from the detected one or more tags (Refer to para [074 and 075]; “The server (40) detects the samples in the reference color area and the detection area from the image data received from the user terminal (30) and analyzes the color of the samples to diagnose the health status of the companion animal. To this end, the server (40) stores data including health status information of the companion animals corresponding to the color values of the samples in contact with urine.”).
While the Kim reference does not use the language crop and align, the Kim reference more than fairly discloses positioning the sample in the detection area and displaying a specific location based on said recognition. It would have been obvious to one of ordinary skill in the art to utilize the processor so as to position the image in order to processing the extracted information as rejected above by Kim.

The Examiner contends that it would have been obvious to one of ordinary skill in the art to calculate edited image data as rejected above by Kim to obtain the specified claimed elements of Claim 16. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Claims 2, 3, 17 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Kim (US 20230408417 A1) in combination with Yebes Torres (US 20230005286 A1)

Regarding Claim 2: Kim discloses all the claimed elements as rejected above. Kim does not expressly point to a bounding box calculation.

Yebes Torres teaches “computer-based image analysis and, more particularly, methods, systems, articles of manufacture, and an apparatus for decoding purchase data using an image.”

Yebes Torres also teaches a processor (Refer to para [046]; “Examples of processor circuitry include programmed microprocessors, Field Programmable Gate Arrays (FPGAs) that may instantiate instructions, Central Processor Units (CPUs), Graphics Processor Units (GPUs), Digital Signal Processors (DSPs), XPUs, or microcontrollers and integrated circuits such as Application Specific Integrated Circuits (ASICs).”) is further programmed to perform the step of bounding each of the one or more tags by a tag-box (Refer to para [062]; “For example, the polygons can be detected by identifying groups of pixels that belong to the same line. In some examples, a merging process is applied to the detected polygons to merge unconnected polygons belonging to the same row. In some examples, the row detection techniques can be used to generate a list of polygonal regions (e.g., bounding boxes) representing the rows of the receipt including locations (e.g., coordinates) of the polygonal regions.”). 

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Kim by adding a processor “For example, the example OCR circuitry 116 can apply an OCR-based algorithm over the receipt image 108 to obtain text data. After applying an OCR-based algorithm over receipt image 108, the OCR circuitry 116 can return the characters and words (e.g., text) obtained from the receipt image 108 as well as their locations. For example, the OCR circuitry 116 can output bounding boxes (e.g., text boxes) corresponding to strings of characters (e.g., transcribed text) and locations (e.g., coordinates) of the bounding boxes within the receipt image 108.” as taught by Yebes Torres to the overall image processor.

The suggestion/motivation for combining the teachings of Kim and Yebes Torres would have been in order “to identify the two main regions of interest.” (Refer to para [091], Yebes Torres).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Kim and Yebes Torres in order to obtain the specified claimed elements of Claim 2. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding Claim 17: Kim discloses all the claimed elements as rejected above. Kim does not expressly point to a bounding box calculation.
Yebes Torres teaches “computer-based image analysis and, more particularly, methods, systems, articles of manufacture, and an apparatus for decoding purchase data using an image.”

Yebes Torres also teaches bounding each of the one or more tags by a tag-box (Refer to para [062]; “For example, the polygons can be detected by identifying groups of pixels that belong to the same line. In some examples, a merging process is applied to the detected polygons to merge unconnected polygons belonging to the same row. In some examples, the row detection techniques can be used to generate a list of polygonal regions (e.g., bounding boxes) representing the rows of the receipt including locations (e.g., coordinates) of the polygonal regions.”).

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Kim by adding a processor “For example, the example OCR circuitry 116 can apply an OCR-based algorithm over the receipt image 108 to obtain text data. After applying an OCR-based algorithm over receipt image 108, the OCR circuitry 116 can return the characters and words (e.g., text) obtained from the receipt image 108 as well as their locations. For example, the OCR circuitry 116 can output bounding boxes (e.g., text boxes) corresponding to strings of characters (e.g., transcribed text) and locations (e.g., coordinates) of the bounding boxes within the receipt image 108.” as taught by Yebes Torres to the overall image processor.

The suggestion/motivation for combining the teachings of Kim and Yebes Torres would have been in order “to identify the two main regions of interest.” (Refer to para [091], Yebes Torres).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Kim and Yebes Torres in order to obtain the specified claimed elements of Claim 17. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding Claim 3: Yebes Torres teaches the processor (Refer to para [046]; “Examples of processor circuitry include programmed microprocessors, Field Programmable Gate Arrays (FPGAs) that may instantiate instructions, Central Processor Units (CPUs), Graphics Processor Units (GPUs), Digital Signal Processors (DSPs), XPUs, or microcontrollers and integrated circuits such as Application Specific Integrated Circuits (ASICs).”) is further programmed to perform the step of calculating a tag quality score for the tag-box (Refer to para [064]; “In some examples, Intersection over Union (IoU) calculations are used to map (e.g., assign) each of the words to their respective columns and rows. IoU is a metric for measuring overlap between two bounding boxes by comparing a ratio of an overlap area of the two bounding boxes to a total area of the two bounding boxes. In some examples, the words are assigned to a column and/or row if the IoU ratio reaches a threshold value. In some examples, the threshold value is approximately 0.5 (e.g., 50%). However, the threshold value can be higher or lower in additional or alternative examples. In some examples, the receipt can be substantially fully structured after the words generated by the OCR engine are assigned to the detected rows and/or columns. For example, mapping the word to the detected rows and/or columns generates an example data frame. As disclosed herein, a data frame refers to data displayed in a format as a table. While the data frame can include different types of columns, each column in the data frame should have the same type of data.”).

Regarding Claim 18: Yebes Torres teaches calculating a tag quality score for the tag-box (Refer to para [064]; “In some examples, Intersection over Union (IoU) calculations are used to map (e.g., assign) each of the words to their respective columns and rows. IoU is a metric for measuring overlap between two bounding boxes by comparing a ratio of an overlap area of the two bounding boxes to a total area of the two bounding boxes. In some examples, the words are assigned to a column and/or row if the IoU ratio reaches a threshold value. In some examples, the threshold value is approximately 0.5 (e.g., 50%). However, the threshold value can be higher or lower in additional or alternative examples. In some examples, the receipt can be substantially fully structured after the words generated by the OCR engine are assigned to the detected rows and/or columns. For example, mapping the word to the detected rows and/or columns generates an example data frame. As disclosed herein, a data frame refers to data displayed in a format as a table. While the data frame can include different types of columns, each column in the data frame should have the same type of data.”).
Allowable Subject Matter
Claims 4-15, 19-30 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The prior art either singly or in combination does not teach, disclose or suggest at least the following claim limitation(s): “…wherein the processor is further programmed to calculate the tag quality score is computed by the processor as a ratio between a tag-box area and an image area.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Janssen (US 20140064618 A1) discloses “…The system 20 further textualizes the digital image 22 with a textualization module 26. When a digital image 22 is received, textual elements (words and numbers) and metatextual elements (bar codes, QR codes, Xerox DataGlyphs, logos, etc.) in the image 22 are recognized and tagged with their content. Specifically, the binarized document image is processed by the textualization module 26 to extract meaningful textual content. The textualization process includes OCR to extract apparent text, along with logo recognition to turn logo art into text, barcode decoding to turn barcodes into text, QR code decoding to turn QR codes into text, and the like. The extracted text is tagged with its location on the digital image 22, and the textualization module 26 assigns a recognition confidence value between 0 and 1. Text extracted from an image, such as a logo or barcode, is given the bounding box of that image as its location. Note that the OCR process used produces the bounding box as an output; other information, such as font types and sizes and information on bold or italic faces would are also output. Confidences on recognition accuracy of word and character elements are also utilized if the textualization module 26 further processes the image to improve accuracy. It should also be appreciated that the textualization module 26 also enables the user to correct inaccurate work or character elements. For example, the textualization module 26 processes the image 22 through the OCR sub-system to extract the text. The textualization module 26 convert this information from the OCR system's proprietary format to a hOCR format standard for OCR output, which enables the textualization module 26 to effectively separate the particular OCR sub-system from the rest of document processing system 16. Bounding box information is present for each word, using the "bbox" hOCR tag. Character and word confidences are present in the hOCR, using the optional "x confs" and "x wconf" tags.”
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MIA M THOMAS whose telephone number is (571)270-1583. The examiner can normally be reached M-Th 8:30am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen (Steve) Koziol can be reached at (408) 918-7630. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MIA M. THOMAS
Primary Examiner
Art Unit 2665



/MIA M THOMAS/Primary Examiner
Art Unit 2665

Read full office action

Prosecution Timeline

May 23, 2024

Application Filed

Mar 07, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/476,497

Patent 12602938

SYSTEM AND METHOD FOR ITEM IDENTIFICATION USING CONTAINER-BASED CLASSIFICATION

2y 5m to grant Granted Apr 14, 2026

18/220,284

Patent 12597154

IMAGE ANALYSIS METHOD AND CAMERA APPARATUS

2y 5m to grant Granted Apr 07, 2026

18/000,596

Patent 12590529

BOREHOLE IMAGE INTERPRETATION AND ANALYSIS

2y 5m to grant Granted Mar 31, 2026

18/476,468

Patent 12586220

SYSTEM AND METHOD FOR CAMERA RE-CALIBRATION BASED ON AN UPDATED HOMOGRAPHY

2y 5m to grant Granted Mar 24, 2026

17/480,869

Patent 12579220

Visual Attribute Expansion via Multiple Machine Learning Models

2y 5m to grant Granted Mar 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

86%

Grant Probability

99%

With Interview (+15.7%)

2y 12m

Median Time to Grant

Low

PTA Risk

Based on 703 resolved cases by this examiner. Grant probability derived from career allow rate.