Last updated: May 29, 2026

Application No. 18/568,400

IDENTIFICATION DEVICE, IDENTIFICATION METHOD, AND IDENTIFICATION PROGRAM

Non-Final OA §103

Filed

Dec 08, 2023

Priority

Jun 11, 2021 — nonprovisional of PCTJP2021022420

Examiner

NGUYEN, CAO H

Art Unit

2171

Tech Center

2100 — Computer Architecture & Software

Assignee

NTT, Inc.

OA Round

1 (Non-Final)

Interview Optional

— +7.5% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 91% grant rate with +7.5% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 1138 resolved cases, 2023–2026

Examiner Intelligence

NGUYEN, CAO H View full profile →

Grants 91% — above average

Career Allowance Rate

1033 granted / 1138 resolved

+35.8% vs TC avg

Moderate +8% lift

Without

With

+7.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 6m

Avg Prosecution

5 currently pending

Career history

1155

Total Applications

across all art units

Statute-Specific Performance

§101

1.6%

-38.4% vs TC avg

§103

68.4%

+28.4% vs TC avg

§102

14.9%

-25.1% vs TC avg

§112

0.6%

-39.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1138 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1-2, 4, 8-10, 12, 16, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dines et al. (US Patent No. 11,232,170) in view of Kumar US Patent Publication Application No. 2019/0250891).
	Regarding claims 1 and 8, Dines discloses an identification device comprising a processor configured to execute operations comprising [see col. 1, lines 56-60; the computer program is configured to cause at least one processor to analyze a UI at runtime to identify UI element attributes and compare the UI element attributes to UI descriptor attributes for an activity of an RPA workflow using one or more initial graphical element detection techniques]: 
identifying first screen data including an image of a screen of an application and information regarding a screen component object that is an object of an element included in the screen [see col. 4, lines 10-24 and 45-50 and col. 6, lines 20-42; a screen is an image of an application UI or a portion of the application UI at a certain point in time. UI elements and screens differentiated into specific types of UI elements and screens (e.g., top windows, modal windows, popup windows, etc.). An example UI tree may include a document object model (DOM) underlying a webpage rendered by a web browser application; UI descriptors are an encapsulated data/struct format that includes UI element selector(s), anchor selector(s), CV descriptor(s), OCR descriptor(s), unified target descriptor(s) combining two or more types of UI descriptors, a screen image capture (context), an element image capture, other metadata (e.g., the application and application version); which corresponds to attribute UI hierarchy data to identify UI elements and screen image];
outputting outputs a first identification result associated with sample screen data that is screen data to be referred to [see col. 6, lines 15-42 and figures 3-5; HyperText Markup Language (HTML) example, an element ID representing an individual form field may indicate that the respective form field is a child of an HTML form, which in turn is a child of a specific section of a webpage and using the text field example, a text field for entering a first name may appear to the right of the label “First Name”. This first name label may be set as an “anchor” to help to uniquely identify the text field, which is the “target”; which corresponds to  outputting an identification result associated with screen data]; however, Dines fails to explicitly teach 
 identifying, a second identification unit that identifies, on a basis of the sample screen data and the first identification result, second screen data including an image of the screen and not including information regarding the screen component object; and outputting outputs a second identification result associated with the sample screen data.
	Kumar discloses identifying, a second identification unit that identifies, on a basis of the sample screen data and the first identification result [see para. 0011-0014; the GUI model generated for a GUI for an application may encapsulate information corresponding to the one or more GUI screens for the application. For each GUI screen, the GUI model may include information identifying one or more user interface (UI) components included in the GUI screen. For each GUI screen, the model may also include information about the structure of the GUI screen, such as information identifying a hierarchical organization of the user interface components and text content items on the GUI screen], second screen data including an image of the screen and not including information regarding the screen component object [see para. 0069, 0070; model generation system extract text information from GUI screen images , a text detection tool may be used to determine the locations (e.g., coordinates) of the regions in a GUI screen image that may include text content, and an optical character recognition (OCR) tool may then be used to extract (e.g., recognize) text content items from these regions in GUI screen image]; and outputting outputs a second identification result associated with the sample screen data  [see para. 0100; object detection module detect and classify UI components in the output image from OCR module using machine learning-based model(s) generated by machine learning subsystem. For example, the machine learning-based models may include a UI component detector and/or classifier that can identify and classify objects in an image. Object detection module use the machine learning-based model to identify UI components in the GUI screen image, and classify each UI component to determine the type of the UI component and the associated actions or functions. The output of object detection module include the text content items (and their location information), and information regarding the identified UI components in the GUI screen image, such as the locations, types (or classes), associated actions or functions; which corresponds to UI elements recognized from image screen data including element location].
It would have been obvious to one of an ordinary skill in the art, having the teachings of Dines and Kumar before the affective filing date of the claimed invention to modify, a UI element detection system of Dines to include an automated GUI model generation from screenshot, as taught by Kumar. One would have been motivated to make such a combination in order to perform identification using UI object information based on screen attributes when available.
Regarding claims 2, 10 and 16, Kumar discloses the processor further configured to execute operations comprising: prior to identifying the second screen data: identifying identifies a common object commonly included in a plurality of pieces of the first screen data among the screen component object of the sample screen data from the sample screen data and the first identification result [see para, 0161 and figures 14-15; an input GUI screen image and the corresponding GUI screen image displayed using code generated based on techniques disclosed herein according to certain embodiments. Input GUI screen image is a JPG file displayed by a photo viewer showing the designed GUI screen. GUI screen image is displayed by a web browser based on a HTML file;  a screen shot showing html code generated for an example of an input GUI screen based on techniques, when a user selects the HTML file generated by the tool in file structure, the source HTML code may be displayed to the user, and the user may modify the HTML code as needed]; obtaining obtains relative arrangement relationship for each drawing area of a common object; and generating derives a sample screen model including relative arrangement relationship [see para. 0068; analyze GUI screen images to determine one or more GUI screens specified for the GUI, and for each GUI screen, the set of user interface components included on that screen and the physical arrangement of the user interface components. This GUI model generation processing include, for example, for a GUI screen, determining a set of user interface components (e.g., buttons, drop down lists, segments, and the like) and their attributes (e.g., labels, sizes, locations), determining the physical layout of the UI components within the GUI screen (e.g., determining hierarchical containment relationships of UI components or groups of UI components), and determining functionality to be associated with one or more of the UI components; which corresponds to generating a screen layout model from visual features].
	Regarding claims 4, 12 and 18, Dines discloses wherein the identifying the common object further comprises: identifying a fixed-value object that is commonly included in a plurality of pieces of the first screen data and includes a same character string among the screen component object of the sample screen data included in an identification case [see col. 6, lines 31-42 and lines 63-67; a text field for entering a first name may appear to the right of the label “First Name”. This first name label may be set as an “anchor” to help to uniquely identify the text field, which is the “target” and  consider a web form where two text field for entering a first name appear to the right of respective labels “First Name” in different locations on the screen. One or more additional anchors may be useful to uniquely identify a given target. The geometric properties between the anchors and the target (e.g., line segment lengths, angles, and/or relative locations with tolerances) may be used to uniquely identify the target; which corresponds to same character string commonly included in a plurality of screens].
Regarding claim 8 is an independent claim and relates to an identification method comprising. Since the features of claim 8 are substantially the same as those of claim 1 except for the category of invention, the same reasoning as in claim 1 applies to claim 8.
Regarding claim 9, Dines discloses a computer-readable non-transitory recording medium storing a computer-executable program instructions that when executed by a processor cause a computer system to execute operations comprising [see col. 19, lines 5-15; the computer program may be embodied on a non-transitory computer-readable medium. The computer-readable medium may be, but is not limited to, a hard disk drive, a flash device, RAM, a tape, and/or any other such medium or combination of media used to store data. The computer program may include encoded instructions for controlling processor(s) of a computing system (e.g., processor(s) of computing system]: 
identifying first screen data including an image of a screen of an application and information regarding a screen component object that is an object of an element included in the screen [see col. 4, lines 10-24 and 45-50 and col. 6, lines 20-42; a screen is an image of an application UI or a portion of the application UI at a certain point in time. UI elements and screens differentiated into specific types of UI elements and screens (e.g., top windows, modal windows, popup windows, etc.). An example UI tree may include a document object model (DOM) underlying a webpage rendered by a web browser application; UI descriptors are an encapsulated data/struct format that includes UI element selector(s), anchor selector(s), CV descriptor(s), OCR descriptor(s), unified target descriptor(s) combining two or more types of UI descriptors, a screen image capture (context), an element image capture, other metadata (e.g., the application and application version); which corresponds to attribute UI hierarchy data to identify UI elements and screen image];
outputting a first identification result associated with sample screen data that is screen data to be referred to [see col. 6, lines 15-42 and figures 3-5; HyperText Markup Language (HTML) example, an element ID representing an individual form field may indicate that the respective form field is a child of an HTML form, which in turn is a child of a specific section of a webpage and using the text field example, a text field for entering a first name may appear to the right of the label “First Name”. This first name label may be set as an “anchor” to help to uniquely identify the text field, which is the “target”; which corresponds to  outputting an identification result associated with screen data]; however, Dines fails to explicitly teach 
 identifying, a second identification unit that identifies, on a basis of the sample screen data and the first identification result, second screen data including an image of the screen and not including information regarding the screen component object; and outputting outputs a second identification result associated with the sample screen data.
	Kumar discloses identifying, a second identification unit that identifies, on a basis of the sample screen data and the first identification result [see para. 0011-0014; the GUI model generated for a GUI for an application may encapsulate information corresponding to the one or more GUI screens for the application. For each GUI screen, the GUI model may include information identifying one or more user interface (UI) components included in the GUI screen. For each GUI screen, the model may also include information about the structure of the GUI screen, such as information identifying a hierarchical organization of the user interface components and text content items on the GUI screen], second screen data including an image of the screen and not including information regarding the screen component object [see para. 0069, 0070; model generation system extract text information from GUI screen images , a text detection tool may be used to determine the locations (e.g., coordinates) of the regions in a GUI screen image that may include text content, and an optical character recognition (OCR) tool may then be used to extract (e.g., recognize) text content items from these regions in GUI screen image]; and outputting outputs a second identification result associated with the sample screen data  [see para. 0100; object detection module detect and classify UI components in the output image from OCR module using machine learning-based model(s) generated by machine learning subsystem. For example, the machine learning-based models may include a UI component detector and/or classifier that can identify and classify objects in an image. Object detection module use the machine learning-based model to identify UI components in the GUI screen image, and classify each UI component to determine the type of the UI component and the associated actions or functions. The output of object detection module include the text content items (and their location information), and information regarding the identified UI components in the GUI screen image, such as the locations, types (or classes), associated actions or functions; which corresponds to UI elements recognized from image screen data including element location].
It would have been obvious to one of an ordinary skill in the art, having the teachings of Dines and Kumar before the affective filing date of the claimed invention to modify, a UI element detection system of Dines to include an automated GUI model generation from screenshot, as taught by Kumar. One would have been motivated to make such a combination in order to perform identification using UI object information based on screen attributes when available.
Regarding claims 10, 12, 16 and 18 directly or indirectly dependent on claims 8 and 9, essentially correspond to those of claims 2 and 4 respectively. Accordingly, the same reasoning as in claims 2 and 4 applies to claims 10, 12, 16 and 18.

Allowable Subject Matter
Claims 3, 5-7, 11, 13-15, 17 and 19-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure (See PTO-892).
Detiege (2008/0195958) discloses the apparatus has a capture system module capturing a screen of a computer to an image. An analysis system module analyzes the image. A layout creation system module creates a layout with new virtual objects of the screen, where the apparatus is utilized to recognize and localize objects e.g. input field, button, icon, check box and text on the screen.
	Li et al. (US2014/0253559) discloses an action is then identified and performed to the UI element. A different runtime image of the text string is generated by drawing the other runtime image with a different set of text property values when no UI element on the UI matches the runtime image.
A reference to specific paragraphs, columns, pages, or figures in a cited prior art reference is not limited to preferred embodiments or any specific examples. It is well settled that a prior art reference, in its entirety, must be considered for all that it expressly teaches and fairly suggests to one having ordinary skill in the art. Stated differently, a prior art disclosure reading on a limitation of Applicant's claim cannot be ignored on the ground that other embodiments disclosed were instead cited. Therefore, the Examiner's citation to a specific portion of a single prior art reference is not intended to exclusively dictate, but rather, to demonstrate an exemplary disclosure commensurate with the specific limitations being addressed. In re Heck, 699 F.2d 1331, 1332-33,216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006,1009, 158 USPQ 275, 277 (CCPA 1968)). In re: Upsher-Smith Labs. v. Pamlab, LLC, 412 F.3d 1319, 1323, 75 USPQ2d 1213, 1215 (Fed. Cir. 2005); In re Fritch, 972 F.2d 1260, 1264, 23 USPQ2d 1780, 1782 (Fed. Cir. 1992); Merck & Co. v. Biocraft Labs., Inc., 874 F.2d 804, 807, 10 USPQ2d 1843, 1846 (Fed. Cir. 1989); In re Fracalossi, 681 F.2d 792,794 n.1,215 USPQ 569, 570 n.1 (CCPA 1982); In re Lamberti, 545 F.2d 747, 750, 192 USPQ 278, 280 (CCPA 1976); In re Bozek, 416 F.2d 1385, 1390, 163 USPQ 545, 549 (CCPA 1969). 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to CAO H NGUYEN whose telephone number is (571)272-4053.  The examiner can normally be reached on Mon-Fri 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kieu Vu can be reached on 571-272-4057.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CAO H NGUYEN/            Primary Examiner, Art Unit 2171

Read full office action

Prosecution Timeline

Dec 08, 2023

Application Filed

Jan 28, 2026

Non-Final Rejection mailed — §103

Apr 21, 2026

Examiner Interview Summary

Apr 21, 2026

Applicant Interview (Telephonic)

Apr 28, 2026

Response Filed

Precedent Cases

Applications granted by this same examiner with similar technology

18/528,006

Patent 12638957

COMMUNICATION SYSTEM USING USER INTERFACES FOR AUTOMATIC INSTANT MESSENGER CHAT SUMMARY GENERATION

2y 5m to grant Granted May 26, 2026

18/577,965

Patent 12638958

USER CONTENT MODIFICATION SUGGESTIONS AT CONSISTENT DISPLAY LOCATIONS

2y 4m to grant Granted May 26, 2026

18/512,491

Patent 12632158

INFORMATION ACQUISITION METHOD AND DEVICE, STORAGE MEDIUM AND ELECTRONIC APPARATUS

2y 6m to grant Granted May 19, 2026

18/512,616

Patent 12632162

DATA ASSOCIATION METHOD AND APPARATUS, COMPUTER DEVICE AND STORAGE MEDIUM

2y 6m to grant Granted May 19, 2026

18/587,721

Patent 12632121

USER INTERACTIONS WITH REMOTE DEVICES

2y 2m to grant Granted May 19, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

91%

Grant Probability

98%

With Interview (+7.5%)

2y 6m (~1m remaining)

Median Time to Grant

Low

PTA Risk

Based on 1138 resolved cases by this examiner. Grant probability derived from career allowance rate.