Last updated: April 19, 2026

Application No. 18/501,667

APPARATUS AND METHODS FOR AUGMENTING VISION WITH REGION-OF-INTEREST BASED PROCESSING

Non-Final OA §103

Filed

Nov 03, 2023

Examiner

DEMETER, HILINA K

Art Unit

2617

Tech Center

2600 — Communications

Assignee

Softeye Inc.

OA Round

3 (Non-Final)

Interview Optional

— +19.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 659 resolved cases, 2023–2026

Examiner Intelligence

DEMETER, HILINA K View full profile →

Grants 72% — above average

Career Allow Rate

472 granted / 659 resolved

+9.6% vs TC avg

Strong +19% interview lift

Without

With

+19.4%

Interview Lift

resolved cases with interview

Typical timeline

3y 1m

Avg Prosecution

27 currently pending

Career history

686

Total Applications

across all art units

Statute-Specific Performance

§101

8.7%

-31.3% vs TC avg

§103

61.0%

+21.0% vs TC avg

§102

14.5%

-25.5% vs TC avg

§112

6.7%

-33.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 659 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/22/2026 has been entered.

Information Disclosure Statement
The information disclosure statement (IDS) submitted is considered by the examiner.

Response to Arguments
Applicant’s arguments with respect to claim(s) 9-19 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 9-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lacey et al. (US Publication Number 2019/0362557 A1, hereinafter “Lacey”) in view of Alcock et al. (US Publication Number 2020/0026949 A1, hereinafter “Alcock”).

(1) regarding claim 9:
As shown in fig. 2A, Lacey disclosed a smart glasses apparatus (200, fig. 2A, para. [0097], note that the wearable system 200 includes a display 220, and various mechanical and electronic modules and systems to support the functioning of display 220), comprising: 
a physical frame comprising a hands-free trigger mechanism (230, fig. 2A, para. [0097], note that the display 220 may be coupled to a frame 230); 
a display (para. [0094], note that the wearable system 200 includes a display 220, and various mechanical and electronic modules and systems to support the functioning of display 220);
a processor subsystem (para. [0109], note that the head mounted wearable component (58) also features a processor (128)); and 
a non-transitory computer-readable medium comprising instructions that, when executed by the processor subsystem (para. [0101], note that the local processing and data module 260 may comprise a hardware processor, as well as digital memory, such as non-volatile memory (e.g., flash memory)), cause the smart glasses apparatus to: 
capture an image via an outward-facing camera assembly in response to a hands-free trigger condition (para. [0357], note that the wearable system may receive a user input selecting element(s) for editing in any desired form including, but not limited to, a speech input, a gaze input (e.g., via the inward-facing imaging system 462), a gesture input (e.g., as captured by the outward-facing imaging system 464)); and display the region-of-interest on the display of the smart glasses apparatus (para. [0244], note that the virtual content 1926 is initially displayed within the boundary polygon 1972a on the wall surface 1915. The user can select the virtual object 1926, for example, through a cone cast or a multimodal input (including two or more of the gesture 1960, head pose 1920, eye gaze, or an input from the user input device 1940)).
Lacey disclosed most of the subject matter as described as above except for specifically teaching crop a region-of-interest from the image forming a cropped image; send the cropped image to a companion device; receive request for image data adjacent to the region-of interest from the companion device based on an active panning instruction; and send the image data adjacent to the region-of-interest to the companion device in response to the request.
However, Alcock disclosed crop a region-of-interest from the image forming a cropped image (para. [0137], note that [in each image frame, an image of the Object within a bounding box surrounding the Object is extracted from the image frame and the image is a cropped bounding box); send the cropped image to a companion device (para. [0137], note that the cropped bounding box 404 is sent with the metadata to the server 406 for further processing. The cropped bounding box 404 represents the image of the Object over this tracked period of time); receive request for image data adjacent to the region-of interest from the companion device based on an active panning instruction (para. [0137], note that the more computational intensive object detection and classification module may be another neural network to resolve or extract the Object from another overlapping or closely adjacent object); and send the image data adjacent to the region-of-interest to the companion device in response to the request (para. [0137], note that more than one cropped bounding box may be picked from the list of top 10 cropped bounding boxes for sending to the server 406. For example, another cropped bounding box selected by the highest confidence level may be sent as well.).
At the time of filing for the invention, it would have been obvious to a person of ordinary skilled in the art to teach crop a region-of-interest from the image forming a cropped image; send the cropped image to a companion device; receive request for image data adjacent to the region-of interest from the companion device based on an active panning instruction; and send the image data adjacent to the region-of-interest to the companion device in response to the request. The suggestion/motivation for doing so would have been in order to optimize resource utilization by detecting and recognizing objects with a desired accuracy (para. [0005]). Therefore, it would have been obvious to combine Lacey with Alcock to obtain the invention as specified in claim 9.

(2) regarding claim 10:
Lacey further disclosed the smart glasses apparatus of claim 9, where the hands-free trigger mechanism comprises an eye-tracking camera and where the hands-free trigger condition comprises a gaze fixation (para. [0165], note that the vergence of the user's eyes can be tracked and an accommodation/vergence model can be used to determine the accommodation state of the user's eyes, which provides information on a rendering plane on which the user is focusing).

(3) regarding claim 11:
Lacey further disclosed the smart glasses apparatus of claim 10, where the eye-tracking camera is further configured to identify a gaze point (para. [0357], note that the wearable system may receive a user input in the form of a user's gaze lingering on a particular word for a threshold period of time or may receive a user's gaze on a particular word at the same time) and the instructions, when executed by the processor subsystem, further cause the smart glasses apparatus to select the region-of-interest based on the gaze point (para. [0244], note that the user can select the virtual object 1926, for example, through a cone cast or a multimodal input (including two or more of the gesture 1960, head pose 1920, eye gaze, or an input from the user input device 1940).).

(4) regarding claim 12:
Lacey further disclosed the smart glasses apparatus of claim 9, where the hands-free trigger mechanism comprises the outward-facing camera assembly and where the hands-free trigger condition comprises a user gesture (para. [0411], note that the system may fuse together head pose, eye pose, and hand gesture inputs, thereby allowing the user to look and point to select an object (while using the head and eye gaze pose inputs to increase the accuracy of the hand gesture input)).

(5) regarding claim 13:
Lacey further disclosed the smart glasses apparatus of claim 12, where the processor subsystem is further configured to identify the region-of-interest based on the user gesture (para. [0256], note that FIG. 21, the virtual objects within the circle 2122 may have a higher confidence score than the virtual objects in-between the circle 2122 and the circle 2124 because the objects that are closer to the user's position 2120 are more likely to be the objects that the user is interested in interacting with).

(6) regarding claim 14:
Lacey further disclosed the smart glasses apparatus of claim 9, where the hands-free trigger mechanism comprises a microphone and where the hands-free trigger condition comprises a voice command (para. [0254], note that the direct user inputs may include head pose, eye gaze, voice input, gesture, inputs from a user input device, or other inputs that directly from a user).

(7) regarding claim 15:
Lacey further disclosed the smart glasses apparatus of claim 9, where the hands-free trigger mechanism comprises an inertial measurement unit (IMU) and where the hands-free trigger condition comprises a head movement (para. [0206], note that the HMD can recognize head poses 1614 using an IMU. A head 1410 may have multiple degrees of freedom).

Claim(s) 16-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Strawn et al. (US Publication Number 2024/0248527 A1, hereinafter “Strawn”).

(1) regarding claim 16:
As shown in fig. 6, Strawn disclosed a method for region-of-interest based processing (para. [0071], note that FIG. 6 illustrates control of interaction with displayed content such as a 3D interactive map based on gaze focus in conjunction with secondary inputs, according to examples. Diagram 600 shows identification of a focus area 608 (location of interest) on displayed content 606 (a 3D interactive map, for example) through detection of a gaze 604 of an eye 602 and performance of an action on the focus area (612) based on a secondary input 610), comprising: 
monitoring for a gaze fixation (para. [0018], note that a location of interest in displayed content may be identified through the user's gaze and/or fixation/saccades); 
determining a duration of the gaze fixation (para. [0064], note that determining the gaze of the user may be include determination of a fixation and/or saccades. During fixation, the eyes may stop scanning a scene and hold the foveal area of the field of vision in one place, which may be interpreted as a location of interest); 
determining a zoom level based on the duration of the gaze fixation (para. [0064], note that once a location of interest is identified (through fixation), a number of actions specific to that location may be available through secondary input. For example, the user may be enabled to zoom, rotate, pan, or move the location of interest through the finger gestures 504 and/or the hand gestures 506); 
capturing a first image via an outward-facing camera assembly based on gaze fixation and zoom level (para. [0071], note that diagram 600 shows identification of a focus area 608 (location of interest) on displayed content 606 (a 3D interactive map, for example) through detection of a gaze 604 of an eye 602 and performance of an action on the focus area (612) based on a secondary input 610. Also see, para. [0055], note that the camera 340 captures images of the physical environment in the field of view. The captured images may be processed by a virtual reality engine to add virtual objects to the captured images or modify physical objects in the captured images).
Strawn disclosed most of the subject matter as described as above except for specifically teaching determining a region-of-interest within the first image via a computer vision logic.
However, it would have been obvious for Strawn to teach determining a region-of-interest within the first image via a computer vision logic (para. [0084], note that the displayed content may be adjusted based on the performed action. For example, a portion display content may be rotated, zoomed, or panned. Upon completion of the adjustment of the displayed content, new actions may be made available to the user to provide secondary input and/or the user's gaze may be tracked to determine if the location of interest).
At the time of filing for the invention, it would have been obvious to a person of ordinary skilled in the art to teach a region-of-interest within the first image via a computer vision logic. The suggestion/motivation for doing so would have been in order to display content based on eye tracking and a location of interest in displayed content may be identified through the user's gaze, fixation, and/or saccades, or other actions, such as zoom, rotate, pan, move, open actionable menus, etc (abs.). Therefore, it would have been obvious for Strawn to obtain the invention as specified in claim 16.

(2) regarding claim 17:
Strawn further disclosed the method of claim 16, where the region-of-interest is based on a gaze point (para. [0071], note that a focus area 608 (location of interest) is on displayed content 606 (a 3D interactive map, for example) through detection of a gaze 604 of an eye 602 and performance of an action on the focus area (612) based on a secondary input 610).

(3) regarding claim 18:
Strawn further disclosed the method of claim 16, further comprising monitoring for a user gesture and where determining the region-of-interest is based on the user gesture (para. [0061], note that the user's gaze may focus on a building on the map, a tool tip listing businesses in the building may be displayed upon detection of the focus of the user's gaze. The user may then select one of the listed businesses through secondary input (e.g., eye gesture, hand gesture, finger gesture, device input, body movement, etc.)).

(4) regarding claim 19:
Strawn further disclosed the method of claim 16, where the first image is captured at a first resolution and where the method further comprises capturing a second image at a second resolution via the outward-facing camera assembly based on the region-of-interest (para. [0098], note that the wearable system 200 can include an outward-facing imaging system 464 which observes the world in the environment around the user. Also see para. [0107], note that the system 200 can include a forward facing “world” camera as well as a relatively high-resolution. Also see para. [0196], capturing multiple objects based on user’s FOV).

Allowable Subject Matter
Claims 1-6, 8, 21-22 are allowed. The following is an examiner’s statement of reasons for allowance: the prior arts made of record do not teach “determine a trigger condition has been satisfied; in response to determining the trigger condition has been satisfied, capture the image based on trigger condition; cause the logic to determine the region-of-interest based on the gaze point; crop the region-of-interest from the image, generating a cropped region-of-interest image; argument the cropped region-of-interest image to improve perceptibility of one or more features within the region-of-interest when displayed via the heads-up display forming an augmented region-of-interest image; and display the augmented region-of-interest image (in combination of all the limitations stated in the claim 1)”. Claims 2-6, 8, 21-22 are allowed as they depend on claim 1.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

Brower (US Publication Number 2021/0027103 A1) disclosed object detections of a machine learning model are leveraged to automatically generate new ground truth data for images captured at different perspectives.

Wexler et al. (US Publication Number 2017/0195629 A1) disclosed a wearable apparatus is provided for capturing and processing images from an environment of a user.  

Any inquiry concerning this communication or earlier communication from the examiner should be directed to Hilina K Demeter whose telephone number is (571) 270-1676. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, King Y. Poon could be reached at (571) 270- 0728. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about PAIR system, see http://pari-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/HILINA K DEMETER/Primary Examiner, Art Unit 2617

Read full office action

Prosecution Timeline

Nov 03, 2023

Application Filed

Mar 04, 2025

Non-Final Rejection — §103

Jun 25, 2025

Interview Requested

Jul 02, 2025

Examiner Interview Summary

Jul 02, 2025

Applicant Interview (Telephonic)

Jul 07, 2025

Response Filed

Sep 17, 2025

Final Rejection — §103

Jan 05, 2026

Examiner Interview Summary

Jan 05, 2026

Applicant Interview (Telephonic)

Jan 22, 2026

Request for Continued Examination

Jan 29, 2026

Response after Non-Final Action

Feb 20, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/083,474

Patent 12602864

EVENT ROUTING IN 3D GRAPHICAL ENVIRONMENTS

2y 5m to grant Granted Apr 14, 2026

18/378,049

Patent 12592042

SYSTEMS AND METHODS FOR MAINTAINING SECURITY OF VIRTUAL OBJECTS IN A DISTRIBUTED NETWORK

2y 5m to grant Granted Mar 31, 2026

17/966,363

Patent 12586297

INTERACTIVE IMAGE GENERATION

2y 5m to grant Granted Mar 24, 2026

18/331,906

Patent 12579724

EXPRESSION GENERATION METHOD AND APPARATUS, DEVICE, AND MEDIUM

2y 5m to grant Granted Mar 17, 2026

18/154,219

Patent 12561906

METHOD FOR GENERATING AT LEAST ONE GROUND TRUTH FROM A BIRD'S EYE VIEW

2y 5m to grant Granted Feb 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

72%

Grant Probability

91%

With Interview (+19.4%)

3y 1m

Median Time to Grant

High

PTA Risk

Based on 659 resolved cases by this examiner. Grant probability derived from career allow rate.