Last updated: April 19, 2026

Application No. 18/777,427

USING VISUAL CONTEXT TO IMPROVE A VIRTUAL ASSISTANT

Non-Final OA §103

Filed

Jul 18, 2024

Examiner

SIDDO, IBRAHIM

Art Unit

2681

Tech Center

2600 — Communications

Assignee

Apple Inc.

OA Round

1 (Non-Final)

Interview Optional

— +13.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 474 resolved cases, 2023–2026

Examiner Intelligence

SIDDO, IBRAHIM View full profile →

Grants 84% — above average

Career Allow Rate

397 granted / 474 resolved

+21.8% vs TC avg

Moderate +13% lift

Without

With

+13.3%

Interview Lift

resolved cases with interview

Typical timeline

2y 3m

Avg Prosecution

17 currently pending

Career history

491

Total Applications

across all art units

Statute-Specific Performance

§101

7.0%

-33.0% vs TC avg

§103

61.8%

+21.8% vs TC avg

§102

17.2%

-22.8% vs TC avg

§112

7.6%

-32.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 474 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-2, 8-14 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (US 2020/0042819) in view of Kilarapu (US 2020/0074872).
With respect to claim 13 (similarly claims 1 and 14), Zhang teaches an electronic device (e.g. an attention memory system 10 [0039] implemented as a server/electronic device [0041]) comprising:
one or more processors (e.g. a control unit 120, [0044], [0047]);
a memory (e.g. memory 130. [0044]); and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions (e.g. the control unit 120 may execute a program stored in the memory 130, [0048] to execute the method of Fig 2) for:
receiving an image (e.g. the attention memory system 10 may acquire an image Fig 2 at step S2001. [0073]);
generating, based on the image, a plurality of questions corresponding to a first object in the image (e.g. generating based on the image, a plurality of questions corresponding to a first object in the image, see Fig 2 S2003-S2005 No to loop back to S2003 [0081], [0084]-[0085], [0091]);
However, Zhang fails to teach selecting a subset of the plurality of questions corresponding to the first object in the image; and
displaying the subset of the plurality of questions corresponding to the first object in the image.
Kilarapu teaches an electronic device for generating and displaying questions for content of multimedia [0016]-[0017], selecting a subset of the plurality of questions corresponding to the first object in the image/multimedia (e.g. selecting a subset of the relevant questions corresponding to the content of the multimedia, [0020]-[0023], [0035]-[0036]); and
displaying the subset of the plurality of questions corresponding to the first object in the image (e.g. the display control unit 106 can be configured to display the question received from the question selection unit 104 to a user while watching the chapter of the multimedia [0024], [0037]).
Zhang and Kilarapu are analogous art because they all pertain to generating questions on objects in an image/multimedia. Therefore, it would have been obvious to people having ordinary skill in the art before the effective filing date of the claimed invention to modify Zhang with the teachings of Kilarapu to include: selecting a subset of the plurality of questions corresponding to the first object in the image; and
displaying the subset of the plurality of questions corresponding to the first object in the image, as taught by Kilarapu in Figs 1-2 [0016]-[0037]. The benefit of the modification would be such that displaying the questions for each chapter/object creates highly engaged viewing experience and also provides ability to the user to assess learning efficiently, Kilarapu [0037].
With respect to claim 2, Zhang in view of Kilarapu teaches the non-transitory computer-readable storage medium of claim 1, wherein the plurality of questions is a first plurality of questions (Zhang e.g. the plurality of questions is a first plurality of questions, see Fig 2 S2003-S2005 No to loop back to S2003 [0081], [0084]-[0085], [0091]), the one or more programs further including instructions for:
generating, based on the image, a second plurality of questions corresponding to a second object in the image (e.g. generating based on the image, a second plurality of questions corresponding to a second object in the image extracted among the objects at S2002, see Fig 2 S2003-S2005 No to loop back to S2003 [0081], [0084]-[0085], [0091]);
selecting a subset of the first plurality of questions and a subset of the second plurality of questions as suggested questions (e.g. selecting a subset of the first plurality of questions and a subset of the second plurality of questions as suggested questions, as modified by Kilarapu in  [0020]-[0023], [0035]-[0036] of Kilarapu); and
displaying the suggested questions (e.g. displaying the suggested questions as modified by Kilarapu in the display control unit 106 can be configured to display the question received from the question selection unit 104 to a user while watching the chapter of the multimedia [0024], [0037]).
With respect to claim 8, Zhang in view of Kilarapu teaches the non-transitory computer-readable storage medium of claim 1, the one or more programs further including instructions for:
detecting selection of a question of the subset of questions corresponding to the first object in the image (Zhang e.g.  the attention memory system 10 may generate a question for identifying a preset object on the image [0081], [0084]-[0085], [0092]); and
in response to detecting selection of the question of the subset of questions corresponding to the first object in the image (Zhang e.g. yes S2005 Fig 2): determining information about the first object based on the selected question (Zhang e.g. the attention memory system 10 may select the preset object from candidate objects on the image through comparison with the attention memory information which is information about a candidate object close to the preset object, the space information of the objects on the image, and the categories of the objects [0094]); and
providing the information about the first object (Zhang e.g. [0093]-[0094] provide information about the first object).
With respect to claim 9, Zhang in view of Kilarapu teaches the non-transitory computer-readable storage medium of claim 8, wherein the information about the first object references contextual data of the electronic device (Zhang e.g. the information about the first object references contextual data of the electronic device as suggested in [0090] i.e. the relevance of the question).
With respect to claim 10, Zhang in view of Kilarapu teaches the non-transitory computer-readable storage medium of claim 1, wherein displaying the subset of the plurality of questions corresponding to the first object in the image further comprises: displaying the subset of questions as a virtual item in a view of the electronic device (Kilarapu e.g. [0037] suggests displaying the subset of questions as a virtual item in a view of the electronic device).
With respect to claim 11, Zhang in view of Kilarapu teaches the non-transitory computer-readable storage medium of claim 1, wherein generating, based on the image, a plurality of questions corresponding to a first object in the image and selecting a subset of the plurality of questions corresponding to the first object in the image are performed by a neural network (Zhang e.g. [0008]-[0010] suggest that generating, based on the image, a plurality of questions corresponding to a first object in the image and selecting a subset of the plurality of questions corresponding to the first object in the image are performed by a neural network).
With respect to claim 12, Zhang in view of Kilarapu teaches the non-transitory computer-readable storage medium of claim 1, the one or more programs further including instructions for: determining information corresponding to each question of the subset of questions (Zhang e.g. [0081]-[0085] determine information corresponding to each question of the subset of questions); and displaying the information corresponding to each question of the subset of questions (Zhang e.g. [0045] and [0076] suggest displaying the information corresponding to each question of the subset of questions).

Claim(s) 3-7 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (US 2020/0042819) in view of Kilarapu (US 2020/0074872) and further in view of Gupta (US 2018/0288477).
With respect to claim 3, Zhang in view of Kilarapu teaches the non-transitory computer-readable storage medium of claim 1 including wherein generating, based on the image, a plurality of questions corresponding to the first object in the image, see Fig 2 S2003-S2005 No to loop back to S2003 [0081], [0084]-[0085], [0091].
However, Zhang fails to teach wherein generating, based on the image, a plurality of questions corresponding to the first object in the image further comprises:
determining whether a gaze of a user includes the first object; and
in accordance with a determination that the gaze of the user includes the first object, selecting the first object for generating the plurality of questions.
Gupta teaches a media guidance application to: 
determining whether a gaze of a user includes a first object (e.g. determining whether a gaze of a user includes a first object/a first character, see Fig 1 [0062]-[0064], [0072]-[0074], [0084]); and
in accordance with a determination that the gaze of the user includes the first object (e.g. in accordance with a determination that the gaze of the user includes the first object/character i.e. If the user answers “yes” to the media guidance application's prompt [0068], [0099]-[0100]), selecting the first object for generating the plurality of questions (e.g. selecting Jack Gleeson for generating the plurality of questions as suggested in [0062]-[0068], [0099]-[0100]).
Zhang and Gupta are analogous art because they all pertain to asking questions about objects/characters in an image. Therefore, it would have been obvious to people having ordinary skill in the art before the effective filing date of the claimed invention to modify Zhang with the teachings of Gupta to include: wherein generating, based on the image, a plurality of questions corresponding to the first object in the image further comprises:
determining whether a gaze of a user includes the first object; and
in accordance with a determination that the gaze of the user includes the first object, selecting the first object for generating the plurality of questions, as suggested by Gupta in Figs 1-3. The benefit of the modification would be to disambiguate between the objects/characters to satisfy the user’s queries about the objects.
With respect to claim 4, Zhang in view of Kilarapu and further in view of Gupta teaches the non-transitory computer-readable storage medium of claim 3, wherein determining whether the gaze of the user includes the first object further comprises:
determining a length of time that the gaze of the user is directed at the first object (Gupta e.g. determining a length of time 6:50-6:52 PM that the gaze of the user is directed at the first object/character, as suggested in [0063]-[0068], see also the viewing log of [0078]); and
in accordance with a determination that the length of time that the gaze of the user is directed at the first object exceeds a predetermined threshold (Gupta e.g. in accordance with a determination that the length of time that the gaze of the user is directed at the first object exceeds a predetermined threshold i.e. 2 seconds from 6:50-6:52 PM as suggested in [0063]-[0068, [0078]), determining that the gaze of the user includes the first object (Gupta e.g. determining that the gaze of the user includes the first object/character Jack Gleeson as suggested in [0063]-[0068], [0099]-[0100]).
With respect to claim 5, Zhang in view of Kilarapu teaches the non-transitory computer-readable storage medium of claim 1 including wherein generating, based on the image, a plurality of questions corresponding to the first object in the image, see Fig 2 S2003-S2005 No to loop back to S2003 [0081], [0084]-[0085], [0091].
However, Zhang fails to teach wherein generating, based on the image, a plurality of questions corresponding to the first object in the image further comprises:
determining a location of the electronic device; and
selecting the first object in the image based on the location of the electronic device.
Gupta teaches determining a location of the electronic device (e.g. distance 124 determines a location of the electronic device of Fig 1 as suggested in [0069]-[0070]); and
selecting the first object in the image based on the location of the electronic device (selecting a first object/character 118 based on the distance 124 of the electronic device, see Fig 1).
Zhang and Gupta are analogous art because they all pertain to asking questions about objects/characters in an image. Therefore, it would have been obvious to people having ordinary skill in the art before the effective filing date of the claimed invention to modify Zhang with the teachings of Gupta to include: wherein generating, based on the image, a plurality of questions corresponding to the first object in the image further comprises:
determining a location of the electronic device; and
selecting the first object in the image based on the location of the electronic device, as suggested by Gupta in Fig 1. The benefit of the modification would be to determine the position of eye 104 relative to display 114 Gupta [0070].
With respect to claim 6, Zhang in view of Kilarapu teaches the non-transitory computer-readable storage medium of claim 1 including selecting the subset of the plurality of questions corresponding to the first object in the image and selecting the predetermined number of questions as the subset of the plurality of questions in [0020]-[0023], [0035]-[0036] of Kilarapu.
However, Zhang fails to teach wherein selecting the subset of the plurality of questions corresponding to the first object in the image further comprises:
determining a corresponding weight for each of the plurality of questions;
ranking the plurality of questions based on the corresponding weights;
determining a predetermined number of questions of the plurality of questions based on the ranking;
Gupta teaches determining a corresponding weight for each of the plurality of questions (e.g. the media guidance application may also store a weighting profile in the user profile, which indicates how important an entity is (or how often the user looks at said entity) [0063]);
ranking the plurality of questions based on the corresponding weights (ranking entity 118 a weight of 40%, entity 120 a weight of 30% and a third entity a weight of 30%, see [0030]-[0031], [0081], [0098]);
determining a predetermined number of questions of the plurality of questions based on the ranking (e.g. increasing or decreasing the weight/ a predetermined number of questions of the plurality of questions based on the ranking as suggested in [0098], [0100], [0105] whereby where a negative answer is received about a question, questions about that character is decreased, whereas a positive answer increases the number of questions); 
Zhang and Gupta are analogous art because they all pertain to asking questions about objects/characters in an image. Therefore, it would have been obvious to people having ordinary skill in the art before the effective filing date of the claimed invention to modify Zhang with the teachings of Gupta to include: wherein selecting the subset of the plurality of questions corresponding to the first object in the image further comprises: determining a corresponding weight for each of the plurality of questions; ranking the plurality of questions based on the corresponding weights; determining a predetermined number of questions of the plurality of questions based on the ranking; and selecting the predetermined number of questions as the subset of the plurality of questions, as suggested by Gupta in Figs 1-4. The benefit of the modification would be to disambiguate between the objects/characters to satisfy the user’s queries about the objects. 
With respect to claim 7, Zhang in view of Kilarapu teaches the non-transitory computer-readable storage medium of claim 6, wherein the corresponding weight is an indication of the questions relevance and wherein the relevance is based on at least one of contextual data, digital assistant interaction history, and popularity of the question (e.g. the corresponding weight of 30%, 40% of [0030]-[0031], [0081], [0098], [0105] is an indication of the questions relevance and wherein the relevance is based on at least one of contextual data, digital assistant interaction history, and popularity of the question).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to IBRAHIM SIDDO whose telephone number is (571)272-4508. The examiner can normally be reached 9:00-5:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Akwasi Sarpong can be reached at 5712703438. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/IBRAHIM SIDDO/Primary Examiner, Art Unit 2681

Read full office action

Prosecution Timeline

Jul 18, 2024

Application Filed

Feb 21, 2025

Response after Non-Final Action

Feb 17, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/227,884

Patent 12592233

GAZE-BASED COMMAND DISAMBIGUATION

2y 5m to grant Granted Mar 31, 2026

18/519,027

Patent 12587606

METHOD FOR MANUFACTURING A DECORATIVE SHEET AND A METHOD FOR MANUFACTURING A DECORATIVE PANEL COMPRISING A DECORATIVE SHEET

2y 5m to grant Granted Mar 24, 2026

18/155,514

Patent 12572092

OPTICAL DEVICE, IMAGE READING DEVICE, AND ASSEMBLING METHOD

2y 5m to grant Granted Mar 10, 2026

18/374,575

Patent 12572573

SESSION-BASED USER AWARENESS IN LARGE LANGUAGE MODELS

2y 5m to grant Granted Mar 10, 2026

18/423,503

Patent 12574465

ELECTRONIC DEVICE

2y 5m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

84%

Grant Probability

97%

With Interview (+13.3%)

2y 3m

Median Time to Grant

Low

PTA Risk

Based on 474 resolved cases by this examiner. Grant probability derived from career allow rate.

USING VISUAL CONTEXT TO IMPROVE A VIRTUAL ASSISTANT

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email