Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This office action is in response to correspondence 02/12/26 regarding application 18/646,001, in which claims 1, 5, 8, 15, and 18 were amended, claim 9 was cancelled, and new claim 21 was added. Claims 1-8 and 10-21 are pending in the application and have been considered.
Response to Arguments
Applicant has amended independent claims 1, 8, and 15 by incorporating language previously found in now cancelled claim 9, claims 1, 8, 9, and 15 all having been previously rejected as being anticipated by Shim. Applicant argues that Shim does not disclose “providing, by the user device, the intent data and one or more prompts to a large language model (LLM) system, wherein multiple intents, within the intent data, are prioritized based on weights assigned to the multiple intents” as recited by independent claim 1. As evidence, Applicant reproduces paragraph 131 of the Shim reference and emphasizes that Shim dynamically adjusts weights based on shifts in user attention, for example, a shift in eye movement toward the term “hydroelectricity” on the screen causes the processor to update the weights associated with the tokens. Applicant then argues that this adjusting of the weights based on shifts in user attention does not disclose the above claim limitation in question.
Applicant’s argument has been considered and is not persuasive. Shim plainly describes augmenting the original prompt tokens with the weighted contextual tokens to generate an enhanced prompt which is provided to the LLM, see [0131-0134]. The examiner maintains that Shim discloses providing, by the user device, the intent data and one or more prompts to a large language model (LLM) system because in Shim the user prompt tokens are augmented with the weighted contextual tokens, which represent the gaze and intent data, and provided to the LXM, [0052], which is a large generative model such as an LLM, Abstract, and Shim further discloses wherein multiple intents, within the intent data, are prioritized based on weights assigned to the multiple intents because Shim describes assigning weights to contextual tokens using eye-tracking data, e.g. “solar power” and “hydroelectricity”, which are updated based on shifts in user attention, [0129-0131]. It is generally unclear where exactly the purported distinction from Shim occurs because there does not appear to be any particular difference between the weights in Shim and the claimed “weights assigned to the multiple intents”. It is noted that the Shim elements “hydroelectricity” and “wind” “turbines” correspond to the claimed “intent data” since the user’s query “Tell me about renewable energy” is interpreted using contextual information such as gaze to ascertain e.g. which form of renewable energy the user intends to learn about, see Shim [0130]. For example, if the user utters this while looking at the word “wind”, this is given a higher weight in the token making up the enhanced prompt, since the user more likely intends to learn about wind energy versus hydroelectricity.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-5, 7, 8, 12, 13, 15-17, and 20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Shim et al. (US 20250131023).
Consider claim 1, Shim discloses a method (method, [0111]), comprising:
receiving, by a user device, a user interface that includes content (e.g. a webpage section, [0124], with content such as words, sentences, or paragraphs to be displayed on laptop computer, [0024], [0142]);
providing, by the user device, the user interface for display to a user of the user device (subject matter displayed on the computing device, [0114], such as a current paragraph, [0058]);
receiving, by the user device, a user interaction with the user interface (the user speaks e.g. “Illustrate the current paragraph for my child.”, [0058]);
calculating, by the user device and based on the user interaction, gaze data identifying a gaze of the user, a dwell time of the gaze, and an eye behavior of the user relative to the content (using eye gaze tracking to identify the word, sentence, or paragraph the user is looking at while speaking, [0058], [0114], as well as gaze duration, i.e. dwell time, and saccades and fixations, i.e. eye behavior, [0124]; these are tokenized as contextual information, i.e. calculated gaze data, [0126-0127]);
generating, by the user device, intent data based on the gaze data (identifying the paragraph the user is looking at while speaking an otherwise indefinite prompt, i.e. which paragraph the user intends for the LLM to illustrate, [0058]; this becomes enhanced prompt information, which summarizes the user’s intent, [0040]);
providing, by the user device, the intent data and one or more prompts to a large language model (LLM) system (the user prompt tokens are augmented with the weighted contextual tokens, which represent the gaze and intent data, and provided to the LXM, [0052], which is a large generative model such as an LLM, Abstract), wherein multiple intents, within the intent data, are prioritized based on weights assigned to the multiple intents (assigning weights to contextual tokens using eye-tracking data, e.g. “solar power” and “hydroelectricity”, which are updated based on shifts in user attention, [0129-0131]); and
receiving, by the user device, one or more responses from the LLM system based on providing the intent data and the one or more prompts to the LLM system (response from the LXM, [0052]).
Consider claim 8, Shim discloses a user device (laptop computer, [0142]), comprising:
one or more processors (processor, [0142]) configured to:
receive a user interface that includes content (e.g. a webpage section, [0124], with content such as words, sentences, or paragraphs to be displayed on laptop computer, [0024], [0142]);
provide the user interface for display to a user of the user device (subject matter displayed on the computing device, [0114], such as a current paragraph, [0058]);
receive a user interaction with the user interface (the user speaks e.g. “Illustrate the current paragraph for my child.”, [0058]);
calculate, based on the user interaction, gaze data identifying a gaze of the user, a dwell time of the gaze, and an eye behavior of the user relative to the content (using eye gaze tracking to identify the word, sentence, or paragraph the user is looking at while speaking, [0058], [0114], as well as gaze duration, i.e. dwell time, and saccades and fixations, i.e. eye behavior, [0124]; these are tokenized as contextual information, i.e. calculated gaze data, [0126-0127]);
filter irrelevant gaze data to focus on particular user interactions with the content (processor filters the raw gaze data to remove outliers or anomalies, [0124-125]);
generate intent data based on the gaze data (identifying the paragraph the user is looking at while speaking an otherwise indefinite prompt, i.e. which paragraph the user intends for the LLM to illustrate, [0058]; this becomes enhanced prompt information, which summarizes the user’s intent, [0040]);
provide the intent data and one or more prompts to a large language model (LLM) system (the user prompt tokens are augmented with the weighted contextual tokens, which represent the gaze and intent data, and provided to the LXM, [0052], which is a large generative model such as an LLM, Abstract), wherein multiple intents, within the intent data, are prioritized based on weights assigned to the multiple intents (assigning weights to contextual tokens using eye-tracking data, e.g. “solar power” and “hydroelectricity”, which are updated based on shifts in user attention, [0129-0131]); and
receive one or more responses from the LLM system based on providing the intent data and the one or more prompts to the LLM system (response from the LXM, [0052]).
Consider claim 15, Shim discloses a non-transitory computer-readable medium storing a set of instructions (software applications stored in memory, [0148]), the set of instructions comprising: one or more instructions that, when executed by one or more processors of a user device (memory with instructions executed by processor, [0147-0148]), cause the user device to:
receive a user interface that includes content (e.g. a webpage section, [0124], with content such as words, sentences, or paragraphs to be displayed on laptop computer, [0024], [0142]);
provide the user interface for display to a user of the user device (subject matter displayed on the computing device, [0114], such as a current paragraph, [0058]);
receive a user interaction with the user interface (the user speaks e.g. “Illustrate the current paragraph for my child.”, [0058]);
calculate, based on the user interaction, gaze data identifying a gaze of the user, a dwell time of the gaze, and an eye behavior of the user relative to the content (using eye gaze tracking to identify the word, sentence, or paragraph the user is looking at while speaking, [0058], [0114], as well as gaze duration, i.e. dwell time, and saccades and fixations, i.e. eye behavior, [0124]; these are tokenized as contextual information, i.e. calculated gaze data, [0126-0127]);
extract context-specific features from the content based on the gaze data (contextual information, [0126-0127]);
generate intent data based on the gaze data and the context-specific features of the content (identifying the paragraph the user is looking at while speaking an otherwise indefinite prompt, i.e. which paragraph the user intends for the LLM to illustrate, [0058]; this becomes enhanced prompt information, which summarizes the user’s intent, [0040], when the user prompt tokens are augmented with the weighted contextual tokens, [0052]);
provide the intent data and one or more prompts to a large language model (LLM) system (the user prompt tokens are augmented with the weighted contextual tokens, which represent the gaze and intent data, and provided to the LXM, [0052], which is a large generative model such as an LLM, Abstract), wherein multiple intents, within the intent data, are prioritized based on weights assigned to the multiple intents (assigning weights to contextual tokens using eye-tracking data, e.g. “solar power” and “hydroelectricity”, which are updated based on shifts in user attention, [0129-0131]); and
receive one or more responses from the LLM system based on providing the intent data and the one or more prompts to the LLM system (response from the LXM, [0052]).
Consider claim 2, Shim discloses: extracting context-specific features from the content based on the gaze data (contextual information, [0126-0127]), wherein generating the intent data based on the gaze data comprises: generating the intent data based on the gaze data and the context-specific features of the content (identifying the paragraph the user is looking at while speaking an otherwise indefinite prompt, i.e. which paragraph the user intends for the LLM to illustrate, [0058]; this becomes enhanced prompt information, which summarizes the user’s intent, [0040], when the user prompt tokens are augmented with the weighted contextual tokens, [0052]).
Consider claim 3, Shim discloses: refining the intent data by correlating the gaze data with historical interaction data of the user prior to providing the intent data to the LLM system (the attention based metrics including gaze tracking and historical interaction data, which are aggregated over time to provide a longitudinal view of user behavior and focus, and are further analyzed to refine their accuracy, [0046]).
Consider claim 4, Shim discloses: updating a user profile of the user based on the intent data (the components are configured to keep track of elements the user may have missed or engaged on for an extended period as intent data, considered a user profile, in order to determine which sentence the user intends to query about when asking “please tell me more about the sentence was I looking at two minutes ago?”, [0059-0060]).
Consider claim 5, Shim discloses the intent data is utilized by the LLM system is configured to generate the one or more responses to the one or more prompts (responding to the user based on the prompt and user intent, [0052], [0058]).
Consider claim 7, Shim discloses: modifying the intent data based on a change in the eye behavior of the user and to generate modified intent data (tracking saccades, fixations of the eye, etc. to determine current and update areas of interest, [0124]); and providing the modified intent data to the LLM system (adaptively updating the input provided to the LXMs based on latest areas of interest or focus, [0057]).
Consider claim 12, Shim discloses the one or more processors are further configured to: determine a sequence of user focus areas on the content (first looking at “wind turbines”, then “hydroelectricity”. [0130]-0131]); and utilize the sequence of user focus areas to enhance an accuracy of the intent data (updating the contextual token weights to focus more on hydroelectricity, [0130-0132]).
Consider claim 13, Shim discloses the one or more processors are further configured to: associate emotional states of the user with the intent data based on an analysis of the eye behavior of the user (determining the attention-based metrics by analyzing facial expressions approximating emotional conditions in conjunction with the eye behavior, [0078], [0079]).
Consider claim 16, Shim discloses the one or more instructions further cause the user device to one or more of: update a user profile of the user based on the intent data (the components are configured to keep track of elements the user may have missed or engaged on for an extended period as intent data, considered a user profile, in order to determine which sentence the user intends to query about when asking “please tell me more about the sentence was I looking at two minutes ago?”, [0059-0060]); or calibrate a gaze biometric component of the user device based on an initial interaction of the user with the user device (noting the claim language “one or more of”, this limitation is required by the claim language only in the alternative).
Consider claim 17, Shim discloses the one or more instructions further cause the user device to: modify the intent data based on a change in the eye behavior of the user and to generate modified intent data (tracking saccades, fixations of the eye, etc. to determine current and update areas of interest, [0124]); and provide the modified intent data to the LLM system (adaptively updating the input provided to the LXMs based on latest areas of interest or focus, [0057]).
Consider claim 20, Shim discloses one or more instructions further cause the user device to: determine a sequence of user focus areas on the content (first looking at “wind turbines”, then “hydroelectricity”. [0130]-0131]); and utilize the sequence of user focus areas to enhance an accuracy of the intent data (updating the contextual token weights to focus more on hydroelectricity, [0130-0132]).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 6 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Shim et al. (US 20250131023) in view of Venkatesh (US 20070076958).
Consider claim 6, Shim does not, but Venkatesh discloses: calibrating a gaze biometric component of the user device based on an initial interaction of the user with the user device (initially the gaze direction areas are determined, and a given number of images of the pupils are captured and reference midpoint location is determined, [0054], [0055]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Shim by calibrating a gaze biometric component of the user device based on an initial interaction of the user with the user device in order to reduce processing complexity, as suggested by Venkatesh ([0001]). Doing so would have led to predictable results of making the gaze direction system less expensive to install and maintain, as suggested by Venkatesh ([0001]). The references cited are analogous art in the same field of gaze detection.
Consider claim 14, Shim discloses the one or more processors, to calculate the gaze data (using eye gaze tracking to identify the word, sentence, or paragraph the user is looking at while speaking, [0058], [0114], as well as gaze duration, i.e. dwell time, and saccades and fixations, i.e. eye behavior, [0124]; these are tokenized as contextual information, i.e. calculated gaze data, [0126-0127]).
Shim does not specifically mention track a horizontal and vertical ratio of the gaze of the user; or calculate midpoint coordinates of the gaze of the user on the content.
Venkatesh discloses: track a horizontal and vertical ratio of the gaze of the user (noting the claim language “or”, this limitation is required by the claim language only in the alternative); or calculate midpoint coordinates of the gaze of the user on the content (coordinates of the current midpoint location, [0059]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Shim by calculating midpoint coordinates of the gaze of the user on the content for reasons similar to those for claim 6.
Claim 10, 18, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Shim et al. (US 20250131023) in view of Deleuze et al. (US 20210303797).
Consider claim 10, Shim discloses the one or more processors are further configured to: generate an alert (system alerts, [0078]).
Shim does not specifically mention generating an alert based on the intent data indicating an error in the user interaction with the user interface.
Deleuze discloses based on the intent data indicating an error in the user interaction with the user interface (making a correction based on recognizing a user intent to correct errors, [0083], in interactions with user interface, [0085]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Shim by generating an alert based on the intent data indicating an error in the user interaction with the user interface as in Deleuze in order to clarify ambiguous or erroneous language, as suggested by Deleuze, [0001]), predictably improving ability to process informal language, as suggested by Deleuze ([0002]). The references cited are analogous art in the same field of natural language processing.
Consider claim 18, Shim discloses the one or more processors are further configured to: generate an alert (system alerts, [0078]).
Shim does not specifically mention generating an alert based on the intent data indicating an error in the user interaction with the user interface.
Deleuze discloses based on the intent data indicating an error in the user interaction with the user interface (making a correction based on recognizing a user intent to correct errors, [0083], in interactions with user interface, [0085]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Shim by generating an alert based on the intent data indicating an error in the user interaction with the user interface as in Deleuze for reasons similar to those for claim 10.
Consider claim 21, Shim discloses the one or more processors are further configured to: generate an alert (system alerts, [0078]).
Shim does not specifically mention generating an alert based on the intent data indicating an error in the user interaction with the user interface.
Deleuze discloses based on the intent data indicating an error in the user interaction with the user interface (making a correction based on recognizing a user intent to correct errors, [0083], in interactions with user interface, [0085]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Shim by generating an alert based on the intent data indicating an error in the user interaction with the user interface as in Deleuze for reasons similar to those for claim 10.
Claims 11 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Shim et al. (US 20250131023) in view of Callegari et al. (US 20240362422).
Consider claim 11, Shim does not, but Callegari discloses one or more processors (processor, [0087]) are further configured to: receive feedback data from the LLM system (assessment report of prompt 30 from trained LLM, [0018], [0022], Fig 2); modify the intent data based on feedback data and to generate modified intent data (revised prompt 69, [0022]); and provide the modified intent data to the LLM system (revised prompt is provided to the LLM, [0022]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Shim such that one or more processors are further configured to: receive feedback data from the LLM system; modify the intent data based on feedback data and to generate modified intent data; and provide the modified intent data to the LLM system in order to improve prompt quality, as suggested by Callegari ([0003]), predictably making the experience less frustrating for users, as suggested by Callegari ([0003]). The references cited are analogous art in the same field of natural language processing.
Consider claim 19, Shim does not, but Callegari discloses the one or more instructions further cause the user device (program executed by a computer processor, [0082-0083]) to: (assessment report of prompt 30 from trained LLM, [0018], [0022], Fig 2); modify the intent data based on feedback data and to generate modified intent data (revised prompt 69, [0022]); and provide the modified intent data to the LLM system (revised prompt is provided to the LLM, [0022]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Shim such that one or more processors are further configured to: receive feedback data from the LLM system; modify the intent data based on feedback data and to generate modified intent data; and provide the modified intent data to the LLM system for reasons similar to those for claim 11.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jesse Pullias whose telephone number is 571/270-5135. The examiner can normally be reached on M-F 8:00 AM - 4:30 PM. The examiner’s fax number is 571/270-6135.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Andrew Flanders can be reached on 571/272-7516.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Jesse S Pullias/
Primary Examiner, Art Unit 2655 03/20/26