Last updated: April 19, 2026
Application No. 19/289,879
SYSTEMS AND METHODS FOR BROWSER EXTENSIONS AND LARGE LANGUAGE MODELS FOR INTERACTING WITH RECORDINGS

Non-Final OA §103
Filed
Aug 04, 2025
Examiner
MAHMOOD, REZWANUL
Art Unit
2159
Tech Center
2100 — Computer Architecture & Software
Assignee
Curioxr Inc. (F/K/A Vr-Edu Inc. )
OA Round
1 (Non-Final)
This examiner grants 46% of cases after interview

— +34.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 402 resolved cases, 2023–2026
Examiner Intelligence

MAHMOOD, REZWANUL View full profile →
Grants 46% of resolved cases
Career Allow Rate
186 granted / 402 resolved
-8.7% vs TC avg
Strong +35% interview lift
Without
With
+34.7%
Interview Lift
resolved cases with interview
Typical timeline
4y 5m
Avg Prosecution
31 currently pending
Career history
433
Total Applications
across all art units
Statute-Specific Performance

§101
18.9%
-21.1% vs TC avg
§103
54.8%
+14.8% vs TC avg
§102
9.0%
-31.0% vs TC avg
§112
12.1%
-27.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 402 resolved cases
Office Action

§103
DETAILED ACTION
	This office action is in response to the communication filed on August 04, 2025. Claims 21-50 are currently pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 08/04/25 has been considered by the examiner.

Claim Objections
Claims 36 and 50 are objected to because of the following informalities:
In claims 36 and 50, the phrase “the text from a transcript of the recording; and the timestamp associated with the recording” should be “the text from a transcript of the recording, and the timestamp associated with the recording”.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 21-30, 32-45, and 47-50 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chi (US Pub 2025/0095690, provisional application filing date 09/14/23) in view of Gopalakrishnan (US Pub 2025/0029603).

With respect to claim 21, Chi discloses a method for prompting a machine learning model to generate answer data based on a recording, the method comprising:
displaying a recording on a display (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, presenting a video in the user interface and enabling access to a set of support videos including question-answer videos by hovering over user-selectable chips that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video);
receiving a user interaction (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, presenting a video in the user interface and enabling access to a set of support videos including question-answer videos by hovering over user-selectable chips that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video),
based on the user interaction, presenting an input engine configured to receive a prompt (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, presenting a video in the user interface and enabling access to a set of support videos including question-answer videos by hovering over user-selectable chips that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video; Chi in [0041] and [0044] discloses given an input source video retrieving the video transcript, descriptions, and annotations annotating the frames with time-code to identify faces and text, generating prompts to an LLM for relevant information to the video, using the prompt to generate a set of short-length videos);
receiving the prompt at the input engine (Chi in [0022] and [0037] discloses prompt a LLM to generate additional relevant content, such as summaries, explanations, and/or questions and answers, process extracted textual content and a prompt with the LLM to generate an output; Chi in [0049] and [0050] discloses each prompt containing a prefix to provide a clear ask to LLM, a target input such as a video transcript, and a suffix to specify output format, prompts can be combined with video metadata and provided to an LLM or other model);
transmitting the prompt and a first data…to a system having access to a machine learning model, wherein the first data…comprises at least one of: data from a partial timeframe of the recording; text from a transcript of the recording; or a timestamp associated with the recording (Chi in [0022] and [0037] discloses extracting textual content such as transcripts, metadata, etc. from video content, prompt a LLM to generate additional relevant content, such as summaries, explanations, and/or questions and answers, process extracted textual content and a prompt with the LLM to generate an output; Chi in [0049] and [0050] discloses each prompt containing a prefix to provide a clear ask to LLM, a target input such as a video transcript, and a suffix to specify output format, prompts can be combined with video metadata and provided to an LLM or other model; here Chi does not explicitly disclose transmitting the prompt and a first data domain, but the Gopalakrishnan reference discloses the feature, as discussed below);
receiving answer data from the machine learning model (Chi in [0022] and [0037] discloses prompt a LLM to generate additional relevant content, such as summaries, explanations, and/or questions and answers, process extracted textual content and a prompt with the LLM to generate an output); and
displaying the answer data in a visual format on the display (Chi in [0022] and [0037] discloses prompt a LLM to generate additional relevant content, such as summaries, explanations, and/or questions and answers, process extracted textual content and a prompt with the LLM to generate an output).
Chi discloses transmitting a prompt associated with various data to a system having access to a machine learning model, however, Chi does not explicitly disclose:
transmitting the prompt and a first data domain…;
The Gopalakrishnan reference discloses transmitting a prompt and a first data domain (Gopalakrishnan in [0017], [0018], and [0049] discloses domain specialty prompt instructions are generated and inserted into prompts to perform tasks, transcripts for different specialties labeled with corresponding specialty, performing text analysis tasks on different domain specialties or across multiple domains with multiple domain specialties, model store used to maintain different fine-tuned models for different text analysis tasks or domains; Gopalakrishnan in [0044] and identifying domain general entities in input text, replacing or modifying domain general entities with domain specialty identifier).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having the teachings of Chi and Gopalakrishnan, to have combined Chi and Gopalakrishnan. The motivation to combine Chi and Gopalakrishnan would be to improve use of machine learning models to perform domain-specific text analysis tasks by augmenting a data set for tuning a pre-trained large language model using different domain specialties (Gopalakrishnan: [0015] and [0017]).

With respect to claim 22, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein the first data domain comprises the text from the transcript of the recording (Chi in [0018] and [0022] discloses extracting textual content associated with videos, such as a transcript of speech that occurs within the video, textual metadata, and other textual information associated with the video; Gopalakrishnan in [0012] discloses obtaining text generates from audio or video transcripts; Gopalakrishnan in [0025] and [0027] receiving a request to generate a transcript and summary of an audio conversation).

With respect to claim 23, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein the recording comprises media having audio and visual components (Chi in [0039] discloses content including textual and/or visual content, content can be an audio and/or video content; Gopalakrishnan in [0027] and [0042] discloses receiving an audio file including metadata of a conversation, transcribing text from audio or video sources).

With respect to claim 24, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein the recording is displayed using at least one of a browser or a video hosting site (Chi in [0019] and [0026] discloses presenting videos in an interactive player in a user interface; Chi in [0026] and [0030] and Figure 1A discloses the user interface providing side panels for support videos and questions and answers similar to a web document or website).

With respect to claim 25, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein the prompt comprises at least one of audio input or text input (Chi in [0037] discloses a prompt including an instruction to summarize one or more sets of textual content, to explain, to generate pairs of questions and answers, and/or other instructions; Chi in [0049] and [0050] discloses each prompt containing a prefix to provide a clear ask to LLM, a target input such as a video transcript, and a suffix to specify output format, prompts can be combined with video metadata and provided to an LLM or other model).

With respect to claim 26, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein the user interaction is received at a button (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, user can hover over user-selectable chips in the interface that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video; Chi in [0033] discloses viewers or users can select an information button and the interface can generate a response).

With respect to claim 27, Chi in view of Gopalakrishnan discloses the method of claim 21, further comprising displaying a user interface, wherein the user interaction is received at the user interface (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, user can hover over user-selectable chips in the interface that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video; Chi in [0033] discloses viewers or users can select an information button and the interface can generate a response).

With respect to claim 28, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein user devices may only access the machine learning model through an application programming interface (API) (Chi in [0084], [0086], and [0087] discloses application communicating with device components using an API, application communicates with central intelligence layer including a number of machine learning models; Gopalakrishnan in [0022] and [0045] discloses interface may be one or more graphical user interfaces that implement application programming interfaces (APIs), an API call using inserted domain specialty identifiers in a generated instruction invokes a host system for a pre-trained language model to perform text analysis).

With respect to claim 29, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein the machine learning model is configured for generative artificial intelligence (Chi in [0019] and [0022] discloses processing textual content with a generative sequence processing model, such as a large language model, to generate an output, prompt a LLM to generate relevant content, such as summaries, explanations, and/or questions and answers; Chi in [0070] and [0087] discloses training machine learning models using various training or learning techniques, a central intelligence layer including a number of machine learning models; Gopalakrishnan in [0001] and [0013] discloses summaries created using a special class of machine learning models such as generative large language models that are tuned to follow natural language instructions describing any task; Gopalakrishnan in [0015] and [0017] discloses train machine learning models to accept and apply domain-specific information as part of input to perform text analysis tasks, tuning a pre-trained large language model using different domain specialties).

With respect to claim 30, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein the machine learning model is a large language model (LLM) (Chi in [0019] and [0022] discloses processing textual content with a generative sequence processing model, such as a large language model, to generate an output, prompt a LLM to generate relevant content, such as summaries, explanations, and/or questions and answers; Chi in [0070] and [0087] discloses training machine learning models using various training or learning techniques, a central intelligence layer including a number of machine learning models; Gopalakrishnan in [0001] and [0013] discloses summaries created using a special class of machine learning models such as generative large language models that are tuned to follow natural language instructions describing any task; Gopalakrishnan in [0015] and [0017] discloses train machine learning models to accept and apply domain-specific information as part of input to perform text analysis tasks, tuning a pre-trained large language model using different domain specialties).

With respect to claim 32, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein the system has access to at least one second data domain with a data scope differing from the first data domain (Chi in [0031] and [0041] discloses retrieving external materials such as URL link to a web tutorial; Chi in [0049] and [0050] discloses each prompt containing a prefix to provide a clear ask to LLM, a target input such as a video transcript or a URL link, and a suffix to specify output format, prompts can be combined with video metadata and provided to an LLM or other model; Gopalakrishnan in [0001] and [0013] discloses summaries created using a special class of machine learning models such as generative large language models that are tuned to follow natural language instructions describing any task; Gopalakrishnan in [0018] and [0049] discloses performing text analysis tasks on different domain specialties or across multiple domains with multiple domain specialties, model store used to maintain different fine-tuned models for different text analysis tasks or domains; Gopalakrishnan in [0021] and [0026] discloses a provided network implementing natural language processing service that implements domain specialty instruction generation for performing text analysis tasks, a provider network may be a private or closed system or may be accessible via the internet, clients may convey network-based services requests).

With respect to claim 33, Chi in view of Gopalakrishnan discloses the method of claim 32, wherein the at least one second data domain includes information available on the internet (Chi in [0031] and [0041] discloses retrieving external materials such as URL link to a web tutorial; Chi in [0049] and [0050] discloses each prompt containing a prefix to provide a clear ask to LLM, a target input such as a video transcript or a URL link, and a suffix to specify output format, prompts can be combined with video metadata and provided to an LLM or other model; Gopalakrishnan in [0001] and [0013] discloses summaries created using a special class of machine learning models such as generative large language models that are tuned to follow natural language instructions describing any task; Gopalakrishnan in [0018] and [0049] discloses performing text analysis tasks on different domain specialties or across multiple domains with multiple domain specialties, model store used to maintain different fine-tuned models for different text analysis tasks or domains; Gopalakrishnan in [0021] and [0026] discloses a provided network implementing natural language processing service that implements domain specialty instruction generation for performing text analysis tasks, a provider network may be a private or closed system or may be accessible via the internet, clients may convey network-based services requests).

With respect to claim 34, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein the answer data in the visual format is based on generating natural language corresponding to the answer data (Chi in [0019] and [0022] discloses processing textual content with a generative sequence processing model, such as a large language model, to generate an output, prompt a LLM to generate relevant content, such as summaries, explanations, and/or questions and answers; Chi in [0070] and [0087] discloses training machine learning models using various training or learning techniques, a central intelligence layer including a number of machine learning models; Gopalakrishnan in [0001] and [0013] discloses summaries created using a special class of machine learning models such as generative large language models that are tuned to follow natural language instructions describing any task; Gopalakrishnan in [0015] and [0017] discloses train machine learning models to accept and apply domain-specific information as part of input to perform text analysis tasks, tuning a pre-trained large language model using different domain specialties).

With respect to claim 35, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein the input engine is presented in response to receiving the user interaction (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, user can hover over user-selectable chips in the interface that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video; Chi in [0033] discloses viewers or users can select an information button and the interface can generate a response).

With respect to claim 36, Chi in view of Gopalakrishnan discloses the method of claim 21, wherein the first data domain comprises the data from a partial timeframe of the recording, the text from a transcript of the recording; and the timestamp associated with the recording (Chi in [0022] and [0037] discloses extracting textual content such as transcripts, metadata, etc. from video content, prompt a LLM to generate additional relevant content, such as summaries, explanations, and/or questions and answers, process extracted textual content and a prompt with the LLM to generate an output; Chi in [0049] and [0050] discloses each prompt containing a prefix to provide a clear ask to LLM, a target input such as a video transcript, and a suffix to specify output format, prompts can be combined with video metadata and provided to an LLM or other model). 

With respect to claim 37, Chi discloses a non-transitory computer readable medium including instructions that are executable by one or more processors to perform operations (Chi in [0059] discloses one or more non-transitory computer-readable storage media storing data and instructions executed by processor to perform operations) comprising:
displaying a recording on a display (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, presenting a video in the user interface and enabling access to a set of support videos including question-answer videos by hovering over user-selectable chips that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video);
receiving a user interaction (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, presenting a video in the user interface and enabling access to a set of support videos including question-answer videos by hovering over user-selectable chips that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video),
based on the user interaction, presenting an input engine configured to receive a prompt (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, presenting a video in the user interface and enabling access to a set of support videos including question-answer videos by hovering over user-selectable chips that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video; Chi in [0041] and [0044] discloses given an input source video retrieving the video transcript, descriptions, and annotations annotating the frames with time-code to identify faces and text, generating prompts to an LLM for relevant information to the video, using the prompt to generate a set of short-length videos);
receiving the prompt at the input engine (Chi in [0022] and [0037] discloses prompt a LLM to generate additional relevant content, such as summaries, explanations, and/or questions and answers, process extracted textual content and a prompt with the LLM to generate an output; Chi in [0049] and [0050] discloses each prompt containing a prefix to provide a clear ask to LLM, a target input such as a video transcript, and a suffix to specify output format, prompts can be combined with video metadata and provided to an LLM or other model);
transmitting the prompt and a first data…to a system having access to a machine learning model, wherein the first data…comprises at least one of: data from a partial timeframe of the recording; text from a transcript of the recording; or a timestamp associated with the recording (Chi in [0022] and [0037] discloses extracting textual content such as transcripts, metadata, etc. from video content, prompt a LLM to generate additional relevant content, such as summaries, explanations, and/or questions and answers, process extracted textual content and a prompt with the LLM to generate an output; Chi in [0049] and [0050] discloses each prompt containing a prefix to provide a clear ask to LLM, a target input such as a video transcript, and a suffix to specify output format, prompts can be combined with video metadata and provided to an LLM or other model; here Chi does not explicitly disclose transmitting the prompt and a first data domain, but the Gopalakrishnan reference discloses the feature, as discussed below);
receiving answer data from the machine learning model (Chi in [0022] and [0037] discloses prompt a LLM to generate additional relevant content, such as summaries, explanations, and/or questions and answers, process extracted textual content and a prompt with the LLM to generate an output); and
displaying the answer data in a visual format on the display (Chi in [0022] and [0037] discloses prompt a LLM to generate additional relevant content, such as summaries, explanations, and/or questions and answers, process extracted textual content and a prompt with the LLM to generate an output).
Chi discloses transmitting a prompt associated with various data to a system having access to a machine learning model, however, Chi does not explicitly disclose:
transmitting the prompt and a first data domain…;
The Gopalakrishnan reference discloses transmitting a prompt and a first data domain (Gopalakrishnan in [0017], [0018], and [0049] discloses domain specialty prompt instructions are generated and inserted into prompts to perform tasks, transcripts for different specialties labeled with corresponding specialty, performing text analysis tasks on different domain specialties or across multiple domains with multiple domain specialties, model store used to maintain different fine-tuned models for different text analysis tasks or domains; Gopalakrishnan in [0044] and identifying domain general entities in input text, replacing or modifying domain general entities with domain specialty identifier).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having the teachings of Chi and Gopalakrishnan, to have combined Chi and Gopalakrishnan. The motivation to combine Chi and Gopalakrishnan would be to improve use of machine learning models to perform domain-specific text analysis tasks by augmenting a data set for tuning a pre-trained large language model using different domain specialties (Gopalakrishnan: [0015] and [0017]).

With respect to claim 38, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, wherein the first data domain comprises the text from the transcript of the recording (Chi in [0040] discloses associating support videos with one or more timestamps of source video, one or more timestamps correspond to one or more sets of textual content; Chi in [0043] discloses acquiring timecoded sentences, each with a start and end time mapped to the source video; Chi in [0044] discloses analyze video frames to annotate relevant information, annotations can be time-coded; Gopalakrishnan in [0012], [0017], and [0018] discloses generating text from audio or video transcripts and labeling with domain specialty, inserting domain specialty information into prompts to perform domain text analysis tasks; Gopalakrishnan in [0044] and identifying domain general entities in input text, replacing or modifying domain general entities with domain specialty identifier.

With respect to claim 39, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, wherein the recording comprises media having audio and visual components (Chi in [0039] discloses content including textual and/or visual content, content can be an audio and/or video content; Gopalakrishnan in [0027] and [0042] discloses receiving an audio file including metadata of a conversation, transcribing text from audio or video sources).

With respect to claim 40, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, wherein the recording is displayed using at least one of a browser or a video hosting site (Chi in [0019] and [0026] discloses presenting videos in an interactive player in a user interface; Chi in [0026] and [0030] and Figure 1A discloses the user interface providing side panels for support videos and questions and answers similar to a web document or website).

With respect to claim 41, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, wherein the prompt comprises at least one of audio input or text input (Chi in [0037] discloses a prompt including an instruction to summarize one or more sets of textual content, to explain, to generate pairs of questions and answers, and/or other instructions; Chi in [0049] and [0050] discloses each prompt containing a prefix to provide a clear ask to LLM, a target input such as a video transcript, and a suffix to specify output format, prompts can be combined with video metadata and provided to an LLM or other model).

With respect to claim 42, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, wherein the user interaction is received at a button (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, user can hover over user-selectable chips in the interface that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video; Chi in [0033] discloses viewers or users can select an information button and the interface can generate a response).

With respect to claim 43, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, the operations further comprising displaying a user interface, wherein the user interaction is received at the user interface (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, user can hover over user-selectable chips in the interface that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video; Chi in [0033] discloses viewers or users can select an information button and the interface can generate a response).

With respect to claim 44, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, wherein the machine learning model is configured for generative artificial intelligence (Chi in [0019] and [0022] discloses processing textual content with a generative sequence processing model, such as a large language model, to generate an output, prompt a LLM to generate relevant content, such as summaries, explanations, and/or questions and answers; Chi in [0070] and [0087] discloses training machine learning models using various training or learning techniques, a central intelligence layer including a number of machine learning models; Gopalakrishnan in [0001] and [0013] discloses summaries created using a special class of machine learning models such as generative large language models that are tuned to follow natural language instructions describing any task; Gopalakrishnan in [0015] and [0017] discloses train machine learning models to accept and apply domain-specific information as part of input to perform text analysis tasks, tuning a pre-trained large language model using different domain specialties).

With respect to claim 45, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, wherein the machine learning model is a large language model (LLM) (Chi in [0019] and [0022] discloses processing textual content with a generative sequence processing model, such as a large language model, to generate an output, prompt a LLM to generate relevant content, such as summaries, explanations, and/or questions and answers; Chi in [0070] and [0087] discloses training machine learning models using various training or learning techniques, a central intelligence layer including a number of machine learning models; Gopalakrishnan in [0001] and [0013] discloses summaries created using a special class of machine learning models such as generative large language models that are tuned to follow natural language instructions describing any task; Gopalakrishnan in [0015] and [0017] discloses train machine learning models to accept and apply domain-specific information as part of input to perform text analysis tasks, tuning a pre-trained large language model using different domain specialties).

With respect to claim 47, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, wherein the system has access to at least one second data domain with a data scope differing from the first data domain (Chi in [0031] and [0041] discloses retrieving external materials such as URL link to a web tutorial; Chi in [0049] and [0050] discloses each prompt containing a prefix to provide a clear ask to LLM, a target input such as a video transcript or a URL link, and a suffix to specify output format, prompts can be combined with video metadata and provided to an LLM or other model; Gopalakrishnan in [0001] and [0013] discloses summaries created using a special class of machine learning models such as generative large language models that are tuned to follow natural language instructions describing any task; Gopalakrishnan in [0018] and [0049] discloses performing text analysis tasks on different domain specialties or across multiple domains with multiple domain specialties, model store used to maintain different fine-tuned models for different text analysis tasks or domains; Gopalakrishnan in [0021] and [0026] discloses a provided network implementing natural language processing service that implements domain specialty instruction generation for performing text analysis tasks, a provider network may be a private or closed system or may be accessible via the internet, clients may convey network-based services requests).

With respect to claim 48, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 47, wherein the at least one second data domain includes information available on the internet (Chi in [0031] and [0041] discloses retrieving external materials such as URL link to a web tutorial; Chi in [0049] and [0050] discloses each prompt containing a prefix to provide a clear ask to LLM, a target input such as a video transcript or a URL link, and a suffix to specify output format, prompts can be combined with video metadata and provided to an LLM or other model; Gopalakrishnan in [0018] and [0049] discloses performing text analysis tasks on different domain specialties or across multiple domains with multiple domain specialties, model store used to maintain different fine-tuned models for different text analysis tasks or domains; Gopalakrishnan in [0021] and [0026] discloses a provided network implementing natural language processing service that implements domain specialty instruction generation for performing text analysis tasks, a provider network may be a private or closed system or may be accessible via the internet, clients may convey network-based services requests).

With respect to claim 49, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, wherein the input engine is presented in response to receiving the user interaction (Chi in [0025]-[0027] and in Figures 1A-B graphical user interfaces enabling users to interact, user can hover over user-selectable chips in the interface that correspond to a question, when user clicks on a chip the interface pauses the main video and plays a selected support video; Chi in [0033] discloses viewers or users can select an information button and the interface can generate a response).

With respect to claim 50, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, wherein the first data domain comprises the data from a partial timeframe of the recording, the text from a transcript of the recording; and the timestamp associated with the recording (Chi in [0040] discloses associating support videos with one or more timestamps of source video, one or more timestamps correspond to one or more sets of textual content; Chi in [0043] discloses acquiring timecoded sentences, each with a start and end time mapped to the source video; Chi in [0044] discloses analyze video frames to annotate relevant information, annotations can be time-coded; Gopalakrishnan in [0012], [0017], and [0018] discloses generating text from audio or video transcripts and labeling with domain specialty, inserting domain specialty information into prompts to perform domain text analysis tasks).

Claim(s) 31 and 46 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chi (US Pub 2025/0095690, provisional application filing date 09/14/23) in view of Gopalakrishnan (US Pub 2025/0029603) and in further view of Arat (US Pub 2025/0126329, provisional application filing date 10/15/23).

With respect to claim 31, Chi in view of Gopalakrishnan discloses the method of claim 21, however, Chi and Gopalakrishnan do not explicitly disclose:
further comprising displaying a feedback interface for user feedback to transmit to the system or another system.
The Arat reference discloses displaying a feedback interface for user feedback to transmit to the system or another system (Arat in [0013] and [0255] discloses posing a question while playing an interactive video content and receiving a response within a graphical user interface, collecting feedback from viewers on the quality and/or relevance of responses provided while watching an interactive video, using feedback as another metric to score and/or rank candidate responses in connection with determining a response to a question posed by a viewer during playback on the interactive video; Arat in [0120] and [0249] discloses generating responses to questions posed by viewers, asking a viewer to rate the quality of an answer provided to a posed question, gathering and using viewer feedback on responses to score and/or rank responses).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having the teachings of Chi, Gopalakrishnan, and Arat, to have combined Chi, Gopalakrishnan, and Arat. The motivation to combine Chi, Gopalakrishnan, and Arat would be to score and/rank responses to questions posed by viewers by asking the viewers to rate the quality of answers provided (Arat: [0120]).

With respect to claim 46, Chi in view of Gopalakrishnan discloses the non-transitory computer readable medium of claim 37, however, Chi and Gopalakrishnan do not explicitly disclose:
the operations further comprising displaying a feedback interface for user feedback to transmit to the system or another system.
The Arat reference discloses displaying a feedback interface for user feedback to transmit to the system or another system (Arat in [0013] and [0255] discloses posing a question while playing an interactive video content and receiving a response within a graphical user interface, collecting feedback from viewers on the quality and/or relevance of responses provided while watching an interactive video, using feedback as another metric to score and/or rank candidate responses in connection with determining a response to a question posed by a viewer during playback on the interactive video; Arat in [0120] and [0249] discloses generating responses to questions posed by viewers, asking a viewer to rate the quality of an answer provided to a posed question, gathering and using viewer feedback on responses to score and/or rank responses).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having the teachings of Chi, Gopalakrishnan, and Arat, to have combined Chi, Gopalakrishnan, and Arat. The motivation to combine Chi, Gopalakrishnan, and Arat would be to score and/rank responses to questions posed by viewers by asking the viewers to rate the quality of answers provided (Arat: [0120]).

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to REZWANUL MAHMOOD whose telephone number is (571)272-5625. The examiner can normally be reached M-F 9-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann J. Lo can be reached at 571-272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/R.M/Examiner, Art Unit 2159                                                                                                                                                                                                        /ANN J LO/Supervisory Patent Examiner, Art Unit 2159
Read full office action
Prosecution Timeline

Aug 04, 2025
Application Filed
Dec 27, 2025
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/343,938
Patent 12579192
PROMISE KEYS FOR RESULT CACHES OF DATABASE SYSTEMS
2y 5m to grant Granted Mar 17, 2026
17/569,030
Patent 12548309
LABEL INHERITANCE FOR SOFT LABEL GENERATION IN INFORMATION PROCESSING SYSTEM
2y 5m to grant Granted Feb 10, 2026
17/343,379
Patent 12541537
DEVICE DISCOVERY SYSTEM
2y 5m to grant Granted Feb 03, 2026
18/495,269
Patent 12524465
SYSTEMS AND METHODS FOR BROWSER EXTENSIONS AND LARGE LANGUAGE MODELS FOR INTERACTING WITH VIDEO STREAMS
2y 5m to grant Granted Jan 13, 2026
18/297,527
Patent 12450226
EFFICIENTLY ANALYZING TRACE DATA
2y 5m to grant Granted Oct 21, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
46%
Grant Probability
81%
With Interview (+34.7%)
4y 5m
Median Time to Grant
Low
PTA Risk
Based on 402 resolved cases by this examiner. Grant probability derived from career allow rate.