Last updated: April 19, 2026

Application No. 18/361,397

COLLABORATION USING CONVERSATIONAL ARTIFICIAL INTELLIGENCE DURING VIDEO CONFERENCING

Non-Final OA §103

Filed

Jul 28, 2023

Examiner

NGUYEN, PHUNG HOANG JOSEPH

Art Unit

2691

Tech Center

2600 — Communications

Assignee

Zoom Video Communications, Inc.

OA Round

3 (Non-Final)

Interview Optional

— +32.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 877 resolved cases, 2023–2026

Examiner Intelligence

NGUYEN, PHUNG HOANG JOSEPH View full profile →

Grants 79% — above average

Career Allow Rate

694 granted / 877 resolved

+17.1% vs TC avg

Strong +32% interview lift

Without

With

+32.1%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

32 currently pending

Career history

909

Total Applications

across all art units

Statute-Specific Performance

§101

5.6%

-34.4% vs TC avg

§103

56.8%

+16.8% vs TC avg

§102

15.2%

-24.8% vs TC avg

§112

8.2%

-31.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 877 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Allowable Subject Matter
Claims 6, 7, 14 and 15 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 5, 9, 13 and 17-19  are rejected under 35 U.S.C. 103 as being unpatentable over Rose further in view of Khosla et al (US 2025/0005058) OR Blohm et al (US 2024/0346255).

Claims 1, 9 and 17, Rose teaches present a method, a medium and a system comprising:
joining a first client device and a second client device to a first video conference, the first video conference having a first plurality of participants using a first plurality of client devices, the first client device associated with a first participant, and the second client device associated with a second participant; 
Rose: Fig. 2… may be implemented for game streaming applications, video conference applications, virtual learning applications, social media content sharing application, video sharing applications, and/or the like…. For example user interface 200 includes an application viewing panel 202, a host streaming window 204A, a host 204B, a plurality of participant streaming windows 206, and a chat box 208. The chat box 208 includes an input field 210 and illustrates a response selection box 212, as well as several user comments 214, 216, and 218, [0038]);

receiving, from the first client device, a first prompt submitted by the first participant, [[the first prompt comprising a free-form, natural language question = X1]]; 
Rose, based on the determined trigger phrase or action, one or more response options may be determined…For example, a teacher may ask his/her class, “Did you read ‘Macbeth’ over the weekend?”….based on the determined question, the system may generate and display a selection box that includes selection buttons that allow students to only choose “YES” or “NO” answer options in response to the teacher's question as to whether the students read Macbeth over the weekend. [0020]);

relaying the first prompt to a conversational artificial intelligence (Al) [[comprising a large language model (LLM) trained for general-purpose use = X2]]; 
Rose:  the one or more selected neural networks corresponding to a topic of discussion may be used in conjunction with a trigger phrase to generate a selection box with several potential answers. [0021];  the transcript generator 106 may use natural language processing (NLP) to generate a transcript from the stream data 104. For example, a machine learning model(s), a neural network(s), a NLP algorithm, and/or another type of NLP algorithm may be used to generate a transcript from the stream data 104, [0025].

outputting, to the second client device, a first data structure comprising first information about the first prompt; 

Rose: based on the determined question, the system may generate and display a selection box that includes selection buttons that allow students to only choose “YES” or “NO” answer options in response to the teacher's question as to whether the students read Macbeth over the weekend, [0020]); 

receiving, from the conversational Al, a first response responsive to the first prompt; 
Rose: [0021 and 0025]: dialogue between teaches and student via the NLP; and 

outputting, to the second client device, a second data structure comprising second information about the first response.  
Rose: Fig. 2, [0020] Based on the determined trigger phrase or action, one or more response options may be determined. A response option may be a set of appropriate and/or relevant responses. Further, one or more responses within a chat feature that do not correspond to the one or more response options may be filtered out. For example, a teacher may ask his/her class, “Did you read ‘Macbeth’ over the weekend?” Based on determining that the teacher asked a question (e.g., a trigger phrase), the system may only accept “YES” or “NO” comments entered into the chat feature. In some embodiments, a graphical element may be generated and populated that corresponds to the one or more response options within the chat or comment feature. For example, based on the determined question, the system may generate and display a selection box that includes selection buttons that allow students to only choose “YES” or “NO” answer options in response to the teacher's question as to whether the students read Macbeth over the weekend. 
Here clearly Rose at least suggests first and second data structures which are relevant to the current Specs, [0117]: a first data structure comprising first information about the first prompt; and [0119]: a second data structure comprising second information about the first response.
Here examiner maps Rose’s teaching is based on teacher’s question “Did you read ‘Macbeth’ over the weekend?”   The student’s Response is ‘YES’ or ‘No’.
Regarding X1 and X2 in which examiner would like to address them together.  While Rose does not EXPLICITLY detail  X1 and X2, Rose does, at least by suggestion or alternatively by obviousness, teach, “neural networks corresponding to a topic of discussion may be used in conjunction with a trigger phrase to generate a selection box with several potential answers. [0021];  and the transcript generator 106 may use natural language processing (NLP) to generate a transcript from the stream data 104. For example, a machine learning model(s), a neural network(s), a NLP algorithm, and/or another type of NLP algorithm may be used to generate a transcript from the stream data 104, [0025].  

To support the obviousness,  “Khosla teaches, “The LLM component 106 may take the prompt and the user context received from the aggregator component 104 and utilize a generative AI model (e.g., Retrieval Augmented Generation (RAG) utilizing natural language processing (NLP) architecture) to determine an answer to the natural language question. For example, if the customer computing devices 122 sends a natural language question regarding how to setup a type of network-based storage, the LLM component 106 may utilize a trained generative AI model (e.g., trained on QA pairs from a network-based storage service and customer knowledge graphs) to determine an answer (e.g., where the answer provides the instructions and potential API calls to setup the network-based storage) from the prompt received from the aggregator component 104, [0024].
Blohm provides an interface for asking questions of knowledge based in a freeform manner, [0045, 0056] for executing complex artificial intelligence applications such as lar language models, [0071] where “The retrieved information can be accordingly utilized by the summarization module along with the initialization request to generate an instruction for the large language model. Commonly referred to as a “prompt”, the instruction can be a natural language statement that configures the large language model to execute a specified task. In a basic example, an instruction can command a large language model to “generate a summary of the following document, [0009, 0035, 0051]”.
It would have been obvious to the ordinary artisan before the effective filing date to incorporate the teaching of Khosla or Blohm into the teaching of Rose for the purpose of utilize one of the best language models in the field of artificial intelligence to ensure greater accurate answer/response indicating that the content of the natural language output is factual, accurate and suitable for publication.

Claim 5 and 13. The method of claim 1 wherein: the first information about the first prompt comprises: information about the first client device; and information about the first participant; and the second information about the first response comprises: information about the first client device; and information about the first participant.  (Rose: The method 400, at block B402, includes receiving a data stream associated with an application. For example, a chat management system may receive a data stream (e.g., audio data, video data, metadata, etc.) associated with an application, such as from a host device during an online video conference, video stream, game stream, and/or the like, [0047] wherein first participant is a teacher).

Claims 18 and 19. The system of claim 17, further comprising receiving a selection of a chat view video layout mode, wherein the rendered layout is configured to display the first prompt, the first response, the second prompt, and the second response as a chat dialogue; and the first information about the first response comprises: information about the first client device; and during Video Conferencing information about the first participant; the second information about the second prompt comprises: information about the second client device; and information about the second participant; and the third information about the second response comprises: information about the second client device; and information about the second participant.  (See the independent claims and Rose, Fig. 2, the layout of a chat box where multiple prompts and responses are exchanged, [0037-0039]).   




Claim(s) 2-4, 8, 10-12, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Rose further in view of Khosla et al (US 2025/0005058) OR Blohm et al (US 2024/0346255) and  further in view of Kumahara OR  Schwaber-Cohen).

Per the Written Opinion stating that dependent claims 2-8, 10-16, 18-20 do not contain any features which, in combination with the features of any claim to which they refer, meet the requirements of the PCT in respect of novelty and/or inventive step, see cited passages in Rose, combined when detailing the additional features of dependent claims about transcript and shared knowledge of conversation (transcripts, LLM, multiple users, chat dialogues), with Schwaber-Cohen, in particular Schwaber-Cohen page 2 points 3 and 4.
Indeed, the Rose videoconferencing chat system/method is connected to a "conversational Al", but the use-case of replies required from that conversational Al is rather applied to facilitate the teacher/student chat with template responses, whereas the present application use case is rather that of a chat GPT-like conversational Al.
Examiner agrees with the Written Opinion as examiner further furnishes the detail.
Claim 2 and 10. The method of claim 1, further comprising: receiving, from the second client device, a second prompt submitted by the second participant; outputting, to the first client device, a third data structure comprising third information about the second prompt; relaying the second prompt to the conversational Al; receiving, from the conversational Al, a second response responsive to the second prompt; and outputting, to the first client device, a fourth data structure comprising fourth information about the second response.  (As observed in Rose, the conversation between teaches and students comprises many exchanges, i.e, prompt and response to prompting, it is obvious that third or fourth data structures are subsequently of the exchanges as shown by Kumahara’s Fig. 11 illustrating “example social graph 1100. In particular embodiments, social networking system 1002 may store one or more social graphs 1100 in one or more data stores. In particular embodiments, social graph 1100 may include multiple nodes—which may include multiple user nodes 1102 or multiple concept nodes 1104—and multiple edges 1106 connecting the nodes. Example social graph 1100 illustrated in FIG. 11 is shown, for didactic purposes, in a two-dimensional visual map representation. In particular embodiments, a social networking system 1002, client device 1006, or third-party system 1008 may access social graph 1100 and related social-graph information for suitable applications. The nodes and edges of social graph 1100 may be stored as data objects, for example, in a data store (such as a social-graph database). Such a data store may include one or more searchable or query able indexes of nodes or edges of social graph 1100.
[0243] In particular embodiments, a user node 1102 may correspond to a user of social networking system 1002. As an example and not by way of limitation, a user may be an individual (human user), an entity (e.g., an enterprise, business, or third-party application), or a group (e.g., of individuals or entities) that interacts or communicates with or over social networking system 1002. In particular embodiments, when a user registers for an account with social networking system 1002, social networking system 1002 may create a user node 1102 corresponding to the user, and store the user node 1102 in one or more data stores. Users and user nodes 1102 described herein may, where appropriate, refer to registered users and user nodes 1102 associated with registered users. In addition or as an alternative, users and user nodes 1102 described herein may, where appropriate, refer to users that have not registered with social networking system 1002. In particular embodiments, a user node 1102 may be associated with information provided by a user or information gathered by various systems, including social networking system 1002. As an example and not by way of limitation, a user may provide his or her name, profile picture, contact information, birth date, sex, marital status, family status, employment, education background, preferences, interests, or other demographic information. Each user node of the social graph may have a corresponding web page (typically known as a profile page). In response to a request including a user name, the social networking system can access a user node corresponding to the user name, and construct a profile page including the name, a profile picture, and other information associated with the user. A profile page of a first user may display to a second user all or a portion of the first user's information based on one or more privacy settings by the first user and the relationship between the first user and the second user;  Precisely between users A-G… There would have been multiple question asked and multiple responses.
Similarly Schwaber-Cohen provides multiple users during a conference call,  page 6 of 27.

It would have been obvious to the ordinary artisan before the effective filing date to incorporate the teaching of Kumahara or Schwaber-Cohen into the teaching of Rose for the purpose of learning to more accurately predict and identify event details when provided with one or more electronic communications.

	Claim 3. The method of claim 2, wherein the second response responsive to the second prompt is further responsive to first prompt and the first response.  (Please see the independent claims and claims 2/10 above for detail analysis).
Claim 4. The method of claim 2, wherein the first prompt and the second prompt are included in a plurality of prompts, each prompt received from a client device of the first plurality of client During Video Conferencing devices and having an associated response, further comprising outputting, to the first plurality of client devices, a fifth data structure comprising the plurality of prompts and the response associated with each prompt of the plurality of prompts.  (Please see the independent claims and claims 2/10 above for detail analysis).

Claims 8, 16 and 20. The method of claim 1, wherein the conversational Al is a transformer-based large language model, wherein the transformer-based large language model is a generative pre-trained transformer (GPT) model during Video Conferencing  (While Rose teaches conversational AI as communication exchanges between teaches and students….but GPT. DC teaches the feature.

Claim 11. The non-transitory computer-readable medium of claim 10, wherein the second response responsive to the second prompt is further responsive to first prompt and the first response.  (Please see the independent claims and claims 2/10 above for detail analysis).

Claim 12. The non-transitory computer-readable medium of claim 10, wherein the first prompt and the second prompt are included in a plurality of prompts, each prompt received from a client during Video Conferencing device of the first plurality of client devices and having an associated response, further comprising outputting, to the first plurality of client devices, a fifth data structure comprising the plurality of prompts and the response associated with each prompt of the plurality of prompts. (Please see the independent claims and claims 2/10 above for detail analysis).


Inquiry
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHUNG-HOANG J. NGUYEN whose telephone number is (571)270-1949. The examiner can normally be reached Reg. Sched. 6:00-3:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached at 571-272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/PHUNG-HOANG J NGUYEN/Primary Examiner, Art Unit 2691

Read full office action

Prosecution Timeline

Jul 28, 2023

Application Filed

Jun 20, 2025

Non-Final Rejection — §103

Aug 22, 2025

Response Filed

Nov 25, 2025

Final Rejection — §103

Jan 08, 2026

Applicant Interview (Telephonic)

Jan 08, 2026

Examiner Interview Summary

Jan 27, 2026

Response after Non-Final Action

Feb 02, 2026

Request for Continued Examination

Feb 04, 2026

Response after Non-Final Action

Feb 27, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/843,731

Patent 12598256

DISRUPTED-SPEECH MANAGEMENT ENGINE FOR A MEETING MANAGEMENT SYSTEM

2y 5m to grant Granted Apr 07, 2026

18/518,577

Patent 12591408

DISPLAY APPARATUS AND METHOD INCORPORATING INTEGRATED SPEAKERS WITH ADJUSTMENTS

2y 5m to grant Granted Mar 31, 2026

17/989,972

Patent 12587612

Method and Device for Invoking Public or Private Interactions during a Multiuser Communication Session

2y 5m to grant Granted Mar 24, 2026

18/256,155

Patent 12587705

LIVESTREAMING AUDIO PROCESSING METHOD AND DEVICE

2y 5m to grant Granted Mar 24, 2026

18/629,549

Patent 12587700

GROUPING IN A SYSTEM WITH MULTIPLE MEDIA PLAYBACK PROTOCOLS

2y 5m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

79%

Grant Probability

99%

With Interview (+32.1%)

2y 9m

Median Time to Grant

High

PTA Risk

Based on 877 resolved cases by this examiner. Grant probability derived from career allow rate.

COLLABORATION USING CONVERSATIONAL ARTIFICIAL INTELLIGENCE DURING VIDEO CONFERENCING

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email