Last updated: April 19, 2026
Application No. 18/160,066
Identifying A Speaking Conference Participant In A Physical Space

Non-Final OA §103
Filed
Jan 26, 2023
Examiner
MOHAMMED, ASSAD
Art Unit
2691
Tech Center
2600 — Communications
Assignee
Zoom Video Communications, Inc.
OA Round
5 (Non-Final)
Interview Optional

— +11.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 587 resolved cases, 2023–2026
Examiner Intelligence

MOHAMMED, ASSAD View full profile →
Grants 73% — above average
Career Allow Rate
430 granted / 587 resolved
+11.3% vs TC avg
Moderate +11% lift
Without
With
+11.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
24 currently pending
Career history
611
Total Applications
across all art units
Statute-Specific Performance

§101
7.3%
-32.7% vs TC avg
§103
67.5%
+27.5% vs TC avg
§102
7.8%
-32.2% vs TC avg
§112
9.5%
-30.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 587 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
1.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 11/18/2025 has been entered.
 
Claim Rejections - 35 USC § 103
2.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
3.	Claim(s) 1, 12, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Fukasawa (US 2020/0295959) in view of Kumar et al. (US 2019/0215464) further in view of Nagpal et al. (US 2024/0045574) in further view of Hogan et al. (US 5,483,587).
Regarding claim 1, Fukasawa teaches a method, comprising: determining, by a conferencing system used for a video conference, identities of multiple conference participants by comparing, for each conference participant of the multiple conference participants, sensor data captured during the video conference with identifying information captured upon the conference participant checking in for the video conference, by a card reader of a physical space and one or more of a camera or a microphone of the physical space (see fig. 11, 22, ¶ 0122-0123. The authentication system at the physical location has a card reader (225) and a camera, which in order to authenticate the participant, the system matches the IC card of the participant read by the reader and the face image of the participant captured by the camera and matching the user in the database. As shown in fig. 22, the system displays a participants list, which includes conference room that includes participants that are part of the conference. That could include multiple participants at one physical area.).  
Fukasawa includes authenticating multiple users which could be at one site room and authenticating users via IC card and facial image capture at the conferencing station.   
Fukasawa is vague outputting, by the conferencing system, first data to cause a client application to present, within a portion of a graphical user interface of the video conference, a video depicting the multiple conference participants within the physical space; outputting, by the conferencing system, second data to cause the client application to present, within the video, text representations of the identities of the multiple conference participants; outputting, by the conferencing system, third data to cause the client application to highlight, within the video and while the text representations of the identities of the multiple conference participants remain presented within the video, the text representation of the identity of a first conference participant of the multiple conference participants based on speech audio of the first conference participant, wherein highlighting the text representation of the identity of the first conference participant visually distinguishes the text representation of the identity of the first conference participant from text representations of identities of other conference participants of the multiple conference participants within the video and purging, by the conferencing system, after the video conference, the identifying information captured for each conference participant upon the conference participant checking in for the video the video conference.
Kumar teaches outputting, by the conferencing system, first data to cause a client application to present, within a portion of a graphical user interface of the video conference, a video depicting the multiple conference participants within the physical space (see fig. 1C, 2B-2C, 6C, Abstract, ¶ 0022, 0036-0037, 0048, 0053-0054, 0057. The conferencing session is able identify and/or recognizing one or more voices and/or one or more faces of the participants participating from the common office location. The conferencing system provides on a portion of the UI participants in a physical space in the conferencing session.); outputting, by the conferencing system, second data to cause the client application to present, within the video, text representations of the identities of the multiple conference participants; outputting, by the conferencing system, third data to cause the client application to highlight, within the video and while the text representations of the identities of the multiple conference participants remain presented within the video, the text representation of the identity of a first conference participant of the multiple conference participants based on speech audio of the first conference participant, wherein highlighting the text representation of the identity of the first conference participant visually distinguishes the text representation of the identity of the first conference participant from text representations of identities of other conference participants of the multiple conference participants within the video (see fig. 2b-2c,5c, 6c, ¶ 0053-0054, 0057, 0059, 0067-0070, 0074. The conferencing system is able to determine the identity of each participant in the conferencing session. The identity of each participants area displayed on the screen. Participant's voice/speech may be tagged with his/her name to form a voice profile. Such voice profiles may be provided to voice recognizer and used by voice recognizer to recognize the identity of the participant who is speaking (i.e., the identity of the active speaker). The client device can label each of the detected faces in the rendered version of the room video stream with the name of the person to which the detected face belongs. The user interface of FIG. 6C, the detected faces have been labeled as “Rebecca”, “Peter”, “Wendy” and “Sandy”. Based on the identity of the active speaker provided by data processor, client device can further label the active speaker. FIG. 6C, a rectangle is used to indicate that Rebecca is the active speaker. The rectangle around the name indicating the active speaker can be considered the highlight.).
The combination of Kumar to Fukasawa provides for additional features with authentication with multiple participants in a conferencing session and indicating an active speaker and highlighting the name (e.g. placing a box around the active speaker name).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa to incorporate authenticating and displaying the name of the participants during the conference and indicating an active speaker and highlighting the name (e.g. placing a box around the active speaker name). The modification provides for the system to be able to display and highlight the speakers name on the display during the conference at the physical room. 
Nagpal discloses when a group participant at a physical location indicates that he or she plans on speaking and thereby identifying himself or herself as an active speaker of the virtual meeting, virtual meeting management system many highlight the participant identifier (e.g., the participant name) of the group participant in the participant list of the virtual meeting to indicate that the group participant is an active speaker of the virtual meeting (see fig. 6, ¶ 0091).
The combination of Nagpal to Kumar and Fukasawa provides highlighting the participant name as the active speaker during the conference. The speaker name is highlight when the participant is the active speaker.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Kumar and Fukasawa to incorporate highlighting the participant name as the active speaker during the conference. The modification provides for the system to be able to highlight the name of the participant actively speaking during a conference, so associates are able to distinguish who is speaking among the plurality of participants. 
Hogan teaches purging, by the conferencing system, after the video conference, the identifying information captured for each conference participant upon the conference participant checking in for the video the video conference (see fig. 32, col. 24, lines 28-54. The conferencing system upon the termination of the conferencing session, the system will purge (deletes) the conferencing information from the database and participant information from the conferencing participant database.).  
The combination of Hogan to Fukasawa, both have a database which store participant information, this can include features of Fukasawa (checking in) comparing information and when exiting, the features of Hogan can delete participant data as we as the conferencing session data from the databases.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Kumar, Fukasawa and Nagpal to incorporate purging the conferencing data including participant data from the database after terminating the session. The modification provides deleting or purging data from the participant and conference databases upon the termination of the conferencing session.  

Regarding claim 12,Fukasawa teaches an apparatus, comprising: a memory; and a processor configured to execute instructions stored in the memory to: determine identities of multiple conference participants by comparing, for each conference participant of the multiple conference participants, sensor data captured during the video conference with identifying information captured upon the conference participant checking in for the video conference, by a card reader of a physical space and one or more of a camera or a microphone of the physical space (see fig. 11, 22, ¶ 0122-0123. The authentication system at the physical location has a card reader (225) and a camera, which in order to authenticate the participant, the system matches the IC card of the participant read by the reader and the face image of the participant captured by the camera and matching the user in the database. As shown in fig. 22, the system displays a participants list, which includes conference room that includes participants that are part of the conference. That could include multiple participants at one physical area.).  
Fukasawa includes authenticating multiple users which could be at one site room and authenticating users via IC card and facial image capture at the conferencing station.   
Fukasawa s vague output first data to cause a client application to present, within a portion of a graphical user interface of the video conference, a video depicting the multiple conference participants within the physical space; output second data to cause the client application to present, within the video, text representations of the identities of the multiple conference participants; output third data to cause the client application to highlight, within the video and while the text representations of the identities of the multiple conference participants remain presented within the video, the text representation of the identity of a first conference participant of the multiple conference participants based on speech audio of the first conference participant, wherein highlighting the text representation of the identity of the first conference participant visually distinguishes the text representation of the identity of the first conference participant from text representations of identities of other conference participants of the multiple conference participants within the video; and purging, after the video conference, the identifying information captured for each conference participant upon the conference participant checking in for the video the video conference.
Kumar teaches output first data to cause a client application to present, within a portion of a graphical user interface of the video conference, a video depicting the multiple conference participants within the physical space (see fig. 1C, 2B-2C, Abstract, ¶ 0022, 0036-0037, 0048, 0053-0054, 0057. The conferencing session is able identify and/or recognizing one or more voices and/or one or more faces of the participants participating from the common office location. The conferencing system provides on a portion of the UI participants in a physical space in the conferencing session.); output second data to cause the client application to present, within the video, text representations of the identities of the multiple conference participants; output third data to cause the client application to highlight, within the video and while the text representations of the identities of the multiple conference participants remain presented within the video, the text representation of the identity of a first conference participant of the multiple conference participants based on speech audio of the first conference participant, wherein highlighting the text representation of the identity of the first conference participant visually distinguishes the text representation of the identity of the first conference participant from text representations of identities of other conference participants of the multiple conference participants within the video  (see fig. 2b-2c,5c, 6c, ¶ 0053-0054, 0057, 0059, 0067-0070, 0074. The conferencing system is able to determine the identity of each participant in the conferencing session. The identity of each participants area displayed on the screen. Participant's voice/speech may be tagged with his/her name to form a voice profile. Such voice profiles may be provided to voice recognizer and used by voice recognizer to recognize the identity of the participant who is speaking (i.e., the identity of the active speaker). The client device can label each of the detected faces in the rendered version of the room video stream with the name of the person to which the detected face belongs. The user interface of FIG. 6C, the detected faces have been labeled as “Rebecca”, “Peter”, “Wendy” and “Sandy”. Based on the identity of the active speaker provided by data processor, client device can further label the active speaker. FIG. 6C, a rectangle is used to indicate that Rebecca is the active speaker. The rectangle around the name indicating the active speaker can be considered the highlight.).
The combination of Kumar to Fukasawa provides for additional features with authentication with multiple participants in a conferencing session and indicating an active speaker and highlighting the name (e.g. placing a box around the active speaker name).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa to incorporate authenticating and displaying the name of the participants during the conference and indicating an active speaker and highlighting the name (e.g. placing a box around the active speaker name). The modification provides for the system to be able to display and highlight the speakers name on the display during the conference at the physical room. 
Nagpal discloses when a group participant at a physical location indicates that he or she plans on speaking and thereby identifying himself or herself as an active speaker of the virtual meeting, virtual meeting management system many highlight the participant identifier (e.g., the participant name) of the group participant in the participant list of the virtual meeting to indicate that the group participant is an active speaker of the virtual meeting (see fig. 6, ¶ 0091).
The combination of Nagpal to Kumar and Fukasawa provides highlighting the participant name as the active speaker during the conference. The speaker name is highlight when the participant is the active speaker.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Kumar and Fukasawa to incorporate highlighting the participant name as the active speaker during the conference. The modification provides for the system to be able to highlight the name of the participant actively speaking during a conference, so associates are able to distinguish who is speaking among the plurality of participants. 
Hogan teaches purging, after the video conference, the identifying information captured for each conference participant upon the conference participant checking in for the video the video conference (see fig. 32, col. 24, lines 28-54. The conferencing system upon the termination of the conferencing session, the system will purge (deletes) the conferencing information from the database and participant information from the conferencing participant database.).  
The combination of Hogan to Fukasawa, both have a database which store participant information, this can include features of Fukasawa (checking in) comparing information and when exiting, the features of Hogan can delete participant data as we as the conferencing session data from the databases.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Kumar, Fukasawa and Nagpal to incorporate purging the conferencing data including participant data from the database after terminating the session. The modification provides deleting or purging data from the participant and conference databases upon the termination of the conferencing session.  

Regarding claim 17, Fukasawa teaches a non-transitory computer-readable medium storing instructions operable to cause one or more processors to perform operations comprising: determining, by a conferencing system used for a video conference, identities of multiple conference participants by comparing, for each conference participant of the multiple conference participants, sensor data captured during the video conference with identifying information captured upon the conference participant checking in for the video conference, by a card reader of a physical space and one or more of a camera or a microphone of the physical space (see fig. 11, 22, ¶ 0122-0123. The authentication system at the physical location has a card reader (225) and a camera, which in order to authenticate the participant, the system matches the IC card of the participant read by the reader and the face image of the participant captured by the camera and matching the user in the database. As shown in fig. 22, the system displays a participants list, which includes conference room that includes participants that are part of the conference. That could include multiple participants at one physical area.).  
Fukasawa includes authenticating multiple users which could be at one site room and authenticating users via IC card and facial image capture at the conferencing station.   
Fukasawa is vague outputting, by the conferencing system, first data to cause a client application to present, within a portion of a graphical user interface of the video conference, a video depicting the multiple conference participants within the physical space; determining, by the conferencing system, an identity of each conference participant of the multiple conference participants; and outputting, by the conferencing system, second data to cause the client application to present, within the video, text representations of the identities of the multiple conference participants; outputting, by the conferencing system, third data to cause the client application to highlight, within the video and while the text representations of the identities of the multiple conference participants remain presented within the video, the text representation of the identity of a first conference participant of the multiple conference participants based on speech audio of the first conference participant, wherein highlighting the text representation of the identity of the first conference participant visually distinguishes the text representation of the identity of the first conference participant from text representations of identities of other conference participants of the multiple conference participants within the video and purging, by the conferencing system, after the video conference, the identifying information captured for each conference participant upon the conference participant checking in for the video the video conference. 
Kumar teaches outputting, by the conferencing system, first data to cause a client application to present, within a portion of a graphical user interface of the video conference, a video depicting the multiple conference participants within the physical space (see fig. 1C, 2B-2C, Abstract, ¶ 0022, 0036-0037, 0048, 0053-0054, 0057. The conferencing session is able identify and/or recognizing one or more voices and/or one or more faces of the participants participating from the common office location. The conferencing system provides on a portion of the UI participants in a physical space in the conferencing session.); determining, by the conferencing system, an identity of each conference participant of the multiple conference participants; and outputting, by the conferencing system, second data to cause the client application to present, within the video, text representations of the identities of the multiple conference participants; outputting, by the conferencing system, third data to cause the client application to highlight, within the video and while the text representations of the identities of the multiple conference participants remain presented within the video, the text representation of the identity of a first conference participant of the multiple conference participants based on speech audio of the first conference participant, wherein highlighting the text representation of the identity of the first conference participant visually distinguishes the text representation of the identity of the first conference participant from text representations of identities of other conference participants of the multiple conference participants within the video (see fig. 2b-2c,5c, 6c, ¶ 0053-0054, 0057, 0059, 0067-0070, 0074. The conferencing system is able to determine the identity of each participant in the conferencing session. The identity of each participants area displayed on the screen. Participant's voice/speech may be tagged with his/her name to form a voice profile. Such voice profiles may be provided to voice recognizer and used by voice recognizer to recognize the identity of the participant who is speaking (i.e., the identity of the active speaker). The client device can label each of the detected faces in the rendered version of the room video stream with the name of the person to which the detected face belongs. The user interface of FIG. 6C, the detected faces have been labeled as “Rebecca”, “Peter”, “Wendy” and “Sandy”. Based on the identity of the active speaker provided by data processor, client device can further label the active speaker. FIG. 6C, a rectangle is used to indicate that Rebecca is the active speaker. The rectangle around the name indicating the active speaker can be considered the highlight.).
The combination of Kumar to Fukasawa provides for additional features with authentication with multiple participants in a conferencing session and indicating an active speaker and highlighting the name (e.g. placing a box around the active speaker name).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa to incorporate authenticating and displaying the name of the participants during the conference and indicating an active speaker and highlighting the name (e.g. placing a box around the active speaker name). The modification provides for the system to be able to display and highlight the speakers name on the display during the conference at the physical room. 
Nagpal discloses when a group participant at a physical location indicates that he or she plans on speaking and thereby identifying himself or herself as an active speaker of the virtual meeting, virtual meeting management system many highlight the participant identifier (e.g., the participant name) of the group participant in the participant list of the virtual meeting to indicate that the group participant is an active speaker of the virtual meeting (see fig. 6, ¶ 0091).
The combination of Nagpal to Kumar and Fukasawa provides highlighting the participant name as the active speaker during the conference. The speaker name is highlight when the participant is the active speaker.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Kumar and Fukasawa to incorporate highlighting the participant name as the active speaker during the conference. The modification provides for the system to be able to highlight the name of the participant actively speaking during a conference, so associates are able to distinguish who is speaking among the plurality of participants. 
Hogan teaches purging, by the conferencing system, after the video conference, the identifying information captured for each conference participant upon the conference participant checking in for the video the video conference (see fig. 32, col. 24, lines 28-54. The conferencing system upon the termination of the conferencing session, the system will purge (deletes) the conferencing information from the database and participant information from the conferencing participant database.).  
The combination of Hogan to Fukasawa, both have a database which store participant information, this can include features of Fukasawa (checking in) comparing information and when exiting, the features of Hogan can delete participant data as we as the conferencing session data from the databases.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Kumar, Fukasawa and Nagpal to incorporate purging the conferencing data including participant data from the database after terminating the session. The modification provides deleting or purging data from the participant and conference databases upon the termination of the conferencing session.  

4.	Claim(s) 2, 3, 6, 7, 9, 11, 14, 15, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Fukasawa (US 2020/0295959) in view of Kumar et al. (US 2019/0215464) further in view of Nagpal et al. (US 2024/0045574) in further view of Hogan et al. (US 5,483,587).
	Regarding claim 2, Fukasawa, Nagpal and Hogan do not teach the method of claim 1, wherein at least some of the identities of the multiple conference participants are determined using voice recognition.
Kumar teaches wherein at least some of the identities of the multiple conference participants are determined using voice recognition (see fig. 3A, ¶ 0058-0059. The system having a voice activity detector, is able to identify a participant based on voice recognition.).   
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Nagpal and Hogan to incorporate the conferencing session is able identify and/or recognizing one or more voices and/or one or more faces of the participants participating from the common office location. The modification provides for recognizing a participants voice during the conferencing session.

Regarding claim 3, Fukasawa, Nagpal and Hogan do not teach the method of claim 1, at least some of the identities of the multiple conference participants are determined using facial recognition.
Kumar teaches the method of claim 1, at least some of the identities of the multiple conference participants are determined using facial recognition (see ¶ 0053. A face recognizer is able to determine participants in the room and identify each participant name). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Nagpal and Hogan to incorporate the conferencing session is able identify and/or recognizing one or more voices and/or one or more faces of the participants participating from the common office location. The modification provides for recognizing a participants voice during the conferencing session.


Regarding claim 6, Fukasawa, Nagpal and Hogan do not teach the method of claim 1, wherein the text representations of the identities of the multiple conference participants include names of the multiple conference participants.
Kumar teaches the method of claim 1, wherein the text representations of the identities of the multiple conference participants include names of the multiple conference participants (see fig. 2b-2c,5c, 6c  ¶ 0053-0054, 0057, 0067-0070, 0074. The conferencing system is able to determine the identity of each participant in the conferencing session. The identity of each participants area displayed on the screen.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Nagpal and Hogan to incorporate the conferencing session is able identify each participant in the conferencing session. The modification provides for identifying participants in the conferencing session.

Regarding claim 7, Fukasawa, Nagpal and Hogan do not teach the method of claim 1, wherein the portion of the graphical user interface is a tile associated with the physical space.
Kumar teaches the method of claim 1, wherein the portion of the graphical user interface is a tile associated with the physical space (see fig. 1C, ¶ 0048. The conferencing system provides on a portion of the UI participants in a physical space in the conferencing session. The images on the screen are tile images including the physical captured room space of the conferencing session.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Nagpal and Hogan to incorporate the conferencing session provides a portion of the UI participants in a physical space in the conferencing session. The modification provides for a portion of the UI participants in a physical space in the conferencing session.

Regarding claim 9, Fukasawa, Nagpal and Hogan do not teach the method of claim 1, wherein the text representations of the identities of the multiple conference participants include images of the multiple conference participants.
Kumar teaches the method of claim 1, wherein the representations of the identities of the multiple conference participants are images of the multiple conference participants (see fig. 2b-2c,5c, 6c,  ¶ 0053-0054, 0057, 0067-0070, 0074. The conferencing system is able to determine the identity of each participant in the conferencing session. The identity of each participants area displayed on the screen.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Nagpal and Hogan to incorporate the conferencing session provides identities of multiple participant in the conferencing session. The modification provides identities of multiple participant in the conferencing session.

Regarding claim 11, Fukasawa, Nagpal and Hogan do not teach the method of claim 1, wherein the representations of the identities of the multiple conference participants are aliases of the multiple conference participants.
Kumar teaches the method of claim 1, wherein the representations of the identities of the multiple conference participants are aliases of the multiple conference participants (see fig. 2b-2c,5c,  ¶ 0053-0054, 0057, 0067-0070. The conferencing system is able to determine the identity of each participant in the conferencing session. The identity of each participants area displayed on the screen.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Nagpal and Hogan to incorporate the conferencing system is able to determine the identity of each participant in the conferencing session. The modification provides the identity of each participants area displayed on the screen.

Regarding claim 14, Fukasawa, Nagpal and Hogan do not teach the apparatus of claim 12, wherein the processor is further configured to execute the instructions to: perform one or both of voice recognition or facial recognition to determine the identity of at least one conference participant of the multiple participants.
Kumar teaches wherein the processor is further configured to execute the instructions to: perform one or both of voice recognition or facial recognition to determine the identity of at least one conference participant of the multiple participants (see fig. 3A, ¶ 0053,  0058-0059. The system having a voice activity detector, is able to identify a participant based on voice recognition. A face recognizer is able to determine participants in the room and identify each participant name. Conferencing session is able identify and/or recognizing one or more voices and/or one or more faces of the participants participating from the common office location).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Nagpal and Hogan to incorporate the conferencing session is able identify and/or recognizing one or more voices and/or one or more faces of the participants participating from the common office location. The modification provides for recognizing a participants voice during the conferencing session.

Regarding claim 15, Fukasawa, teaches the apparatus of claim 12, wherein the processor is further configured to execute the instructions to: combine sensor data from at least two sensors to determine the identity of a conference participant, wherein the include the card reader and the at least one of the camera or the microphone (see fig. 11, 22, ¶ 0122-0123. The authentication system at the physical location has a card reader (225) and a camera, which in order to authenticate the participant, the system matches the IC card of the participant read by the reader and the face image of the participant captured by the camera and matching the user in the database. As shown in fig. 22, the system displays a participants list, which includes conference room that includes participants that are part of the conference. That could include multiple participants at one physical area.).  

Regarding claim 20, Fukasawa, Nagpal and Hogan do not teach the non-transitory computer-readable medium of claim 17, the operations comprising: comparing captured audio data to sample audio data to determine the identity of a conference participant of the multiple conference participants.
 Kumar teaches the non-transitory computer-readable medium of claim 17, the operations comprising: comparing captured audio data to sample audio data to determine the identity of a conference participant of the multiple conference participants (see fig. 3A, ¶ 0058-0059. The system having a voice activity detector, is able to identify a participant based on voice recognition.).    
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Nagpal and Hogan to incorporate the conferencing session is able identify and/or recognizing one or more voices and/or one or more faces of the participants participating from the common office location. The modification provides for recognizing a participants voice during the conferencing session.

5.	 Claim(s) 4, 13 are rejected under 35 U.S.C. 103 as being unpatentable over Fukasawa (US 2020/0295959) in view of Kumar et al. (US 2019/0215464) further in view of Nagpal et al. (US 2024/0045574) in further view of Hogan et al. (US 5,483,587) in further view of Krupka et al. (US 2019/0341055).
Regarding claim 4, Fukasawa, Kumar, Nagpal and Hogan do not teach the method of claim 1, further comprising: verifying, based on a difference in audio level captured by two or more microphones, that the speech audio corresponds to the first conference participant.
Krupka teaches verifying, based on a difference in audio level captured by two or more microphones, that the speech audio corresponds to the first conference participant (see fig. 1, 1C, 2, ¶ 0021-0023, 0025. The system is able to detect a participant among multiple participants to identify and authenticate speaker. The system has multiple microphones to obtain audio signals from participants.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Kumar, Nagpal and Hogan to incorporate the conferencing session is able identify and/or recognizing one or more voices from participants participating from the common office location. The modification provides for recognizing a participants voice during the conferencing session.

Regarding claim 13, Fukasawa, Kumar, Nagpal and Hogan do not teach the apparatus of claim 12, processor is further configured to execute the instructions to: verify, based on a difference in audio level captured by two or more microphones, that the speech audio corresponds to the first conference participant.
Krupka teaches verify, based on a difference in audio level captured by two or more microphones, that the speech audio corresponds to the first conference participant (see fig. 1, 1C, 2, ¶ 0021-0023, 0025. The system is able to detect a participant among multiple participants to identify and authenticate speaker. The system has multiple microphones to obtain audio signals from participants.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Kumar, Nagpal and Hogan to incorporate the conferencing session is able identify and/or recognizing one or more voices from participants participating from the common office location. The modification provides for recognizing a participants voice during the conferencing session.


6.	 Claim(s) 5 is rejected under 35 U.S.C. 103 as being unpatentable over Fukasawa (US 2020/0295959) in view of Kumar et al. (US 2019/0215464) further in view of Nagpal et al. (US 2024/0045574) in further view of Hogan et al. (US 5,483,587) in further view of Wu et al. (US 2022/0303314).
Regarding claim 5, Fukasawa, Kumar, Nagpal and Hogan do not teach the method of claim 1, further comprising: obtaining a portion of the identifying information for each conference participant using a fingerprint reader of the physical space.
Wu teaches obtaining a portion of the identifying information for each conference participant using a fingerprint reader of the physical space (see ¶ 0020. Participants of the conferencing are identified by a fingerprint reader.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Kumar, Nagpal and Hogan to incorporate the conferencing session is able identify participant by a fingerprint reader. The modification provides for identifying recognizing a participants via a fingerprint reader for a conferencing session.

7.	 Claim(s) 10 is rejected under 35 U.S.C. 103 as being unpatentable over Fukasawa (US 2020/0295959) in view of Kumar et al. (US 2019/0215464) further in view of Nagpal et al. (US 2024/0045574) in further view of Hogan et al. (US 5,483,587) further in view of Geng et al. (US 10,771,694).
Regarding claim 10, Fukasawa, Kumar, Nagpal and Hogan do not teach the method of claim 1, further comprising: passing the identities of the multiple conference participants to an audio to text generation service to distinguish between names and other words. 
Geng teaches passing the identities of the multiple conference participants to an audio to text generation service to distinguish between names and other words (see fig. 5, 8,  col. 6, lines 24- col. 7, line 4, col. 14, line 37-56. The Conferencing session provides the participants on the screen, the participants area shown with names associated with each participants. When a participant is speaking the system will highlight the speaker on the screen. In addition, the conferencing image also has subtitles, so that people suffering from ear problems can also participate in the conference normally.).
The combination of Geng to Fukasawa, Kumar, Nagpal and Hogan provides highlighting the participant speaking during the conference. The highlighted speaker has a name associated with it, so you know who is speaker at the time.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Kumar, Nagpal and Hogan to incorporate highlighting the participant speaking during the conference and upon the participating being highlighted, the participant has their name associated with it, so you know who is speaker at the time. The modification provides for the system to be able to highlight the speaker during a conference, so associates are able to distinguish who is speaking among the plurality of participants. 



 8.	Claim(s) 16 is rejected under 35 U.S.C. 103 as being unpatentable over Fukasawa (US 2020/0295959) in view of Kumar et al. (US 2019/0215464) further in view of Nagpal et al. (US 2024/0045574) in further view of Hogan et al. (US 5,483,587) in view of Tadge (US 2022/0377177).
	Regarding claim 16, Fukasawa, Kumar and Nagpal, Hogan do not teach the apparatus of claim 12, wherein the processor is further configured to execute the instructions to: prompt the multiple conference participants to provide other identifying information usable to determine the identities of the multiple conference participants.  
	Tadge teaches prompt the multiple conference participants to provide other identifying information usable to determine the identities of the multiple conference participants (see fig. 2A-2B, ¶ 0050. The information is manually inputted system (e.g. name).).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Kumar and Nagpal, Hogan to incorporate prompting a user to enter identification information in a conferencing session. The modification is to provide information inputted by a user in a conference. 

9.	Claim(s) 18 is rejected under 35 U.S.C. 103 as being unpatentable over Fukasawa (US 2020/0295959) in view of Kumar et al. (US 2019/0215464) further in view of Nagpal et al. (US 2024/0045574) in further view of Hogan et al. (US 5,483,587) in view of Ostap et al. (US 10,972,655).
	Regarding claim 18, Fukasawa, Kumar and Nagpal, Hogan do not teach the non-transitory computer-readable medium of claim 17, the operations comprising: determining a source location of audio data, wherein the audio data is used to determine the identities of at least some of the multiple conference participants. 
	Ostap teaches determining a source location of audio data (see fig. 4H, col. 29, line 60-col. 21, line 12. The conferencing system is able to identify the participant speaker and location of the speaker.).
Ostap does not disclose wherein the audio data is used to determine the identities of at least some of the multiple conference participants
Kumar teaches wherein the audio data is used to determine the identities of at least some of the multiple conference participants (see fig. 3A, ¶ 0053,  0058-0059. The system having a voice activity detector, is able to identify a participant based on voice recognition. A face recognizer is able to determine participants in the room and identify each participant name).  
The combination of Ostap to Fukasawa, Kumar and Nagpal, Hogan would provide the identifies of the participants in the conference based on the audio data.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Kumar and Nagpal, Hogan to incorporate audio location for participants in a conferencing in order to identify the speaker in the room based on audio data for speaker location. The modification provides to identify the location of the speaker in the room. 

10.	Claim(s) 19 is rejected under 35 U.S.C. 103 as being unpatentable over Fukasawa (US 2020/0295959) in view of Kumar et al. (US 2019/0215464) further in view of Nagpal et al. (US 2024/0045574) in further view of Hogan et al. (US 5,483,587) in view of Matsuura (US 2023/0007060).
	Regarding claim 19, Fukasawa, Kumar and Nagpal, Hogan do not teach the non-transitory computer-readable medium of claim 17, the operations comprising: generating metadata for identifying a time span during which to present the representations of the identities of the multiple conference participants. 
	Matsuura teaches generating metadata for identifying a time span during which to present the representations of the identities of the multiple conference participants (see fig. 7-8, ¶ 0054, 0057. The system generates identifies a participant that will be speaking and a time span (start time and end time of the speech) in association with the participant. This would be applicable for multiple users in a conferencing session.). 
  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Kumar and Nagpal, Hogan to incorporate a time span of the identified speaker that will be speaking. he modification is to provide information about user and time frame the user will be speaking. 

11.	 Claim(s) 21 is rejected under 35 U.S.C. 103 as being unpatentable over Fukasawa (US 2020/0295959) in view of Kumar et al. (US 2019/0215464) further in view of Nagpal et al. (US 2024/0045574) in further view of Hogan et al. (US 5,483,587) in further view of Margolin (US 9,613,448).
Regarding claim 21, Fukasawa, Kumar and Nagpal, Hogan do not teach the non-transitory computer-readable medium of claim 17, The non-transitory computer-readable medium of claim 17, wherein the highlighting of the text representation of the identity of the first conference participant is to a callout depicted within the video and within which the text representation of the identity of the first conference participant is visually presented.
Margolin teaches wherein the highlighting of the text representation of the identity of the first conference participant is to a callout depicted within the video and within which the text representation of the identity of the first conference participant is visually presented (see fig. 5, col. 15, line 33-col, line 67, col. 20, lines 4-62. The conferencing session provides the participants on the screen as well as names of participants in a callout being depicted next to each participant on the screen. The callout as depicted in fig. 5, would be the highlight being the callout or bubble directed with the name to the participant.)
The combination of Margolin to Fukasawa, Kumar and Nagpal, Hogan provides highlighting the participant with a callout bubble during the conference. The highlighted speaker has a name associated with the participant.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukasawa, Kumar and Nagpal, Hogan to incorporate highlighting the participant with a callout bubble, the participant has their name associated within the callout bubble. The modification provides for the system to be able to highlight the participant with a callout bubble with the name in the callout bubble.


Conclusion
12. 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to ASSAD MOHAMMED whose telephone number is (571)270-7253. The examiner can normally be reached 9:00AM-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached at 571-272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ASSAD MOHAMMED/Examiner, Art Unit 2691  

/DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2691
Read full office action
Prosecution Timeline

Jan 26, 2023
Application Filed
Oct 02, 2024
Non-Final Rejection — §103
Jan 07, 2025
Applicant Interview (Telephonic)
Jan 07, 2025
Response Filed
Jan 07, 2025
Examiner Interview Summary
Jan 28, 2025
Final Rejection — §103
Apr 02, 2025
Applicant Interview (Telephonic)
Apr 02, 2025
Examiner Interview Summary
Apr 03, 2025
Response after Non-Final Action
Apr 16, 2025
Request for Continued Examination
Apr 21, 2025
Response after Non-Final Action
Apr 30, 2025
Non-Final Rejection — §103
Jul 31, 2025
Examiner Interview Summary
Jul 31, 2025
Applicant Interview (Telephonic)
Aug 05, 2025
Response Filed
Sep 15, 2025
Final Rejection — §103
Nov 07, 2025
Interview Requested
Nov 13, 2025
Applicant Interview (Telephonic)
Nov 14, 2025
Examiner Interview Summary
Nov 18, 2025
Response after Non-Final Action
Dec 11, 2025
Request for Continued Examination
Jan 25, 2026
Response after Non-Final Action
Mar 17, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/105,074
Patent 12604149
ELECTRONIC DEVICE AND METHOD THEREOF FOR OUTPUTTING AUDIO DATA
2y 5m to grant Granted Apr 14, 2026
18/340,183
Patent 12598441
AUDIO SIGNAL PROCESSING METHOD AND AUDIO SIGNAL PROCESSING APPARATUS
2y 5m to grant Granted Apr 07, 2026
18/585,594
Patent 12587801
RE-MIXING A COMPOSITE AUDIO PROGRAM FOR PLAYBACK WITHIN A REAL-WORLD VENUE
2y 5m to grant Granted Mar 24, 2026
18/626,976
Patent 12587774
SYSTEM AND METHOD OF ASSEMBLING A COMPRESSION TRIGGERED HEADSET POWER SAVING SYSTEM FOR AN AUDIO HEADSET
2y 5m to grant Granted Mar 24, 2026
18/245,792
Patent 12581240
Method and System for Determining Audio Channel Role of Sound Box, Electronic Device, and Storage Medium
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
73%
Grant Probability
84%
With Interview (+11.1%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 587 resolved cases by this examiner. Grant probability derived from career allow rate.
Identifying A Speaking Conference Participant In A Physical Space

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email