DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-3, 13-15 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sarris (US 2016/0255126) in view of Anderson et al (US 2016/0088259) and further in view of Waugh (US 2018/0176384), Bieselin et al (US 5,559,875) or Harathi et al (US 11,516,036) and further in view of the Indian application per IDS (2/19/26).
Claims 1, 13 and 19, Sarris teaches a system, a method and a computer-readable storage media for automatically adapting meeting interfaces of an application providing a virtual meeting in response to detecting an acoustic signal subsequent to receiving an associated interruption symbol raising signal, the system comprising: a processor; and a memory, coupled to the processor, configured to store executable instructions that, when executed by the processor, cause the processor to:
receive an interruption signal indicating that a first participant, among a set of participants of the virtual meeting, is requesting permission to speak in the meeting, from a first meeting interface of the application providing the virtual meeting, wherein the first meeting interface is among a set of meeting interfaces associated with the application providing the virtual meeting on a respective set of client devices for the respective set of participants of the virtual meeting; (Sarris: Videoconferencing provides participants with the ability to give visual cues to one another and thereby lessen the number of interruptions and instances of multiple parties speaking at the same time, [0008]. Sarris provides a chronological order so that the earliest participant (301) requesting to speak will be at the top of the list in the speakers' queue, [0193] as Sarris describes an orderly speaking turn when pressed, the “ask to speak” button 2022 signals a participant's 301 desire to participate in the virtual meeting as a speaker 302b places the participant 301 in a speaker's queue 202 which is a virtual “waiting line” for speakers 302 who will be electronically placed in a speaker's box 2024)when it is that speaker's 302 turn to speak, when it is time for that participant 301 to speak, the “ask to speak” button 2022 is electronically converted to a stop speaking toggle 2028 which the speaker 302 toggles when he/she finishes speaking, the stop speaking toggle 2028 is then electronically converted back to an “ask to speak” button 2022, [0156, 0157, 0174]);
responsive to receiving the interruption signal, send the interruption symbol raising signal over a communication network to each meeting interface of the set of meeting interfaces to turn on a visual interruption symbol from the first participant; (Sarris, para 0175,0201, 0203 beneath the speaker's box 2024 is the speaker's queue 2023 which shows the speakers 302 waiting in line to enter the speaker's box 2024, the speaker's queue 2023 identifies the next speaker 302 and, in ascending order, the names of the speakers 302 waiting in line behind the next speaker 302, for the participants interface in fig. after entering the room if the user press “ask to speak” button 2022 signals a participant's 301 desire to participate in the virtual meeting as a speaker 302, pressing the “ask to speak button” 2022 places the participant 301 in a speaker's queue 2023 which is a virtual “waiting line” for speakers 302 who will be electronically placed in a speaker's box 2024 when it is that speaker's 302 turn to speak, for a meeting interface shoed in fig. 7 for the moderator 303 and administrator 304, to interrupt a speaker 302, or to eject a speaker 302 presses mute participant(s) function 5106, an eject participant function 5107, and an interruption function 5010).
detect a first acoustic signal from a device used by the first participant [[indicating that the first participant has spoken in the meeting = X1]], or detect a first acoustic signal received at a network server from the device used by the first participant [[indicating that the first participant has spoken in the meeting X1]]; and responsive to detecting the first acoustic signal from the first participant subsequent to receiving the interruption symbol raising signal: automatically generate an interruption symbol lowering signal after the first participant starts speaking, wherein: the interruption symbol lowering signal is configured to control a device to update a meeting interface by turning off the visual interruption symbol; and the interruption symbol lowering signal is generated without waiting for the first participant to stop speaking and without requiring a manual request from the first participant for generating the interruption symbol lowering signal; (Anderson teaches per Fig. 12b that… Raise Hand… Par. [0247] “In step 1220, the request is checked for Raise Hand Request. In step 1221, a Raise Hand request is processed: the status for the attendee is changed, both in the attendee user interface and all attendee lists where the attendee is displayed. Next, the main loop is repeated. In one example, if this is the first hand raised (First Hand Up), an alert sound is generated for the speaker. The purpose of this alert sound it to enable the speaker to focus on the video camera rather than having to check to see if anyone has raised their hand. This alert sound should be distinctive, and different from other alert sounds”. Then Lower Hand…. “ Par. [0248] In step 1222, the request is checked for Lower Hand Request. In step 1223, a Lower Hand request is processed by updating the status for the attendee user interface and in all attendee lists where the attendee is displayed. If there is a pending text message from this attendee, that state is reasserted. Otherwise, the default present or listening state is asserted. Next, the main loop is repeated. In one example, if this is the last hand lowered (Last Hand Down), an alert sound is generated for the speaker, for the same reason as for the raised hand case. The alert sound for First Hand Up and Last Hand Down should be different, such as a rapid rising or falling tones.
By obviousness, the conferencing system would control/update the status of all activities of the participants at all time and per the non-patent literature, namely the Indian Application, (IDS submitted 02/19/26), confirms that as examiner quotes the passage below or see the Indian document page 3.
PNG
media_image1.png
698
738
media_image1.png
Greyscale
and send to each of the set of client devices the generated interruption symbol lowering signal to automatically update each meeting interface and turn off the visual interruption symbol. (Please note that examiner will address the X1 last).
As demonstrated above and in the previous action, Sarris teaches “detect a first acoustic signal at a device used by the first participant, or detect a first acoustic signal received at a network server from a device used by the first participant; and Sarris does not explicitly detail “responsive to detecting the first acoustic signal from the first participant subsequent to receiving the interruption symbol raising signal: automatically generate an interruption symbol lowering signal”.
Anderson via Fig. 8 teaches analogous in the same field of invention, discloses, para [0408-0410], step 2711 the streaming server informs all connected clients of the status change of the new client that Requested Floor, which is that the client's hand is raised, step 2712 the streaming server sends a Hand Raised response back to the client to inform the client that the floor is owned by another client, but the requesting client's hand is raised. Proceed to step 2713, in step 2713 the client receives the Hand Raised message from the server and makes the appropriate UI changes). Para, [0272] in fig. 13a a “hand” icon is used to illustrate a “hand raised” status, whereas a “hand not raised” status is illustrated by a lack of a “hand” icon, the “hand” icon will be shown next to all participants in the list client that has the “hand raise” to take floor). Also see para 0279, there are two columns of icons to the left of each participant name in the Participant List 1313, the first column is used for Hand Raised Indicators 1319 and the current Speaker Indicator 1314, so the number of Hand Raised indicator will one less the total number of request received for taking floor, as the current speaker indicator is different than the “hand raised” indicator, [0279] but to suppress display of at least one of the visual interruption symbols for at least one of the subsequent speakers (Anderson see para 0412-0414 as shown in fig. 28 In step 2801 the client sends a ‘Lower Hand’ message as the first subset of interruption signal to the streaming media server, step 2802 the streaming server receives the request and checks to see if the client has a hand up representing one or more received interruption signals, If the client does have a hand up, then in step 2803 the streaming server sets the client's state to LISTEN which indicates the client does not have the floor and has no hand raised, step 2804 the streaming server reduces the total hand raised count which is 1 less than the previously received total number of hands up signal, current hands up count is used to keep track of the total number of clients with hands raised after the lowering of the message sent by the client):
Anderson further teaches “send to each of the set of client devices the generated interruption symbol lowering signal to automatically update each meeting interface” (FIG. 28 illustrates an example of a video conference client (participant) sending a ‘Lower Hand’ request to the streaming media server. In step 2801 the client sends a ‘Lower Hand’ message to the streaming media server… and in In step 2804 the streaming server reduces the total hand raised count. This count is used to keep track of the total number of clients with hands raised, [0412-0416]. Furthermore “ the attendee type facilitator or non-facilitator is checked in step 1213, If not a facilitator, the new attendee is set as a conference listener in step 1214, at Step 1214 is also the entry point SL for setting an attendee as a listener, Step 1214 sets the current data stream from the speaker in the current group or subgroup to be sent to the attendee, and updates the attendee status in attendee lists of other users interfaces, [0245]).
Anderson see para 0279, there are two columns of icons to the left of each participant name in the Participant List 1313, the first column is used for Hand Raised Indicators 1319 and the current Speaker Indicator 1314, so the number of Hand Raised indicator will one less the total number of request received for taking floor, as the current speaker indicator is different than the “hand raised” indicator, but to suppress display of at least one of the visual interruption symbols for at least one of the subsequent speakers. Also Anderson see para 0412-0414 as shown in fig. 28 In step 2801 the client sends a ‘Lower Hand’ message as the first subset of interruption signal to the streaming media server, step 2802 the streaming server receives the request and checks to see if the client has a hand up representing one or more received interruption signals, If the client does have a hand up, then in step 2803 the streaming server sets the client's state to LISTEN which indicates the client does not have the floor and has no hand raised, step 2804 the streaming server reduces the total hand raised count which is 1 less than the previously received total number of hands up signal, current hands up count is used to keep track of the total number of clients with hands raised after the lowering of the message sent by the client), and turn off the visual interruption symbol (Anderson see para 0243, 0248, In step 1222, the request is checked for Lower Hand Request. In step 1223, a Lower Hand request is processed by updating the status for the attendee user interface and in all attendee lists where the attendee is displayed, if there is a pending text message from this attendee, that state is reasserted, the default present or listening state is asserted, Next, the main loop is repeated, step 1206, also the entry point L for the Main Loop, the system checks for an attendee request which is representing a next interruption signal for surfacing from the plurality of interruption signals other than the first subset of interruption signals of list of signal “hand raised”, If one is found, control is transferred to entry point B in fig. 12b, else check for a request to add a new attendee in step 1207).
Regarding X1.
Sarris does not teach X1. X1 however in notoriously well-known in the art of voice recognition for determining whether a speaker/user/participant has spoken,
Waugh teaches, “If a voice collision between participants is detected, the conference call manager may implement a fairness technique that determines which of these participants has spoken the least on the conference call. The conference call manager may then provide an indication to the other participants to yield speaking in favor of this participant, [0014]”. Or in some examples, The lower ranked participant may be the participant that has spoken for a longer cumulative time period than at least one other participant of the multiple participants associated with the voice collision. As discussed, it may be fairer to allocate priority to those participants who have not spoken as much on the conference call compared to others to speak on the conference call. Additionally, it may be more interesting to hear from a participant who has not spoken very much on the conference call, [0033].
Bieselin teaches, “During the conference, the user can perform certain actions, step 577, to generate information associated with the audio recording of the conference. For example, conference participants who are speaking during a particular audio data block can be identified and added to the speaker list in the current audio data file control block, step 578. When a conference participant speaks, the system detects that a participant has spoken, identifies the source of the speech. This can be achieved a number of ways using technology well known in the art… by determining voice signals on a particular line card interface... Col. 7, lines 7-20”.
Harathi teaches, ”The real-time meeting effectiveness metrics may include, for example, the amount of time remaining in the meeting, the distribution of speaking time across individual speakers in a meeting (e.g., in percentage and/or in hours, minutes, or seconds), the order in which the participants have indicated they'd like to speak, participation of each participant (e.g., whether a participant has spoken or has spoken more or less than a threshold of time length of time), occurrence of an interruption and various other live measurements. Col. 18, lines 44-54.
Therefore it would have been obvious to the ordinary artisan before the effective filing date to incorporate the teaching of Anderson into the teaching of Sarris for the purpose of providing an efficient method for providing a highly reliable and stable video conferencing system for an interactive, live conference with multiple participants, and to optionally provide for small group formation for subgroup interactions and exercises within a larger conference, where the video conferencing system overcome the primary hurdles to use of video conferencing for such interactive conferences, including echoes, delays, start-stop conversations, and CPU and Internet bandwidth overload. (see Anderson, [0123] and also to incorporate the teaching of Waugh, Bieselin or Harathi into Sarris’s teaching for the purpose of determining the interruption from the participant who has spoken and allowing other non-speaking participants to speak and further incorporate the teaching of the Indian document presenting an automatic capability to control/update actions of the participants during the conferencing session such as giving permission to speak when hand is raised and dropped or lowered hands when other hands causing the interruption the speaker.
Claim 2 and 14. The system of claim 1, wherein the first acoustic signal does not comprise any voice command for turning off the visual interruption symbol. (See the independent claims, specifically Anderson).
Claim 3, 15 and 20 The system of claim 1, wherein the visual interruption symbol is a raised hand symbol indicating that the first participant raises a virtual hand in the meeting via the first meeting interface. (See the independent claims, specifically Anderson).
Claim(s) 4-8, 10-12 and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Sarris in view of Anderson and in view of Waugh, Bieselin or Harathi in view IDS (2/19/26) and further in view of Deole.
Claim 4 and 16. The system of claim 1, wherein the executable instructions further include instructions that, when executed by the processor, cause the processor to: receive a plurality of acoustic signals from a plurality of sources associated with the virtual meeting; determine an intensity level for each of the plurality of acoustic signals; and identify, from the plurality of acoustic signals, one or more acoustic signals having the intensity level exceeding a threshold level. (Deole: [0018] The soft client would do a screening at its end before passing the data onto the server. The initial screening may include a check determining if the intensity of the speech is beyond certain threshold, such as an audible limit of human, and filter out audio signals that are below this limit. Signals above the limited are passed on to the conference server. [0019] When a participant is speaking on mute, the server uses the data received from the soft client to compare it against baseline data, such as training data and/or signals from the NLP module, to determine that the participant is actively speaking in the conference but is doing so on mute.
Claim 5 and 17 The system of claim 1, wherein to detect the first acoustic signal from the first participant, the executable instructions further include instructions that, when executed by the processor, cause the processor to: recognize, from the one or more acoustic signals, one or more voices from one or more participants of the virtual meeting; and detect whether the first participant contributes to the virtual meeting based on whether the one or more voices include a voice from the first participant. (Deole: [0013] Since the data gathering, as described above, identifies the voice characteristics of each user while he is actively participating (speaking) in the conference, this data may then be used by the subsequent modules (in the process flow) to train the ML models. To reduce false positives, filtering may be performed, such as to exclude sound outside the audible level of humans. This data is used to train the models regarding the voice characteristics of each user in conference which helps the models to accurately identify that the user is actually participating (speaking) in the conference rather than having a sidebar conversation with someone outside of the conference (e.g., a person in the same room, using a cellular phone to conduct a separate conversation, etc.).
[0034] Other embodiments herein provide for the analyzing the participants voice characteristics using NLP/AI and voice recognition techniques to make a determination that the user is actually speaking on mute in the conference and automatically take appropriate action by the system without any manual intervention, thus preserving the rich user experience of participants in the conference. NLP or other machine intelligence may be utilized to parse a sentence spoke by one participant that addressed or referencing another participant. For example, asking a question directed to another participant (e.g., “Let's look at the document. Do you have it ready, Alice?”) is a question directed to Alice and, as a result, the endpoint utilized by Alice should be responding. If not, the endpoint may be automatically unmuted. If the NLP determines the reference is not directed to another participant (e.g., “Let's look at the document shared by Alice.”) then the endpoint utilized by Alice may not be expected to respond and the current mute/unmuted state left unchanged.
Claim 6 and 18. The system of claim 5, wherein the executable instructions further include instructions that, when executed by the processor, cause the processor to: determine whether the first meeting interface is in a mute status or in an unmute status; and in response to determining that the one or more voices include the voice from the first participant and the first meeting interface is in the unmute status, generate the interruption symbol raising signal. (Sarris: [0181] The open private chat button (5101) allows the moderator (303) to open a private text communication dialogue with any participant (301) the contents of which remain between the moderator (303) and the selected participant (301). The mute user button (5102) is a toggle control that when pressed, mutes the selected participant (301); pressing this button/toggle (5102) again will restore the sound to the selected participant (301). [0201] There may be occasions where the moderator (303) or administrator (304) wishes to adjust the speaker's (302) volume, to interrupt a speaker (302), or to eject a speaker (302) or unruly participant (301) from the videoconference. To accomplish these tasks, the moderator (303) and administrator (304) are equipped with a mute participant(s) function (506). Examiner further notices that Deole also teach the feature. Para [0017] When the participant is connected to the conference using a soft client (or web client) and uses soft/web client to mute himself/herself, the data stream is still passed to the server however the server does not broadcast the stream to other participants. Therefore, the participant may be speaking on mute, however, the server still has access to the stream of data coming from participant's endpoint/terminal. [0031] In addition to automatically determining a threshold confidence, the participant, conference moderator, or other administrator may configure the threshold values and/or disable automatic muting/unmuting with or without announcement functions announcing or indicating the participant should manually initiate muting/unmuting their endpoint. It may be necessary or beneficial to warn participants that, when muted, their audio will be monitored, but that such monitoring is solely for the determination of whether audio provided while on mute, indicates the audio should be unmuted, or vice versa, such as in accordance with the law/legal rules imposed by the local countries/geographies in which the invention will be used.
Claim 7-8. The system of claim 5, wherein the interruption symbol lowering signal is generated after a specific amount of time of detecting that the one or more voices include the voice from the first participant; wherein the specific amount of time is determined from participant interactions in the virtual meeting. (See the independent claim or at least Anderson, para [0408-0410], step 2711 the streaming server informs all connected clients of the status change of the new client that Requested Floor, which is that the client's hand is raised, step 2712 the streaming server sends a Hand Raised response back to the client to inform the client that the floor is owned by another client, but the requesting client's hand is raised. Proceed to step 2713, in step 2713 the client receives the Hand Raised message from the server and makes the appropriate UI changes). Para, [0272] in fig. 13a a “hand” icon is used to illustrate a “hand raised” status, whereas a “hand not raised” status is illustrated by a lack of a “hand” icon, the “hand” icon will be shown next to all participants in the list client that has the “hand raise” to take floor). Also see para 0279, there are two columns of icons to the left of each participant name in the Participant List 1313, the first column is used for Hand Raised Indicators 1319 and the current Speaker Indicator 1314, so the number of Hand Raised indicator will one less the total number of request received for taking floor, as the current speaker indicator is different than the “hand raised” indicator, [0279] but to suppress display of at least one of the visual interruption symbols for at least one of the subsequent speakers (Anderson see para 0412-0414 as shown in fig. 28 In step 2801 the client sends a ‘Lower Hand’ message as the first subset of interruption signal to the streaming media server, step 2802 the streaming server receives the request and checks to see if the client has a hand up representing one or more received interruption signals, If the client does have a hand up, then in step 2803 the streaming server sets the client's state to LISTEN which indicates the client does not have the floor and has no hand raised, step 2804 the streaming server reduces the total hand raised count which is 1 less than the previously received total number of hands up signal, current hands up count is used to keep track of the total number of clients with).
Claim 10. The system of claim 5, wherein, prior to automatically generate the interruption symbol lowering signal, the executable instructions further include instructions that, when executed by the processor, cause the processor to notify, via the first meeting interface, the first participant of generating the interruption symbol lowering signal. (See the independent claims, specifically Anderson).
Claim 11. The system of claim 5, wherein, prior to detecting the first acoustic signal, the executable instructions further include instructions that, when executed by the processor, cause the processor to automatically turn off a mute button for the first participant. (See claims 6 and 18).
Claim 12. The system of claim 5, wherein, prior to detecting the first acoustic signal, the executable instructions further include instructions that, when executed by the processor, cause the processor to prompt the first participant to take an unmute action. (See claims 6 and 18).
Inquiry
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHUNG-HOANG J. NGUYEN whose telephone number is (571)270-1949. The examiner can normally be reached Reg. Sched. 6:00-3:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached on 571-272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PHUNG-HOANG J NGUYEN/Primary Examiner, Art Unit 2691