Last updated: April 19, 2026
Application No. 18/761,478
MULTI-USER AUDIO SIGNAL PROCESSOR FOR IMITATING A FEEDBACK SIDETONE

Non-Final OA §103
Filed
Jul 02, 2024
Examiner
KAZEMINEZHAD, FARZAD
Art Unit
2653
Tech Center
2600 — Communications
Assignee
Skyworks Solutions Inc.
OA Round
1 (Non-Final)
Interview Optional

— +67.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 534 resolved cases, 2023–2026
Examiner Intelligence

KAZEMINEZHAD, FARZAD View full profile →
Grants 71% — above average
Career Allow Rate
379 granted / 534 resolved
+9.0% vs TC avg
Strong +67% interview lift
Without
With
+67.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 6m
Avg Prosecution
24 currently pending
Career history
558
Total Applications
across all art units
Statute-Specific Performance

§101
13.6%
-26.4% vs TC avg
§103
36.9%
-3.1% vs TC avg
§102
18.3%
-21.7% vs TC avg
§112
18.5%
-21.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 534 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “audio processing system” in claim 9 where no structure is recited.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Objections
Claim 9 is objected to because of the following informalities: “the audio processor system” in line 2 appears to be misspelling of “the audio processing system”.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over GABA et al. (WO 2022/139077), and further in view of PEHRSSON et al. (EP1336253).

Regarding claim 1, GABA et al. do teach an audio processor for generating a non-delayed sidetone for a user (Title, Abstract, Fig. 2 displays “an electronic device” (audio processor) which according to the Abstract lines 3-4 manages a “group call” (a playback signal) among “a plurality of participants” (plurality of users)), 
the audio processor comprising:
an input for receiving a playback signal, the playback signal including respective delayed audio signals from one or more users mixed together as the playback signal, the one or more users including a first user and the mixed audio signal from the first user being a delayed input audio signal (Abstract lines 3-4 discuss managing a “group call” (a playback signal) among “a plurality of participants” (plurality of users), which could comprise of “echo” (a delayed signal (¶ 111 lines 4-5); ¶ 76 lines 1+: “the electronic device (100)” (the audio processor) “is configured to receive a private connection request from one or more” “of the plurality of identified participants” “during the group call” “to enable a private mode between the electronic device” “and the one or more candidate participant” “is configured to receive a media stream” (receiving a playback signal) “from the one or more candidate participant during the group call in the private mode and mask the media stream from the one or more candidate participant. Further, the electronic device (100) is configured to mix” (which includes mixing) “the masked media stream received from the one or more first participant” (a first user) “and a media stream received from the remaining participant” “of the plurality of participants” (and one or more users in the “media stream” (playback signal) where these include “echo” (a delayed signal as well (¶111));
a signal remover configured to generate a first processed signal by removing the delayed audio signal for the first user from the playback signal (¶ 111 lines 4-5: “the enhancement bit-steam generator” (a signal remover) “remove echo” (removes the delayed audio signal) “for a current speaker” (for the first user) “from the final mixed audio” (to generate a first processed signal)); 
and a signal mixer coupled to the signal remover, the signal mixer configured to generate a second processed signal by mixing the first processed signal with a non-delayed input audio signal from the first user (¶ 111 lines 6-7: “The bit-stream multiplexer (504) mixes the audio stream” (the second processed signal generated from mixing the first processed signal with a non-delayed input audio signal from the first user, because according to ¶ 111 lines 1-3 the “mixed audio stream” is formed by mixing “audio from different calling connections” which includes the “first participant” or sometimes referred to as “first speaker” (the first user input audio) included the “mixed audio” (which comprises of the first processed signal) ), 
and output the second processed signal (¶ 111 last S: “The mixed audio stream” (the resulting second processed signal) “is transmit” (is outputted) “to the participants” (e.g., including the “first participant” (first user)).
GABA et al. do not specifically disclose:
the non-delayed input audio signal acting as a sidetone for the first user.
PEHRSSON et al. do teach:
the non-delayed input audio signal acting as a sidetone for the first user (in a “conference call” (Col. 10 line 4) according to Col. 10 lines 15+ teach : “signals originate from different users” (a plurality of users) “generating by said summator a sum signal” (signals are summed or mixed) “of said first and second electronic signals for transmission to a transmitter/receiver (10) of said portable communication device, and generating a first sidetone” (wherein a non-delayed audio signal of each user acts as a sidetone) “of said first electronic signal for transmission” (upon output) “to first and second earphone outputs (13a,13b; 14), and a second sidetone of the second electronic signal for transmission to said earphone outputs (13a,13b; 14)”).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “summator” module of PEHRSSON et al. into the “multiplexer” which “mixes the audio stream” of GABA et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable GABA et al. to enable its “caller[s]” to get a sense of the volume of their own voice without generating echo of the user’s voice from the speaker back into the microphone receiving the user’s voice.

Regarding claim 2, GABA et al. do teach the audio processor according to claim 1 further comprising a measuring unit coupled to the signal remover, the measuring unit configured to determine an audio feature of the delayed audio signal of the first user relative to the playback audio signal or relative to the non-delayed input audio signal from the first user (¶ 111 lines 4-5: “the enhancement bit-steam generator” (a measuring unit) “remove echo” (to determine a feature of the audio signal including its delayed signal component) “for a current speaker” (for the first user) “from the final mixed audio”).

Regarding claim 3, GABA et al. do not specifically disclose the audio processor according to claim 2 further comprising a first amplifier having a first input, a second input, and an output, wherein the first input of the amplifier is coupled to the measuring unit and the second input of the amplifier is coupled to the non-delayed input audio signal from the first user, the output of the amplifier being coupled to the signal mixer such that the audio feature of the non-delayed input audio signal is adjusted based on a relative level of the determined audio feature of the delayed audio signal with respect to the playback audio signal.
PEHRRSON et al. do teach the audio processor according to claim 2 further comprising a first amplifier having a first input, a second input, and an output, wherein the first input of the amplifier is coupled to the measuring unit and the second input of the amplifier is coupled to the non-delayed input audio signal from the first user, the output of the amplifier being coupled to the signal mixer such that the audio feature of the non-delayed input audio signal is adjusted based on a relative level of the determined audio feature of the delayed audio signal with respect to the playback audio signal (Col. 5 lines 42+ referring to Fig. 2: “As described above the CODEC block 1 has a speech decoder path, which operates simultaneously and independently from the speech encoder paths for duplex operation. The PCM signal accepted by the PCM input 12 from the DSP 8 is transmitted to a PCM voice decoder 22. The output signal from the decoder 22 is fed through a programmable volume control 23. Further, the volume-controlled signal is fed through a filter block 24, comprising a receiver low-pass filter, a digital-to-analogue converter for converting the PCM decoded signal into an analogue signal. Before input to a first earphone amplifier 25 for the fixed earphone 15 and/or to a second earphone amplifier 26 for the handsfree earphone 16, the signal passes a receive programmable gain stage (PGA) in the filter block 24, enabling adjustment of the circuit for different sensitivity of the earphone(s) and is spread in the path” (the entire set of Fig. 2 including the “amplifier 25” functions as an amplifier which in one instance comprises of the “volume control 23” (a first input) the “SIDE TONE” “30” (a second input) and “programmable gain stage (PGA) in the filter block 24” (an output), such that the “volume control 23” receives “output signal from the decoder” (a non-delayed input audio signal) and itself functions as a measuring and “control” (adjusting) unit for “volume” (an audio feature), plus an input from the “SIDE TONE” “30” (a delayed audio signal) which are together fed into the ‘FILTER DAC” or “block 24” (the output) which “enable[es] adjustment” (i.e., adjusts the two through “volume” (relative level)). 
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the functions of the “conference call” management of PEHRRSON et al. into the “group call” of GABA et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable GABA et al. to “enabling adjustment” “for different [audio] sensitivity” as disclosed in GABA et al. ¶ 0028 last sentence.

Regarding claim 4, GABA et al. do not specifically disclose the audio processor according to claim 3 wherein the determined audio feature is a volume or a frequency of the delayed audio signal.
PEHRRSON et al. do teach the audio processor according to claim 3 wherein the determined audio feature is a volume or a frequency of the delayed audio signal (¶ 0028 lines 6+: “The output signal from the decoder 22 is fed through a programmable volume control 23. Further, the volume-controlled signal” (“volume” audio feature is determined) “is fed through a filter block 24”).
For obviousness to combine GABA et al. and PEHRRSON et al. see claim 3.

Regarding claim 5, GABA et al. do teach the audio processor according to claim 1 further comprising a microphone coupled to the audio processor, the microphone being configured to input audio signals from the first user (¶ 68: “FIGS. 1a-1c are example illustrations in which an electronic device (100) handles a group call, according to an embodiment as disclosed herein. The electronic device (100) can be, for example, but not limited to a cellular phone, a smart phone” (i.e., devices that comprise of a microphone: see PEHRRSON et al. ¶ 0002: “A conventional mobile phone is used for a single user and, consequently, the mobile phone has a single microphone for receiving sound waves of human voice of the user of the phone and a single loudspeaker for reproducing sound for the user”).

Regarding claim 6, GABA et al. do teach the audio processor according to claim 5 further comprising a speaker coupled to the audio processor, the speaker being configured to output the second processed audio signal to the first user (¶ 68: “FIGS. 1a-1c are example illustrations in which an electronic device (100) handles a group call, according to an embodiment as disclosed herein. The electronic device (100) can be, for example, but not limited to a cellular phone, a smart phone” (i.e., devices that comprise of speakers for e.g., outputting the second processed audio signal : see PEHRRSON et al. ¶ 0002: “A conventional mobile phone is used for a single user and, consequently, the mobile phone has a single microphone for receiving sound waves of human voice of the user of the phone and a single loudspeaker for reproducing sound for the user”)).

Regarding claim 7, GABA et al. do not specifically disclose the audio processor according to claim 6 further comprising an analog-to-digital converter coupled to the microphone.
PEHRRSON et al. do teach the audio processor according to claim 6 further comprising an analog-to-digital converter coupled to the microphone (¶ 0033: “The output of the amplifier comprises the analogue sum signal. The signal is filtered through a filter block 28, which comprises a transmit programmable gain stage (TX PGA), an analogue-to-digital converter (ADC)” (analog to digital converter) “and a transmit band pass filter, which converts the analogue signal from the microphone” (coupled to the microphone) “to a digital signal and filters”).
For obviousness to combine GABA et al. and PEHRRSON et al. see claim 3.

Regarding claim 8, GABA et al. do not specifically disclose the audio processor according to claim 7 further comprising a digital-to-analog converter coupled to the speaker.
PEHRRSON et al. do teach the audio processor according to claim 7 further comprising a digital-to-analog converter coupled to the speaker (¶ 0028 lines 51+: “a digital-to-analogue converter” (a digital to analog converter) “for converting the PCM decoded signal into an analogue signal. Before input to a first earphone” (coupled to a speaker)).
For obviousness to combine GABA et al. and PEHRRSON et al. see claim 3.

Regarding claim 9, GABA et al. do teach an audio processing system for generating a non-delayed sidetone for a plurality of users (Title, Abstract, Fig. 2 displays “an electronic device” (audio processing system) which according to the Abstract lines 3-4 manages a “group call” (a playback signal) among “a plurality of participants” (plurality of users)), 
the audio processing system comprising for each respective user of the plurality of users:
an input configured to receive a playback signal including delayed audio signals from the plurality of users mixed together as the playback signal (Abstract lines 3-4 discuss managing a “group call” (a playback signal) among “a plurality of participants” (plurality of users), which could comprise of “echo” (a delayed signal (¶ 111 lines 4-5); ¶ 76 lines 1+: “the electronic device (100)” (the audio processor) “is configured to receive a private connection request from one or more” “of the plurality of identified participants” “during the group call” “to enable a private mode between the electronic device” “and the one or more candidate participant” “is configured to receive a media stream” (receiving a playback signal) “from the one or more candidate participant during the group call in the private mode and mask the media stream from the one or more candidate participant. Further, the electronic device (100) is configured to mix” (which includes mixing) “the masked media stream received from the one or more first participant” (a first user) “and a media stream received from the remaining participant” “of the plurality of participants” (and one or more users in the “media stream” (playback signal) where these include “echo” (a delayed signal as well (¶111));
a signal remover, coupled to the input, the signal remover being configured to generate a first processed signal by removing the delayed audio signal for the respective  user from the playback signal (¶ 111 lines 4-5: “the enhancement bit-steam generator” (a signal remover) “remove echo” (removes the delayed audio signal) “for a current speaker” (for the first user) “from the final mixed audio” (to generate a first processed signal)); 
and a signal mixer, the signal mixer being coupled to the signal remover, the signal mixer configured to generate a second processed signal by mixing the first processed signal with a non-delayed input audio signal from the first user (¶ 111 lines 6-7: “The bit-stream multiplexer (504) mixes the audio stream” (the second processed signal generated from mixing the first processed signal with a non-delayed input audio signal from the first user, because according to ¶ 111 lines 1-3 the “mixed audio stream” is formed by mixing “audio from different calling connections” which includes the “first participant” or sometimes referred to as “first speaker” (the first user input audio) included the “mixed audio” (which comprises of the first processed signal) ), 
and output the second processed signal (¶ 111 last S: “The mixed audio stream” (the resulting second processed signal) “is transmit” (is outputted) “to the participants” (e.g., including the “first participant” (first user)).
GABA et al. do not specifically disclose:
the non-delayed input audio signal acting as a sidetone for the respective user.
PEHRSSON et al. do teach:
the non-delayed input audio signal acting as a sidetone for the respective user (in a “conference call” (Col. 10 line 4) according to Col. 10 lines 15+ teach : “signals originate from different users” (a plurality of users) “generating by said summator a sum signal” (signals are summed or mixed) “of said first and second electronic signals for transmission to a transmitter/receiver (10) of said portable communication device, and generating a first sidetone” (wherein a non-delayed audio signal of e.g. the respective user acts as a sidetone) “of said first electronic signal for transmission” (upon output) “to first and second earphone outputs (13a,13b; 14), and a second sidetone of the second electronic signal for transmission to said earphone outputs (13a,13b; 14)”).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “summator” module of PEHRSSON et al. into the “multiplexer” which “mixes the audio stream” of GABA et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable GABA et al. to enable its “caller[s]” to get a sense of the volume of their own voice without generating echo of the user’s voice from the speaker back into the microphone receiving the user’s voice.

Regarding claim 10, GABA et al. do not specifically disclose a computer-implemented system comprising the audio processor of claim 1 and a recording device coupled to the audio processor, the recording device being configured to generate the playback signal input to the audio processor device.
PHERSSON et al. do teach a computer-implemented system comprising the audio processor of claim 1 and a recording device coupled to the audio processor, the recording device being configured to generate the playback signal input to the audio processor device (Abstract: “A portable communication device for conference calls, comprising at least first and second speech encoder paths with first and second inputs” (from input to) “for connection to first and second recording devices” (recording device) “respectively, and at least one output” (to generate playback) “connected to signal processor means for receiving and processing first and second electronic signals from said recording devices via said first and second speech encoder paths for transmission to a transmitter/receiver operatively connected to said signal processor means, wherein said apparatus is adapted to receive said first and second electronic signals simultaneously even if the signals are different, and said apparatus comprises a summator adapted to sum said first and second electronic signals into a sum signal for transmission to said transmitter/receiver”).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the functions of the “conference call” management of PEHRRSON et al. into the “group call” of GABA et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable GABA et al. to acquire recording function so as to be able to use an input plurality of times should it become necessary for further processing.

Regarding claim 11, GABA et al. do not specifically disclose the computer implemented system according to claim 10 further comprising a second amplifier in the recording device for adjusting an audio feature of the delayed input audio signal in the mixed playback signal.
PEHRSSON et al. do teach the computer implemented system according to claim 10 further comprising a second amplifier in the recording device for adjusting an audio feature of the delayed input audio signal in the mixed playback signal (Col. 5 lines 53+ : “Before input to a first earphone amplifier 25 for the fixed earphone 15 and/or to a second earphone amplifier 26 for the handsfree earphone 16, the signal passes a receive programmable gain stage (PGA) in the filter block 24” (there are two “amplifiers”, “25” and “26” (a second amplifier) and they are coupled to the “programmable gain stage” (a “volume control” (for adjusting audio feature (Col. 5 line 49)))).
For obviousness to combine GABA et al. and PEHESSON et al. see claim 3. 

Regarding claim 12, GABA et al. do teach an audio processing method for generating a non-delayed sidetone for a user (Title, Abstract, Fig. 2 displays “an electronic device” (audio processor) which according to the Abstract lines 3-4 manages a “group call” (a playback signal) among “a plurality of participants” (plurality of users)), 
the audio processing method comprising:
receiving a playback signal, the playback signal including respective delayed audio signals from one or more users mixed together as the playback signal, the one or more users including a first user and the mixed audio signal from the first user being a delayed input audio signal (Abstract lines 3-4 discuss managing a “group call” (a playback signal) among “a plurality of participants” (plurality of users), which could comprise of “echo” (a delayed signal (¶ 111 lines 4-5); ¶ 76 lines 1+: “the electronic device (100)” (the audio processor) “is configured to receive a private connection request from one or more” “of the plurality of identified participants” “during the group call” “to enable a private mode between the electronic device” “and the one or more candidate participant” “is configured to receive a media stream” (receiving a playback signal) “from the one or more candidate participant during the group call in the private mode and mask the media stream from the one or more candidate participant. Further, the electronic device (100) is configured to mix” (which includes mixing) “the masked media stream received from the one or more first participant” (a first user) “and a media stream received from the remaining participant” “of the plurality of participants” (and one or more users in the “media stream” (playback signal) where these include “echo” (a delayed signal as well (¶111));
generating, by a signal remover, a first processed signal by removing the delayed audio signal for the first user from the playback signal (¶ 111 lines 4-5: “the enhancement bit-steam generator” (a signal remover) “remove echo” (removes the delayed audio signal) “for a current speaker” (for the first user) “from the final mixed audio” (to generate a first processed signal)); 
generating, by a signal mixer, a second processed signal by mixing the first processed signal with a non-delayed input audio signal from the first user (¶ 111 lines 6-7: “The bit-stream multiplexer (504) mixes the audio stream” (the second processed signal generated from mixing the first processed signal with a non-delayed input audio signal from the first user, because according to ¶ 111 lines 1-3 the “mixed audio stream” is formed by mixing “audio from different calling connections” which includes the “first participant” or sometimes referred to as “first speaker” (the first user input audio) included the “mixed audio” (which comprises of the first processed signal) ), 
and outputting, by the signal mixer,  the second processed signal (¶ 111 last S: “The mixed audio stream” (the resulting second processed signal) “is transmit” (is outputted) “to the participants” (e.g., including the “first participant” (first user)).
GABA et al. do not specifically disclose:
the non-delayed input audio signal acting as a sidetone for the first user.
PEHRSSON et al. do teach:
the non-delayed input audio signal acting as a sidetone for the first user (in a “conference call” (Col. 10 line 4) according to Col. 10 lines 15+ teach : “signals originate from different users” (a plurality of users) “generating by said summator a sum signal” (signals are summed or mixed) “of said first and second electronic signals for transmission to a transmitter/receiver (10) of said portable communication device, and generating a first sidetone” (wherein a non-delayed audio signal of each user acts as a sidetone) “of said first electronic signal for transmission” (upon output) “to first and second earphone outputs (13a,13b; 14), and a second sidetone of the second electronic signal for transmission to said earphone outputs (13a,13b; 14)”).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “summator” module of PEHRSSON et al. into the “multiplexer” which “mixes the audio stream” of GABA et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable GABA et al. to enable its “caller[s]” to get a sense of the volume of their own voice without generating echo of the user’s voice from the speaker back into the microphone receiving the user’s voice.

Regarding claim 13, GABA et al. do teach the method according to claim 12 further comprising determining, by a measuring unit, an audio feature of the delayed audio signal of the first user relative to the playback audio signal or relative to the non-delayed input audio signal from the first user (¶ 111 lines 4-5: “the enhancement bit-steam generator” (a measuring unit) “remove echo” (to determine a feature of the audio signal including its delayed signal component) “for a current speaker” (for the first user) “from the final mixed audio”).

 Regarding claim 14, GABA et al. do not specifically disclose the method according to claim 13 further comprising adjusting, by a first amplifier, the audio feature of the non-delayed input audio signal based on a relative level to the playback signal, wherein a first input of the first amplifier is coupled to the measuring unit and a second input of the first amplifier is coupled to the non-delayed input audio signal from the first user, and an output of the first amplifier is coupled to the signal.
PEHRRSON et al. do teach the method according to claim 13 further comprising adjusting, by a first amplifier, the audio feature of the non-delayed input audio signal based on a relative level to the playback signal, wherein a first input of the first amplifier is coupled to the measuring unit and a second input of the first amplifier is coupled to the non-delayed input audio signal from the first user, and an output of the first amplifier is coupled to the signal mixer (Col. 5 lines 42+ referring to Fig. 2: “As described above the CODEC block 1 has a speech decoder path, which operates simultaneously and independently from the speech encoder paths for duplex operation. The PCM signal accepted by the PCM input 12 from the DSP 8 is transmitted to a PCM voice decoder 22. The output signal from the decoder 22 is fed through a programmable volume control 23. Further, the volume-controlled signal is fed through a filter block 24, comprising a receiver low-pass filter, a digital-to-analogue converter for converting the PCM decoded signal into an analogue signal. Before input to a first earphone amplifier 25 for the fixed earphone 15 and/or to a second earphone amplifier 26 for the handsfree earphone 16, the signal passes a receive programmable gain stage (PGA) in the filter block 24, enabling adjustment of the circuit for different sensitivity of the earphone(s) and is spread in the path” (the entire set of Fig. 2 including the “amplifier 25” functions as a first amplifier which in one instance comprises of the “volume control 23” (a first input) the “SIDE TONE” “30” (a second input) which are mixed as a signal mixer and “programmable gain stage (PGA) in the filter block 24” (an output), such that the “volume control 23” receives “output signal from the decoder” (a non-delayed input audio signal) and itself functions as a measuring and “control” (adjusting) unit for “volume” (an audio feature), plus an input from the “SIDE TONE” “30” (a delayed audio signal) which are together fed into the ‘FILTER DAC” or “block 24” (the output) which “enable[es] adjustment” (i.e., adjusts the two through “volume” (relative level)). 
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the functions of the “conference call” management of PEHRRSON et al. into the “group call” of GABA et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable GABA et al. to “enabling adjustment” “for different [audio] sensitivity” as disclosed in GABA et al. ¶ 0028 last sentence.

Regarding claim 15, GABA et al. do not specifically disclose the method according to claim 14 wherein the audio feature is a volume or a frequency of the delayed audio signal.
PEHRRSON et al. do teach the method according to claim 14 wherein the audio feature is a volume or a frequency of the delayed audio signal (¶ 0028 lines 6+: “The output signal from the decoder 22 is fed through a programmable volume control 23. Further, the volume-controlled signal” (“volume” audio feature is determined) “is fed through a filter block 24”).
For obviousness to combine GABA et al. and PEHRRSON et al. see claim 14.

Regarding claim 16, GABA et al. do teach the method according to claim 12 further comprising inputting, by a microphone, an input audio signal from the first user (¶ 68: “FIGS. 1a-1c are example illustrations in which an electronic device (100) handles a group call, according to an embodiment as disclosed herein. The electronic device (100) can be, for example, but not limited to a cellular phone, a smart phone” (i.e., devices that comprise of a microphone: see PEHRRSON et al. ¶ 0002: “A conventional mobile phone is used for a single user and, consequently, the mobile phone has a single microphone for receiving sound waves of human voice of the user of the phone and a single loudspeaker for reproducing sound for the user”).

Regarding claim 17, GABA et al. do teach the method according to claim 16 further comprising outputting, by a speaker, the second processed audio signal to the first user (¶ 68: “FIGS. 1a-1c are example illustrations in which an electronic device (100) handles a group call, according to an embodiment as disclosed herein. The electronic device (100) can be, for example, but not limited to a cellular phone, a smart phone” (i.e., devices that comprise of speakers for e.g., outputting the second processed audio signal : see PEHRRSON et al. ¶ 0002: “A conventional mobile phone is used for a single user and, consequently, the mobile phone has a single microphone for receiving sound waves of human voice of the user of the phone and a single loudspeaker for reproducing sound for the user”)).

Regarding claim 18, GABA et al. do not specifically disclose the method according to claim 17 further comprising coupling an analog-to-digital converter to the microphone  and coupling a digital-to-analog converter to the speaker.
PEHRRSON et al. do teach the method according to claim 17 further comprising coupling an analog-to-digital converter to the microphone  and coupling a digital-to-analog converter to the speaker (¶ 0033: “The output of the amplifier comprises the analogue sum signal. The signal is filtered through a filter block 28, which comprises a transmit programmable gain stage (TX PGA), an analogue-to-digital converter (ADC)” (analog to digital converter) “and a transmit band pass filter, which converts the analogue signal from the microphone” (coupled to the microphone) “to a digital signal and filters”; ¶ 0028 lines 51+: “a digital-to-analogue converter” (a digital to analog converter) “for converting the PCM decoded signal into an analogue signal. Before input to a first earphone” (coupled to a speaker)).
For obviousness to combine GABA et al. and PEHRRSON et al. see claim 14.

Regarding claim 19, GABA et al. do not specifically disclose the method according to claim 14 further comprising adjusting, by a second amplifier,  an audio feature of the delayed input audio signal in the mixed playback signal.
PEHRSSON et al. do teach the method according to claim 14 further comprising adjusting, by a second amplifier,  an audio feature of the delayed input audio signal in the mixed playback signal (Col. 5 lines 53+ : “Before input to a first earphone amplifier 25 for the fixed earphone 15 and/or to a second earphone amplifier 26 for the handsfree earphone 16, the signal passes a receive programmable gain stage (PGA) in the filter block 24” (there are two “amplifiers”, “25” and “26” (a second amplifier) and they are coupled to the “programmable gain stage” (a “volume control” (for adjusting audio feature (Col. 5 line 49) in a mixed signal comprising one going to the “first earphone” and one going into the “second earphone”))).
For obviousness to combine GABA et al. and PEHESSON et al. see claim 14. 

Regarding claim 20, GABA et al. do not specifically disclose the method according to claim 12 further comprising generating, by a recording device, the playback signal.
PHERSSON et al. do teach the method according to claim 12 further comprising generating, by a recording device, the playback signal (Abstract: “A portable communication device for conference calls, comprising at least first and second speech encoder paths with first and second inputs” (from input to) “for connection to first and second recording devices” (recording device) “respectively, and at least one output” (to generate playback) “connected to signal processor means for receiving and processing first and second electronic signals from said recording devices via said first and second speech encoder paths for transmission to a transmitter/receiver operatively connected to said signal processor means, wherein said apparatus is adapted to receive said first and second electronic signals simultaneously even if the signals are different, and said apparatus comprises a summator adapted to sum said first and second electronic signals into a sum signal for transmission to said transmitter/receiver”).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the functions of the “conference call” management of PEHRRSON et al. into the “group call” of GABA et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable GABA et al. to acquire recording function so as to be able to use an input plurality of times should it become necessary for further processing.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARZAD KAZEMINEZHAD whose telephone number is (571)270-5860. The examiner can normally be reached 10:30 am to 11:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras D. Shah can be reached at (571) 270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Farzad Kazeminezhad/
Art Unit 2653
March 21st 2026.
Read full office action
Prosecution Timeline

Jul 02, 2024
Application Filed
Mar 21, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/319,946
Patent 12603080
GAZE-BASED AND AUGMENTED AUTOMATIC INTERPRETATION METHOD AND SYSTEM
2y 5m to grant Granted Apr 14, 2026
18/475,788
Patent 12592242
MACHINE LEARNING (ML) BASED EMOTION, IDENTITY AND VOICE CONVERSION IN AUDIO USING VIRTUAL DOMAIN MIXING AND FAKE PAIR-MASKING
2y 5m to grant Granted Mar 31, 2026
18/890,293
Patent 12586596
SYSTEM AND METHOD FOR BACKGROUND NOISE SUPPRESSION BY PROJECTING AN INPUT AUDIO INTO A HIGHER DIMENSION SPACE
2y 5m to grant Granted Mar 24, 2026
18/604,374
Patent 12555587
APPARATUS AND METHOD FOR ENCODING AN AUDIO SIGNAL USING AN OUTPUT INTERFACE FOR OUTPUTTING A PARAMETER CALCULATED FROM A COMPENSATION VALUE
2y 5m to grant Granted Feb 17, 2026
18/164,336
Patent 12537019
ACTIVITY CHARTING WHEN USING PERSONAL ARTIFICIAL INTELLIGENCE ASSISTANTS INCLUDING DIFFERENTIATING A PATIENT FROM A DIFFERENT PERSON BASED ON AUDIO ASSOCIATED WITH TOILETTING
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
71%
Grant Probability
99%
With Interview (+67.2%)
3y 6m
Median Time to Grant
Low
PTA Risk
Based on 534 resolved cases by this examiner. Grant probability derived from career allow rate.