DETAILED ACTION
This examination is in response to the communication filed on 12/09/2025. Claims 1-2, 4-7, 9-12, 14-17 and 19-20 are currently pending, wherein claims 1 and 11 have been amended.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendments/Arguments
Applicant’s arguments with respect to the rejection of pending claims 1-2, 4-7, 9-12, 14-17 and 19-20 under §103 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Applicant's amendment/arguments filed 12/09/2025 have been fully considered but they are not persuasive. Applicant argues, on page 12, last paragraph, that “Lynch merely shows amplitude differences between 2-5 kHz, which do not overlap with 1 kHz and 10 kHz” therefore the claimed “multiband triplets” are not obvious in view of Lynch. The Examiner respectfully disagrees.
Figure 6 of Lynch clearly illustrates, albeit for a single type of face mask, the amplitude difference caused by the facemask in a frequency range of .2 kHz to 7.4 kHz, which substantially overlaps the claims 1 to 10 kHz. In addition, ¶[0018] specifically, discloses compensating the masked speech signals “by selectively amplifying the frequencies in speech based on how much respective frequencies are affected by a face covering”. Accordingly, the training signals, such as that shown in Fig. 6, inherently reflect the amount in dB the amplitude of the speech is affect by each type of face covering. Accordingly, the specific compensation amounts recited in amended claims 1 and 11 are an inherent function of the mask type and thus an obvious result of the teachings of Lynch, as the specific examples describe in Lynch clearly establish a range of 0 to 10 dB at a frequency range of .2 to 8 kHz which clearly overlaps the claimed values. Further, the Examiner notes that the specification fails to describe or suggest that the specific attenuations noted in the specification are anything other than a function/result of the different type of commonly worn mask.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1, 2, 4-7, 11, 12 and 14-17 are rejected under 35 U.S.C. 103 as being unpatentable by Lynch et al. (US 2022/0343934 A1; herein “Lynch”).
Regarding claims 1 and 11, Lynch teaches a processing method of a sound signal and an apparatus comprising a memory for storing a code (Fig. 8, Memory Device 806 ); and a processor (Fig. 8, Processing System 803 ), coupled to the memory, and for loading the code, wherein the processor perform a method comprising:
receiving the sound signal (Fig. 5, user comms. 501; and ¶[0033] teaches “…user system 402 captures user communications 501 at step 1…User communications 501 at least includes audio captured of user 422 speaking…”);
identifying a respirator type according to a sound feature of the sound signal, wherein the respirator type is a type of a respirator corresponding to the sound signal (Fig. 5, step 3; and P[0034] teaches “Communication session system 401 recognizes, at step 3, that user 422 is wearing face covering 431 when generating user communications 501 (i.e., when speaking).” and ¶[0021] teaches “Compensator 121… may determine a face covering of face coverings 131’s type (e.g., cloth mask, paper makes, plastic face shield, etc.) is positioned over user 141’s mouth…may recognize a particular attenuation pattern in audio of user 141 speaking that indicates a face covering is present” ),
wherein identifying the respirator type according to a sound feature of the sound signal comprises:
distinguishing different respirator types based on an attenuation amplitude of the sound signal (¶[0021] teaches “Compensator 121 may determine that face covering 131 specifically is positioned over user 141's mouth (as opposed to another face covering), may determine a face covering of face covering 131's type ( e.g., cloth mask, paper mask, plastic face shield, etc.) is positioned over user 141's mouth…may recognize a particular attenuation pattern in audio of user 141 speaking…”, and ¶[0024] teaches “the predefined adjustments may include amounts for specific types of face coverings if compensator 121 determined a specific type for face covering 191. For instance, the amount in which the amplitude for a set of frequencies are adjusted may be different in the predefined amounts depending on the type of face covering 131) at a frequency band between 2 and 5 kHz in a frequency domain (Fig. 6 and ¶[0039] teaches the frequency range of the mask effect on amplitude is greatest between 2 and 7 kHz), wherein the sound feature comprises the attenuation amplitude, different respirator types have differences in the attenuation amplitude between 2 and 5 kHz in the frequency domain, (¶[0018] teaches “the compensation…accounts for the non-linear effects by selectively amplifying the frequencies in speech based on how much respective frequencies are affected by a face covering” and Fig. 6 teaches that the attenuation effect from the face coving is greatest between 2 and 5 kHz) and distinguishing different respirator types based on the attenuation amplitude of the sound signal at the frequency band between 2 and 5 kHz in the frequency comprises:
modifying the sound signal according to the respirator type (¶[0023] teaches “Compensator 121 adjusts the respective amplitudes of the affected frequencies to level (or at least closer to the levels) that the amplitudes would have been had user 141 not been wearing face covering 131” and ¶[0024] teaches “the predefined adjustment may include amounts for specific types of face coverings…the amount in which the amplitude for a set of frequencies are adjusted may be different … depending on the type of face covering 131”),
wherein modifying the sound signal according to the respirator type comprises: superimposing a compensation signal with compensation values between 0 dB and 10 dB corresponding to the attenuation amplitude between 1 and 10 kHz in the frequency domain and the sound signal (¶[0024] teaches “the predefined adjustment may include amounts for specific types of face coverings…the amount in which the amplitude for a set of frequencies are adjusted may be different … depending on the type of face covering 131” In addition, Fig. 6 and ¶¶[0038]-[0039] teaches that the amplitude difference in the 2 to 5 kHz frequency range varies from 0 dB to 10 dB).
Although Lynch ¶[0028] which teaches “Compensator 121 compares reference audio 301 to training audio 302 at step 3 to determine how much the frequencies of the user 141’s speech are attenuated in training audio 302 due to face covering 131…Compensator 121 then uses the differences in amplitudes across at least the range of frequencies typical for human speech (e.g., roughly 125 Hz to 8000 Hz) to create a profile at step 4 that user 141 can enable when wearing face covering 131. The profile indicates to compensator 121 frequencies and amounts in which those frequencies should be amplified in order to compensate for user 141 wearing face covering” and ¶[0021] teaches determining the covering type based on the attenuation pattern, Lynch fails to explicitly disclose determining different respirator types at 4 kHz with at least 2 dB difference or that the attenuation compensation explicitly discloses superimposing 0.5 dB at 1 kHz, 3 dB at 4 kHz and 2.5 dB at 10 kHz for the sound signal in response to identifying a first type of respirator type; superimposing 0.5 dB at 1 kHz, 10 dB at 4 kHz, and 10 dB at 10 kHz for the sound signal in response to identifying a second type of respirator type; and superimposing 0 dB at 1 kHz, 5 dB at 4 kHz, and 3 dB at 10 kHz for the sound signal of in response to identifying a third type of respirator type. However, the specifically claimed adjustment values overlap with the ranges disclosed in Lynch and the respective attenuation is an inherent function of the type of mask used, thus the claimed compensation values are obvious in view of Lynch. See MPEP 2144.05 “In the case where the claimed ranges ‘overlap or lie inside ranges disclosed by the prior art’ a prima facie case of obviousness exists. In re Wertheim, 541 F.2d 257, 191 USPQ 90 (CCPA 1976)”.)
Regarding claims 2 and 12, Lynch teaches all of the element of claims 1 and 11 (see detailed element listing above). In addition, Lynch further teaches modifying the sound signal according to the respirator type comprises:
obtaining the compensation signal corresponding and according to the respirator type (¶[0024] teaches “the amounts in which certain frequencies should be adjusted may be predetermined within compensator 121…the predefined adjustments may include amounts for specific types of face coverings…”); and
modifying the sound signal according to the compensation signal (¶[0023] teaches “…compensator 121 adjusts amplitudes of frequencies in audio 111 to compensate for face covering 131 (203)…Compensator 121 adjusts the respective amplitudes of the affected frequencies to levels (or at least closer to the levels) that the amplitudes would have been had user 141 not been wearing face covering 131”).
Regarding claims 4 and 14, Lynch teaches all of the element of claims 2 and 12 (see detailed element listing above). In addition, Lynch further teaches modifying the sound signal according to the compensation signal comprises: setting the compensation signal to zero in response to an absence of the respirator (¶[0038] teaches “communication session system 401 may stop adjusting the audio in user communications 501 because there is no longer a face covering for which to compensate”).
Regarding claims 5 and 15, Lynch teaches all of the element of claims 2 and 12 (see detailed element listing above). In addition, Lynch further teaches obtaining an original signal, wherein the original signal is the sound signal generated without the respirator (Fig. 3, Ref. Audio 301 and ¶[0027] teaches “compensator 121 receives, via microphone 122, reference audio 301 from user 141 at step 1 while user 141 is not wearing a face covering of any kind”);
obtaining a training signal, wherein the training signal is the sound signal generated through the respirator of the respirator type (Fig. 3, Training Audio 302 and ¶[0027] teaches “…Compensator 121 then receives, via microphone 122, training audio 302 at step 2 while user 141 is wearing face covering 131…” ); and
getting the compensation signal according to a difference between the original signal and the training signal (Fig. 3, step 3 and ¶[0028] teaches “Compensator 121 compares reference audio 301 to training audio 302 at step 3 to determine how much the frequencies of user 141’s speech are attenuated in training audio dues to face covering 131”).
Regarding claims 6 and 16, the Lynch teaches all of the element of claims 1 and 11 (see detailed element listing above). In addition, Lynch further teaches identifying the respirator type comprises: identifying the respirator type of the respirator in an image (¶[0040] teaches “In this example, user 741 is operating user system 701 on a real-time video communication session with one or more other endpoints and captures video 721, which includes a video image of user 741, at step 1. In this example, user 741 is wearing face covering 731 in video 721 and user system 701 identifies that fact at step 2. ).
Regarding claims 7 and 17, the combination of Carter, Zhang and Amman teaches all of the element of claims 6 and 16 (see detailed element listing above). In addition, Lynch teaches identifying the respirator type comprises: identifying the respirator type through a classifier, wherein the classifier is trained based on a machine learning algorithm (¶[0021] teaches the compensator “may recognize a particular attenuation pattern in audio of user 141 speaking that indicates a face covering is present” pattern recognition is interpreted as a classification). However, Lynch fails to disclose that the classifier is trained based on machine learning
Official Notice is taken that classifiers trained based on machine learning algorithms to classify audio signals are well known in the art as evidenced by US 2022/0246146 A1 to Nouri (See ¶[0028] “The task classifier 142 classifies a task into one or more predefined classes using acoustic features…of the received audio data…the task classifier 142 may use a machine learning model to classify the task”).
Therefore, it would have been obvious to one having ordinary skill in the art, before the effective filing date of the invention, to utilize a machine learning classifier to classify the type of respirator based on attenuation pattern as it is merely the substitution of well-known process to achieve the predictable result of classifying the audio/sound signals.
Claims 9, 10, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Lynch as applied to claims 1 and 11 above, and further in view of Ostrand et al. US 2022/0199102 A1; herein “Ostrand”).
Regarding claims 9 and 19, the Lynch teaches all of the element of claims 1 and 11 (see detailed element listing above). However, Lynch fails to teach determining whether to the sound signal is a registered signal according to a modified sound signal.
Ostrand teaches a method and system for enhancing user-specific audio signals during a conversation. More specifically, Ostrand teaches identifying in response to the sound signal is a registered signal according to a modified sound signal (Under a broadest reasonable interpretation “a registered signal” is interpreted as a signal corresponding to specific user, i.e., a voiceprint; Ostrand ¶[0019] teaches “This speaker-specific acoustic model may be specifically tuned to that particular user’s vocal characteristics…”, ¶[0024] teaches “the acoustic model in some embodiments may be based on the acoustic voiceprint (profile) of a user”, and ¶[0074] teaches “this acoustic profile 499a may be optimized to selectively identify and/or isolate a full dynamic range of the user’s voice from a recording stream”)
Lynch differs from the claimed invention, as defined in claim 9 or 19, in that the Lynch fails to specifically disclose determining whether the modified sound signal is a registered signal, i.e., to perform speaker identification/verification. Performing speaker identification in response to receiving a sound signal is well known in the art as evidenced by Ostrand. Therefore, it would have been obvious to one having ordinary skill in the art to modify the system taught by Lynch to perform speaker identification/recognition on the sound signal as taught by Ostrand as merely constitutes the combination of known processes to achieve the predictable result of allowing speaker specific models and/or signal processing to be performed.
Regarding claims 10 and 20, the Lynch and Ostrand teaches all of the element of claims 9 and 19 (see detailed element listing above). In addition, Ostrand teaches determining whether the sound signal is the registered signal according to the modified sound signal comprises:
generating a registered acoustic model of the registered signal according to a first acoustic feature of a registered sound signal (¶[0019] teaches “the user may begin by recording their voice…creating clean target recording or set of records…then analyze the clean recording to generate one or more parameters for a voice profile” and ¶[0024] teaches “…these feature may include, without limitation: pitch variations and perturbations…Mel Frequency Cepstral Coefficients (MFCCs)…” );
generating a tested acoustic model of the sound signal according to a second acoustic feature of the modified sound signal (¶[0019] teaches “…then build the speaker-specific acoustic model of the user’s voice using the one or more parameters”); and
determining whether the sound signal is the registered signal according to a comparison result between the registered acoustic model and the tested acoustic model (¶[0020] teaches “Once the acoustic model is built and/or trained, the same features for creating the models may be extracted and processed from the future live audio and/or audiovisual streams.” and ¶[0074] teaches “this acoustic profile 499a may be optimized to selectively identify and/or isolate a full dynamic range of the user’s voice from a recording stream” Identifying the user’s voice is interpreted as comparing the user-specific acoustic model and the audio from the recording stream),
wherein in response to the sound signal is the registered signal according to a comparison result showing the registered acoustic model is the same as the tested acoustic model (¶[0074] teaches “this acoustic profile 499a may be optimized to selectively identify and/or isolate a full dynamic range of the user’s voice from a recording stream” Identifying the user’s voice is interpreted as showing the audio in the recording stream is the same as the specific-user acoustic model).
Lynch differs from the claimed invention, as defined in claim 9 or 19, in that the combination fails to specifically disclose determining whether the modified sound signal is a registered signal, i.e., to perform speaker identification/verification. Performing speaker identification in response to receiving an sound signal is well known in the art as evidenced by Ostrand. Therefore, it would have been obvious to one having ordinary skill in the art to modify the system taught by Lynch to perform speaker identification/recognition on the sound signal as taught by Ostrand as merely constitutes the combination of known processes to achieve the predictable result of allowing speaker specific models and/or signal processing to be performed.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PENNY L CAUDLE whose telephone number is (703)756-1432. The examiner can normally be reached M-Th 8:00 am to 5:00 pm eastern.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached at 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PENNY L CAUDLE/Examiner, Art Unit 2657