DETAILED ACTION
This action is in response to the initial filing of application no. 18/762,455 on 07/02/2024.
Claims 1- 20 are still pending in this application, with claims 1 and 11 being independent.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Allowable Subject Matter
Claims 3 and 13 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The prior art fails to teach or suggest the following limitations in reasonable combination: wherein: generating a time measurement for the reduced noise audio beam includes counting by a counter a number of frames of the reduced noise audio beam that includes the speech component, wherein the counting includes: incrementing the counter by one or more in response to determining that the reduced noise audio beam includes the speech component in a current frame of the reduced noise audio beam; and decrementing the counter by one or more in response to determining that the reduced noise audio beam does not include the speech component in the current frame of the reduced noise audio beam; and generating the gain for the audio beam corresponding to the reduced noise audio beam includes reducing the gain towards zero based on the counter being at zero.
Claims 5 and 15 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The prior art fails to teach or suggest the following limitations in reasonable combination the following limitations: wherein for each reduced noise audio beam of the plurality of reduced noise audio beams, calculating the SNR of the reduced noise audio beam includes: calculating a first signal level of the audio beam corresponding to the reduced noise audio beam; calculating a second signal level of the reduced noise audio beam; calculating the noise level as a difference between the first signal level and the second signal level; and calculating a ratio of the second signal level to the noise level as the SNR.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
Claims 2 and 12 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claims 2 and 12 recites the limitation "the speech component”. There is insufficient antecedent basis for this limitation in the claim.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1, 4, 7, 8, 10, 11, 14, 17, 18 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Allen (US 9,532,138) in view of Mansour et al. (US 10,657,981) (“Mansour”).
For claim 1, Allen discloses a method of audio mixing (Abstract), comprising: receiving a plurality of audio beams (Fig.3B, 200)(The beams generated by the beamformers include unidirectional beams and omnidirectional beams), wherein the audio beams are generated from a plurality of audio signals from a microphone array (Fig.3B, 100) (column 5 lines 28 –column 6 line 20); for each audio beam of the plurality of audio beams: calculating a signal characteristic (estimated acoustic energies) of the audio beam (parameter estimation, Fig.3B, 301) (column 6 lines 21 – 63); determining, based on the plurality of signal characteristics of the plurality of audio beams, one or more audio beams that include a speech component (One or more beams are selected based on the acoustic energies. A volume of speech received by each of the selected beams is identified., column 6 lines 60 – column 7 line 30; claim 28); and for each audio beam of the plurality of audio beams, generating a gain for the audio beam based on the determination (column 7 lines 40 – 53; claim 28).
Yet, Allen fails to teach the following: Yet, Allen fails to teach the following: generating a reduced noise audio beam from the audio beam by reducing a noise component of the audio beam; and further performing the calculating and determining steps using the reduced noise audio beam.
However, Mansour discloses a system and method for producing output audio (Abstract), comprising the following: reduced noise audio beams (ANC outputs, Fig.8, 835) are generated from audio beams (FBF Outputs, Fig.8, 825 ) by reducing a noise component (column 18 lines 47 – 52; column 19 lines 28 - 51; column 20 lines 1 – 16); and the reduced noise audio beams are provided to a beam selector (Fig.8A, 840) for further processing (column 20 lines 32 - 41).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve Allen’s invention in the same way that Mansour’s invention has been improved to achieve the following, predictable results for the purpose of removing device produced sounds/noise to improve the ability to recognize user speech in received microphone signals (Mansour, column 1 lines 5 – 20): further providing a module, e.g. adaptive noise cancellation component, to reduce noise in an acoustic signal; further generating reduced noise audio beams from the audio beams by reducing a noise component of the audio beams; and further processing the reduced noise audio beams, e.g. by performing the calculating and determining steps using the reduced noise audio beams.
For claims 4 and 14, Allen and Mansour further discloses wherein for each reduced noise audio beam of the plurality of reduced noise audio beams, the signal characteristic of the reduced noise audio beam includes one of: a signal level (Allen, acoustic energy), wherein the signal level indicates an instantaneous signal power of the reduced noise audio beam (Allen, column 6 lines 30 – 59) (Mansour, column 20 lines 32 - 41); or a signal-to-noise ratio (SNR), wherein the SNR indicates a ratio between the signal level of the reduced noise audio beam and a noise level of the noise component of the audio beam corresponding to the reduced noise audio beam (Mansour, column 20 lines 1 - 41).
For claims 7 and 17, Allen and Mansour further disclose calculating a direction of arrival (DOA) of audio to the microphone array based on the one or more reduced noise audio beams that include the speech component (Allen, The acoustic energies for unidirectional beams are estimated. An unidirectional beam “acts as a spatial filter that passes acoustic energy from some spatial directions while filtering out acoustic energy from other directions. By forming a beam that points at or near a desired source of acoustic energy (e.g., a person who is speaking), the desired acoustic energy of the speaker may be passed by the spatial filter implemented by a beam while acoustic energy from noise sources or reflections of the desired source may be rejected or attenuated. In this manner, audio quality of the communication device may be improved.”, column 1 lines 23 - 25; column 2 lines 18 – 27; column 6 lines 20 – 63) (Mansour, column 18 lines 47 – 52; column 19 lines 28 - 51; column 20 lines 1 – 16 and 32 – 41); and generating a control signal to control one or more of an audio unit (Allen, The selected beam is the control signal to control the beam mixer to generate an audio output., ., column 6 lines 60 – column 7 lines 1, 60 – column 8 line 26) or a video unit based on the DOA.
For claims 8 and 18, Allen further discloses mixing the plurality of audio beams to generate a mixed audio signal (Allen, audio out, Fig.3B and Fig.4) wherein mixing the plurality of audio beams includes: for each audio beam of the plurality of audio beams, multiplying the audio beam with the gain for the audio beam to generate a processed audio beam (Allen, Fig.4, 401 and 402; column 7 lines 60 – column 8 lines 11, 13 - 30); and combining the plurality of processed audio beams to generate the mixed audio signal (Allen, column 8 lines 8 – 13).
For claims 10 and 20, Allen and Mansour further disclose: for each audio beam of the plurality of audio beams: receiving audio at one or more microphones of the microphone array (Allen, Fig.3B, 100; column 5 lines 28 – 48) (Mansour, column 18 lines 43 - 51); for each microphone of the one or more microphones, generating an audio signal from the audio received at the microphone (Allen, column 5 lines 28- 48)( Mansour, column 18 lines 43 - 51), wherein the plurality of audio signals includes the audio signal (Allen, column 5 lines 28- 48) (Mansour, column 18 lines 43 - 51); and generating, by a fixed beamformer (Allen, beamformers, Fig.3B, 200) (Mansour, FBF, Fig.8A, 820) the audio beam from the one or more audio signals (Allen, column 5 lines 49 – column 6 line 20) (Mansour, column 19 lines 28 - 51).
For claims 11, Allen discloses an audio mixing system (Abstract), comprising: a components to perform operations comprising: receiving a plurality of audio beams (Fig.3B, 200)(The beams generated by the beamformers include unidirectional beams and omnidirectional beams), wherein the audio beams are generated from a plurality of audio signals from a microphone array (Fig.3B, 100) (column 5 lines 28 –column 6 line 20); for each audio beam of the plurality of audio beams: calculating a signal characteristic (estimated acoustic energies) of the audio beam (parameter estimation, Fig.3B, 301) (column 6 lines 21 – 63); determining, based on the plurality of signal characteristics of the plurality of audio beams, one or more audio beams that include a speech component (One or more beams are selected based on the acoustic energies. A volume of speech received by each of the selected beams is identified., column 6 lines 60 – column 7 line 30; claim 28); and for each audio beam of the plurality of audio beams, generating a gain for the audio beam based on the determination (column 7 lines 40 – 53; claim 28).
Yet, Allen fails to teach the following: the components further comprise a processing system, and a memory storing instructions that, when executed by the processing system, causes the audio mixing system to perform the operations; and a reduced noise audio beam is generated from the audio beam by reducing a noise component of the audio beam; and further performing the calculating and determining steps using the reduced noise audio beam.
However, Mansour discloses a system and method for producing output audio (Abstract), comprising the following: reduced noise audio beams (ANC outputs, Fig.8, 835) are generated from audio beams (FBF Outputs, Fig.8, 825 ) by reducing a noise component (column 18 lines 47 – 52; column 19 lines 28 - 51; column 20 lines 1 – 16); the reduced noise audio beams are provided to a beam selector (Fig.8A, 840) for further processing (column 20 lines 32 - 41); and a method is performed by a processing system executing instructions stored in a memory (column 31 lines 65 – column 32 lines 35).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve Allen’s invention in the same way that Mansour’s invention has been improved to achieve the following, predictable results for the purpose of removing device produced sounds/noise to improve the ability to recognize user speech in received microphone signals (Mansour, column 1 lines 5 – 20): further providing a module, e.g. adaptive noise cancellation component, to reduce noise in an acoustic signal; further generating reduced noise audio beams from the audio beams by reducing a noise component of the audio beams; and further processing the reduced noise audio beams, e.g. by performing the calculating and determining steps using the reduced noise audio beams; and further performing the operations by a processing system executing instructions stored in a memory.
Claim(s) 2 and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Allen (US 9,532,138) in view of Mansour et al. (US 10,657,981) (“Mansour”) and further in view of Angkitirakul et al. (US 2020/0402499) (“Angkitirakul”).
For claims 2 and 12, the combination of Allen and Mansour further discloses the following: for each reduced noise audio beam of the plurality of reduced noise audio beams, determining that the reduced noise audio beam includes the speech component (Allen, column 7 lines 38 - 49) (Mansour, column 8 lines 62 – column 9 line 6; column 20 lines 32 - 41), wherein generating the gain for the audio beam corresponding to the reduced noise audio beam is generated based on the determination (Allen, column 7 lines 38 - 49) (Mansour, column 20 lines 32 - 41).
Yet, the combination of Allen and Mansour fails to teach that the determination includes generating a time measurement.
However, Angkitirakul discloses a system and method for detecting speech activity in a real-time audio signal (Abstract), comprising the following: a time measurement (speech start point and speech end point) is generated to determine than a reduced noise audio signal includes a speech component ([0015] [0017 – 0026] [0030] [0031] [0033]).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to modify the combined teachings of Allen and Carlson with Angkitirakul’s teachings so that the speech component is further determined by generating a time measurement (a start and end point) for the purpose of accurately ascertaining speech to perform speech recognition (Allen, column 1 line24 – column 2 lines 27) (Mansour, column 1 lines 18 – 40 and column 2 lines 23 - 31) (Ankitirakul, [0001]).
Claim(s) 6, 9, 16 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Allen (US 9,532,138) in view of Mansour et al. (US 10,657,981) (“Mansour”) and further in view of Zhang et al. (“DeepANC: A deep learning approach to active noise control”).
For claims 6 and 16, the combination of Allen and Mansour further discloses the following: wherein for each audio beam of the plurality of audio beams, reducing the noise component of the audio beam includes: inputting the audio beam to a noise reduction unit (Mansour, ANC, Fig.8A, 830) dedicated to processing the audio beam (Mansour, column 18 lines 47 – 52; column 19 lines 28 - 51; column 20 lines 1 – 16), wherein the NNNRU includes a recurrent neural network configured to receive samples of the audio beam based on a frequency spectrum of the audio beam; and denoising the audio beam to generate the reduced noise audio beam by the noise reduction unit (Mansour, column 18 lines 47 – 52; column 19 lines 28 - 51; column 20 lines 1 – 16).
Yet, the combination of Allen and Mansour fails to teach that the noise reduction unit includes a recurrent neural network configured to receive samples of the audio beam based on a frequency spectrum of the audio beam.
However, Zhang discloses a method for controlling noise(Abstract), comprising the following: a recurrent neural network unit (CRN) receives samples of an audio signal based on a frequency spectrum of the audio signal (Figure 2; 1. Introduction and 3. Deep ANC method, pg. 1 - 4).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve the invention disclosed by the combination of Allen and Mansour in the same way that Zhang’s invention has been improved to achieve the following, predictable results for the purpose of suppressing noise in audio signals received by microphones of an audio input device to improve speech recognition (Allen, column 1 lines 14 – column 2 line 3) (Zhang, 1. Introduction, pg. 1 and 2): the noise reduction unit further includes a recurrent neural network configured to receive samples of the audio beam based on a frequency spectrum of the audio beam.
For claims 9 and 19, the combination of Allen and Mansour further discloses reducing a noise in the mixed audio signal by a noise reduction unit (Mansour, ANC, Fig.8A, 830) to generate an output audio signal (Allen, column 6 lines 20 – 29; column 7 lines 38 – column 8 line 33) (Mansour, column 18 lines 47 – 52; column 19 lines 28 - 51; column 20 lines 1 – 16)
Yet, the combination of Allen and Mansour fails to teach that the noise reduction unit includes a neural network.
However, Zhang discloses a method for controlling noise(Abstract), comprising the following: a recurrent neural network unit (CRN) receives samples of an audio signal based on a frequency spectrum of the audio signal and reduces the noise in those samples (Figure 2; 1. Introduction and 3. Deep ANC method, pg. 1 - 4).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of applicant’s filing to improve the invention disclosed by the combination of Allen and Mansour in the same way that Zhang’s invention has been improved to achieve the following, predictable results for the purpose of suppressing noise in audio signals received by microphones of an audio input device to improve speech recognition (Allen, column 1 lines 14 – column 2 line 3) (Zhang, 1. Introduction, pg. 1 and 2): the noise reduction unit further includes a recurrent neural network configured to reduce noise (echo) in the mixed audio signal.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SONIA L GAY whose telephone number is (571)270-1951. The examiner can normally be reached Monday-Friday 9-5 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached at 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SONIA L GAY/ Primary Examiner, Art Unit 2657