Last updated: May 29, 2026
Application No. 18/482,978
AUDIO FILTER SYSTEM FOR A VEHICLE

Non-Final OA §102§103§112
Filed
Oct 09, 2023
Examiner
SERRAGUARD, SEAN ERIN
Art Unit
2657
Tech Center
2600 — Communications
Assignee
GM Global Technology Operations LLC
OA Round
2 (Non-Final)
Interview Optional

— +33.0% interview lift. Examiner has a relatively high allowance rate (70%); +33.0% interview lift. A written response may suffice.
Based on 142 resolved cases, 2023–2026
Examiner Intelligence

SERRAGUARD, SEAN ERIN View full profile →
Grants 70% — above average
Career Allowance Rate
99 granted / 142 resolved
+7.7% vs TC avg
Strong +33% interview lift
Without
With
+33.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
23 currently pending
Career history
180
Total Applications
across all art units
Statute-Specific Performance

§101
0.5%
-39.5% vs TC avg
§103
95.0%
+55.0% vs TC avg
§102
1.4%
-38.6% vs TC avg
§112
2.9%
-37.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 142 resolved cases
Office Action

§102 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
All objections/rejections not mentioned in this Office Action have been withdrawn by the Examiner.

Status of the Claims 
Prior to entry of the amendment(s) and/or consideration of the argument(s), the status of the claims is as follows. 
Claim(s) 1-20 is/are pending. 
Claim(s) 2 and 8-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1-4, 8-11, and 15-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lu (U.S. Pat. App. Pub. No. 2018/0144759, hereinafter Lu) in view of Furuta (U.S. Pat. App. Pub. No. 2022/0208206, hereinafter Furuta).
Claims 5-7, 12-14, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lu and Furuta as applied to claim 1, 4, 8, 11, 15, and 18 above, and further in view of Wang (U.S. Pat. App. Pub. No. 2017/0365273, hereinafter Wang).

Response to Amendments 
Applicant’s amendment filed on 24 October 2025 has been entered. 
In view of the amendment to the claim(s), the amendments of claim(s) 1-6 and 8-20 have been acknowledged and entered.  
In view of the amendment to claim(s) 2 and 8-20, the rejection of claim(s) 2 and 8-20 under 35 U.S.C. §112 is withdrawn. 
In view of the amendment to claim(s) 1-6 and 8-20, the rejection of claims 1-20 under 35 U.S.C. §103 is withdrawn.
In light of the amended claims, new grounds for rejection under 35 U.S.C. §102 and 35 U.S.C. §103 are provided in the action below. 

Response to Arguments
Applicant’s arguments regarding the prior art rejections under 35 U.S.C. §103, see pages 7-8 of the Response to Non-Final Office Action dated 07 August 2025, which was received on 24 October 2025 (hereinafter Response and Office Action, respectively), have been fully considered.
With respect to the rejection(s) of claim(s) 1, 8, and 15 under 35 U.S.C. §103 in light of Lu in view of Furuta, applicant asserts that the cited references fail to teach or suggest all limitations of the claims as amended. Applicant’s arguments in light of the amended claims are persuasive. As such, the rejections of claims 1, 8, and 15 under 35 U.S.C. §103 are withdrawn.
Applicant further argues that the rejection(s) of dependent claims 2-7, 9-14, and 16-20 should be withdrawn for at least the same reasons as independent claims 1, 8, and 15. Applicant’s arguments in light of the amended claims are persuasive. As such, the rejections of claims 2-7, 9-14, and 16-20 under 35 U.S.C. §103 are withdrawn.
However, upon further consideration, new ground(s) of rejection under 35 U.S.C. §103 are made in light of combinations of Wang, and newly cited reference Rigaud-Maazaoui (FR 3113537 A1, hereinafter Rigaud-Maazaoui).
The Applicant has not provided any further statement and therefore, the Examiner directs the Applicant to the below rationale.	

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-3, 8-10, and 15-17 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Rigaud-Maazaoui.

Regarding claim 1, Rigaud-Maazaoui discloses A computer-implemented method (Systems and methods for “reducing multi-channel noise”; Rigaud-Maazaoui, ¶ [0001] (Title)) executed by data processing hardware that causes the data processing hardware to perform operations to design an audio filter comprising (“The invention also relates to a computer program product comprising software instructions which, when executed by a computer, implement a method.”; Rigaud-Maazaoui, ¶ [0034]): receiving, from each sensor of a sensor array of a plurality of sensors, a respective audio signal including a target audio signal and an interference audio signal (Discloses an “array of microphones 16” which are “positioned in the form of a network in the passenger compartment” each receiving an audio signal comprising a voice portion (speech source) and a noise portion, where “The speech source 12 is for example a person speaking inside the passenger compartment”; Rigaud-Maazaoui, ¶ [0049]-[0052]); processing the respective audio signals using a short-time Fourier transform (STFT) (“The upstream processing module 36 is configured to receive M acquired signals” from each of the “microphones 16” in the array of microphones, and “apply a Fourier transform to them” where since “the signals acquired belong to the discrete domain, the Fourier transform is advantageously a short-term Fourier transform”; Rigaud-Maazaoui, ¶ [0115], [0130]); generating an output vector for each of the respective processed audio signals via a set of beamformers (After the “short-term Fourier transform” the “upstream frequency processing module is then configured to output M frequency signals” and “The third determination module 40 is connected... at the output of the upstream processing module 36” and “is configured to calculate the spatial filters” and “to determine the spatially filtered signal by application, to each of the frequency signals of the respective spatial gain and by adding the signals obtained after application of the spatial filters... also known as beamforming”; Rigaud-Maazaoui, ¶ [0126], [0130]); estimating a noise variance for each of the respective processed audio signals (“From the previously estimated reference noises, the estimation module 24 is configured to estimate a power spectral density (ΦU(k, l)) of the set of reference noises (U0(k, l), U1 (k, l),..., UM-1 (k, l)) arranged in a column vector and a power spectral density (Φx(k, l)) of the set of in-phase signals (X0 (k, l), X1 (k, l),..., XM-1 (k, l))” where power spectral density of the set of reference noises is the frequency domain variance of said reference noises and, as shown, the estimation is performed for each of the processed audio signals.; Rigaud-Maazaoui, ¶ [0071]); extracting a speech energy for a respective speaker corresponding to each of the respective processed audio signals using the generated output vector from the beamformers (“the estimation module 24 is configured to estimate a power spectral density (Φs(k,l)) of the voice part and a power spectral density (ΦN(k,l)) of the noise N, from the power spectral density (Φu(k,l)) of the reference noise set (U0(k,l), U1(k,l),..., UM1(k,l)) and the power spectral density (Φx(k,l)) of the in-phase signal set (X(k,l), X1(k,l),..., XM1(k,l))” where (Φx(k,l)) is calculated using X, which is the generated output vector from the beamformers.; Rigaud-Maazaoui, ¶ [0073], [0075], [0077]); determining a prior-signal-to-noise ratio (prior-SNR) using the generated output vectors, the estimated noise variance, and the speech energy for each of the respective processed audio signals (“The first calculation module 26 is also configured to calculate an a priori signal to noise ratio. The a priori signal-to-noise ratio corresponds to a smoothed version of the a posteriori signal-to-noise ratio, taking into consideration the gain calculated by the second calculation module 28 at the previous time frame index. The first calculation module 26 is configured to calculate the a priori signal to noise ratio” where the a posteriori SNR is calculated “from the power spectral density of the vocal part”, or (Φs(k,l)), “and power spectral density of the noise” or (ΦN(k,l)), each of which being calculated based on the output vectors; Rigaud-Maazaoui, ¶ [0075], [0077], [0082]-[0085]); and determining a gain value of the audio filter based on the prior-SNR (“The second calculation module 28 is configured to calculate the variable gain”, also referred to as the Optimally Modified Log-Spectral Amplitude or G_OM-LSA, “from the a posteriori signal-to-noise ratio and the a priori signal-to-noise ratio”; Rigaud-Maazaoui, ¶ [0013], [0089]). 

Regarding claim 2, Rigaud-Maazaoui discloses further including enhancing the target audio signal using the audio filter and attenuating the interference audio signals (Discloses that “the vocal part is highlighted/enhanced thanks to the OM-LSA processing” where the system calculates the clean speech energy through “estimation of the power spectral density of the vocal part” using equation 4 and calculates the gain G_OM-LSA, which is used to calculate G_Lisse (see [0142]-[0143]), and the “calculation module 38 applies the gain (G_lisse) to the signal...by multiplying” and resulting in “the noise N {interference audio signals} is strongly attenuated”; Rigaud-Maazaoui, ¶ [0077], [0089], [0142]-[0143], [0147], [0151], [0158]). 

Regarding claim 3, Rigaud-Maazaoui discloses wherein processing the respective processed audio signals using STFT includes determining a timeframe index and a frequency bin index (Discloses the “processing of acquired signals… to obtain M frequency signals (Z0(k,l))… where k is a frequency index and l is a time frame index.” Further, examiner takes official notice that the generation of short blocks (time frames) and the analysis of the frequency content of each block (frequency bins) is the fundamental purpose of performing a STFT.; Rigaud-Maazaoui, ¶ [0010], [0055]).

Regarding claim 8, Rigaud-Maazaoui discloses An audio filter system for a vehicle, the audio filter system comprising (Systems and methods for “reducing multi-channel noise” for use in a “motor vehicle”; Rigaud-Maazaoui, ¶ [0001] (Title)): data processing hardware; and memory hardware in communication with the data processing hardware (The systems and methods may be embodied in “an electronic device” also referred to as a computing device or computer “for reducing noise in an audio signal, the audio signal comprising at least one noise a voice part and being able to be received by several microphones” and can be part of a computer program. Where a computer running a computer program is well understood in the art as including memory hardware (e.g., RAM, flash memory, disk drives, etc.).; Rigaud-Maazaoui, ¶ [0002]-[0003]), the memory hardware storing instructions that, when executed on the data processing hardware, cause the data processing hardware to perform operations to design an audio filter comprising (“a computer program comprising software instructions which, when executed by a computer, implement such a noise reduction method.”; Rigaud-Maazaoui, ¶ [0003]): receiving, from each sensor of a sensor array of a plurality of sensors, a respective audio signal including a target audio signal and an interference audio signal (Discloses an “array of microphones 16” which are “positioned in the form of a network in the passenger compartment” each receiving an audio signal comprising a voice portion (speech source) and a noise portion, where “The speech source 12 is for example a person speaking inside the passenger compartment”; Rigaud-Maazaoui, ¶ [0049]-[0052]); determining a prior-signal-to-noise ratio (prior-SNR) for each of the respective audio signals (“The first calculation module 26 is also configured to calculate an a priori signal to noise ratio. The a priori signal-to-noise ratio corresponds to a smoothed version of the a posteriori signal-to-noise ratio, taking into consideration the gain calculated by the second calculation module 28 at the previous time frame index. The first calculation module 26 is configured to calculate the a priori signal to noise ratio” where the a posteriori SNR is calculated “from the power spectral density of the vocal part”, or (Φs(k,l)), “and power spectral density of the noise” or (ΦN(k,l)), each of which being calculated based on the output vectors for each of the audio signals; Rigaud-Maazaoui, ¶ [0075], [0077], [0082]-[0085]); designing the audio filter using the determined prior-SNRs (“The second calculation module 28 is configured to calculate the variable gain”, also referred to as the Optimally Modified Log-Spectral Amplitude or G_OM-LSA, “from the a posteriori signal-to-noise ratio and the a priori signal-to-noise ratio”; Rigaud-Maazaoui, ¶ [0013], [0089]); and enhancing the target audio signal using the interference audio signals and the designed audio filter by attenuating the interference audio signals using the designed audio filter (Discloses that “the vocal part is highlighted/enhanced thanks to the OM-LSA processing” where the system calculates the clean speech energy through “estimation of the power spectral density of the vocal part” using equation 4 and calculates the gain G_OM-LSA, which is used to calculate G_Lisse (see [0142]-[0143]), and the “calculation module 38 applies the gain (G_lisse) to the signal...by multiplying” and resulting in “the noise N {interference audio signals} is strongly attenuated”; Rigaud-Maazaoui, ¶ [0077], [0089], [0142]-[0143], [0147], [0151], [0158]). 

Regarding claim 9, Rigaud-Maazaoui discloses further including processing the respective audio signals using a short-time Fourier transform (STFT) (“The upstream processing module 36 is configured to receive M acquired signals” from each of the “microphones 16” in the array of microphones, and “apply a Fourier transform to them” where since “the signals acquired belong to the discrete domain, the Fourier transform is advantageously a short-term Fourier transform”; Rigaud-Maazaoui, ¶ [0115], [0130]) and generating an output vector for each of the respective audio signals via a set of beamformers (After the “short-term Fourier transform” the “upstream frequency processing module is then configured to output M frequency signals” and “The third determination module 40 is connected... at the output of the upstream processing module 36” and “is configured to calculate the spatial filters” and “to determine the spatially filtered signal by application, to each of the frequency signals of the respective spatial gain and by adding the signals obtained after application of the spatial filters... also known as beamforming”; Rigaud-Maazaoui, ¶ [0126], [0130]). 

Regarding claim 10, Rigaud-Maazaoui discloses wherein processing the respective audio signals using STFT includes determining a timeframe index and a frequency bin index (Discloses the “processing of acquired signals… to obtain M frequency signals (Z0(k,l))… where k is a frequency index and l is a time frame index.” Further, examiner takes official notice that the generation of short blocks (time frames) and the analysis of the frequency content of each block (frequency bins) is the fundamental purpose of performing a short-time Fourier transform (STFT).; Rigaud-Maazaoui, ¶ [0010], [0055]).

Regarding claim 15, Rigaud-Maazaoui discloses A computer-implemented method (Systems and methods for “reducing multi-channel noise”; Rigaud-Maazaoui, ¶ [0001] (Title)) executed by data processing hardware that causes the data processing hardware to perform operations to design an audio filter comprising (“The invention also relates to a computer program product comprising software instructions which, when executed by a computer, implement a method.”; Rigaud-Maazaoui, ¶ [0034]): receiving, from each sensor of a sensor array of a plurality of sensors, a respective audio signal including a target audio signal and an interference audio signal (Discloses an “array of microphones 16” which are “positioned in the form of a network in the passenger compartment” each receiving an audio signal comprising a voice portion (speech source) and a noise portion, where “The speech source 12 is for example a person speaking inside the passenger compartment”; Rigaud-Maazaoui, ¶ [0049]-[0052]); determining a prior-signal-to-noise ratio (prior-SNR) for each of the respective audio signals (“The first calculation module 26 is also configured to calculate an a priori signal to noise ratio. The a priori signal-to-noise ratio corresponds to a smoothed version of the a posteriori signal-to-noise ratio, taking into consideration the gain calculated by the second calculation module 28 at the previous time frame index. The first calculation module 26 is configured to calculate the a priori signal to noise ratio” where the a posteriori SNR is calculated “from the power spectral density of the vocal part”, or (Φs(k,l)), “and power spectral density of the noise” or (ΦN(k,l)), each of which being calculated based on the output vectors for each of the audio signals; Rigaud-Maazaoui, ¶ [0075], [0077], [0082]-[0085]); designing the audio filter using the determined prior-SNRs (“The second calculation module 28 is configured to calculate the variable gain”, also referred to as the Optimally Modified Log-Spectral Amplitude or G_OM-LSA, “from the a posteriori signal-to-noise ratio and the a priori signal-to-noise ratio”; Rigaud-Maazaoui, ¶ [0013], [0089]); and enhancing the target audio signal using the interference audio signals and the designed audio filter and attenuating the interference audio signals (Discloses that “the vocal part is highlighted/enhanced thanks to the OM-LSA processing” where the system calculates the clean speech energy through “estimation of the power spectral density of the vocal part” using equation 4 and calculates the gain G_OM-LSA, which is used to calculate G_Lisse (see [0142]-[0143]), and the “calculation module 38 applies the gain (G_lisse) to the signal...by multiplying” and resulting in “the noise N {interference audio signals} is strongly attenuated”; Rigaud-Maazaoui, ¶ [0077], [0089], [0142]-[0143], [0147], [0151], [0158]). 

Regarding claim 16, Rigaud-Maazaoui discloses further including processing the respective audio signals using a short-time Fourier transform (STFT) (“The upstream processing module 36 is configured to receive M acquired signals” from each of the “microphones 16” in the array of microphones, and “apply a Fourier transform to them” where since “the signals acquired belong to the discrete domain, the Fourier transform is advantageously a short-term Fourier transform”; Rigaud-Maazaoui, ¶ [0115], [0130]) and generating an output vector for each of the respective audio signals via beamformers (After the “short-term Fourier transform” the “upstream frequency processing module is then configured to output M frequency signals” and “The third determination module 40 is connected... at the output of the upstream processing module 36” and “is configured to calculate the spatial filters” and “to determine the spatially filtered signal by application, to each of the frequency signals of the respective spatial gain and by adding the signals obtained after application of the spatial filters... also known as beamforming”; Rigaud-Maazaoui, ¶ [0126], [0130]). 

Regarding claim 17, Rigaud-Maazaoui discloses wherein processing the respective audio signals using the STFT includes determining a timeframe index and a frequency bin index (Discloses the “processing of acquired signals… to obtain M frequency signals (Z0(k,l))… where k is a frequency index and l is a time frame index.” Further, examiner takes official notice that the generation of short blocks (time frames) and the analysis of the frequency content of each block (frequency bins) is the fundamental purpose of performing a STFT.; Rigaud-Maazaoui, ¶ [0010], [0055]).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 4-7, 11-14, and 18-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rigaud-Maazaoui as applied to claim 1, 8, and 15 above, and further in view of Wang.

Regarding claim 4, the rejection of claim 1 is incorporated. Rigaud-Maazaoui discloses all of the elements of the current invention as stated above. However, Rigaud-Maazaoui fails to expressly recite wherein generating the output vector for each of the respective processed audio signals via the beamformers includes providing a number of speakers and generating dimensions for the output vector based on the provided number of speakers.
 Wang discloses systems and methods of “audio source separation from audio content.” (Wang, ¶ [0002]). Regarding claim 4, Wang teaches wherein generating the output vector for each of the respective processed audio signals via the beamformers includes providing a number of speakers (Discloses that the system receives “a set of source settings” including a “predetermined source number”; Wang, ¶ [0050], [0068]) and generating dimensions for the output vector based on the provided number of speakers (“from knowledge of the predetermined source number, an initialized matrix of spatial parameters… may be constructed” where the source number is at least applied for the determination of the “underdetermined mode” and the “overdetermined mode”, as well as for the number of separated sources in the output vector.; Wang, ¶ [0049], [0051], [0071]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the multi-channel noise reduction of Rigaud-Maazaoui, to incorporate the teachings of Wang to include wherein generating the output vector for each of the respective processed audio signals via the beamformers includes providing a number of speakers and generating dimensions for the output vector based on the provided number of speakers. Wang discloses “a solution for audio source separation by jointly taking advantage of both additive source modeling and independent/uncorrelated source modeling” such that “perceptually natural audio sources are obtained while enabling a stable and rapid convergence” which can be readily applied “in any application areas which require audio source separation for mixed signal processing and analysis” and “enable[s] dealing with highly non-stationary sources with stable convergence, including fast moving objects, [and] time-varying sounds”, which, in the context of Rigaud-Maazaoui, would allow the system to adapt for different vehicle occupancy levels without processing empty channels, as recognized by Wang. (Wang, ¶ [0037], [0041]).

Regarding claim 5, the rejection of claim 4 is incorporated. Rigaud-Maazaoui discloses all of the elements of the current invention as stated above. However, Rigaud-Maazaoui fails to expressly recite wherein determining the prior-SNR includes calculating an individual prior-SNR for each of the respective processed audio signals and estimating the prior-SNR as a joint prior-SNR of each individual prior-SNR.
The relevance of Wang is described above with relation to claim 4. Regarding claim 5, Wang teaches wherein determining the prior-SNR includes calculating an individual prior-SNR for each of the respective processed audio signals (Discloses a method to estimate the power for each individual audio source, including determining the “power spectrum matrix of the audio sources to be separated... represented as ∑_s,fn”, where the matrix contains the power spectrum “∑_j”{speech energy} for each audio source j. As well, the system models and estimates the noise power, where a term b_f,n represents the additive noise and the system estimates its power, represented as “Λ_b,f”; Wang, ¶ [0049], [0079], [0083]) and estimating the prior-SNR as a joint prior-SNR of each individual prior-SNR (The spatial parameters of the audio sources are “jointly determined” using an “expectation maximization iterative process” which is a well-known method for jointly estimating parameters in a system with unknown variables. The system is configured to “jointly estimate the source spatial parameters and spectral parameters”; Wang, ¶ [0047], [0056], [0178]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the multi-channel noise reduction of Rigaud-Maazaoui, to incorporate the teachings of Wang to include wherein determining the prior-SNR includes calculating an individual prior-SNR for each of the respective processed audio signals and estimating the prior-SNR as a joint prior-SNR of each individual prior-SNR. Wang discloses “a solution for audio source separation by jointly taking advantage of both additive source modeling and independent/uncorrelated source modeling” such that “perceptually natural audio sources are obtained while enabling a stable and rapid convergence” which can be readily applied “in any application areas which require audio source separation for mixed signal processing and analysis” and “enable[s] dealing with highly non-stationary sources with stable convergence, including fast moving objects, [and] time-varying sounds”, which, in the context of Rigaud-Maazaoui, would allow the system to adapt for different vehicle occupancy levels without processing empty channels, as recognized by Wang. (Wang, ¶ [0037], [0041]).

Regarding claim 6, the rejection of claim 1 is incorporated. Rigaud-Maazaoui discloses all of the elements of the current invention as stated above. However, Rigaud-Maazaoui fails to expressly recite wherein estimating the noise variance for each of the respective processed audio signals includes expressing each noise variance as a respective linear equation for each generated output vector from the beamformers.
The relevance of Wang is described above with relation to claim 4. Regarding claim 6, Wang teaches wherein estimating the noise variance for each of the respective processed audio signals includes expressing each noise variance as a respective linear equation (Discloses a “mixing model” for audio content that includes a term “bf,n” which “represents the additive noise” where the power of the noise signal is the parameter to be estimated. Further, the system includes a linear model for the covariance matrix, as shown in equation 7, which is a linear equation as it describes a linear relationship between the covariance matrices. The variable “Λ_b,f” represents the noise signal, which is the noise variance to be estimated.; Wang, ¶ [0049], [0079], [0090], [0097]) for each generated output vector from the beamformers (The Expectation-Maximization iterative process is designed to solve the unified model to find all unknown parameters, including both noise power (Λb,f) and source power (C_Sf,n); Wang, ¶ [0055]-[0056], [0090], [0109]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the multi-channel noise reduction of Rigaud-Maazaoui, to incorporate the teachings of Wang to include wherein estimating the noise variance for each of the respective processed audio signals includes expressing each noise variance as a respective linear equation for each generated output vector from the beamformers. Wang discloses “a solution for audio source separation by jointly taking advantage of both additive source modeling and independent/uncorrelated source modeling” such that “perceptually natural audio sources are obtained while enabling a stable and rapid convergence” which can be readily applied “in any application areas which require audio source separation for mixed signal processing and analysis” and “enable[s] dealing with highly non-stationary sources with stable convergence, including fast moving objects, [and] time-varying sounds”, which, in the context of Rigaud-Maazaoui, would allow the system to adapt for different vehicle occupancy levels without processing empty channels, as recognized by Wang. (Wang, ¶ [0037], [0041]).

Regarding claim 7, the rejection of claim 6 is incorporated. Rigaud-Maazaoui discloses all of the elements of the current invention as stated above. However, Rigaud-Maazaoui fails to expressly recite wherein extracting the speech energy for the respective speaker includes determining the speech energy using the respective linear equation for each generated output vector from the beamformers.
The relevance of Wang is described above with relation to claim 4. Regarding claim 7, Wang teaches wherein extracting the speech energy for the respective speaker includes determining the speech energy using the respective linear equation (The disclosure is based on an orthogonality characteristic where the audio sources are assumed to be uncorrelated and, for uncorrelated sources, the covariance matrix “C_Sf,n” is diagonal, and its diagonal elements represent the power of each individual source.; Wang, ¶ [0090]-[0091], [0097]-[0098], [0109]) for each generated output vector from the beamformers (The Expectation-Maximization iterative process is designed to solve the unified model to find all unknown parameters, including both noise power (Λb,f) and source power (C_Sf,n); Wang, ¶ [0055]-[0056], [0090], [0109]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the multi-channel noise reduction of Rigaud-Maazaoui, to incorporate the teachings of Wang to include wherein extracting the speech energy for the respective speaker includes determining the speech energy using the respective linear equation for each generated output vector from the beamformers. Wang discloses “a solution for audio source separation by jointly taking advantage of both additive source modeling and independent/uncorrelated source modeling” such that “perceptually natural audio sources are obtained while enabling a stable and rapid convergence” which can be readily applied “in any application areas which require audio source separation for mixed signal processing and analysis” and “enable[s] dealing with highly non-stationary sources with stable convergence, including fast moving objects, [and] time-varying sounds”, which, in the context of Rigaud-Maazaoui, would allow the system to adapt for different vehicle occupancy levels without processing empty channels, as recognized by Wang. (Wang, ¶ [0037], [0041]).

Regarding claim 11, the rejection of claim 8 is incorporated. Rigaud-Maazaoui discloses all of the elements of the current invention as stated above. However, Rigaud-Maazaoui fails to expressly recite wherein generating the output vector for each of the respective audio signals via the beamformers includes providing a number of speakers and generating dimensions for the output vector based on the provided number of speakers.
The relevance of Wang is described above with relation to claim 4. Regarding claim 11, Wang teaches wherein generating the output vector for each of the respective audio signals via the beamformers includes providing a number of speakers (Discloses that the system receives “a set of source settings” including a “predetermined source number”; Wang, ¶ [0050], [0068]) and generating dimensions for the output vector based on the provided number of speakers (“from knowledge of the predetermined source number, an initialized matrix of spatial parameters… may be constructed” where the source number is at least applied for the determination of the “underdetermined mode” and the “overdetermined mode”, as well as for the number of separated sources in the output vector.; Wang, ¶ [0049], [0051], [0071]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the multi-channel noise reduction of Rigaud-Maazaoui, to incorporate the teachings of Wang to include wherein generating the output vector for each of the respective audio signals via the beamformers includes providing a number of speakers and generating dimensions for the output vector based on the provided number of speakers. Wang discloses “a solution for audio source separation by jointly taking advantage of both additive source modeling and independent/uncorrelated source modeling” such that “perceptually natural audio sources are obtained while enabling a stable and rapid convergence” which can be readily applied “in any application areas which require audio source separation for mixed signal processing and analysis” and “enable[s] dealing with highly non-stationary sources with stable convergence, including fast moving objects, [and] time-varying sounds”, which, in the context of Rigaud-Maazaoui, would allow the system to adapt for different vehicle occupancy levels without processing empty channels, as recognized by Wang. (Wang, ¶ [0037], [0041]).

Regarding claim 12, the rejection of claim 11 is incorporated. Rigaud-Maazaoui discloses all of the elements of the current invention as stated above. However, Rigaud-Maazaoui fails to expressly recite wherein determining the prior- SNRs includes calculating an individual prior-SNR for each of the respective audio signals and estimating the prior-SNR as a joint prior-SNR of each individual prior-SNR.
The relevance of Wang is described above with relation to claim 4. Regarding claim 12, Wang teaches wherein determining the prior- SNRs includes calculating an individual prior-SNR for each of the respective audio signals (Discloses a method to estimate the power for each individual audio source, including determining the “power spectrum matrix of the audio sources to be separated... represented as ∑_s,fn”, where the matrix contains the power spectrum “∑_j”{speech energy} for each audio source j. As well, the system models and estimates the noise power, where a term b_f,n represents the additive noise and the system estimates its power, represented as “Λ_b,f”; Wang, ¶ [0049], [0079], [0083]) and estimating the prior-SNR as a joint prior-SNR of each individual prior-SNR (The spatial parameters of the audio sources are “jointly determined” using an “expectation maximization iterative process” which is a well-known method for jointly estimating parameters in a system with unknown variables. The system is configured to “jointly estimate the source spatial parameters and spectral parameters”; Wang, ¶ [0047], [0056], [0178]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the multi-channel noise reduction of Rigaud-Maazaoui, to incorporate the teachings of Wang to include wherein determining the prior- SNRs includes calculating an individual prior-SNR for each of the respective audio signals and estimating the prior-SNR as a joint prior-SNR of each individual prior-SNR. Wang discloses “a solution for audio source separation by jointly taking advantage of both additive source modeling and independent/uncorrelated source modeling” such that “perceptually natural audio sources are obtained while enabling a stable and rapid convergence” which can be readily applied “in any application areas which require audio source separation for mixed signal processing and analysis” and “enable[s] dealing with highly non-stationary sources with stable convergence, including fast moving objects, [and] time-varying sounds”, which, in the context of Rigaud-Maazaoui, would allow the system to adapt for different vehicle occupancy levels without processing empty channels, as recognized by Wang. (Wang, ¶ [0037], [0041]).

Regarding claim 13, the rejection of claim 11 is incorporated. Rigaud-Maazaoui discloses all of the elements of the current invention as stated above. However, Rigaud-Maazaoui fails to expressly recite further including estimating a noise variance for each of the generated output vectors from the beamformers and extracting a speech energy for a respective speaker corresponding to each of the respective audio signals using a respective linear equation.
The relevance of Wang is described above with relation to claim 4. Regarding claim 13, Wang teaches further including estimating a noise variance for each of the generated output vectors from the beamformers (Discloses a “mixing model” for audio content that includes a term “bf,n” which “represents the additive noise” where the power of the noise signal is the parameter to be estimated. Further, the system includes a linear model for the covariance matrix, as shown in equation 7. The variable “Λ_b,f” represents the noise signal, which is the noise variance to be estimated by the linear model.; Wang, ¶ [0049], [0079], [0090], [0097]) and extracting a speech energy for a respective speaker corresponding to each of the respective audio signals (The disclosure is based on an orthogonality characteristic where the audio sources are assumed to be uncorrelated and, for uncorrelated sources, the covariance matrix “C_Sf,n” is diagonal, and its diagonal elements represent the power of each individual source.; Wang, ¶ [0090]-[0091], [0097]-[0098], [0109]) using a respective linear equation (Further, the system includes a linear model for the covariance matrix, as shown in equation 7, which is a linear equation as it describes a linear relationship between the covariance matrices.; Wang, ¶ [0090]-[0091]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the multi-channel noise reduction of Rigaud-Maazaoui, to incorporate the teachings of Wang to include further including estimating a noise variance for each of the generated output vectors from the beamformers and extracting a speech energy for a respective speaker corresponding to each of the respective audio signals using a respective linear equation. Wang discloses “a solution for audio source separation by jointly taking advantage of both additive source modeling and independent/uncorrelated source modeling” such that “perceptually natural audio sources are obtained while enabling a stable and rapid convergence” which can be readily applied “in any application areas which require audio source separation for mixed signal processing and analysis” and “enable[s] dealing with highly non-stationary sources with stable convergence, including fast moving objects, [and] time-varying sounds”, which, in the context of Rigaud-Maazaoui, would allow the system to adapt for different vehicle occupancy levels without processing empty channels, as recognized by Wang. (Wang, ¶ [0037], [0041]).

Regarding claim 14, the rejection of claim 13 is incorporated. Rigaud-Maazaoui discloses all of the elements of the current invention as stated above. However, Rigaud-Maazaoui fails to expressly recite wherein: estimating the noise variance includes estimating each noise variance individually at the respective linear equation of the generated output vector: and extracting the speech energy for the respective speaker includes determining the speech energy using the respective linear equation for the generated output vector from the beamformers.
The relevance of Wang is described above with relation to claim 4. Regarding claim 14, Wang teaches wherein: estimating the noise variance includes estimating each noise variance individually at the respective linear equation of the generated output vector (Discloses a “mixing model” for audio content that includes a term “bf,n” which “represents the additive noise” where the power of the noise signal is the parameter to be estimated. Further, the system includes a linear model for the covariance matrix, as shown in equation 7, which is a linear equation as it describes a linear relationship between the covariance matrices. The variable “Λ_b,f” represents the noise signal, which is the noise variance to be estimated.; Wang, ¶ [0049], [0079], [0090], [0097]): and extracting the speech energy for the respective speaker includes determining the speech energy using the respective linear equation for the generated output vector from the beamformers (The Expectation-Maximization iterative process is designed to solve the unified model to find all unknown parameters, including both noise power (Λb,f) and source power (C_Sf,n); Wang, ¶ [0055]-[0056], [0090], [0109]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the multi-channel noise reduction of Rigaud-Maazaoui, to incorporate the teachings of Wang to include wherein: estimating the noise variance includes estimating each noise variance individually at the respective linear equation of the generated output vector: and extracting the speech energy for the respective speaker includes determining the speech energy using the respective linear equation for the generated output vector from the beamformers. Wang discloses “a solution for audio source separation by jointly taking advantage of both additive source modeling and independent/uncorrelated source modeling” such that “perceptually natural audio sources are obtained while enabling a stable and rapid convergence” which can be readily applied “in any application areas which require audio source separation for mixed signal processing and analysis” and “enable[s] dealing with highly non-stationary sources with stable convergence, including fast moving objects, [and] time-varying sounds”, which, in the context of Rigaud-Maazaoui, would allow the system to adapt for different vehicle occupancy levels without processing empty channels, as recognized by Wang. (Wang, ¶ [0037], [0041]).

Regarding claim 18, the rejection of claim 15 is incorporated. Rigaud-Maazaoui discloses all of the elements of the current invention as stated above. However, Rigaud-Maazaoui fails to expressly recite wherein generating the output vector for each of the respective audio signals via the beamformers includes providing a number of speakers and generating dimensions for the output vector based on the number of speakers.
The relevance of Wang is described above with relation to claim 4. Regarding claim 18, Wang teaches wherein generating the output vector for each of the respective audio signals via the beamformers includes providing a number of speakers (Discloses that the system receives “a set of source settings” including a “predetermined source number”; Wang, ¶ [0050], [0068]) and generating dimensions for the output vector based on the number of speakers (“from knowledge of the predetermined source number, an initialized matrix of spatial parameters… may be constructed” where the source number is at least applied for the determination of the “underdetermined mode” and the “overdetermined mode”, as well as for the number of separated sources in the output vector.; Wang, ¶ [0049], [0051], [0071]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the multi-channel noise reduction of Rigaud-Maazaoui, to incorporate the teachings of Wang to include wherein generating the output vector for each of the respective audio signals via the beamformers includes providing a number of speakers and generating dimensions for the output vector based on the number of speakers. Wang discloses “a solution for audio source separation by jointly taking advantage of both additive source modeling and independent/uncorrelated source modeling” such that “perceptually natural audio sources are obtained while enabling a stable and rapid convergence” which can be readily applied “in any application areas which require audio source separation for mixed signal processing and analysis” and “enable[s] dealing with highly non-stationary sources with stable convergence, including fast moving objects, [and] time-varying sounds”, which, in the context of Rigaud-Maazaoui, would allow the system to adapt for different vehicle occupancy levels without processing empty channels, as recognized by Wang. (Wang, ¶ [0037], [0041]).

Regarding claim 19, the rejection of claim 18 is incorporated. Rigaud-Maazaoui discloses all of the elements of the current invention as stated above. However, Rigaud-Maazaoui fails to expressly recite wherein determining the prior-SNRs includes calculating an individual prior-SNR for each of the respective audio signals and estimating the prior-SNR as a joint prior-SNR of each individual prior-SNR.
The relevance of Wang is described above with relation to claim 4. Regarding claim 19, Wang teaches wherein determining the prior-SNRs includes calculating an individual prior-SNR for each of the respective audio signals (Discloses a method to estimate the power for each individual audio source, including determining the “power spectrum matrix of the audio sources to be separated... represented as ∑_s,fn”, where the matrix contains the power spectrum “∑_j”{speech energy} for each audio source j. As well, the system models and estimates the noise power, where a term b_f,n represents the additive noise and the system estimates its power, represented as “Λ_b,f”; Wang, ¶ [0049], [0079], [0083]) and estimating the prior-SNR as a joint prior-SNR of each individual prior-SNR (The spatial parameters of the audio sources are “jointly determined” using an “expectation maximization iterative process” which is a well-known method for jointly estimating parameters in a system with unknown variables. The system is configured to “jointly estimate the source spatial parameters and spectral parameters”; Wang, ¶ [0047], [0056], [0178]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the multi-channel noise reduction of Rigaud-Maazaoui, to incorporate the teachings of Wang to include wherein determining the prior-SNRs includes calculating an individual prior-SNR for each of the respective audio signals and estimating the prior-SNR as a joint prior-SNR of each individual prior-SNR. Wang discloses “a solution for audio source separation by jointly taking advantage of both additive source modeling and independent/uncorrelated source modeling” such that “perceptually natural audio sources are obtained while enabling a stable and rapid convergence” which can be readily applied “in any application areas which require audio source separation for mixed signal processing and analysis” and “enable[s] dealing with highly non-stationary sources with stable convergence, including fast moving objects, [and] time-varying sounds”, which, in the context of Rigaud-Maazaoui, would allow the system to adapt for different vehicle occupancy levels without processing empty channels, as recognized by Wang. (Wang, ¶ [0037], [0041]).

Regarding claim 20, the rejection of claim 18 is incorporated. Rigaud-Maazaoui discloses all of the elements of the current invention as stated above. However, Rigaud-Maazaoui fails to expressly recite further including estimating a noise variance for each of the respective audio signals; and extracting a speech energy for a respective speaker corresponding to each of the respective audio signals using a respective linear equation for each generated output vector, wherein estimating the noise variance includes estimating each noise variance by the respective linear equation and extracting the speech energy for the respective speaker includes determining the speech energy using the respective linear equation for each generated output vector from the beamformers.
The relevance of Wang is described above with relation to claim 4. Regarding claim 20, Wang teaches further including estimating a noise variance for each of the respective audio signals (Discloses a “mixing model” for audio content that includes a term “bf,n” which “represents the additive noise” where the power of the noise signal is the parameter to be estimated. Further, the system includes a linear model for the covariance matrix, as shown in equation 7. The variable “Λ_b,f” represents the noise signal, which is the noise variance to be estimated by the linear model.; Wang, ¶ [0049], [0079], [0090], [0097]); and extracting a speech energy for a respective speaker corresponding to each of the respective audio signals (The disclosure is based on an orthogonality characteristic where the audio sources are assumed to be uncorrelated and, for uncorrelated sources, the covariance matrix “C_Sf,n” is diagonal, and its diagonal elements represent the power of each individual source.; Wang, ¶ [0090]-[0091], [0097]-[0098], [0109]) using a respective linear equation for each generated output vector, (Further, the system includes a linear model for the covariance matrix, as shown in equation 7, which is a linear equation as it describes a linear relationship between the covariance matrices.; Wang, ¶ [0090]-[0091]) wherein estimating the noise variance includes estimating each noise variance by the respective linear equation (Discloses a “mixing model” for audio content that includes a term “bf,n” which “represents the additive noise” where the power of the noise signal is the parameter to be estimated. Further, the system includes a linear model for the covariance matrix, as shown in equation 7, which is a linear equation as it describes a linear relationship between the covariance matrices. The variable “Λ_b,f” represents the noise signal, which is the noise variance to be estimated.; Wang, ¶ [0049], [0079], [0090], [0097]) and extracting the speech energy for the respective speaker includes determining the speech energy using the respective linear equation for each generated output vector from the beamformers (The Expectation-Maximization iterative process is designed to solve the unified model to find all unknown parameters, including both noise power (Λb,f) and source power (C_Sf,n); Wang, ¶ [0055]-[0056], [0090], [0109]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the multi-channel noise reduction of Rigaud-Maazaoui, to incorporate the teachings of Wang to include further including estimating a noise variance for each of the respective audio signals; and extracting a speech energy for a respective speaker corresponding to each of the respective audio signals using a respective linear equation for each generated output vector, wherein estimating the noise variance includes estimating each noise variance by the respective linear equation and extracting the speech energy for the respective speaker includes determining the speech energy using the respective linear equation for each generated output vector from the beamformers. Wang discloses “a solution for audio source separation by jointly taking advantage of both additive source modeling and independent/uncorrelated source modeling” such that “perceptually natural audio sources are obtained while enabling a stable and rapid convergence” which can be readily applied “in any application areas which require audio source separation for mixed signal processing and analysis” and “enable[s] dealing with highly non-stationary sources with stable convergence, including fast moving objects, [and] time-varying sounds”, which, in the context of Rigaud-Maazaoui, would allow the system to adapt for different vehicle occupancy levels without processing empty channels, as recognized by Wang. (Wang, ¶ [0037], [0041]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Jensen et al. (U.S. Pat. App. Pub. No. 2021/0058713) discloses systems and methods for a method for estimating a signal to noise ratio of an electric input signal representing sound including a scheme for obtaining an a priori (or second) signal-to-noise-ratio estimate by non-linear smoothing (e.g. implemented as low pass filtering with adaptive low pass cut-off frequency) of an a posteriori (or first) signal-to-noise-ratio estimate.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Sean E. Serraguard whose telephone number is (313)446-6627. The examiner can normally be reached 07:00-17:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel C. Washburn can be reached at (571) 272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Sean E Serraguard/Patent Examiner, Art Unit 2657
Read full office action
Prosecution Timeline

Oct 09, 2023
Application Filed
Aug 07, 2025
Non-Final Rejection mailed — §102, §103, §112
Oct 24, 2025
Response Filed
Jan 30, 2026
Final Rejection mailed — §102, §103, §112
Mar 30, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

18/221,274
Patent 12614545
CROSS-DEVICE DATA SYNCHRONIZATION BASED ON SIMULTANEOUS HOTWORD TRIGGERS
2y 9m to grant Granted Apr 28, 2026
17/583,512
Patent 12609109
Speech Recognition Method and Apparatus, and Computer-Readable Storage Medium
4y 2m to grant Granted Apr 21, 2026
18/154,549
Patent 12603095
Stereo Audio Signal Delay Estimation Method and Apparatus
3y 3m to grant Granted Apr 14, 2026
17/648,548
Patent 12598250
SYSTEMS AND METHODS FOR COHERENT AND TIERED VOICE ENROLLMENT
4y 2m to grant Granted Apr 07, 2026
18/004,197
Patent 12597429
PACKET LOSS CONCEALMENT
3y 3m to grant Granted Apr 07, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

2-3
Expected OA Rounds
70%
Grant Probability
99%
With Interview (+33.0%)
3y 0m (~5m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 142 resolved cases by this examiner. Grant probability derived from career allowance rate.