Last updated: April 19, 2026
Application No. 18/403,195
GENERATING SPATIALIZED AUDIO SIGNALS BASED ON MODAL INTERPOLATION OF IMPULSE RESPONSES

Non-Final OA §101§103
Filed
Jan 03, 2024
Examiner
MOHAMMED, ASSAD
Art Unit
2691
Tech Center
2600 — Communications
Assignee
Mitsubishi Electric Research Laboratories Inc.
OA Round
1 (Non-Final)
Interview Optional

— +11.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 587 resolved cases, 2023–2026
Examiner Intelligence

MOHAMMED, ASSAD View full profile →
Grants 73% — above average
Career Allow Rate
430 granted / 587 resolved
+11.3% vs TC avg
Moderate +11% lift
Without
With
+11.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
24 currently pending
Career history
611
Total Applications
across all art units
Statute-Specific Performance

§101
7.3%
-32.7% vs TC avg
§103
67.5%
+27.5% vs TC avg
§102
7.8%
-32.2% vs TC avg
§112
9.5%
-30.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 587 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
1.	Regarding claim 18-20, the claim states “computer storage media having  program instruction ….” The applicants specification on paragraph 00115 states “Storage system 1403 may comprise any computer readable storage media readable by processing system 1402 and capable of storing software 1405. Storage system 1403 may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. Examples of storage media include random access memory, read only memory, magnetic disks, optical disks, flash memory, virtual memory and non-virtual memory, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other suitable storage media. In no case is the computer readable storage media a propagated signal. In the claims, the phrase “computer storage medium” and variations thereof does not include and is not a propagated signal.” Thus the computer storage media excludes signals per se.
Examiner on the bases of the applicants specification this would constitute a non-transitory medium.

Allowable Subject Matter
2.	Claim 17 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Claim Rejections - 35 USC § 103
3.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
4.	Claim(s) 1 is rejected under 35 U.S.C. 103 as being unpatentable over Binn et al. (US 2022/0014868) in view of Grimanis (US 2024/0323626).
Regarding claim 1, Binn teaches an audio processing method, wherein the method uses a processor coupled with stored instructions implementing the method, wherein the instructions, when executed by the processor, carry out steps of the method, comprising: executing a neural network to produce a modal output based at least on spatial input, wherein the spatial input comprises a sound source direction, and wherein the modal output comprises learned modal components of an impulse response generated by the neural network based on the sound source direction (see ¶ 0100, 0167-171, 0194, 0218. An audio output provide listening areas in a public space, address multiple listeners with discrete sound sources, provide spatialization of source material for a single listener (virtual surround sound). The filters may be implemented as recurrent neural networks or deep neural networks, which typically emulate the same process of spatialization. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics.  Point-to-point transfer function model is that some or all of the filters must change when anything moves. If instead the computational model was of the whole acoustic space, sources and listeners could be moved as desired without affecting the underlying room simulation. Transforming the audio program with a spatialization model, to generate an array of audio transducer signals for an audio transducer array representing spatialized audio, the spatialization model comprising parameters defining a head-related transfer function for the listener, and an acoustic interaction of the object.); determining coefficients for an infinite impulse response (IIR) filter based on the learned modal components of the impulse response generated by the neural network (see ¶ 0194, 0218. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics).
Binn does not disclose processing an anechoic audio signal with the IIR filter configured with the coefficients to produce a spatialized audio signal. 
Grimanis teaches processing an anechoic audio signal with the IIR filter configured with the coefficients to produce a spatialized audio signal (see ¶ 0045, 0072, 101-0102. Calculate and apply the parameters for the filters infinite impulse response (IIR). Immersive audio processing unit may include an  infinite impulse response, IIR, filter. The IIR filter may be configured to process the output audio signal and generate an immersive audio signal. A clean audio sample including a voice component may have been recorded in a special environment (e.g. a studio or an anechoic chamber) to reduce the presence of background noise. Alternatively, or in addition, the clean audio signal may have been extensively processed to remove noise.)
The combination of Grimanis to Binn provides the clean audio sample in order to reduce the background noise with an IIR filter. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Binn to incorporate a clean signal being process by a IIR filter for reducing the background noise of the immersive audio signal. The modification provides using a IIR filter to reduce the background noise using a anechoic signal (clean signal).

5. 	Claim(s) 9 is rejected under 35 U.S.C. 103 as being unpatentable over Binn et al. (US 2022/0014868) in view of Leppanen et al. (US 2021/0014630).
Regarding claim 9, Binn teaches a computing device comprising: processing circuitry configured to at least: execute a neural network to process a spatial input, the neural network being trained to produce a modal output, wherein the spatial input comprises a sound source direction, and wherein the modal output comprises learned modal components of an impulse response associated with the sound source direction (see ¶ 0100, 0167-171, 0194, 0218. An audio output provide listening areas in a public space, address multiple listeners with discrete sound sources, provide spatialization of source material for a single listener (virtual surround sound). The filters may be implemented as recurrent neural networks or deep neural networks, which typically emulate the same process of spatialization. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics.  Point-to-point transfer function model is that some or all of the filters must change when anything moves. If instead the computational model was of the whole acoustic space, sources and listeners could be moved as desired without affecting the underlying room simulation. Transforming the audio program with a spatialization model, to generate an array of audio transducer signals for an audio transducer array representing spatialized audio, the spatialization model comprising parameters defining a head-related transfer function for the listener, and an acoustic interaction of the object.); determine coefficients for an infinite impulse response (IIR) filter based on the learned modal components of the impulse response obtained from the neural network (see ¶ 0194, 0218. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics).
Binn does not disclose process an audio signal with the IIR filter configured with the coefficients to increase a spatialization of the audio signal; and audio circuitry configured to output the audio signal.  
Leppanen teaches the process an audio signal with the IIR filter configured with the coefficients to increase a spatialization of the audio signal; and audio circuitry configured to output the audio signal (see ¶ 0197, 0200-0207. The widening of the signal is configured by the infinite impulse response.).   
The combination of Leppanen to Binn provides the widening the audio signal with an IIR filter. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Binn to incorporate audio signal with an IIR filter. The modification provides using a IIR filter to widening the audio signal.



6.	Claim(s) 18 is rejected under 35 U.S.C. 103 as being unpatentable over Binn et al. (US 2022/0014868) in view of Mahabub (US 2009/0046864).
 Regarding claim 18, Binn teaches One or more computer readable storage media having program instructions stored thereon that, when executed by one or more processors of a computing device, direct the computing device to at least: supply spatial input to a neural network that produces a modal output based on the spatial input, wherein the spatial input comprises a sound source direction, and wherein the modal output comprises learned modal components of an impulse response generated by the neural network based on the sound source direction (see ¶ 0100, 0167-171, 0194, 0218. An audio output provide listening areas in a public space, address multiple listeners with discrete sound sources, provide spatialization of source material for a single listener (virtual surround sound). The filters may be implemented as recurrent neural networks or deep neural networks, which typically emulate the same process of spatialization. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics.  Point-to-point transfer function model is that some or all of the filters must change when anything moves. If instead the computational model was of the whole acoustic space, sources and listeners could be moved as desired without affecting the underlying room simulation. Transforming the audio program with a spatialization model, to generate an array of audio transducer signals for an audio transducer array representing spatialized audio, the spatialization model comprising parameters defining a head-related transfer function for the listener, and an acoustic interaction of the object.); determine coefficients for an infinite impulse response (IIR) filter based on the learned modal components of the impulse response generated by the neural network (see ¶ 0194, 0218. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics).
Binn does not disclose configure the IIR filter based on the coefficients.  
Mahabub teaches configure the IIR filter based on the coefficients (see ¶ 0046. The spatialized sound having a IIR filters uses the coefficients in order to modify the spatialized sound.).    
The combination of Mahabub to Binn provides the using coefficients to configure the IIR for the spatialized sound.  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Binn to incorporate coefficients to configure the IIR for the spatialized sound. The modification provides using a IIR filter coefficients to modify the spatialized audio.

7.	Claim(s) 21 is rejected under 35 U.S.C. 103 as being unpatentable over Binn et al. (US 2022/0014868) in view of Tassart (US 2015/0180436) further in view of Shafran et al. (US 2019/0156819).
 Regarding claim 21, Binn teaches a method of training an artificial neural network, the method comprising: extracting spatial features from impulse response samples, wherein each of the impulse response samples comprises a spatial feature and an associated impulse response; for each one of the impulse response samples: supplying the feature vector as input to the artificial neural network (see ¶ 0073, 0100, 0167-171, 0194, 0218. An audio output provide listening areas in a public space, address multiple listeners with discrete sound sources, provide spatialization of source material for a single listener (virtual surround sound). The filters may be implemented as recurrent neural networks or deep neural networks, which typically emulate the same process of spatialization. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics.  Point-to-point transfer function model is that some or all of the filters must change when anything moves. If instead the computational model was of the whole acoustic space, sources and listeners could be moved as desired without affecting the underlying room simulation. Transforming the audio program with a spatialization model, to generate an array of audio transducer signals for an audio transducer array representing spatialized audio, the spatialization model comprising parameters defining a head-related transfer function for the listener, and an acoustic interaction of the object. Speech signal in the frame from the neural network can include concatenating the spectral features with corresponding features of a context window to obtain an input vector; and using the input vector as the input to the neural network.); obtaining output from the artificial neural network comprising learned modal components of the impulse response; determining coefficients for an impulse response (IR) filter based on the learned modal components of the impulse response (see ¶ 0194, 0218. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics).
	Binn does not disclose performing a comparison of the estimated frequency domain magnitude response of the IR filter to a known frequency domain magnitude response of the impulse response; and updating weights in the artificial neural network based on the comparison.  
	Tassart teaches a comparison of the estimated frequency domain magnitude response of the IR filter to a known frequency domain magnitude response of the impulse response (see ¶ 0037, 0040. Comparing weighted values of the frequency domain and evaluated the compared magnitudes and setting weights of the filter to equalizing the audio signal.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Binn to incorporate comparing filter weighted audio signals for processing. The modification providing in the frequency domain the weighted values of the compared weighted magnitudes of the audio signals. 
	Shafran updating weights in the artificial neural network based on the comparison (see ¶ 0102. Neural network is updated the weights for the filter when compared data is obtained.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Binn and Tassart to incorporate updating the weights of the signal in the neural network. The modification providing neural network updated audio signal.  

8.	Claim(s) 2, 3, 4, 5, 6 are rejected under 35 U.S.C. 103 as being unpatentable over Binn et al. (US 2022/0014868) in view of Grimanis (US 2024/0323626).
Regarding claim 2, Binn teaches the audio processing method of claim 1 wherein the learned modal components of the impulse response comprise center frequency, bandwidth, and gain (see ¶ 0026-0028. The signal includes center frequency, bandwidth and gains.).
  
Regarding claim 3, Binn teaches  The audio processing method of claim 2 wherein the sound source direction comprises a direction of a sound source relative to a listener position, and wherein the impulse response comprises a head-related transfer function (HRTF) (¶ 0114, 0211. The sound source is directed based on the users position.)

Regarding claim 4, Binn teaches the audio processing method of claim 3 wherein the anechoic audio signal comprises a sound associated with the sound source, and wherein the steps of the method further comprise configuring the IIR filter with the coefficients and outputting the spatialized audio signal to produce a directional effect of the sound at the listener position (see ¶ 0073, 0100, 0167-171, 0194, 0218. An audio output provide listening areas in a public space, address multiple listeners with discrete sound sources, provide spatialization of source material for a single listener (virtual surround sound). The filters may be implemented as recurrent neural networks or deep neural networks, which typically emulate the same process of spatialization. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics.  Point-to-point transfer function model is that some or all of the filters must change when anything moves. If instead the computational model was of the whole acoustic space, sources and listeners could be moved as desired without affecting the underlying room simulation. Transforming the audio program with a spatialization model, to generate an array of audio transducer signals for an audio transducer array representing spatialized audio, the spatialization model comprising parameters defining a head-related transfer function for the listener, and an acoustic interaction of the object.).

Regarding claim 5, Binn teaches the audio processing method of claim 4 wherein steps of the method further comprise training the neural network with training data to produce modal outputs based on spatial inputs (see ¶ 0100, 0167-171, 0194, 0218. An audio output provide listening areas in a public space, address multiple listeners with discrete sound sources, provide spatialization of source material for a single listener (virtual surround sound). The filters may be implemented as recurrent neural networks or deep neural networks, which typically emulate the same process of spatialization. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics.  Point-to-point transfer function model is that some or all of the filters must change when anything moves. If instead the computational model was of the whole acoustic space, sources and listeners could be moved as desired without affecting the underlying room simulation. Transforming the audio program with a spatialization model, to generate an array of audio transducer signals for an audio transducer array representing spatialized audio, the spatialization model comprising parameters defining a head-related transfer function for the listener, and an acoustic interaction of the object.); 

Regarding claim 6, Binn teaches the audio processing method of claim 5 wherein the training data comprise HRTF samples associated with the listener position, and wherein the direction of the sound source in each of the HRTF samples differs relative to each other sample of the HRTF samples (¶ 0114, 0211. The sound source is directed based on the users position. The system will have different values at different directions.).  

9.	Claim(s) 7, 8 is rejected under 35 U.S.C. 103 as being unpatentable over Binn et al. (US 2022/0014868) in view of Grimanis (US 2024/0323626) further in view of Tassart (US 2015/0180436) further in view of Shafran et al. (US 2019/0156819).
Regarding claim 7, Binn teaches the audio processing method of claim 6 wherein training the neural network with the training data comprises, for each one of the HRTF samples: supplying the direction of the sound source as input to the neural network; obtaining output from the neural network comprising the learned modal components of the HRTF associated with the direction of the sound source; determining the coefficients for the IIR filter based on the learned modal components; determining an estimated frequency domain magnitude response of the IIR filter based on the coefficients (see ¶ 0073, 0114, 0100, 0167-171, 0194, 0211, 0218. An audio output provide listening areas in a public space, address multiple listeners with discrete sound sources, provide spatialization of source material for a single listener (virtual surround sound). The filters may be implemented as recurrent neural networks or deep neural networks, which typically emulate the same process of spatialization. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics.  Point-to-point transfer function model is that some or all of the filters must change when anything moves. If instead the computational model was of the whole acoustic space, sources and listeners could be moved as desired without affecting the underlying room simulation. Transforming the audio program with a spatialization model, to generate an array of audio transducer signals for an audio transducer array representing spatialized audio, the spatialization model comprising parameters defining a head-related transfer function for the listener, and an acoustic interaction of the object. Speech signal in the frame from the neural network can include concatenating the spectral features with corresponding features of a context window to obtain an input vector; and using the input vector as the input to the neural network.); obtaining output from the artificial neural network comprising learned modal components of the impulse response; determining coefficients for an impulse response (IR) filter based on the learned modal components of the impulse response (see ¶ 0194, 0218. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics).
	Binn does not disclose performing a comparison of the estimated frequency domain magnitude response of the IIR filter to a known frequency domain magnitude response of the HRTF; and updating weights in the artificial neural network based on the comparison.  
	Tassart teaches performing a comparison of the estimated frequency domain magnitude response of the IIR filter to a known frequency domain magnitude response of the HRTF (see ¶ 0037, 0040. Comparing weighted values of the frequency domain and evaluated the compared magnitudes and setting weights of the filter to equalizing the audio signal.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Binn to incorporate comparing filter weighted audio signals for processing. The modification providing in the frequency domain the weighted values of the compared weighted magnitudes of the audio signals. 
	Shafran updating weights in the artificial neural network based on the comparison (see ¶ 0102. Neural network is updated the weights for the filter when compared data is obtained.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Binn and Tassart to incorporate updating the weights of the signal in the neural network. The modification providing neural network updated audio signal.  

Regarding claim 8, Binn teaches the audio processing method of claim 7 wherein the training data further comprise listener identities associated with the HRTF samples, and wherein the input to the neural network further includes a one of the listener identities associated with the one of the HRTF samples (¶ 0114, 0211. The sound source is directed based on the users position.)
  
10.	Claim(s) 10, 11, 12, 13, 14, 15 are rejected under 35 U.S.C. 103 as being unpatentable over Binn et al. (US 2022/0014868) in view of Leppanen et al. (US 2021/0014630).
Regarding claim 10, Binn teaches the computing device of claim 9 wherein the sound source direction comprises a direction of a sound source relative to a listener position, and wherein the impulse response comprises a head-related transfer function (HRTF) modeled by the neural network for the listener position with respect to the direction of the sound source (¶ 0114, 0211. The sound source is directed based on the users position.)
 
Regarding claim 11, Binn teaches the computing device of claim 10 wherein the audio signal comprises a sound associated with the sound source, and wherein the processing circuitry is further configured to program the IR filter with the coefficients (see ¶ 0194, 0218. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics).

Regrading claim 12, Binn teaches the computing device of claim 11 wherein the neural network is trained with training data to produce modal outputs based on spatial inputs (see ¶ 0073, 0101-0102, 0114, 0100, 0167-171, 0194, 0211, 0218. An audio output provide listening areas in a public space, address multiple listeners with discrete sound sources, provide spatialization of source material for a single listener (virtual surround sound). The filters may be implemented as recurrent neural networks or deep neural networks, which typically emulate the same process of spatialization. The neural networks or other statistical optimization networks may provide coefficients for a generic signal processing chain, such as a digital filter, which may be finite impulse response (FIR) characteristics and/or infinite impulse response (IIR) characteristics.  Point-to-point transfer function model is that some or all of the filters must change when anything moves. If instead the computational model was of the whole acoustic space, sources and listeners could be moved as desired without affecting the underlying room simulation. Transforming the audio program with a spatialization model, to generate an array of audio transducer signals for an audio transducer array representing spatialized audio, the spatialization model comprising parameters defining a head-related transfer function for the listener, and an acoustic interaction of the object. Speech signal in the frame from the neural network can include concatenating the spectral features with corresponding features of a context window to obtain an input vector; and using the input vector as the input to the neural network.).  

Regarding claim 13, Binn teaches the computing device of claim 12 wherein the training data comprise HRTF samples associated with the listener position (see ¶ 0114, 0211. The sound source is directed based on the users position.).  

Regarding claim 14, Binn teaches the computing device of claim 13 wherein the direction of the sound source in each sample of the HRTF samples differs relative to each other sample of the HRTF samples (¶ 0114, 0211. The sound source is directed based on the users position. The system will have different values at different directions.).  

Regarding claim 15, Binn teaches the computing device of claim 14 wherein the training data further comprise listener identities associated with the HRTF samples, and wherein the spatial input further comprises an identity of a listener (see ¶ 0019, 0097, 0102. The system is able to identify the listeners position and automatically adjust the direction of the audio signal.).   

11.	Claim(s) 16 is rejected under 35 U.S.C. 103 as being unpatentable over Binn et al. (US 2022/0014868) in view of Leppanen et al. (US 2021/0014630) in further view of Orban (US 6,205,225).
	Regarding claim 16. The computing device of claim 9 wherein the IIR filter comprises a cascaded IIR filter having multiple IIR filter sections, and wherein the multiple IIR filter sections include a low-frequency (LF) section, a peak frequency (PF) section, and a high-frequency (HF) section. 
	Orban teaches wherein the IIR filter comprises a cascaded IIR filter having multiple IIR filter sections, and wherein the multiple IIR filter sections include a low-frequency (LF) section, a peak frequency (PF) section, and a high-frequency (HF) section (see col. 7, lines 40-67. Wherein the IIR has cascade of serval stages (LF, HF and PF (high frequency peak)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Binn and Leppanen to incorporate IIR filter with different stages HF, PF, LF. The modification providing IIR filter with different stages for the audio processing.   


12.	Claim(s) 19, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Binn et al. (US 2022/0014868) in view of Mahabub (US 2009/0046864) further in view of Grimanis (US 2024/0323626).
	Regarding claim 19, Binn and Mahabub does not teach the one or more computer readable storage media of claim 18 wherein the program instructions further direct the computing device to process an anechoic audio signal with the IIR filter configured with the coefficients to produce a spatialized audio signal. 
Grimanis teaches wherein the program instructions further direct the computing device to process an anechoic audio signal with the IIR filter configured with the coefficients to produce a spatialized audio signal (see ¶ 0045, 0072, 101-0102. Calculate and apply the parameters for the filters infinite impulse response (IIR). Immersive audio processing unit may include an  infinite impulse response, IIR, filter. The IIR filter may be configured to process the output audio signal and generate an immersive audio signal. A clean audio sample including a voice component may have been recorded in a special environment (e.g. a studio or an anechoic chamber) to reduce the presence of background noise. Alternatively, or in addition, the clean audio signal may have been extensively processed to remove noise.)
The combination of Grimanis to Binn provides the clean audio sample in order to reduce the background noise with an IIR filter. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Binn and Mahabub to incorporate a clean signal being process by a IIR filter for reducing the background noise of the immersive audio signal. The modification provides using a IIR filter to reduce the background noise using a anechoic signal (clean signal).

Regarding claim 20, Binn teaches the one or more computer readable storage media of claim 19 wherein the sound source direction comprises a direction of a sound source relative to a listener position, and wherein the impulse response comprises a head-related transfer function (HRTF) modeled by the neural network for the listener position with respect to the direction of the sound source (¶ 0114, 0211. The sound source is directed based on the users position.).  


13.	Claim(s) 22, 23 are rejected under 35 U.S.C. 103 as being unpatentable over Binn et al. (US 2022/0014868) in view of Tassart (US 2015/0180436) further in view of Shafran et al. (US 2019/0156819).
	Regarding claim 22, Binn teaches the method of claim 21 wherein the IR filter comprises an infinite impulse response (IIR) filter, and wherein the learned modal components comprise center frequency, bandwidth, and gain (see ¶ 0026-0028. The signal includes center frequency, bandwidth and gains.).

Regarding claim 23, Binn teaches the method of claim 22 wherein the impulse response comprises one of a head-related transfer function (HRTF) or a room impulse response (RIR) (¶ 0114, 0211. The sound source is directed based on the users position.).

Conclusion
14.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to ASSAD MOHAMMED whose telephone number is (571)270-7253. The examiner can normally be reached 9:00AM-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached at 571-272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ASSAD MOHAMMED/           Examiner, Art Unit 2691    

/DUC NGUYEN/           Supervisory Patent Examiner, Art Unit 2691
Read full office action
Prosecution Timeline

Jan 03, 2024
Application Filed
Feb 03, 2026
Non-Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/105,074
Patent 12604149
ELECTRONIC DEVICE AND METHOD THEREOF FOR OUTPUTTING AUDIO DATA
2y 5m to grant Granted Apr 14, 2026
18/340,183
Patent 12598441
AUDIO SIGNAL PROCESSING METHOD AND AUDIO SIGNAL PROCESSING APPARATUS
2y 5m to grant Granted Apr 07, 2026
18/585,594
Patent 12587801
RE-MIXING A COMPOSITE AUDIO PROGRAM FOR PLAYBACK WITHIN A REAL-WORLD VENUE
2y 5m to grant Granted Mar 24, 2026
18/626,976
Patent 12587774
SYSTEM AND METHOD OF ASSEMBLING A COMPRESSION TRIGGERED HEADSET POWER SAVING SYSTEM FOR AN AUDIO HEADSET
2y 5m to grant Granted Mar 24, 2026
18/245,792
Patent 12581240
Method and System for Determining Audio Channel Role of Sound Box, Electronic Device, and Storage Medium
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
73%
Grant Probability
84%
With Interview (+11.1%)
3y 0m
Median Time to Grant
Low
PTA Risk
Based on 587 resolved cases by this examiner. Grant probability derived from career allow rate.