Last updated: May 29, 2026

Application No. 17/679,604

DEEP NEURAL NETWORK DENOISER MASK GENERATION SYSTEM FOR AUDIO PROCESSING

Non-Final OA §103

Filed

Feb 24, 2022

Priority

Feb 25, 2021 — provisional 63/153,757

Examiner

ISKENDER, ALVIN ALIK

Art Unit

2654

Tech Center

2600 — Communications

Assignee

Shure Acquisition Holdings Inc.

OA Round

4 (Non-Final)

Interview Optional

— +60.3% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 48% grant rate with +60.3% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 25 resolved cases, 2023–2026

Examiner Intelligence

ISKENDER, ALVIN ALIK View full profile →

Grants 48% of resolved cases

Career Allowance Rate

12 granted / 25 resolved

-14.0% vs TC avg

Strong +60% interview lift

Without

With

+60.3%

Interview Lift

resolved cases with interview

Typical timeline

3y 3m

Avg Prosecution

12 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

0.8%

-39.2% vs TC avg

§103

88.8%

+48.8% vs TC avg

§102

10.4%

-29.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 25 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-22 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-8, 10-15, 17-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Crow et al. (US 20190082276 A1) in view of Zhang et al. (US 20190080710 A1) in view of Fitz (US 20150092967 A1).

Regarding claim 1, Crow et al. teaches a digital signal processing (DSP) apparatus configured to reduce noise from an audio signal sample associated with at least one microphone ([0016]-[0017], FIG 2 audio sensors 212), the DSP apparatus comprising one or more processors and one or more memories  instructions that (FIG 2, processors and storage)
input the audio signal sample to a both a time-frequency domain transformation pipeline and a deep neural network (DNN) processing loop, wherein the time-frequency domain transformation pipeline forms part of a digital signal processing process and configured to execute one or more transformation pipeline steps of the time-frequency domain transformation pipeline during a transformation time period, ([0026]-[0028]: collect audio data from audio sensors, provided to the system for processing including time/frequency domain conversion [0035]-[0040], applying neural network models, time-bounded filter/frequency coefficients, [0052]-[0053] frequency masks)
wherein the DNN processing loop is configured to (i) determine a denoiser mask associated with a noise prediction for the audio signal sample ([0035]-[0040], applying neural network models, time-bounded filter/frequency coefficients, [0052]-[0053] frequency masks)
based on the denoiser mask being determined prior to an expiration of the transformation time period, output a DNN denoised version of the audio signal sample via the time-frequency domain transformation pipeline by applying the denoiser mask to the audio signal sample; ([0039], [0046], time-bounded filter; filters are applied respective to the time when the audio is received and to a “time period of validity”, thus if within the time period of validity, the filter will be applied)
Crow et al does not disclose wherein the DNN processing loop is configured to (ii) utilize machine learning to determine the denoiser mask at least approximately in parallel to the one or more transformation pipeline steps performed by the time-frequency domain transformation pipeline;
However, Zhang et al. does disclose wherein the DNN processing loop is configured to (ii) utilize machine learning to determine the denoiser mask at least approximately in parallel to the one or more transformation pipeline steps performed by the time-frequency domain transformation pipeline; (Figure 4, [0007], claims 1-4: Using feature learning to predict and identify noise components and output a matrix to serve as a denoiser mask in between time frequency transformation steps; processing unit 204 inherently enables multithreaded implementation; Figure 3: starting from block 322, two parallel paths are shown-  one path uses the audio sample to determine the denoiser mask; while the other path provides the time/frequency transformed sample, STFT is an appropriate step for both pipelines)
based the denoiser mask not being determined prior to the expiration of the transformation time period, output a default version of the audio signal sample via the time- frequency domain transformation pipeline without applying the denoiser mask to the audio signal sample. ([0046]: time-bounded filters have a limited time period of validity to which they can be applied to the signal; [0072]: use a default earpiece filter when there isn’t a new time-valid filter available)
It would have been obvious to one with ordinary skill in the art before the effective filing date incorporate the DNN processing steps of Zhang et al. to produce a denoiser mask because by isolating noise from speech and creating a mask using deep learning techniques, the intelligibility of speech in audio can be enhanced (See Zhang et al. [0007])
Crow and Zhang do not teach wherein the time-frequency domain transformation pipeline forms part of digital signal processing process and is configured to add a delay to the audio signal sample.
However, Fitz does teach wherein the time-frequency domain transformation pipeline forms part of digital signal processing process and is configured to add a delay to the audio signal sample. (Figure 3, [0034]: in a parallel unenhanced signal branch, include a delay unit)
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to include a delay unit in a parallel branch as taught by Fitz in the teachings of Crow and Zhang because the delay unit compensates for processing latency introduced in the parallel enhancement branch (see Fitz [0034]).

Regarding claim 2, elements of parent claim 1 are disclosed as written above. Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: 
based on the denoiser mask not being determined prior to expiration of the transformation period, apply a default denoiser mask associated with a default noise prediction to the audio signal sample associated with the time-frequency domain transformation pipeline. ([0072] default earpiece filter)
Regarding claim 3, elements of parent claim 1 are disclosed as written above.  Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: 
based on the denoiser mask not being determined prior to expiration of the transformation period, apply a prior denoiser mask associated with a prior noise prediction to the audio signal sample associated with the time-frequency domain transformation pipeline. ([0061] using recently used filters)
Regarding claim 4, elements of parent claim 3 are disclosed as written above. Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: modify the prior denoiser mask in response to applying the prior denoiser mask to the audio signal sample associated with the time- frequency domain transformation pipeline. ([0061] create a best estimate filter by averaging previously used filters)
Regarding claim 5, elements of parent claim 1 are disclosed as written above.  Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: 
based on the denoiser mask not being determined prior to expiration of the transformation period, apply a passthrough denoiser mask configured without denoising to  the audio signal sample associated with the time-frequency domain transformation pipeline. ([0072] store any number of fallback filter types)
Regarding claim 6, elements of parent claim 1 are disclosed as written above.  Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: 
based on the denoiser mask not being determined prior to expiration of the transformation period, apply a band-pass shape denoiser mask to  the audio signal sample associated with the time-frequency domain transformation pipeline. ([0072] fallback filters include a band-pass filter)
Regarding claim 7, elements of parent claim 1 are disclosed as written above.  Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: 
based on the denoiser mask not being determined prior to expiration of the transformation period, apply a low-pass shape denoiser mask to the audio signal sample associated with the time-frequency domain transformation pipeline. ([0072] fallback filters include a low-pass filter)
Regarding claim 8, elements of parent claim 1 are disclosed as written above. Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: receive user denoiser control parameters; ([0011], [0076]: user preference data)
apply the user denoiser control parameters to the denoiser mask to generate a user-modified denoiser mask; ([0039], [0072] user feedback, preference data)
based on the user-modified denoiser mask being determined prior to expiration of the transformation period, apply the user-modified denoiser mask to the frequency domain version of the audio signal sample associated with the time-frequency domain transformation pipeline to generate a user-modified denoised audio signal sample. ([0039] time-bounded filter)
Regarding claim 10, elements of parent claim 1 are disclosed as written above. Crow et al. further teaches the DSP apparatus wherein the denoiser mask is applied to a first frequency domain audio signal sample, and wherein the instructions are further operable to cause the DSP apparatus to: transform the audio signal sample into the first frequency domain audio signal sample via the time-frequency domain transformation pipeline; ([0025]-[0028] frequency domain conversion)
and transform the audio signal sample into a second frequency domain audio signal sample via the DNN processing loop. ([0025]-[0028], [0035] applying neural network model)
provide the second frequency domain audio signal sample to a DNN model that is configured to determine the denoiser mask; ([0035]-[0040], applying neural network model)
based on the denoiser mask being determined prior to expiration of the transformation period, apply the denoiser mask to the first frequency domain audio signal sample. ([0039] time-bound filter)
Regarding claim 11, elements of parent claim 1 are disclosed as written above. Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: modify frequency of one or more portions of the audio signal sample to generate a modified audio signal sample; ([0028]-[0030], frequency pre-processing)
transform the modified audio signal sample into a frequency domain audio signal sample; modify frequency of one or more portions of the frequency domain audio signal sample to generate a modified frequency domain audio signal sample; ([0025]-[0028] frequency domain conversion)
and determine the denoiser mask associated with the noise prediction based on the modified frequency domain audio signal sample. ([0035]-[0040], applying neural network models, time-bounded filter/frequency coefficients, [0052]-[0053] frequency masks)
Regarding claim 12 elements of parent claim 1 are disclosed as written above. Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: provide the modified frequency domain audio signal sample to a DNN model that is configured to determine the denoiser mask; ([0035]-[0040], applying neural network models, time-bounded filter/frequency coefficients, [0052]-[0053] frequency masks)
based on the denoiser mask being determined prior to expiration of the transformation period, apply the denoiser mask to the audio signal sample associated with the time-frequency domain transformation pipeline. ([0039], time-bounded filter)
Regarding claim 13, elements of parent claim 1 are disclosed as written above. Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: perform spatial filtering of the denoiser mask to generate an optimized denoiser mask; ([0058], [0061]: spatial filtering)
based on the optimized denoiser mask being determined prior to expiration of the transformation period, apply the optimized denoiser mask to the audio signal sample associated with the time-frequency domain transformation pipeline to generate an optimized denoised audio signal sample associated with the at least one microphone. ([0039], time-bounded filter)
Regarding claim 14, elements of parent claim 1 are disclosed as written above. Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: apply the denoiser mask associated with the noise prediction to the audio signal sample associated with the time-frequency domain transformation pipeline based on a user bypass input parameter associated with the time-frequency domain transformation pipeline satisfying a defined bypass criterion. ([0031], [0076], [0079]: receive user input for feedback or user-defined parameters; raw audio mode)
Regarding claim 15, elements of parent claim 1 are disclosed as written above. Crow et al. further teaches the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: provide the audio signal sample to a DNN model of the DNN processing loop that is configured to predict whether the audio signal sample includes one or more signals of interest and to configure the denoiser mask based on the one or more signals of interest; ([0035]-[0040], applying neural network models, detecting trigger conditions)
based on the denoiser mask being determined prior to expiration of the transformation period, scale active noise cancellation associated with the audio signal sample based on the denoiser mask. ([0039], time-bounded filter)
Regarding claim 17, it is analogous to claim 1 and is rejected in a similar fashion. 
Regarding claim 18, elements of parent claim 17 are disclosed as written above. Crow et al. further teaches the computer-implemented method:
wherein outputting the default version of the audio signal sample via the time-frequency domain transformation comprises applying a default denoiser mask associated with a default noise prediction to the audio signal sample associated with the time-frequency domain transformation pipeline. ([0072] storing a default earpiece filter or default filter parameters)
Regarding claim 19, elements of parent claim 17 are disclosed as written above. Crow et al. further teaches the computer-implemented method further comprising: 
wherein outputting the default version of the audio signal sample via the time-frequency domain transformation pipeline comprises applying a predicted denoiser mask to the audio signal sample associated with the time-frequency domain transformation pipeline. ([0061] infer possible filters in case of connection problems)
Regarding claim 20, elements of parent claim 17 are disclosed as written above. Crow et al. further teaches the computer-implemented method further comprising: 
wherein outputting the default version of the audio signal sample via the time-frequency domain transformation pipeline comprises applying a prior denoiser mask configured without denoising to the audio signal sample associated with the time- frequency domain transformation pipeline. ([0061] using recently used filters)
Regarding claim 21, it is analogous to claim 1 and is rejected in a similar fashion.
Regarding claim 22, elements of parent claim 1 are disclosed as written above. Crow et al. further teaches the computer-implemented method further comprising: 
generate the default version of the audio signal sample based on a default denoiser mask associated with a default noise prediction for the time-frequency domain transformation pipeline, a prior denoiser mask associated with a prior noise prediction provided by the DNN processing loop, a passthrough mask for the time-frequency domain transformation pipeline that is configured without denoising, a band-pass shape denoiser mask associated with band-pass filtering, a low-pass shape denoiser mask associated with low-pass filtering, or default DSP processing associated with the time-frequency domain transformation pipeline. ([0072]: storing default filter parameters, cached filter parameters, high/low/band pass, etc.)

Claim(s) 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Crow et al in view of Zhang et al. as applied to claim 1 above, and further in view of Vilermo et al. (US 20190088267 A1).
	Regarding claim 9, elements of parent claim 1 are disclosed as written above.. Crow et al. does not disclose the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: generate a dynamic noise reduction interface object that is configured to cause a client device to render a dynamic noise reduction interface to visually indicate a degree of noise reduction provided by the denoiser mask; 
and output the dynamic noise reduction interface object to the client device.
	Villermo et al. does disclose the DSP apparatus wherein the instructions are further operable to cause the DSP apparatus to: generate a dynamic noise reduction interface object that is configured to cause a client device to render a dynamic noise reduction interface to visually indicate a degree of noise reduction provided by the denoiser mask; ([0113], [0116]: render a menu that displays and allows the user to select a level of noise reduction)
and output the dynamic noise reduction interface object to the client device. ([0112]-[0113]: displaying noise reduction menu)
While Crow et al. discloses a client device that can take user input to interact with the noise reduction system, it does not specify an interface object to indicate degrees of noise reduction. Villermo et al. does teach this concept, and it would have been obvious to one with ordinary skill in the art before the effective filing date to include a noise reduction interface taught by Villermo et al. because it enables the user to control the noise reduction systems supplied by the invention (Villermo et al. [0110]).

Claim(s) 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Crow et al. in view of Zhang et al. as applied to claim 1 above, and further in view of Baby et al. ("Machines hear better when they have ears").
	Regarding claim 16, elements of parent claim 1 are disclosed as written above. Crow et al. does not disclose the DSP apparatus wherein the audio signal sample is an otoacoustic emissions signal sample, and wherein the instructions are further operable to cause the DSP apparatus to: provide the otoacoustic emissions signal sample to a DNN model of the DNN processing loop that is configured to predict whether the otoacoustic emissions signal sample includes one or more signals of interest and to configure the denoiser mask based on the one or more signals of interest.
Baby et al. does disclose the DSP apparatus wherein the audio signal sample is an otoacoustic emissions signal sample, and wherein the instructions are further operable to cause the DSP apparatus to: provide the otoacoustic emissions signal sample to a DNN model of the DNN processing loop that is configured to predict whether the otoacoustic emissions signal sample includes one or more signals of interest and to configure the denoiser mask based on the one or more signals of interest. (pages 4-5, nonlinear transmission line model: simulated otoacoustic emissions signals are input to a noise reduction DNN)
While Crow et al. includes inputting audio signals to a DNN as part of a noise reduction process, it does not mention inputting otoacoustic emission signals. Baby et al. does teach this concept, and it would have been obvious to one with ordinary skill in the art before the effective filing date to include otoacoustic emissions signals as an input because including biophysical features can improve the effectiveness of speech enhancement and noise reduction systems (Baby et al. page 1, 5). 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALVIN ISKENDER whose telephone number is (703)756-4565. The examiner can normally be reached M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, HAI PHAN can be reached on (571) 272-6338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ALVIN ISKENDER/Examiner, Art Unit 2654                  

/HAI PHAN/Supervisory Patent Examiner, Art Unit 2654

Read full office action

Prosecution Timeline

Show 17 earlier events

Jun 28, 2025

Examiner Interview Summary

Aug 18, 2025

Non-Final Rejection mailed — §103

Nov 05, 2025

Interview Requested

Nov 12, 2025

Applicant Interview (Telephonic)

Nov 15, 2025

Examiner Interview Summary

Nov 18, 2025

Response Filed

Jan 15, 2026

Final Rejection mailed — §103

Mar 16, 2026

Response after Non-Final Action

Precedent Cases

Applications granted by this same examiner with similar technology

17/650,876

Patent 12632658

SYSTEM AND METHODS FOR KEY-PHRASE EXTRACTION

4y 3m to grant Granted May 19, 2026

17/188,310

Patent 12562244

COMBINING DOMAIN-SPECIFIC ONTOLOGIES FOR LANGUAGE PROCESSING

4y 12m to grant Granted Feb 24, 2026

17/911,224

Patent 12531078

NOISE SUPPRESSION FOR SPEECH ENHANCEMENT

3y 4m to grant Granted Jan 20, 2026

17/926,994

Patent 12505825

SPONTANEOUS TEXT TO SPEECH (TTS) SYNTHESIS

3y 1m to grant Granted Dec 23, 2025

17/750,973

Patent 12456457

ALL DEEP LEARNING MINIMUM VARIANCE DISTORTIONLESS RESPONSE BEAMFORMER FOR SPEECH SEPARATION AND ENHANCEMENT

3y 5m to grant Granted Oct 28, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

4-5

Expected OA Rounds

48%

Grant Probability

99%

With Interview (+60.3%)

3y 3m (~0m remaining)

Median Time to Grant

High

PTA Risk

Based on 25 resolved cases by this examiner. Grant probability derived from career allowance rate.