DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 10/09/2025 have been fully considered but they are not persuasive.
Applicant argues that amendment traverses the § 102(a)(1) rejection over Wu et al. (CN 112259116 A, “Wu”), asserting that Wu does not disclose (1) two parallel noise-reduction paths and (2) “weighting circuitry” in the second path that modifies a second noise filter based on a first noise filter to generate a third noise filter.
In response: I. WU DISCLSOURE OF TWO PARALLEL NOISE-REDUCTION PATHSApplicant emphasizes Wu’s FIG. 4 modules 420 and 430 as “in series,” but a closer reading shows they are in parallel with respect to the same input and merged at the end.
A. Common input and merged output • Wu ¶ [0010] and Fig. 4: Both the first module 420 and the second module 430 receive “molecular band” (sub-band) inputs derived from the same audio frame data (the “input”). They do not pass serially one into the other; rather, each processes its own sub-band(s) simultaneously.• Wu ¶ [0066]: The noise-reduced outputs of modules 420 and 430 are recombined (“determine the noise-reduced audio frame data based on … first sub-band and other sub-bands”) and then resynthesized into a single time-domain signal. This recombination is consistent only with parallel processing.
B. Applicant’s “series” interpretation misconstrues “first” and “second” modules• Wu ¶ [0050]–¶ [0056] (first module 420 = model-based denoising of one sub-band) and ¶ [0059]–¶ [0066] (second module 430 = gain mapping for other sub-bands) operate on disjoint sub-band sets.• They run in parallel on the same frame; they are not chained.
Accordingly, Wu teaches splitting the input into two (first-subband vs. other-subband) denoising paths working in parallel.
II. WU’S GAIN MAPPING = “WEIGHTING CIRCUITRY” MODIFYING THE SECOND FILTER BASED ON THE FIRST FILTERApplicant contends that Wu lacks “weighting circuitry” that “modifies the second noise filter based … on the first noise filter.” But Wu’s entire purpose of the second module 430 is to calculate gains for the “other” sub-bands “based on the gain of the first sub-band” (¶ [0060]) and then apply those gains as noise-reduction filters (¶ [0065]).
A. Wu discloses determining second-path gains from first-path gain• Wu ¶ [0060]: “based on the gain of the first subband, determine the gains of other subbands.”• Embodiment 4 (¶ [0143]–¶ [0161]): Formulas avgGainH, avgProbH, gainH, and gain explicitly compute other-subband gains from the first-subband gain.
B. Wu applies those gains as filters in the second path• Wu ¶ [0065]: “perform noise reduction processing on the other subbands according to the gains of the other subbands.”• Wu ¶ [0066]: These gains are applied to the frequency-domain (or time-domain) samples of the other sub-bands as multiplicative filters.
C. This is precisely “weighting circuitry” that modifies the second noise filter based on the first noise filter to produce a third filter• The first noise filter in Wu = the model-derived gain applied to the first sub-band.• Wu’s second module uses that gain to compute and then modify the second-path filter gains.• The result (“third filter”) is the gain-adjusted filter applied to the second path’s sub-bands.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-14 and 19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Wu et al (CN 112259116) (Espacenet document provided by applicant is being used for clarity purposes).
Regarding claim 1, Wu discloses an apparatus for generating an output audio signal (60) by suppressing noise in a first input spectrum (20) and noise in a second input spectrum (22), said first and second input spectra (20,22) having been obtained from an input audio signal (12) (i.e., Wu discloses framing input audio and dividing into sub-bands (sub-band spectra) derived from the input audio signal; then denoises sub-band(s) and synthesizes a time-domain output. See Wu ¶[0042]–¶[0046], Embodiment 3 S310–S320 ¶[0116]–¶[0123], and ¶[0134]–¶[0139] (IFFT/synthesis to output audio), wherein said first input spectrum (20) represents first energy, wherein said second input spectrum (22) represents second energy, wherein said first energy is energy that is present in said input audio signal (12) and that is within a first frequency band, and wherein said second energy is energy that is present in said input audio signal (12) and that is within a second frequency band (i.e., Wu discloses sub-bands corresponding to different frequency intervals; first sub-band is chosen as low-frequency interval and other sub-bands correspond to other (higher) frequency intervals. See Wu ¶[0046]–¶[0049], ¶[0074]–¶[0077]), said apparatus comprising a hybrid noise-reducer (10) that splits the input audio signal into a first noise-reduction path (24) and a second noise-reduction path (36) for parallel processing relative to the first noise-reduction path (24), wherein said first noise-reduction path receives said first input spectrum (20), wherein said second noise-reduction path (36) receives said second input spectrum (22) (i.e., Wu discloses processing first sub-band via a noise-reduction model (first module 420) and processing other sub-bands via gain mapping (second module 430). These modules operate on disjoint sub-bands of the same frame and their outputs are later combined — i.e., parallel per-frame processing. See Wu ¶[0050]–¶[0056] (first module), ¶[0059]–¶[0066] (second module), FIG.4 (420, 430), ¶[0066]–¶[0067] (combination into denoised frame), wherein said first noise-reduction path (24) is configured to apply a first noise-reduction method to said first input spectrum (20) by producing a first noise filter (26) for reducing noise in said first input spectrum (20) (i.e., Wu describes inputting first sub-band frequency-domain info to a pre-trained noise reduction model (e.g., RNNoise, RNN, CNN), which outputs denoised frequency-domain information and a gain for each frequency point in the first sub-band — functionally a first-path filter. See Wu ¶[0050]–¶[0056], Embodiment 1 S120), wherein said second noise-reduction path (36) is configured to apply a second noise-reduction method to said second input spectrum (22) by producing a second noise filter (50) for reducing noise in said second input spectrum (22) (i.e., Wu discloses computing gains for the other sub-bands based on the first sub-band’s gain (and decision probabilities) and applying those gains to the other sub-bands — the computed gains act as second-path filters for those sub-bands. See Wu ¶[0059]–¶[0066], Embodiment 3 S330), wherein said second noise-reduction path (36) comprises weighting circuitry (52) that modifies said second noise filter (50) based at least in part on said first noise filter (26), thereby generating a third noise filter (38) (i.e., Wu explicitly teaches that gains for other sub-bands are determined based on the gain(s) output by the first sub-band model and on decision probability (vad). Wu provides formulas (avgGainH, avgProbH, gainH, final gain) showing modification of other-subband gains using first-subband information — i.e., weighting circuitry that modifies second-path filter to produce applied filter. See Wu ¶[0060]–¶[0066]; Embodiment 4 ¶[0144]–¶[0155] (formulas and explanation).
Regarding claim 2, Wu discloses the apparatus of claim 1, wherein said hybrid noise-reduction system (10) further comprises multipliers (28, 40) that are configured to apply said first noise filter (26) to said first input spectrum (20) and to apply said third noise filter (38) to said second input spectrum (22) to yield a filtered first input spectrum (30) and a filtered second input spectrum (42) respectively (i.e., Wu discloses multiplicative application of model output gains to the first sub-band and application of mapped gains to other sub-bands; Wu’s equations and descriptions show multiplying spectral or time-domain sub-band signals by gains to produce filtered outputs (see time-domain multiplication and frequency-domain gain application). See Wu ¶[0050]–¶[0056] (first gain application), ¶[0080]–¶[0083] (time-domain application), Embodiment 3 ¶[0136]–¶[0139] (frequency-domain multiplication and splicing).
Regarding claim 3, Wu discloses the apparatus of claim 1, wherein said hybrid noise-reduction system (10) further comprises stacking circuitry (54) that combines filtered first and second input spectra (30, 42) into an output spectrum (56) that represents a frequency-domain representation (16) of said input audio signal (12) with noise having been suppressed therein (i.e., Wu discloses splicing/combining denoised first sub-band and denoised other sub-bands into a full-band frequency-domain representation prior to inverse transformation (S340/S350). See Wu ¶[0130]–¶[0136] and ¶[0137]–¶[0139] (splicing and IFFT).
Conclusion for Claim 3: Anticipated by Wu.
Regarding claim 4, Wu discloses the apparatus of claim 1, further comprising a transform circuit (14) that receives said input audio signal (12) and provides a frequency-domain representation of said input audio signal (12) from which said first and second input spectra (20, 22) are obtained (i.e., Wu discloses performing FFT or STFT to convert framed audio to frequency-domain X(k), then grouping into sub-bands. See Wu Embodiment 3 S310–S320 ¶[0116]–¶[0123]).
Regarding claim 5, Wu discloses the apparatus of claim 1, further comprising a transform circuit (14) that is configured to carry out a short-term Fourier transform of said input audio signal (12) (i.e., frequency-domain transformations are FFT/STFT of frames (L-point FFT) of the audio — Wu explicitly discusses FFT and IFFT for frames. See Wu ¶[0127]–¶[0134], Embodiment 3 S310).
Regarding claim 6, Wu discloses the apparatus of claim 1, wherein said hybrid noise-reduction system (10) further comprises inverse-transform circuitry (58) that converts an output spectrum (56) into said output audio signal (60), said output spectrum (56) being representative of a frequency domain representation of said input audio signal (12) with noise having been suppressed therein (i.e., Wu discloses inverse transformation (IFFT or synthesis filter bank) of the combined denoised frequency-domain information to produce time-domain enhanced audio. See Wu ¶[0134]–¶[0139]; Embodiment 2 ¶[0102]–¶[0109]).
Regarding claim 7, Wu discloses the apparatus of claim 1, wherein said hybrid noise-reduction system (10) further comprises inverse-transform circuitry (58) that carries out an inverse short-term Fourier transform to convert an output spectrum (56) into said output audio signal (60), said output spectrum (56) being representative of a frequency-domain representation (16) of said input audio signal (12) with noise having been suppressed therein (i.e., Wu expressly describes performing inverse FFT (IFFT) on the combined spectrum to yield enhanced time-domain audio; this corresponds to inverse STFT per-frame synthesis. See Wu ¶[0134]–¶[0139]).
Regarding claim 8, Wu discloses the apparatus of claim 1, wherein said first noise-reduction path (24) comprises a dynamic neural network (34) that produces said first noise filter (26) based on features extracted from said first input spectrum (20) (i.e., Wu identifies the noise reduction model as RNNoise / RNN / CNN or similar neural models; the model consumes first sub-band frequency-domain info and outputs denoised frequency-domain info and gain — this is a neural network producing a filter/gain based on input features (spectral features). See Wu ¶[0051]–¶[0056] (model types and outputs), ¶[0055] (training)).
Regarding claim 9, Wu discloses the apparatus of claim 1, wherein said first noise-reduction path (24) is configured to provide a voice-activity signal (62) indicative of voice activity in said first input spectrum (20) and to provide said voice-activity signal (62) to said weighing circuitry (52) for use in modifying said second filter (50) (i.e., Wu states the model can output decision probability (vad) indicating whether the frame/sub-band contains valid sound (e.g., human voice) and uses that vad in the gain-mapping computation for other sub-bands (avgProbH and subsequent gain calculations). See Wu ¶[0063]–¶[0064], ¶[0148]–¶[0154] (vad used in avgProbH).
Regarding claim 10, Wu discloses the apparatus of claim 1, wherein said first noise-reduction path (24) comprises a dynamic neural network (34) that was trained using frequencies in said first band (i.e., Wu describes training the noise reduction model using sample frame data in the same frequency range as the first sub-band and indicates that training can be performed only on first-subband data to increase efficiency for high sampling rates. See Wu ¶[0055]–¶[0059]).
Regarding claim 11, Wu discloses the apparatus of claim 1, wherein said second noise-reduction path (36) comprises an estimator (44) and a filter calculator (48) that determines said second noise filter (50) based on a noise estimate provided by said estimator (44) (i.e., Wu discloses determining gains for other sub-bands using computed metrics (avgGainH, vad, avgProbH) that operate as an estimate of valid-signal/noise relationship; Wu’s described operations perform the functions of an estimator and a filter-calculation stage that determines gains to apply to other sub-bands. See Wu ¶[0144]–¶[0155] (avgGainH, avgProbH, gainH, final gain) and ¶[0059]–¶[0066] (calculation and application of gains)).
Regarding claim 12, Wu discloses the apparatus of claim 1, wherein said weighting circuitry (52) is configured to modify said second noise filter (50) to cause said third noise filter (38) to suppress noise that would not have been suppressed by said second noise filter (50) had said second noise filter (50) been applied to said input remainder-spectrum (22) (i.e., gain-mapping uses first-subband gain and vad to alter the gains applied to other sub-bands; the computed applied gain can increase suppression in other sub-bands beyond what an unmodified second-path filter might have suppressed, depending on avgGainH/avgProbH/gainH combinations. See Wu ¶[0148]–¶[0154] and ¶[0152]–¶[0154] (gain formula combining gainH and avgGainH), and ¶[0060]–¶[0066] (motivation and operation of gain mapping)).
Regarding claim 13, Wu discloses the apparatus of claim 1, wherein said weighting circuitry (52) is configured to modify said second noise filter (50) to prevent said third noise filter (38) from suppressing power present in said input remainder-spectrum that would have been suppressed by said second noise filter (50) had said second noise filter (50) been applied to said input remainder spectrum (22) (i.e., piecewise/composite gain computation (weighted mix of avgGainH and gainH based on avgProbH threshold) is expressly designed to balance between suppression and preservation depending on detected valid-sound probability (prevent oversuppression when valid sound is present). See Wu ¶[0151]–¶[0154] (piecewise combining gainH and avgGainH) and surrounding text describing rationale for preserving desired signal when vad/high avgGainH indicates presence of valid sound).
Regarding claim 14, Wu discloses the apparatus of [claim 1], wherein there exists a first probability and a second probability, wherein said first probability is a probability that speech is present in said input remainder-spectrum (22), wherein said second probability is a conditional probability that speech is present in said input remainder-spectrum (22) given information concerning the presence of speech in said input base-spectrum (20), and wherein said weighting circuitry (52) is configured to modify said second noise filter (50) based on a function of said first and second probabilities (i.e., Wu discloses vad as a decision probability for first sub-band, and defines avgProbH combining vad and avgGainH (a function blending indicators), and uses avgProbH to compute gainH and final gain. avgProbH and the final gain are functions combining probabilities/metrics of speech presence across bands (conditional/combined reasoning), and Wu explicitly uses these to modify other-subband gains. See Wu ¶[0148]–¶[0154] and ¶[0151] for formulas; ¶[0063]–¶[0064] (VAD concept).
Regarding claim 19, Wu discloses a method comprising reducing noise in an input audio signal, wherein reducing said noise comprises splitting a frequency-domain representation of said input audio signal into first and second input spectra that are processed in parallel (i.e., FFT to obtain frequency-domain X(k), grouping into sub-bands S(m,k) (first/other sub-bands), processing first and other sub-bands separately (parallel per-frame) and later recombining. See Wu Embodiment 3 S310–S320 ¶¶[0116]–[0123]; ¶[0130]–¶[0136]; ¶[0066]–¶[0067]), using a first noise-reduction method, generating a first filter for reducing noise in said first input spectrum, thereby generating a first output spectrum (i.e., Wu: model-based denoising of first sub-band produces denoised frequency-domain information and per-frequency gains (first filter), yielding denoised first-subband output. See Wu ¶[0050]–¶[0056]; Embodiment 3 S320), using a second noise-reduction method that includes use of information obtained from having used said first noise-reduction method, generating a second filter for reducing noise in said second input spectrum, thereby generating a second output spectrum filtered by a third filter (i.e., second path uses information from the first path (first sub-band gain and VAD) to compute gains (second filter inputs) and applies the resulting weighted gain (third filter) to the other sub-bands, producing a second output spectrum. See Wu ¶¶[0060]–[0066]; Embodiment 4 ¶¶[0144]–[0155]), and outputting a time-domain signal formed from having transformed a frequency-domain signal that resulted from having combined said first and second output spectra (i.e., splice the denoised sub-band spectra into a full-band spectrum and perform IFFT/synthesis to produce the denoised time-domain audio. See Wu Embodiment 3 S340–S350 ¶¶[0126]–[0135]; ¶¶[0066]–[0067]).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 15 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu, in view of, Atti et al. (US PG-Pub No. US 20150380008 A1), hereinafter Atti.
Regarding claim 15, Wu discloses the limitations of claim 1, but does not specify the upper bound of the first base-spectrum. However, Atti teaches:
wherein said input base-spectrum has an upper bound of seven kilohertz ([20] In wideband (WB) applications, such as cellular telephony and voice over internet protocol (VoIP), signal bandwidth may span the frequency range from 50 Hz to 7 kHz.).
Wu and Atti are analogous art because they are from a similar field of endeavor in performing operations in the bands spanning human speech. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify set the upper bound of the base frequency band to that of Atti. It would be obvious to try the known technique of setting the frequency band to that of wideband audio. This is a common chosen band for human speech transmission. Setting the band to a maximum of 7Khz would pick up much of typical human speech and some harmonics resulting in the benefit of an accurate depiction of human speech.
Regarding claim 16, Wu discloses the limitations of claim 1, but does not specify if the first band and other sub-bands share bounds. However, Atti teaches:
wherein said remainder base-spectrum has a lower band that is equal to an upper bound of said input base-spectrum ([43] For example, the low-band signal 122 and the high-band signal 124 may occupy non-overlapping frequency bands of 50 Hz-7 kHz and 7 kHz-16 kHz.).
Wu and Atti are analogous art because they are from a similar field of endeavor in performing operations in the bands spanning human speech. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify set the upper bound of the base frequency band and the lower bound of the remainder band to that of Atti. It would be obvious to try the known technique of setting the frequency bands bounds (upper of the base band to the lower of the remainder band) to the same value. This is a common solution as it covers the entire frequency band of interest. Trying the known technique would result in the entire desired spectrum to be covered by each filter, which would result in the known benefit of a complete coverage of the band to accurately pick up all required frequencies.
Claim(s) 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu, in view of, Romit Choudhury (NPL - “Sound Inaudible to Humans Can Be Recorded by Microphones”), hereinafter Choudhury.
Regarding claim 17, Wu discloses the limitations of claim 1, but does not specify the cap frequency. However, Choudhury teaches:
wherein said remainder base-spectrum has an upper bound that is equal to twenty-four kilohertz ([PG1] Microphones, from those in smartphones to hearing aids, are built specifically to hear the human voice — humans can’t hear at levels higher than 20 kHz, and microphones max out at around 24 kHz, meaning that microphones only capture the sound humans hear.).
Wu and Choudhury are analogous art because they are from a similar field of endeavor in capturing all microphone frequencies. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify set the cap frequency to that of the disclosed microphone of Choudhury. It would be obvious to try the known technique of setting the frequency cap to 24 kHz as it would allow of the maximum frequency pick up of consumer microphones, which extend to a maximum of 24 kHz. Using the known technique would result in the benefit of complete frequency coverage of all microphones which would reduce computation time for any frequencies above the cap.
Claim(s) 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu, in view of, audioXpress (NPL – “Sennheiser Unveils New Unified Communications and Collaboration Solutions”), hereinafter audioXpress.
Regarding claim 18, Wu discloses the limitations of claim 1, but does not specify the cap frequency. However, audioXpress teaches:
wherein said remainder base-spectrum has an upper bound that is equal to 11.5 kilohertz ([PG2] The SDW 5000 Series is capable of transmitting sounds in the super wideband range from 150 Hz all the way up to 11.5 kHz. The SDW 5000’s dual- microphone noise cancellation filters background noise, and the own-voice-detector helps suppress disturbances in between words, for quality transmissions with minimized detractions. See image for “Frequency ranges in call mode”.).
Wu and audioXpress are analogous art because they are from a similar field of endeavor in capturing microphone frequencies. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify set the cap frequency to that of the disclosed microphone of audioXpress. It would be obvious to try the known technique of setting the frequency cap to 11.5 kHz as it would allow the capture of the typical consumer microphone, which can be around 11.5 kHz. Using the known technique would result in the benefit of complete frequency coverage of a typical consumer microphone which would reduce computation time for any frequencies above the cap.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US-20210012767-A1, Kupryjanow et al., Real-Time Dynamic Noise Reduction Using Convolutional Networks.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PIERRE LOUIS DESIR whose telephone number is (571)272-7799. The examiner can normally be reached Monday-Friday 9AM-5:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PIERRE LOUIS DESIR/Supervisory Patent Examiner, Art Unit 2659