DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 17 February 2026 have been fully considered but they are not persuasive.
Applicant argues that Cohen in view of Wung does not teach the claimed equation for the prior probability of speech absence. As applicant notes, Cohen alone is not used to read on the claimed equation, but the combination of the references. The differences between Cohen’s equation 29 and applicant’s claimed equation are addressed by the cited portions of Wung. Paragraph [0007] of Wung, partially quoted by applicant, states “The Wiener filter is updated (its filter coefficients are computed) based on a speech presence probability (SPP), and the latter in turn is computed based on an a priori speech presence probability (a priori SPP.) The latter is computed by a multi-channel voice activity detector (MVAD), whose two input thresholds are dynamically adapted and are derived from i) an instantaneous a priori signal to noise ratio (SNR), and ii) an average a priori SNR, of the multi-channel dereverberated signal”. The MVAD referred to by the applicant and cited by the prior rejection is calculating a priori speech presence probability, as the claim requires. The two input thresholds are dynamically adapted and based on “a characteristic of noise distribution” as required by the claim (signal to noise ratio is a characteristic of noise distribution). Thus, the modification of Cohen by Wung renders the claim limitation obvious to one with ordinary skill in the art.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1, 4-13, 15-16, 19-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cohen in view of Wung (US 20180350379 A1).
Claim 1: Cohen discloses a method for suppressing noise by quickly calculating a speech presence probability, comprising:
obtaining an input signal, and converting the input signal from a time-domain signal to a frequency-domain signal; (Section 2 paragraph 2: observed signal converted into frequency domain using STFT)
calculating a real-time power spectrum of the frequency-domain signal, and tracking a minimum power in the real-time power spectrum; (Section 3, Paragraph 2: minimum tracking)
performing noise estimation based on the minimum power to obtain an estimated noise power spectrum; (Section 1: Noise estimation is based on the tracked minimum power)
calculating a gain coefficient based on the estimated noise power spectrum, and enhancing the frequency-domain signal based on the gain coefficient to obtain an enhanced frequency-domain signal; (Section 4 Equation 33, Section 5 Figure 1: Calculate gain based on estimated noise power)
and converting the enhanced frequency-domain signal to a time-domain signal to obtain an output signal. (Section 5 Figure 4: Output enhanced signal c in time domain)
wherein the performing noise estimation based on the minimum power to obtain an estimated noise power spectrum comprises:
calculating a ratio of a real-time power to the minimum power in the real-time power spectrum; (Section 3 Equation 18)
obtaining a threshold and comparing the ratio with the threshold to obtain a prior probability of speech absence; (Section 3 Equation 29)
calculating a posterior signal-to-noise ratio based on the real-time power spectrum, wherein the posterior signal-to-noise ratio is a ratio of a real-time power of a current frame to an estimated noise power of a previous frame; (Section 2 Equation 3)
calculating a prior signal-to-noise ratio through a decision-directed approach; (Section 3 Equation 32 and surrounding paragraphs: decision-directed approach to obtain a priori SNR)
calculating a speech presence probability based on the prior signal-to-noise ratio, the posterior signal-to-noise ratio, and the prior probability of speech absence; (Section 2 Equation 7)
and calculating the estimated noise power spectrum based on the speech presence probability. (Section 2 Equations 9-10 and surrounding paragraphs: use speech probability to obtain the estimated noise spectrum)
wherein a calculation for probability of speech absence is obtained as:
PNG
media_image1.png
115
501
media_image1.png
Greyscale
; (Section 3 Equation 29)
However, Cohen does not disclose wherein Δ represents a threshold set by frequencies based on a characteristic of noise distribution.
Wung does disclose wherein Δ represents a threshold set by frequencies based on a characteristic of noise distribution. (Figure 10 MVAD 50 and psiZERO/psiTideZero thresholds, [0007]: a priori speech presence probability is determined by thresholds derived from a priori SNR)
It would have been obvious to one with ordinary skill in the art before the effective filing date to dynamically adapt the threshold for q(m, k) because it generates less distortion and performs more noise removal than a conventional fixed speech absence probability (see Wung [0063]).
Claim 4: Elements of parent claim 1 are addressed above. Wung further teaches the method wherein the threshold set as:
PNG
media_image2.png
23
377
media_image2.png
Greyscale
(Figure 10 MVAD 50 and psiZERO/psiTideZero thresholds, [0007]: a priori speech presence probability is determined by thresholds derived from a priori SNR).
Claim 5: Elements of parent claim 1 are addressed above. Cohen further teaches the method wherein the calculating a speech presence probability based on the prior signal-to-noise ratio, the posterior signal-to- noise ratio, and the prior probability of speech absence comprises:
calculating a likelihood ratio based on the prior signal-to-noise ratio and the posterior signal-to-noise ratio, wherein the likelihood ratio indicates a ratio of a probability that a received data frame conforms to a distribution of a noisy speech signal to a probability that the data frame conforms to a distribution of a noise signal; (Section 2 Equation 7, Bayes Rule)
and calculating the speech presence probability based on the likelihood ratio and the prior probability of speech absence. (Section 2 Equation 7).
Claim 6: Elements of parent claim 5 are addressed above. Cohen further teaches the method wherein the noisy speech signal and the noise signal each satisfies a Gaussian distribution (Section 2 Paragraph 2), and the likelihood ratio is expressed as:
PNG
media_image3.png
57
307
media_image3.png
Greyscale
(Section 2 Equation 7).
Claim 7: Elements of parent claim 6 are addressed above. Cohen further teaches the method wherein the speech presence probability is calculated as:
PNG
media_image4.png
58
460
media_image4.png
Greyscale
(Section 2 Equation 7)
Claim 8: Elements of parent claim 6 are addressed above. Cohen further teaches the method wherein after the calculating a likelihood ratio based on the prior signal-to-noise ratio and the posterior signal-to-noise ratio, the method further comprises: performing an inter-frequency smoothing on the likelihood ratio to obtain a smoothed likelihood ratio (Section 2, Equations 8-11 and surrounding paragraphs: smoothing on hypotheses)
the calculating a speech presence probability based on the likelihood ratio and the prior probability of speech absence comprises: calculating the speech presence probability based on the smoothed likelihood ratio and the prior probability of speech absence. (Section 2, Equations 7-11).
Claim 9: Elements of parent claim 5 are addressed above. Cohen further teaches the method wherein after the calculating the speech presence probability based on the likelihood ratio, and the prior probability of speech absence, the method further comprises:
obtaining a probability threshold; (Section 3 Equation 21)
and determining whether to update the speech presence probability based on a relationship between the speech presence and the probability threshold. (Section 3 Equation 21).
Claim 10: Elements of parent claim 9 are addressed above. Cohen further teaches the method wherein a smoothed value of the speech presence probability is calculated as:
PNG
media_image5.png
27
663
media_image5.png
Greyscale
(Section 3 Equation 15)
and the speech presence probability is updated as:
PNG
media_image6.png
91
753
media_image6.png
Greyscale
(Section 3 Equations 21, 26) .
Claim 11: Elements of parent claim 1 are addressed above. Cohen further teaches the method wherein in a case that the estimated noise power spectrum does not contain the estimated noise power of the previous frame, the posterior signal-to-noise ratio is calculated by using a current real- time power as the estimated noise power of the previous frame. (Section 4 Paragraph 2: The noise spectrum estimate is initialized at the first frame by the real time power)
Claim 12: Elements of parent claim 1 are addressed above. Cohen further teaches the method wherein the calculating a gain coefficient based on the estimated noise power spectrum, and enhancing the frequency-domain signal based on the gain coefficient to obtain an enhanced frequency-domain signal comprises:
calculating a posterior signal-to-noise ratio of the frequency-domain signal based on the estimated noise power spectrum (Section 2 Equation 3), and updating the prior signal-to-noise ratio based on the posterior signal-to-noise ratio of the frequency-domain signal; (Section 4 Equation 32)
calculating a prior probability of speech absence based on the updated prior signal- to-noise ratio; (Section 2 Equation 7)
calculating an updated speech presence probability based on the posterior signal- to-noise ratio, the updated prior signal-to-noise ratio, and the prior probability of speech absence; (Section 2 Equation 7)
obtaining the gain coefficient based on the updated speech presence probability; (Section 4 Equation 33, Section 5 Figure 1: Calculate gain based on estimated noise power, which is based on updated speech presence probability)
and calculating a product of the frequency-domain signal and the gain coefficient to obtain the enhanced frequency-domain signal. (Section 4 Equation 33, Section 5 Figure 1: Calculate gain based on estimated noise power)
Claim 13: Elements of parent claim 12 are addressed above. Cohen further teaches the method wherein the prior probability of speech absence is calculated as:
PNG
media_image7.png
124
823
media_image7.png
Greyscale
(Section 3 Equation 29)
Claims 15-16, 19-21 are analogous to claims 1, 12-13 and are rejected in a similar fashion.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALVIN ISKENDER whose telephone number is (703)756-4565. The examiner can normally be reached M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, HAI PHAN can be reached at (571) 272-6338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ALVIN ISKENDER/ Examiner, Art Unit 2654
/HAI PHAN/ Supervisory Patent Examiner, Art Unit 2654