Last updated: April 18, 2026

Application No. 18/016,058

NOISE SUPPRESSION METHOD AND APPARATUS FOR QUICKLY CALCULATING SPEECH PRESENCE PROBABILITY, AND STORAGE MEDIUM AND TERMINAL

Final Rejection §103

Filed

Jan 13, 2023

Examiner

ISKENDER, ALVIN ALIK

Art Unit

2654

Tech Center

2600 — Communications

Assignee

UNISOC (CHONGQING) TECHNOLOGIES CO., LTD.

OA Round

2 (Final)

This examiner grants 48% of cases after interview

— +60.3% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 25 resolved cases, 2023–2026

Examiner Intelligence

ISKENDER, ALVIN ALIK View full profile →

Grants 48% of resolved cases

Career Allow Rate

12 granted / 25 resolved

-14.0% vs TC avg

Strong +60% interview lift

Without

With

+60.3%

Interview Lift

resolved cases with interview

Typical timeline

3y 4m

Avg Prosecution

20 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

15.6%

-24.4% vs TC avg

§103

53.0%

+13.0% vs TC avg

§102

25.8%

-14.2% vs TC avg

§112

5.4%

-34.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 25 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 17 February 2026 have been fully considered but they are not persuasive. 
Applicant argues that Cohen in view of Wung does not teach the claimed equation for the prior probability of speech absence. As applicant notes, Cohen alone is not used to read on the claimed equation, but the combination of the references. The differences between Cohen’s equation 29 and applicant’s claimed equation are addressed by the cited portions of Wung. Paragraph [0007] of Wung, partially quoted by applicant, states “The Wiener filter is updated (its filter coefficients are computed) based on a speech presence probability (SPP), and the latter in turn is computed based on an a priori speech presence probability (a priori SPP.) The latter is computed by a multi-channel voice activity detector (MVAD), whose two input thresholds are dynamically adapted and are derived from i) an instantaneous a priori signal to noise ratio (SNR), and ii) an average a priori SNR, of the multi-channel dereverberated signal”. The MVAD referred to by the applicant and cited by the prior rejection is calculating a priori speech presence probability, as the claim requires. The two input thresholds are dynamically adapted and based on “a characteristic of noise distribution” as required by the claim (signal to noise ratio is a characteristic of noise distribution). Thus, the modification of Cohen by Wung renders the claim limitation obvious to one with ordinary skill in the art.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 4-13, 15-16, 19-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cohen in view of Wung (US 20180350379 A1).

Claim 1: Cohen discloses a method for suppressing noise by quickly calculating a speech presence probability, comprising: 
obtaining an input signal, and converting the input signal from a time-domain signal to a frequency-domain signal; (Section 2 paragraph 2: observed signal converted into frequency domain using STFT)
calculating a real-time power spectrum of the frequency-domain signal, and tracking a minimum power in the real-time power spectrum; (Section 3, Paragraph 2: minimum tracking)
performing noise estimation based on the minimum power to obtain an estimated noise power spectrum; (Section 1: Noise estimation is based on the tracked minimum power)
calculating a gain coefficient based on the estimated noise power spectrum, and enhancing the frequency-domain signal based on the gain coefficient to obtain an enhanced frequency-domain signal;  (Section 4 Equation 33, Section 5 Figure 1: Calculate gain based on estimated noise power)
and converting the enhanced frequency-domain signal to a time-domain signal to obtain an output signal.  (Section 5 Figure 4: Output enhanced signal c in time domain)
wherein the performing noise estimation based on the minimum power to obtain an estimated noise power spectrum comprises: 
calculating a ratio of a real-time power to the minimum power in the real-time power spectrum; (Section 3 Equation 18)
obtaining a threshold and comparing the ratio with the threshold to obtain a prior probability of speech absence; (Section 3 Equation 29)
calculating a posterior signal-to-noise ratio based on the real-time power spectrum, wherein the posterior signal-to-noise ratio is a ratio of a real-time power of a current frame to an estimated noise power of a previous frame; (Section 2 Equation 3)
calculating a prior signal-to-noise ratio through a decision-directed approach; (Section 3 Equation 32 and surrounding paragraphs: decision-directed approach to obtain a priori SNR)
calculating a speech presence probability based on the prior signal-to-noise ratio, the posterior signal-to-noise ratio, and the prior probability of speech absence; (Section 2 Equation 7)
and calculating the estimated noise power spectrum based on the speech presence probability.  (Section 2 Equations 9-10 and surrounding paragraphs: use speech probability to obtain the estimated noise spectrum)
wherein a calculation for probability of speech absence is obtained as: 
    PNG
    media_image1.png
    115
    501
    media_image1.png
    Greyscale
 ; (Section 3 Equation 29)
However, Cohen does not disclose wherein Δ represents a threshold set by frequencies based on a characteristic of noise distribution.
Wung does disclose wherein Δ represents a threshold set by frequencies based on a characteristic of noise distribution. (Figure 10 MVAD 50 and psiZERO/psiTideZero thresholds, [0007]: a priori speech presence probability is determined by thresholds derived from a priori SNR)
	It would have been obvious to one with ordinary skill in the art before the effective filing date to dynamically adapt the threshold for q(m, k) because it generates less distortion and performs more noise removal than a conventional fixed speech absence probability (see Wung [0063]).

Claim 4: Elements of parent claim 1 are addressed above. Wung further teaches the method wherein the threshold set as: 
    PNG
    media_image2.png
    23
    377
    media_image2.png
    Greyscale
 
(Figure 10 MVAD 50 and psiZERO/psiTideZero thresholds, [0007]: a priori speech presence probability is determined by thresholds derived from a priori SNR).

Claim 5: Elements of parent claim 1 are addressed above. Cohen further teaches the method wherein the calculating a speech presence probability based on the prior signal-to-noise ratio, the posterior signal-to- noise ratio, and the prior probability of speech absence comprises: 
calculating a likelihood ratio based on the prior signal-to-noise ratio and the posterior signal-to-noise ratio, wherein the likelihood ratio indicates a ratio of a probability that a received data frame conforms to a distribution of a noisy speech signal to a probability that the data frame conforms to a distribution of a noise signal; (Section 2 Equation 7, Bayes Rule)
and calculating the speech presence probability based on the likelihood ratio and the prior probability of speech absence.  (Section 2 Equation 7).

Claim 6: Elements of parent claim 5 are addressed above. Cohen further teaches the method wherein the noisy speech signal and the noise signal each satisfies a Gaussian distribution (Section 2 Paragraph 2), and the likelihood ratio is expressed as: 
    PNG
    media_image3.png
    57
    307
    media_image3.png
    Greyscale
  (Section 2 Equation 7).

Claim 7: Elements of parent claim 6 are addressed above. Cohen further teaches the method wherein the speech presence probability is calculated as: 
    PNG
    media_image4.png
    58
    460
    media_image4.png
    Greyscale
  (Section 2 Equation 7)

Claim 8: Elements of parent claim 6 are addressed above. Cohen further teaches the method wherein after the calculating a likelihood ratio based on the prior signal-to-noise ratio and the posterior signal-to-noise ratio, the method further comprises: performing an inter-frequency smoothing on the likelihood ratio to obtain a smoothed likelihood ratio  (Section 2, Equations 8-11 and surrounding paragraphs: smoothing on hypotheses)
 the calculating a speech presence probability based on the likelihood ratio and the prior probability of speech absence comprises: calculating the speech presence probability based on the smoothed likelihood ratio and the prior probability of speech absence.  (Section 2, Equations 7-11).

Claim 9: Elements of parent claim 5 are addressed above. Cohen further teaches the method wherein after the calculating the speech presence probability based on the likelihood ratio, and the prior probability of speech absence, the method further comprises: 
obtaining a probability threshold; (Section 3 Equation 21)
and determining whether to update the speech presence probability based on a relationship between the speech presence and the probability threshold.  (Section 3 Equation 21).

Claim 10: Elements of parent claim 9 are addressed above. Cohen further teaches the method wherein a smoothed value of the speech presence probability is calculated as: 
    PNG
    media_image5.png
    27
    663
    media_image5.png
    Greyscale
  (Section 3 Equation 15)

and the speech presence probability is updated as:  

    PNG
    media_image6.png
    91
    753
    media_image6.png
    Greyscale
 (Section 3 Equations 21, 26) .

Claim 11: Elements of parent claim 1 are addressed above. Cohen further teaches the method wherein in a case that the estimated noise power spectrum does not contain the estimated noise power of the previous frame, the posterior signal-to-noise ratio is calculated by using a current real- time power as the estimated noise power of the previous frame.  (Section 4 Paragraph 2: The noise spectrum estimate is initialized at the first frame by the real time power)

Claim 12: Elements of parent claim 1 are addressed above. Cohen further teaches the method wherein the calculating a gain coefficient based on the estimated noise power spectrum, and enhancing the frequency-domain signal based on the gain coefficient to obtain an enhanced frequency-domain signal comprises: 
calculating a posterior signal-to-noise ratio of the frequency-domain signal based on the estimated noise power spectrum (Section 2 Equation 3), and updating the prior signal-to-noise ratio based on the posterior signal-to-noise ratio of the frequency-domain signal; (Section 4 Equation 32)
calculating a prior probability of speech absence based on the updated prior signal- to-noise ratio; (Section 2 Equation 7)
calculating an updated speech presence probability based on the posterior signal- to-noise ratio, the updated prior signal-to-noise ratio, and the prior probability of speech absence; (Section 2 Equation 7)
obtaining the gain coefficient based on the updated speech presence probability; (Section 4 Equation 33, Section 5 Figure 1: Calculate gain based on estimated noise power, which is based on updated speech presence probability)
and calculating a product of the frequency-domain signal and the gain coefficient to obtain the enhanced frequency-domain signal.  (Section 4 Equation 33, Section 5 Figure 1: Calculate gain based on estimated noise power)

Claim 13: Elements of parent claim 12 are addressed above. Cohen further teaches the method wherein the prior probability of speech absence is calculated as: 
    PNG
    media_image7.png
    124
    823
    media_image7.png
    Greyscale
(Section 3 Equation 29)

Claims 15-16, 19-21 are analogous to claims 1, 12-13 and are rejected in a similar fashion.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALVIN ISKENDER whose telephone number is (703)756-4565. The examiner can normally be reached M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, HAI PHAN can be reached at (571) 272-6338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ALVIN ISKENDER/               Examiner, Art Unit 2654                   

/HAI PHAN/              Supervisory Patent Examiner, Art Unit 2654

Read full office action

Prosecution Timeline

Jan 13, 2023

Application Filed

Nov 15, 2025

Non-Final Rejection — §103

Feb 17, 2026

Response Filed

Apr 04, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/188,310

Patent 12562244

COMBINING DOMAIN-SPECIFIC ONTOLOGIES FOR LANGUAGE PROCESSING

2y 5m to grant Granted Feb 24, 2026

17/911,224

Patent 12531078

NOISE SUPPRESSION FOR SPEECH ENHANCEMENT

2y 5m to grant Granted Jan 20, 2026

17/926,994

Patent 12505825

SPONTANEOUS TEXT TO SPEECH (TTS) SYNTHESIS

2y 5m to grant Granted Dec 23, 2025

17/750,973

Patent 12456457

ALL DEEP LEARNING MINIMUM VARIANCE DISTORTIONLESS RESPONSE BEAMFORMER FOR SPEECH SEPARATION AND ENHANCEMENT

2y 5m to grant Granted Oct 28, 2025

18/054,153

Patent 12407783

DOUBLE-MICROPHONE ARRAY ECHO ELIMINATING METHOD, DEVICE AND ELECTRONIC EQUIPMENT

2y 5m to grant Granted Sep 02, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

48%

Grant Probability

99%

With Interview (+60.3%)

3y 4m

Median Time to Grant

Moderate

PTA Risk

Based on 25 resolved cases by this examiner. Grant probability derived from career allow rate.