Last updated: April 19, 2026

Application No. 18/522,743

VOICE QUALITY ENHANCEMENT METHOD AND RELATED DEVICE

Final Rejection §102

Filed

Nov 29, 2023

Examiner

ISLAM, MOHAMMAD K

Art Unit

2653

Tech Center

2600 — Communications

Assignee

Huawei Technologies Co., Ltd.

OA Round

2 (Final)

Interview Optional

— +16.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 1288 resolved cases, 2023–2026

Examiner Intelligence

ISLAM, MOHAMMAD K View full profile →

Grants 83% — above average

Career Allow Rate

1070 granted / 1288 resolved

+21.1% vs TC avg

Strong +16% interview lift

Without

With

+16.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

83 currently pending

Career history

1371

Total Applications

across all art units

Statute-Specific Performance

§101

21.4%

-18.6% vs TC avg

§103

32.6%

-7.4% vs TC avg

§102

25.0%

-15.0% vs TC avg

§112

14.6%

-25.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1288 resolved cases

Office Action

§102

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. CN202110611024.0, filed on 05/31/2021, CN202110694849.3, filed on 06/22/2021, CN202111323211.5, filed on 11/09/2021.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 08/26/2024 and 01/06/2025, are considered by the examiner.
Drawings
The drawing submitted on 11/29/2023 is considered by the examiner.
Response to Amendment
Claims 1-20 are currently pending the applications and among them claims 1, and 18-20 are independent claims and have been amended.
Response to Arguments
Applicant's arguments filed 1/22/2026 have been fully considered but they are not persuasive. Applicant' s arguments with respect to amended limitation were not included in the prior office action and therefore applicant is advised to review the office action with respect to the amended limitation.
Applicant Arguments: However, Xin fails to disclose or suggest after a terminal device enters a personalized noise reduction (PNR) mode, obtaining a first noisy voice signal and target voice-related data, wherein the first noisy voice signal comprises an interfering noise signal and a voice signal of a target user, the interfering noise signal includes at least one of a voice signal of a non-target user or an ambient noise signal, and the target voice-related data indicates a voice feature of the target user.
Examiner Response: Examiner respectfully disagree with applicant’s simple assertion with the prior art teaching corresponding to the amended limitations. Prior art  Xin disclosed experiment clearly shows how DNN will process and function to obtain an enhance clean speech signal from noisy speech signal which once trained, will similarly process for “obtaining a first noisy voice signal and target voice-related data, wherein the first noisy voice signal comprises an interfering noise signal and a voice signal of a target user, the interfering noise signal includes at least one of a voice signal of a non-target user or an ambient noise signal, and the target voice-related data indicates a voice feature of the target user” accordingly. Please see the detail on the updated rejection with reference to the amended claims where Xin et al. clearly discloses the “a first noisy voice signal and target voice-related data, wherein the first noisy voice signal comprises an interfering noise signal and a voice signal of a target user, the interfering noise signal includes at least one of a voice signal of a non-target user or an ambient noise signal, and the target voice-related data indicates a voice feature of the target user”.
Applicant arguments are therefore not persuasive and the rejection remain same.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.



Claim(s) 1 and 18-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated  by Xin et al. (Target Speech Signal Enhancement Based on Deep Neural Networks, 2019 2nd IEEE international conference on Information Communication and Signal Processing).

Regarding Claim 1,  Xin et al. teach:  A voice quality enhancement method, comprising: after a terminal device enters a personalized noise reduction(PNR)mode, obtaining a first noisy voice signal and target voice-related data (m(t) = s(t) +α n(t), where α is a noise gain factor, with s(t) and n(t) signifying a clean speech and a noise speech), wherein the first noisy voice signal comprises an interfering noise signal (α n(t)) and a voice signal of a target user ( s(t)), the interfering noise signal includes at least one of voice signal of a non-target user or an ambient noise signal ( Babble noise, Musical noise etc.) ( Page 244, B. Experiments on the quality of noise used for training:  Fig.6, display spectrograms of an utterance example in the test set, including clean speech, noise speech mixed with Babble noise…Musical noise appeared in most speech processed by traditional enhancement algorithms…) and the target voice-related data indicates a voice feature (reference clean LPS features i.e. Yn representing the corresponding clean feature vector) of the target user (Page 243 Col 1, paragraph 1, “A collection of data consisting of pairs of noisy and corresponding clean utterance are used for the model training. More precisely, the noisy speech m(t) is constructed according to m(t) = s(t) +α n(t), where α is a noise gain factor, with s(t) and n(t) signifying a clean speech and a noise speech.”, Page 242-243 Col 1, “A. Network Structure and Model Training”: Before model is fed with features of signal…After extracting the LPS features and normalizing them the features of noisy speech are input to the model. Then the features of enhanced speech can be estimated by layer-by-layer calculations… by minimizing  mean  squared  error (MSE) objection function  [16]  between estimated DNN output and reference clean LPS features… with Yn representing the corresponding clean feature vector. (equation 85)); and performing noise reduction on the first noisy voice signal based on the target voice-related data by using a voice noise reduction model to obtain a noise-reduced voice signal of the target user, wherein the voice noise reduction model is implemented based on a neural network (Page 242, Col 2, Paragraph 2, “In the “training stage”, the regression DNN model is trained from features generated respectively by pairs of noisy and clean speech. In the “enhancement stage”, the well-trained model is fed with the features of noisy speech to generate the enhanced speech features.” Page 243 Col 1, paragraph 1, “After extracting the LPS features and normalizing them, the features of noisy speech are input to the model.” Page 243, Col 2, “B. Speech enhancement Stage, After the model has been trained, the process of denoising noisy speech is performed as follows: the LPS features from noisy speech signals are extracted and presented as inputs for this model. And we use the trained parameters to calculate the enhanced signal’s LPS features for each noisy features that we analyze. Then the inverse normalization and the wave form reconstruction will be used to invert the log-power spectrum back to the time domain.”).

Regarding Claim 18,  Xin et al. teach:   A terminal device, comprising: a processor, and a memory coupled to the processor to store instructions, which when executed by the processor, cause the processor to perform operations, the operations comprising (Page 241, Col1, “I. Introduction: The purpose of speech enhancement technology is to suppress noise and obtain enhance speech signals from noisy signals mixed with the speech and background noise by using some speech enhancement methods, while improving the quality of speech and increasing its intelligibility. It has been extensively used in real life, such as hearing aids, mobile communication and automatic speech recognition. Note: It is inherent for hearing aids, mobile device, and automatic speech recognition to have processor and memory storing instruction to perform noise reduction process using deep neural network.): after the terminal device enters a personalized noise reduction (PNR}mode, obtaining a first noisy voice signal and target voice-related data, wherein the first noisy voice signal comprises an interfering noise signal and a voice signal of a target user , the interfering noise signal includes at least one of a voice signal of a non-target user or an ambient noise signal, and the target voice-related data indicates a voice feature of the target user; and performing noise reduction on the first noisy voice signal based on the target voice-related data by using a voice noise reduction model to obtain a noise-reduced voice signal of the target user, wherein the voice noise reduction model is implemented based on a neural network (See rejection of claim 1).

Regarding Claim 19,  Xin et al. teach:  A chip system, applied to an electronic device, comprising: a processor an interface circuit configured to receive and send data wherein the interface circuit and the processor are interconnected through a line and memory couple to the processor to store instruction which when executed by the processor, cause the electronic device to perform operations, the operations comprising (Page 241, Col1, “I. Introduction: The purpose of speech enhancement technology is to suppress noise and obtain enhance speech signals from noisy signals mixed with the speech and background noise by using some speech enhancement methods, while improving the quality of speech and increasing its intelligibility. It has been extensively used in real life, such as hearing aids, mobile communication and automatic speech recognition. Note: It is inherent for hearing aids, mobile device, and automatic speech recognition to have processor and memory storing instruction to perform noise reduction process using deep neural network. It is also inherent the hearing aids and mobile device to have a transmitter and receiver to send and receive voice signals which is interconnected to a processor of the hearing aid or mobile.): after the electronic device enters a personalized noise reduction (PNR) mode, obtaining a first noisy voice signal and target voice-related data, wherein the first noisy voice signal comprises an interfering noise signal and a voice signal of a target user , the interfering noise signal includes at least one of a voice signal of a non-target user or an ambient noise signal,  and the target voice-related data indicates a voice feature of the target user; and performing noise reduction on the first noisy voice signal based on the target voice-related data by using a voice noise reduction model to obtain a noise-reduced voice signal of the target user, wherein the voice noise reduction model is implemented based on a neural network (See rejection of claim 1).

Regarding Claim 20,  Xin et al. teach: A non-transitory machine-readable storage medium, having instructions stored therein, which when executed by a processor, cause the processor to perform operations, the operations comprising(Page 241, Col1, “I. Introduction: The purpose of speech enhancement technology is to suppress noise and obtain enhance speech signals from noisy signals mixed with the speech and background noise by using some speech enhancement methods, while improving the quality of speech and increasing its intelligibility. It has been extensively used in real life, such as hearing aids, mobile communication and automatic speech recognition. Note: It is inherent for hearing aids, mobile device, and automatic speech recognition to have processor and memory storing instruction to perform noise reduction process using deep neural network.): after entering a personalized noise reduction (PNR)mode, obtaining a first noisy voice signal and target voice-related data, wherein the first noisy voice signal comprises an interfering noise signal and a voice signal of a target user , the interfering noise signal includes at least one of a voice signal of a non-target user or an ambient noise signal,  and the target voice-related data indicates a voice feature of the target user; and performing noise reduction on the first noisy voice signal based on the target voice-related data by using a voice noise reduction model to obtain a noise-reduced voice signal of the target user, wherein the voice noise reduction model is implemented based on a neural network(See rejection of claim 1).

Allowable Subject Matter
Claims 2-17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art of records Nyayate et al.(US 2021/0360349 A1) teach: AUDIO NOISE DETERMINATION USING ONE OR MORE NEURAL NETWORKS.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-5878. The examiner can normally be reached Monday -Friday, EST (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras Shah can be reached at 571-270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2653

Read full office action

Prosecution Timeline

Nov 29, 2023

Application Filed

Nov 15, 2025

Non-Final Rejection — §102

Jan 22, 2026

Response Filed

Feb 13, 2026

Final Rejection — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/028,597

Patent 12601849

SYSTEMS AND METHODS FOR PLANNING SEISMIC DATA ACQUISITION WITH REDUCED ENVIRONMENTAL IMPACT

2y 5m to grant Granted Apr 14, 2026

18/181,087

Patent 12596361

FAILURE DIAGNOSIS METHOD, METHOD OF MANUFACTURING DISK DEVICE, AND RECORDING MEDIUM

2y 5m to grant Granted Apr 07, 2026

18/592,408

Patent 12596872

HOLISTIC EMBEDDING GENERATION FOR ENTITY MATCHING

2y 5m to grant Granted Apr 07, 2026

18/634,768

Patent 12596868

CREATING A DIGITAL ASSISTANT

2y 5m to grant Granted Apr 07, 2026

18/706,798

Patent 12597434

CONTROL OF SPEECH PRESERVATION IN SPEECH ENHANCEMENT

2y 5m to grant Granted Apr 07, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

83%

Grant Probability

99%

With Interview (+16.5%)

2y 9m

Median Time to Grant

Moderate

PTA Risk

Based on 1288 resolved cases by this examiner. Grant probability derived from career allow rate.

VOICE QUALITY ENHANCEMENT METHOD AND RELATED DEVICE

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email