Last updated: April 19, 2026

Application No. 18/057,983

LABEL SMOOTHING TECHNIQUE FOR IMPROVING GENERALIZATION OF DEEP NEURAL NETWORK ACOUSTIC MODELS

Final Rejection §112

Filed

Nov 22, 2022

Examiner

KY, KEVIN

Art Unit

2671

Tech Center

2600 — Communications

Assignee

International Business Machines Corporation

OA Round

2 (Final)

Interview Optional

— +25.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 549 resolved cases, 2023–2026

Examiner Intelligence

KY, KEVIN View full profile →

Grants 76% — above average

Career Allow Rate

420 granted / 549 resolved

+14.5% vs TC avg

Strong +25% interview lift

Without

With

+25.3%

Interview Lift

resolved cases with interview

Typical timeline

2y 6m

Avg Prosecution

33 currently pending

Career history

582

Total Applications

across all art units

Statute-Specific Performance

§101

17.6%

-22.4% vs TC avg

§103

46.5%

+6.5% vs TC avg

§102

20.8%

-19.2% vs TC avg

§112

9.9%

-30.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 549 resolved cases

Office Action

§112

DETAILED ACTION
Claim Interpretation
Claims 19-20 have been analyzed under 35 USC § 101. Paragraphs 76 of the specifications disclose “A computer readable storage medium, as that term is used in the present disclosure, is not to be construed as storage in the form of transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide, light pulses passing through a fiber optic cable, electrical signals communicated through a wire, and/or other transmission media.”. Therefore, claims 19-20 are patent eligible as not being directed to a signal per se.

Claim Objections
Claims 1, 10, and 19 is/are objected to because of the following informalities: “a DNN acoustic model” is claimed.  Acronyms/abbreviations should be defined when first used to avoid clarity issues. Appropriate correction is required.
Claim 1, 10, and 19 is/are objected to because of the following informalities: “dops or insert” is claimed, where “dops” is misspelled. Appropriate correction is required.
Claims 3-4 and 12-13  is/are objected to because of the following informalities: “a deep neural network (DNN) acoustic model” is claimed. It is clear that this is the same DNN acoustic model in claims 1, 10, and 19. Appropriate correction is required.
Claims 4 is/are objected to because of the following informalities: “an execution component” is claimed. It is clear that this is the same execution component in claim 1. Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claims 4 and 13 are rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.  
Claims 4 and 13 depends off of claims 1 and 10 and recites that “an execution component applies the one or more n-best hypotheses of the ground truth label sequence as an individual technique to assist with generalization of a DNN acoustic model”. However, claims 1 and 10 requires that the execution component applies the one or more n-hypotheses “in combination with one or more data argumentation techniques”. 
The limitation in claims 4 and 13 are inconsistent with the limitation of claims 1 and 10 because applying one or more n-best hypothesis “as an individual technique” excludes the requirement of applying the hypotheses in combination with one or more data argumentation techniques, as required by claims 1 and 10.
Accordingly, claims 4 and 13 does not further limit the subject matter of claims 1 and 10, but instead changes the scope of the invention. Therefore, claims 4 and 13 are of improper dependent form.
Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.

Response to Arguments
Applicant’s arguments, with respect to claims 1-4, 6-13, 15-20 have been fully considered and are persuasive.  The rejections under 35 U.S.C. 102 and 35 U.S.C. 103 of claims 1-20 have been withdrawn. However, there are multiple claim objections and rejections under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph that need to be addressed.

Allowable Subject Matter
Claims 1-4, 6-13, 15-20 would be allowable if rewritten or amended to overcome the claim objections and the rejections under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph , set forth in this Office action.
Regarding claim 1, and similarly regarding claims 10 and 19, the prior art of record, alone or in combination, fails to teach at least “an execution component that applies the one or more n-best hypotheses of the ground truth label sequence in combination with one or more data augmentation techniques to assist with generalization of a DNN acoustic model, wherein a one of the data augmentation techniques comprises a length perturbation technique that randomly dops or inserts frames of an utterance to alter a length of a speech feature sequence of the utterance.”
At best, Lundgaard et al (US 20210141995) teaches in ¶25 Implementations of the disclosed subject matter may improve generalization performance of deep learning models with limited training data. Implementations of the disclosed subject matter may improve the generalization performance of deep learning models by using embeddings generated by larger, pre-trained networks; ¶72 From the results shown in Table 1 above, embedding augmentation may provide improved classification performance. E-Stitchup may provide models with the best performance, but adding extra label softening (e.g., Soft E-Stitchup) may improves classification performance in terms of AUROC and AUPR. The accuracy metrics provided for each of the experiments in Table 3 show that the control experiments may be competitive with embedding augmentation methods. The AUROC and AUPR measures, which may provide an unbiased view of classification performance, show an improvement in classification performance using embedding augmentation, especially when extra label softening is added.
At best, Hutmacher et al (US 20210319315) teaches in ¶20 For audio data, the perturbations may be superimposed onto a random part of an audio sequence in order to form a perturbed datum. 
The prior art of record, alone or in combination, does not teach the specifics on the execution component and the specifics of the data augmentation technique, including applying the one or more n-best hypotheses of the ground truth label sequence in combination with one or more data augmentation techniques to assist with generalization of a DNN acoustic mode.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEVIN KY whose telephone number is (571)272-7648. The examiner can normally be reached Monday-Friday 9-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached at 571-272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KEVIN KY/Primary Examiner, Art Unit 2671

Read full office action

Prosecution Timeline

Nov 22, 2022

Application Filed

Aug 21, 2025

Non-Final Rejection — §112

Dec 18, 2025

Response Filed

Mar 20, 2026

Final Rejection — §112

Apr 09, 2026

Applicant Interview (Telephonic)

Apr 09, 2026

Examiner Interview Summary

Precedent Cases

Applications granted by this same examiner with similar technology

17/676,432

Patent 12597158

POSE ESTIMATION

2y 5m to grant Granted Apr 07, 2026

18/814,687

Patent 12597291

IMAGE ANALYSIS FOR PERSONAL INTERACTION

2y 5m to grant Granted Apr 07, 2026

18/222,090

Patent 12586393

KNOWLEDGE-DRIVEN SCENE PRIORS FOR SEMANTIC AUDIO-VISUAL EMBODIED NAVIGATION

2y 5m to grant Granted Mar 24, 2026

18/570,168

Patent 12586559

METHOD AND APPARATUS FOR GENERATING SPEECH OUTPUTS IN A VEHICLE

2y 5m to grant Granted Mar 24, 2026

19/080,452

Patent 12579382

NATURAL LANGUAGE GENERATION USING KNOWLEDGE GRAPH INCORPORATING TEXTUAL SUMMARIES

2y 5m to grant Granted Mar 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

76%

Grant Probability

99%

With Interview (+25.3%)

2y 6m

Median Time to Grant

Moderate

PTA Risk

Based on 549 resolved cases by this examiner. Grant probability derived from career allow rate.