Last updated: April 19, 2026

Application No. 18/040,381

SPEECH RECOGNITION SYSTEM AND METHOD FOR AUTOMATICALLY CALIBRATING DATA LABEL

Non-Final OA §102

Filed

Feb 02, 2023

Examiner

ISKENDER, ALVIN ALIK

Art Unit

2654

Tech Center

2600 — Communications

Assignee

Iucf-Hyu (Industry-University Cooperation Foundation Hanyang University)

OA Round

1 (Non-Final)

This examiner grants 48% of cases after interview

— +60.3% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 25 resolved cases, 2023–2026

Examiner Intelligence

ISKENDER, ALVIN ALIK View full profile →

Grants 48% of resolved cases

Career Allow Rate

12 granted / 25 resolved

-14.0% vs TC avg

Strong +60% interview lift

Without

With

+60.3%

Interview Lift

resolved cases with interview

Typical timeline

3y 4m

Avg Prosecution

20 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

15.6%

-24.4% vs TC avg

§103

53.0%

+13.0% vs TC avg

§102

25.8%

-14.2% vs TC avg

§112

5.4%

-34.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 25 resolved cases

Office Action

§102

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 8 objected to because of the following informalities: "performing repeatedly learning" is grammatically improper; replace with "performing repeated learning", for example.  Appropriate correction is required.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-10 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kahn ("Self-Training for End-to-End Speech Recognition").
Claim 1: Kahn discloses a speech recognition method of automatically correcting a data label, the method comprising:
performing confidence-based filtering in order to find a location at which an incorrect label has occurred in time-series speech data in which an answer label and the incorrect label have been temporally mixed by using a transformer-based speech recognition model; (Section 3.1: Confidence filtering of data; Section 2.1: Label inference)
improving performance of the transformer-based speech recognition model by replacing a label in a decoder time step that has been determined as the incorrect label due to the location at which the incorrect label has occurred after the filtering,  (Section 3.1: Exclude low-confidence samples)
wherein in performing the confidence-based filtering in order to find the location at which the incorrect label has occurred in the time-series speech data, the incorrect label is found and corrected by using confidence using a transition probability between labels every decoder time step. (Section 3.1: using likelihood ratio to find incorrect labels)


Claim 2: Elements of parent claim 1 are disclosed as discussed above. Kahn further teaches method wherein performing the confidence- based filtering in order to find the location at which the incorrect label has occurred in the time-series speech data comprises:
calculating confidence by using a transition probability between labels that transition between decoder time steps; (Section 3.1: likelihood ratio for each label)
calculating confidence by using a self-attention probability that represents correlation between labels; and (Section 2: attention mechanism)
calculating confidence by using a source-attention probability in which a speech and correlation between labels have been considered. (Section 2, 2.1: attention ceiling)

Claim 3: Elements of parent claim 2 are disclosed as discussed above. Kahn further teaches method wherein performing the confidence- based filtering in order to find the location at which the incorrect label has occurred in the time-series speech data further comprises:
generating merged confidence by combining the confidence using a transition probability, the confidence using a self-attention probability, and the confidence using a source-attention probability; and (Section 3.2: ensemble model combination)
finding the location of the incorrect label based on the merged confidence. (Section 3.2: get a new ensemble sample set from the model combination)

Claim 4: Elements of parent claim 1 are disclosed as discussed above. Kahn further teaches method wherein improving the performance of the transformer-based speech recognition model by replacing the label in the decoder time step that has been determined as the incorrect label comprises excluding a decoder time step corresponding to the incorrect label from learning with respect to the time- series speech data. (Section 3.1: exclude low confidence samples)

Claim 5: Elements of parent claim 1 are disclosed as discussed above. Kahn further teaches method wherein improving the performance of the transformer-based speech recognition model by replacing the label in the decoder time step that has been determined as the incorrect label comprises
defining a (K+1)-th new type as a help label by adding the (K+1)-th new type to the number K of all of classification label types, and (Section 2.1: Hypothesis inference)
replacing the incorrect label with the help label. (Section 2.1)

Claim 6: Elements of parent claim 1 are disclosed as discussed above. Kahn further teaches method wherein improving the performance of the transformer-based speech recognition model by replacing the label in the decoder time step that has been determined as the incorrect label comprises replacing the incorrect label with a new label sampled from the transition probability. (Section 3.1: ensemble sampling)

Claim 7: Elements of parent claim 1 are disclosed as discussed above. Kahn further teaches method wherein the transformer-based speech recognition model is a model that maps two time series having different lengths by using an attention mechanism, and comprises an encoder that changes the time-series speech data into memory and a decoder that predicts a current label by using the memory and past labels. (Section 2: Sequence to sequence model)

Claim 8: Elements of parent claim 2 are disclosed as discussed above. Kahn further teaches method wherein improving the performance of the transformer-based speech recognition model by replacing the label in the decoder time step that has been determined as the incorrect label comprises performing repeatedly learning by using a Q-shot learning method in order to obtain the transition probability, the source-attention probability, the self-attention probability, and a transition probability that is used in sampling upon replacement. (Section 2: Recurrent encoding and decoding)

	Regarding claims 9 and 10, they are analogous to elements found in claims 1-8 and are thus rejected in a similar fashion.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALVIN ISKENDER whose telephone number is (703)756-4565. The examiner can normally be reached M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, HAI PHAN can be reached at (571) 272-6338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/ALVIN ISKENDER/Examiner, Art Unit 2654                        

/HAI PHAN/Supervisory Patent Examiner, Art Unit 2654

Read full office action

Prosecution Timeline

Feb 02, 2023

Application Filed

Feb 10, 2026

Non-Final Rejection — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/188,310

Patent 12562244

COMBINING DOMAIN-SPECIFIC ONTOLOGIES FOR LANGUAGE PROCESSING

2y 5m to grant Granted Feb 24, 2026

17/911,224

Patent 12531078

NOISE SUPPRESSION FOR SPEECH ENHANCEMENT

2y 5m to grant Granted Jan 20, 2026

17/926,994

Patent 12505825

SPONTANEOUS TEXT TO SPEECH (TTS) SYNTHESIS

2y 5m to grant Granted Dec 23, 2025

17/750,973

Patent 12456457

ALL DEEP LEARNING MINIMUM VARIANCE DISTORTIONLESS RESPONSE BEAMFORMER FOR SPEECH SEPARATION AND ENHANCEMENT

2y 5m to grant Granted Oct 28, 2025

18/054,153

Patent 12407783

DOUBLE-MICROPHONE ARRAY ECHO ELIMINATING METHOD, DEVICE AND ELECTRONIC EQUIPMENT

2y 5m to grant Granted Sep 02, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

48%

Grant Probability

99%

With Interview (+60.3%)

3y 4m

Median Time to Grant

Low

PTA Risk

Based on 25 resolved cases by this examiner. Grant probability derived from career allow rate.