Prosecution Insights
Last updated: April 19, 2026
Application No. 18/129,030

INSERTION ERROR REDUCTION WITH CONFIDENCE SCORE-BASED WORD FILTERING

Non-Final OA §103
Filed
Mar 30, 2023
Examiner
SAINT CYR, LEONARD
Art Unit
2658
Tech Center
2600 — Communications
Assignee
International Business Machines Corporation
OA Round
3 (Non-Final)
77%
Grant Probability
Favorable
3-4
OA Rounds
3y 1m
To Grant
95%
With Interview

Examiner Intelligence

Grants 77% — above average
77%
Career Allow Rate
882 granted / 1144 resolved
+15.1% vs TC avg
Strong +18% interview lift
Without
With
+18.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
32 currently pending
Career history
1176
Total Applications
across all art units

Statute-Specific Performance

§101
17.8%
-22.2% vs TC avg
§103
39.1%
-0.9% vs TC avg
§102
28.0%
-12.0% vs TC avg
§112
2.2%
-37.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 1144 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/10/25 has been entered. Response to Arguments Applicant’s arguments, see pages 6 - 9, filed 12/10/25, with respect to claims 1 – 20 have been fully considered and are persuasive. The rejection of claims 1 – 20 under 35 U.S.C 101 has been withdrawn. Applicant argues that the independent claims (Claims 1, 9, and 13) include the components or steps of the invention that provide, for example, the improvements to the technological process of computerized speech recognition by calculating an average of confidence for each character in a corresponding word, including space characters that appear after the last character of the word, and removing uncertain words using a threshold process based on the calculated confidence score, as achieved by the steps of: calculating a word-level confidence score by computing an average of confidence levels for each character in a word and a trailing space character delineating an end of the word (Amendment, pages 6 – 9). 4. Applicant’s arguments with respect to claims 1 - 20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Applicant argues that the prior art of record computing confidence levels for each character in a word of an alphabet-based language and a trailing space character delineating an end of the word (Amendment, pages 9, 10). Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1, 2, 4 – 10, 12 – 14, 16 -20 are rejected under 35 U.S.C. 103 as being unpatentable over Harada (US PAP 2013/0096918) in view of MCCARTNEY, Jr.et al. (US PAP 2020/0404386). As per claims 1, 9, 13, Harada teaches a computer-implemented method, the method comprising: calculating, using a computerized automatic speech recognition system, a word-level confidence score (“calculates the sum of the corresponding similarity and the corresponding connection score for every combination of the acoustic models used when the similarity is calculated”; paragraph 99); and managing, using the computerized automatic speech recognition system, the word using a threshold process based on the calculated word-level confidence score (“determining unit 26c judges whether there is a sum among the plurality of calculated sums that exceeds a threshold value. If there is a sum that exceeds a threshold value, the character string corresponding to the largest sum among the sums that exceed a threshold value is determined as the character string corresponding to the voice signal.”; paragraphs 99, 112 - 114); performing automatic speech recognition based on a result of the managing of the word; and presenting a result of the automatic speech recognition to a user (“The output unit 27 transmits the character string determined for each of the frames to the output unit 22 so as to display the character string on a screen as the recognition result of the voice.”; paragraphs 41, 42, 100). However, Harada does not specifically teach computing confidence levels for each character in a word of an alphabet-based language and a trailing space character delineating an end of the word. MCCARTNEY, Jr.et al. discloses processing logic may, subtract median duration timing values for each low confidence caption character string that precedes the anchored caption character string in the sentence fragment to determine the start time of the sentence fragment. For example, if the sentence fragment contains caption character strings “fried served with chutney” where confidence values are {fried=0.72, served=0.18, with=1.00, chutney=0.34}, then processing logic may identify the caption character string “with” as a high confidence caption character string and anchor the character string using the start and end time… The end time for the sentence fragment may similarly be calculated by adding median duration times corresponding to each of the trailing low confidence caption character strings until the end of the sentence fragment is reached. For example, the end time may be estimated by adding the median duration for the 7-character string “chutney” to the end time of the anchor caption character string “with”(paragraph 85). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to calculate the average of the confidence levels as taught by Lee in Harada, because that would help provide accurate alignment of translated audio portions generated from translated caption speech with durations and timings of original audio speech in the video (paragraph 28). As per claims 2, 10, 14, Harada in view of MCCARTNEY, Jr.et al. further disclose managing the word comprises deleting the word based on the calculated word-level confidence score and a given threshold (“Processing logic may be configured to normalize the translated character strings in the translation language caption data in order to remove non-spoken text or special characters from the translated character strings… generated character strings are within a threshold”; MCCARTNEY, Jr.et al., paragraphs 54 – 58, 61 – 66, 79). As per claims 4, 12, 16, Harada in view of MCCARTNEY, Jr.et al. further disclose applying a first weight to the confidence level of the trailing space character and a second weight to the confidence level for each character in the word, the application occurring prior to the computing of the average of the confidence levels for each character in the word and the trailing space character (“If a sentence fragment contains a set of caption character strings near the middle of the sentence fragment with high confidence values and another set of caption character strings in the beginning and the end of the sentence that have lower confidence values, then processing logic may use the set of caption character strings with the high confidence values as an anchor for determining timing information for the sentence fragment.”; MCCARTNEY, Jr.et al., paragraphs 83 - 86; Harada, paragraph 99). As per claims 5, 17, Harada in view of MCCARTNEY, Jr.et al. further disclose assigning the second weight for each character in the word independently on a letter-by-letter basis (MCCARTNEY, Jr.et al., paragraphs 83 - 86; Harada, paragraph 99). As per claims 6, 18, Harada in view of MCCARTNEY, Jr.et al. further disclose separately evaluating the confidence level of a first character of the word and basing the confidence level of the word on the separate evaluation (“the recognizing device 20 calculates the connection score so as to be higher as the plurality of words of the character string which is used to calculate the similarity is closer to each other. Therefore, the recognizing device 20 determines the character string corresponding to the input voice signal by adding not only the similarity but also the connection score.”; Harada paragraphs 41, 42). As per claims 7, 19, Harada in view of MCCARTNEY, Jr.et al. further disclose merging confidence levels on a letter-by-letter basis from each instance of a same word, maintaining a highest confidence level of a letter of a given position and discarding remaining confidence levels of the letter of the given position (“The verifying unit 26 determines a character string corresponding to a sum that exceeds a threshold value and has the largest value among a plurality of calculated sums as a character string corresponding to the voice signal”; Harada, paragraphs 42, 79). As per claims 8, 20, Harada in view of MCCARTNEY, Jr.et al. further disclose basing the confidence level for each character on a corresponding log-likelihood (“When the voice is recognized to calculate the similarity (probability value), the acoustic model is compared with the voice signal”; Harada, paragraph 76). Allowable Subject Matter Claims 3, 11, and 15 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The following is a statement of reasons for the indication of allowable subject matter: As to claims 3, 11, 15, the prior art made of record does not teach or suggest that the word is deleted in response to the confidence level of the entire word being less than a first threshold a and the confidence level of a first character of the word being greater than a second threshold B Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEONARD SAINT-CYR whose telephone number is (571)272-4247. The examiner can normally be reached Monday- Friday. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571)272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /LEONARD SAINT-CYR/ Primary Examiner, Art Unit 2658
Read full office action

Prosecution Timeline

Mar 30, 2023
Application Filed
Mar 08, 2025
Non-Final Rejection — §103
Jul 10, 2025
Response Filed
Aug 20, 2025
Applicant Interview (Telephonic)
Sep 09, 2025
Final Rejection — §103
Nov 24, 2025
Applicant Interview (Telephonic)
Nov 29, 2025
Examiner Interview Summary
Dec 10, 2025
Request for Continued Examination
Dec 15, 2025
Response after Non-Final Action
Feb 02, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12603100
SYSTEM AND METHOD FOR OPTIMIZED AUDIO MIXING
2y 5m to grant Granted Apr 14, 2026
Patent 12597415
VOICE RECOGNITION GRAMMAR SELECTION BASED ON CONTEXT
2y 5m to grant Granted Apr 07, 2026
Patent 12592227
DIALOG UNDERSTANDING DEVICE AND DIALOG UNDERSTANDING METHOD
2y 5m to grant Granted Mar 31, 2026
Patent 12591765
SYSTEMS AND METHODS FOR BUILDING A CUSTOMIZED GENERATIVE ARTIFICIAL INTELLIGENT PLATFORM
2y 5m to grant Granted Mar 31, 2026
Patent 12585884
DIALOGUE APPARATUS, DIALOGUE METHOD, AND PROGRAM
2y 5m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
77%
Grant Probability
95%
With Interview (+18.2%)
3y 1m
Median Time to Grant
High
PTA Risk
Based on 1144 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month