Prosecution Insights
Last updated: May 29, 2026
Application No. 18/816,659

AUTOMATIC UPDATING OF AUTOMATIC SPEECH RECOGNITION FOR NAMED ENTITIES

Non-Final OA §103
Filed
Aug 27, 2024
Priority
Nov 06, 2023 — provisional 63/596,574
Examiner
DESIR, PIERRE LOUIS
Art Unit
2659
Tech Center
2600 — Communications
Assignee
Samsung Electronics Co., Ltd.
OA Round
1 (Non-Final)
61%
Grant Probability
Moderate
1-2
OA Rounds
2y 2m
Est. Remaining
93%
With Interview

Examiner Intelligence

Grants 61% of resolved cases
61%
Career Allowance Rate
176 granted / 288 resolved
-0.9% vs TC avg
Strong +32% interview lift
Without
With
+32.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 11m
Avg Prosecution
3 currently pending
Career history
296
Total Applications
across all art units

Statute-Specific Performance

§101
4.5%
-35.5% vs TC avg
§103
75.0%
+35.0% vs TC avg
§102
11.5%
-28.5% vs TC avg
§112
4.3%
-35.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 288 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Khemka et al., US 2023/0409615 A1 (“Khemka”) in view of Evermann et al., (US 2019/0179890 A1) (Evermann). Regarding claims 1, 9 and 17, Khemka discloses a device, a non-transitory medium and method, “identifying, using an automated speech recognition (ASR) system, at least one named entity hypothesis from at least one audio input;” by receiving audio and performing ASR to generate one-best and n-best hypotheses, followed by NLU extracting slots/entities (Khemka Fig. 2; ¶¶ [0050]–[0056]; ¶¶ [0110]–[0116]); “providing, using the ASR system, the identified at least one named entity hypothesis to a large language model (LLM);” by passing ASR/NLU outputs to transformer-based models (e.g., T5) for dialogue state tracking and slot/value prediction using example-guided inputs (Khemka ¶¶ [0158]–[0176]; Fig. 15); “generating a prompt using an automated prompt generator;” by programmatically constructing question templates Q(s→q) and concatenating retrieved in-context examples (TransferQA formatting) that serve as prompts for the model (Khemka ¶¶ [0174]–[0176]; Fig. 15); “processing, using the LLM, the identified at least one named entity hypothesis and the prompt to generate updated named entity recognition data;” by feeding the template prompt plus examples and dialogue history into T5 to output slot values/updated entity recognition (Khemka ¶¶ [0169]–[0176]; Fig. 15). Khemka does not specifically disclose “providing the updated named entity recognition data back to the ASR system.” However, Evermann discloses the natural language processor… re-processes the text string… to search for word matches… [and] can determine that the text actually refers to… [correct entity]… and feed corrected tokens back into the recognition pipeline” (¶0014, ¶0135; Fig. 5). Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Evermann’s feedback of corrected named-entity tokens into Khemka’s ASR/NLU pipeline to improve recognition accuracy. Both references address the same problem — errors in speech-to-text transcription — and Evermann’s technique predictably improves ASR performance by reintegrating corrected entity data. Regarding claims 2, 10 and 18, Khemka discloses the device, non-transitory medium and method of claims 1, 9 and 17, further comprising updating the ASR system with the updated named entity recognition data to enhance named entity recognition accuracy (i.e., model training, on-device/federated updates, and adapting models with data (Khemka ¶0061; ¶0114–¶0115; ¶0179–¶0181). It should also be noted that Evermann discloses a system which updates its speech recognition models with corrected named entities to improve future recognition accuracy (see paragraphs 15 and 136). Although not combined here, it would have been obvious to integrate Evermann’s update mechanism into Khemka’s pipeline for predictable accuracy gains. Regarding claims 3, 11 and 19, Khemka discloses the device, non-transitory medium and method of claims 2, 10 and 18, further comprising: accessing a set of audio samples of named entities collected from users of a voice assistance system, wherein each audio sample is annotated with a text transcript of the named entity and a corresponding category that the named entity belongs to of a plurality of categories (i.e., Khemka discloses user-collected named-entity audio samples, annotated with transcripts and categories (¶0110–¶0116)); for each audio sample in the set of audio samples: generating the prompt including the named entity based on the corresponding category (i.e., Khemka’s DST/TransferQA prompt generation with slot/category context (¶0174–¶0176)); and providing the prompt as input to the LLM (i.e., (¶0174–¶0176)), wherein the updated named entity recognition data includes a plurality of possible commands including the named entity based on the corresponding category (i.e. Khemka’s LLM output slot values/dialogue state (¶0176).); and training, based on the plurality of possible commands generated by the LLM, at least one of a language model or a talk-to-speech (TTS) model of the ASR system (i.e., Khemka’s training/fine-tuning of language/TTS models with generated examples (¶0179–¶0181; ¶0102–¶0106)). Regarding claims 4 and 12, Khemka discloses the device and method of claims 3 and 11, wherein the plurality of categories includes at least one of: an application name; a name of a person; a name of a television program; a name of a movie; a name of an electronic device; a name of a place; a name of a radio station; a name of a podcast; a name of a genre; a name of a business; a name of a sports team; or a name of a song (i.e., many domains/slots and lists of types (music, movies, contacts, places, etc.), and teaches domain/slot vocabularies that encompass these categories (Khemka ¶0046; ¶0116; Fig. 3C). Regarding claims 5 and 13, Khemka discloses the device and method of claims 3 and 11, further comprising: creating a base model trained using the set of audio samples of named entities collected from the users of the voice assistance system; and periodically updating the ASR system and/or the LLM based on the base model (i.e., building base models from user data, on-device and federated training, and periodically updating models including ASR/NLU components (Khemka ¶0061; ¶0114–¶0115; ¶0179–¶0181)). Regarding claims 6, 14 and 20, Khemka discloses the device, medium and method of claims 1, 9 and 17, further comprising: providing the prompt generated using the automated prompt generator to a talk-to-speech (TTS) model and synthesizing, using the TTS model, an audio sample based on the prompt (i.e., Khemka explicitly describes a TTS component that generates synthesized speech from text and discusses generated examples for model fine-tuning (Khemka ¶0102–¶0106; ¶0179–¶0181)). Although Kemka discloses general model training but does not explicitly teach training the ASR model using TTS-synthesized audio derived from LLM prompts. Evermann teaches generating augmented training data (including TTS-synthesized audio) for improving recognition of named entities and using such synthesized samples for ASR training/augmentation (Evermann ¶0015; ¶0136). Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to to apply Evermann’s data-augmentation technique using TTS samples to Khemka’s prompt/TTS flows in order to improve ASR training coverage for named entities recognized/produced by the LLM and to also improve the performance of the ASR. Regarding claims 7 and 15, Khemka discloses the device and method of claims 1 and 9, wherein the ASR system and the LLM are executed on a same electronic device (i.e., on-device ASR/portions of NLU and acknowledges some processing (ASR/parts of NLU/LLM) may execute on device (Khemka ¶0047–¶0053; Fig. 2). Regarding claims 8 and 16, Khemka discloses the device and method of claims 7 and 15, further comprising: providing user information stored on the same electronic device to the LLM; and processing, using the LLM, the identified at least one named entity hypothesis to generate the updated named entity recognition data using the user information (i.e., using local user data, personalized lexicons/gazetteers and using such context in NLU/LLM processing (Khemka ¶0046; ¶0061; ¶0110–¶0116)). Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to PIERRE LOUIS DESIR whose telephone number is (571)272-7799. The examiner can normally be reached Monday-Friday 9AM-5:30PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /PIERRE LOUIS DESIR/ Supervisory Patent Examiner, Art Unit 2659
Read full office action

Prosecution Timeline

Aug 27, 2024
Application Filed
Mar 02, 2026
Non-Final Rejection mailed — §103
May 01, 2026
Interview Requested
May 13, 2026
Applicant Interview (Telephonic)
May 14, 2026
Examiner Interview Summary

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12632788
PROMPT AUGMENTED GENERATIVE REPLAY VIA SUPERVISED CONTRASTIVE TRAINING FOR LIFELONG INTENT DETECTION
2y 10m to grant Granted May 19, 2026
Patent 12609124
VOICE AGENT SYSTEM
2y 8m to grant Granted Apr 21, 2026
Patent 12585679
EXECUTING UNSUPERVISED PRE-TRAINING TASKS WITH A MACHINE LEARNING MODEL TO PREDICT DOCUMENT GRAPH ATTRIBUTES
2y 11m to grant Granted Mar 24, 2026
Patent 12562154
Scalable Model Specialization Framework for Speech Model Personalization
2y 11m to grant Granted Feb 24, 2026
Patent 12555594
SYSTEM AND METHOD FOR TRACKING EMOTIONAL STATE OF A CALLER USING ARTIFICIAL INTELLIGENCE
3y 3m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2
Expected OA Rounds
61%
Grant Probability
93%
With Interview (+32.1%)
3y 11m (~2y 2m remaining)
Median Time to Grant
Low
PTA Risk
Based on 288 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month