Prosecution Insights
Last updated: April 19, 2026
Application No. 18/816,659

AUTOMATIC UPDATING OF AUTOMATIC SPEECH RECOGNITION FOR NAMED ENTITIES

Non-Final OA §103
Filed
Aug 27, 2024
Examiner
DESIR, PIERRE LOUIS
Art Unit
2659
Tech Center
2600 — Communications
Assignee
Samsung Electronics Co., Ltd.
OA Round
1 (Non-Final)
61%
Grant Probability
Moderate
1-2
OA Rounds
4y 4m
To Grant
92%
With Interview

Examiner Intelligence

Grants 61% of resolved cases
61%
Career Allow Rate
173 granted / 285 resolved
-1.3% vs TC avg
Strong +32% interview lift
Without
With
+31.5%
Interview Lift
resolved cases with interview
Typical timeline
4y 4m
Avg Prosecution
10 currently pending
Career history
295
Total Applications
across all art units

Statute-Specific Performance

§101
14.4%
-25.6% vs TC avg
§103
48.4%
+8.4% vs TC avg
§102
18.8%
-21.2% vs TC avg
§112
11.8%
-28.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 285 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Khemka et al., US 2023/0409615 A1 (“Khemka”) in view of Evermann et al., (US 2019/0179890 A1) (Evermann). Regarding claims 1, 9 and 17, Khemka discloses a device, a non-transitory medium and method, “identifying, using an automated speech recognition (ASR) system, at least one named entity hypothesis from at least one audio input;” by receiving audio and performing ASR to generate one-best and n-best hypotheses, followed by NLU extracting slots/entities (Khemka Fig. 2; ¶¶ [0050]–[0056]; ¶¶ [0110]–[0116]); “providing, using the ASR system, the identified at least one named entity hypothesis to a large language model (LLM);” by passing ASR/NLU outputs to transformer-based models (e.g., T5) for dialogue state tracking and slot/value prediction using example-guided inputs (Khemka ¶¶ [0158]–[0176]; Fig. 15); “generating a prompt using an automated prompt generator;” by programmatically constructing question templates Q(s→q) and concatenating retrieved in-context examples (TransferQA formatting) that serve as prompts for the model (Khemka ¶¶ [0174]–[0176]; Fig. 15); “processing, using the LLM, the identified at least one named entity hypothesis and the prompt to generate updated named entity recognition data;” by feeding the template prompt plus examples and dialogue history into T5 to output slot values/updated entity recognition (Khemka ¶¶ [0169]–[0176]; Fig. 15). Khemka does not specifically disclose “providing the updated named entity recognition data back to the ASR system.” However, Evermann discloses the natural language processor… re-processes the text string… to search for word matches… [and] can determine that the text actually refers to… [correct entity]… and feed corrected tokens back into the recognition pipeline” (¶0014, ¶0135; Fig. 5). Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Evermann’s feedback of corrected named-entity tokens into Khemka’s ASR/NLU pipeline to improve recognition accuracy. Both references address the same problem — errors in speech-to-text transcription — and Evermann’s technique predictably improves ASR performance by reintegrating corrected entity data. Regarding claims 2, 10 and 18, Khemka discloses the device, non-transitory medium and method of claims 1, 9 and 17, further comprising updating the ASR system with the updated named entity recognition data to enhance named entity recognition accuracy (i.e., model training, on-device/federated updates, and adapting models with data (Khemka ¶0061; ¶0114–¶0115; ¶0179–¶0181). It should also be noted that Evermann discloses a system which updates its speech recognition models with corrected named entities to improve future recognition accuracy (see paragraphs 15 and 136). Although not combined here, it would have been obvious to integrate Evermann’s update mechanism into Khemka’s pipeline for predictable accuracy gains. Regarding claims 3, 11 and 19, Khemka discloses the device, non-transitory medium and method of claims 2, 10 and 18, further comprising: accessing a set of audio samples of named entities collected from users of a voice assistance system, wherein each audio sample is annotated with a text transcript of the named entity and a corresponding category that the named entity belongs to of a plurality of categories (i.e., Khemka discloses user-collected named-entity audio samples, annotated with transcripts and categories (¶0110–¶0116)); for each audio sample in the set of audio samples: generating the prompt including the named entity based on the corresponding category (i.e., Khemka’s DST/TransferQA prompt generation with slot/category context (¶0174–¶0176)); and providing the prompt as input to the LLM (i.e., (¶0174–¶0176)), wherein the updated named entity recognition data includes a plurality of possible commands including the named entity based on the corresponding category (i.e. Khemka’s LLM output slot values/dialogue state (¶0176).); and training, based on the plurality of possible commands generated by the LLM, at least one of a language model or a talk-to-speech (TTS) model of the ASR system (i.e., Khemka’s training/fine-tuning of language/TTS models with generated examples (¶0179–¶0181; ¶0102–¶0106)). Regarding claims 4 and 12, Khemka discloses the device and method of claims 3 and 11, wherein the plurality of categories includes at least one of: an application name; a name of a person; a name of a television program; a name of a movie; a name of an electronic device; a name of a place; a name of a radio station; a name of a podcast; a name of a genre; a name of a business; a name of a sports team; or a name of a song (i.e., many domains/slots and lists of types (music, movies, contacts, places, etc.), and teaches domain/slot vocabularies that encompass these categories (Khemka ¶0046; ¶0116; Fig. 3C). Regarding claims 5 and 13, Khemka discloses the device and method of claims 3 and 11, further comprising: creating a base model trained using the set of audio samples of named entities collected from the users of the voice assistance system; and periodically updating the ASR system and/or the LLM based on the base model (i.e., building base models from user data, on-device and federated training, and periodically updating models including ASR/NLU components (Khemka ¶0061; ¶0114–¶0115; ¶0179–¶0181)). Regarding claims 6, 14 and 20, Khemka discloses the device, medium and method of claims 1, 9 and 17, further comprising: providing the prompt generated using the automated prompt generator to a talk-to-speech (TTS) model and synthesizing, using the TTS model, an audio sample based on the prompt (i.e., Khemka explicitly describes a TTS component that generates synthesized speech from text and discusses generated examples for model fine-tuning (Khemka ¶0102–¶0106; ¶0179–¶0181)). Although Kemka discloses general model training but does not explicitly teach training the ASR model using TTS-synthesized audio derived from LLM prompts. Evermann teaches generating augmented training data (including TTS-synthesized audio) for improving recognition of named entities and using such synthesized samples for ASR training/augmentation (Evermann ¶0015; ¶0136). Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to to apply Evermann’s data-augmentation technique using TTS samples to Khemka’s prompt/TTS flows in order to improve ASR training coverage for named entities recognized/produced by the LLM and to also improve the performance of the ASR. Regarding claims 7 and 15, Khemka discloses the device and method of claims 1 and 9, wherein the ASR system and the LLM are executed on a same electronic device (i.e., on-device ASR/portions of NLU and acknowledges some processing (ASR/parts of NLU/LLM) may execute on device (Khemka ¶0047–¶0053; Fig. 2). Regarding claims 8 and 16, Khemka discloses the device and method of claims 7 and 15, further comprising: providing user information stored on the same electronic device to the LLM; and processing, using the LLM, the identified at least one named entity hypothesis to generate the updated named entity recognition data using the user information (i.e., using local user data, personalized lexicons/gazetteers and using such context in NLU/LLM processing (Khemka ¶0046; ¶0061; ¶0110–¶0116)). Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to PIERRE LOUIS DESIR whose telephone number is (571)272-7799. The examiner can normally be reached Monday-Friday 9AM-5:30PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /PIERRE LOUIS DESIR/ Supervisory Patent Examiner, Art Unit 2659
Read full office action

Prosecution Timeline

Aug 27, 2024
Application Filed
Feb 26, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12585679
EXECUTING UNSUPERVISED PRE-TRAINING TASKS WITH A MACHINE LEARNING MODEL TO PREDICT DOCUMENT GRAPH ATTRIBUTES
2y 5m to grant Granted Mar 24, 2026
Patent 12562154
Scalable Model Specialization Framework for Speech Model Personalization
2y 5m to grant Granted Feb 24, 2026
Patent 12555594
SYSTEM AND METHOD FOR TRACKING EMOTIONAL STATE OF A CALLER USING ARTIFICIAL INTELLIGENCE
2y 5m to grant Granted Feb 17, 2026
Patent 12542137
MULTI-PERSON LLM ASSISTANT INTERACTIONS
2y 5m to grant Granted Feb 03, 2026
Patent 12541672
ADDRESSING CATASTROPHIC FORGETTING AND OVER-GENERALIZATION WHILE TRAINING A NATURAL LANGUAGE TO A MEANING REPRESENTATION LANGUAGE SYSTEM
2y 5m to grant Granted Feb 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
61%
Grant Probability
92%
With Interview (+31.5%)
4y 4m
Median Time to Grant
Low
PTA Risk
Based on 285 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month