Last updated: May 29, 2026

Application No. 18/428,208

ENHANCED DOMAIN-SPECIFIC LANGUAGE LEARNING MODELS

Non-Final OA §103

Filed

Jan 31, 2024

Examiner

GODBOLD, DOUGLAS

Art Unit

2655

Tech Center

2600 — Communications

Assignee

Genpact Usa Inc.

OA Round

3 (Non-Final)

Interview Optional

— +10.6% interview lift. Interview lift (+10.6%) is below the 15.0% threshold. A written response is recommended.

Based on 1089 resolved cases, 2023–2026

Examiner Intelligence

GODBOLD, DOUGLAS View full profile →

Grants 83% — above average

Career Allowance Rate

906 granted / 1089 resolved

+21.2% vs TC avg

Moderate +11% lift

Without

With

+10.6%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

18 currently pending

Career history

1106

Total Applications

across all art units

Statute-Specific Performance

§101

6.5%

-33.5% vs TC avg

§103

76.6%

+36.6% vs TC avg

§102

7.0%

-33.0% vs TC avg

§112

4.7%

-35.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1089 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 28 April 2026 has been entered.

Response to Amendment
The amendment filed 28 April 2026 has been accepted and considered in this office action.  Claims 1, 3, 11, 13 and 20 have been amended.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 11, and 20 regarding iteratively training until a training loss falls below a specific threshold have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Applicant's arguments filed 28 April 2026 regarding “wherein the domain-specific data used for training the domain language model is separate from an input corpus used for generating embeddings for one or more downstream tasks” have been fully considered but they are not persuasive.  Applicant argues, see pages 8-9, Tai fails to teach these limitations.   However examiner notes that Tai pre-trains the domain extension model with 17G-Bio corpus (see section 4.3) while using MTL-Bioinformatics-2016 database for finetuning for downstream tasks (see section 4.1).  Thus the data is separate as claimed. 

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Claim(s) 1, 4-8, 10, 11, 14-18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Tai et al. (exBERT: Extending Pre-trained Models with Domain-specific Vocabular Under Constrained Training Resources) in view of Lee et al. (US PAP 20180336183).

Consider claim 1, Tai teaches A method for creating an enhanced domain-specific language learning model (abstract), the method comprising:
training a domain language model using domain-specific data by iteratively supplying the domain specific data to the domain langue model (section 3.1, 3.2, extension module, section 4.1, Adaptive pretraining extension module trained using medical data); 
receiving input corpus for one or more downstream tasks (section 3.1, input text); 
using the domain language model with the input corpus to generate a first set of embeddings (Section 3.1, and 3.2, extension module used to generate embeddings, also figure 1b) wherein the domain-specific data used for training the domain language model is separate from an input corpus used for generating embeddings for one or more downstream tasks (pre-training the domain extension model with 17G-Bio corpus (see section 4.3) while using MTL-Bioinformatics-2016 database for finetuning for downstream tasks (see section 4.1)); 
using a pre-trained large language model (LLM) with the input corpus to generate a second set of embeddings (Section 3.1, and 3.2, using off the shelf BERT model to generate embeddings, also figure 1b); 
combining the first and second sets of embeddings to form a combined set of embeddings (3.2 combining extension embeddings and off the shelf embeddings using weights ); and 
performing the one or more downstream tasks using the combined set of embeddings (section 4.1, performing downstream tasks such as Named Entity Recognition).
Tai does not explicitly teach iteratively adjusting at least one activation function associated with the domain language model until a training loss falls below a specific threshold.
In the same field of machine learning, Lee teaches iteratively adjusting at least one activation function associated with the domain language model until a training loss falls below a specific threshold (figure 6, 0173, updating parameters and repeating training until loss is below a predetermined threshold).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to use training losses and thresholds to adjust activations of models as taught by Lee in the system of Tai in order to make use of the most well-known and understood method of training neural network models. 

Consider claim 4, Tai teaches The method of claim 1, wherein the domain language model is domain agnostic, and wherein prior to training the domain language model, the method further comprises: receiving the domain-specific data from domains in at least finance, insurance, medicine, or artificial intelligence (AI) services (section 4.1, using medical corpus).

Consider claim 5, Tai teaches the method of claim 1, wherein the pre-trained LLM is a generative pre-trained transformer (GPT) model or a bidirectional encoder representation from transformers (BERT) model (section 3.1, off the shelf BERT model).

Consider claim 6, Tai teaches The method of claim 1, wherein combining the first and second sets of embeddings comprises:
generating the combined set of embeddings to capture and integrate both general and domain-specific knowledge respectively learned using the pre-trained LLM and the domain language model (3.2 combining extension embeddings and off the shelf embeddings using weights); and 
using the combined set of embeddings as input to the one or more downstream tasks (section 4.1-4.3, using embeddings to performed named entity recognition (NER)).

Consider claim 7, Tai teaches the method of claim 6, wherein the downstream tasks include one or more of classification, clustering, named entity recognition (NER), and retrieval augmented generation (RAG) (section 4.1-4.3, using embeddings to performed named entity recognition (NER))).

Consider claim 8, Tai teaches the method of claim 6, wherein the combined set of embeddings are generated based on concatenation or weighted averaging (section 3.2, weighted summation, i.e. average or of the shelf and extension embeddings).

Consider claim 10, Tai teaches The method of claim 1, further comprising preprocessing the input corpus (section 3.1, removing overlapping words that also appear in pre-trained BERT corpus).

Consider claim 11, Tai teaches A system for creating an enhanced domain-specific language learning model, the system comprising: 
a processor (section 4.1, GPUS); and 
a memory in communication with the processor and comprising instructions (section 4.1, using GPU which include memories) which, when executed by the processor, program the processor to:
train a domain language model using domain-specific data by iteratively supplying the domain specific data to the domain langue model (section 3.1, 3.2, extension module, section 4.1, Adaptive pretraining extension module trained using medical data); 
receive input corpus for one or more downstream tasks (section 3.1, input text); 
use the domain language model with the input corpus to generate a first set of embeddings (Section 3.1, and 3.2, extension module used to generate embeddings, also figure 1b) wherein the domain-specific data used for training the domain language model is separate from an input corpus used for generating embeddings for one or more downstream tasks (pre-training the domain extension model with 17G-Bio corpus (see section 4.3) while using MTL-Bioinformatics-2016 database for finetuning for downstream tasks (see section 4.1)); 
use a pre-trained large language model (LLM) with the input corpus to generate a second set of embeddings (Section 3.1, and 3.2, using off the shelf BERT model to generate embeddings, also figure 1b); 
combine the first and second sets of embeddings to form a combined set of embeddings (3.2 combining extension embeddings and off the shelf embeddings using weights ); and 
perform the one or more downstream tasks using the combined set of embeddings (section 4.1, performing downstream tasks such as Named Entity Recognition).
Tai does not explicitly teach iteratively adjusting at least one activation function associated with the domain language model until a training loss falls below a specific threshold.
In the same field of machine learning, Lee teaches iteratively adjusting at least one activation function associated with the domain language model until a training loss falls below a specific threshold (figure 6, 0173, updating parameters and repeating training until loss is below a predetermined threshold).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to use training losses and thresholds to adjust activations of models as taught by Lee in the system of Tai in order to make use of the most well-known and understood method of training neural network models. 
 
Claim 14 contains similar limitations as claim 4 and therefore is rejected for the same reasons.

Claim 15 contains similar limitations as claim 5 and therefore is rejected for the same reasons.

Claim 16 contains similar limitations as claim 6 and therefore is rejected for the same reasons.

Claim 17 contains similar limitations as claim 7 and therefore is rejected for the same reasons.

Claim 18 contains similar limitations as claim 8 and therefore is rejected for the same reasons.

Consider claim 20, Tai teaches A computer program product for creating an enhanced domain-specific language learning model, the computer program product comprising a non-transitory computer-readable medium having computer readable program code stored thereon (section 4.1, using GPU which include memories), the computer readable program code configured to: 
train a domain language model using domain-specific data by iteratively supplying the domain specific data to the domain langue model (section 3.1, 3.2, extension module, section 4.1, Adaptive pretraining extension module trained using medical data); 
receive input corpus for one or more downstream tasks (section 3.1, input text); 
use the domain language model with the input corpus to generate a first set of embeddings (Section 3.1, and 3.2, extension module used to generate embeddings, also figure 1b) wherein the domain-specific data used for training the domain language model is separate from an input corpus used for generating embeddings for one or more downstream tasks (pre-training the domain extension model with 17G-Bio corpus (see section 4.3) while using MTL-Bioinformatics-2016 database for finetuning for downstream tasks (see section 4.1)); 
use a pre-trained large language model (LLM) with the input corpus to generate a second set of embeddings (Section 3.1, and 3.2, using off the shelf BERT model to generate embeddings, also figure 1b); 
combine the first and second sets of embeddings to form a combined set of embeddings (3.2 combining extension embeddings and off the shelf embeddings using weights ); and 
perform the one or more downstream tasks using the combined set of embeddings (section 4.1, performing downstream tasks such as Named Entity Recognition).
Tai does not explicitly teach iteratively adjusting at least one activation function associated with the domain language model until a training loss falls below a specific threshold.
In the same field of machine learning, Lee teaches iteratively adjusting at least one activation function associated with the domain language model until a training loss falls below a specific threshold (figure 6, 0173, updating parameters and repeating training until loss is below a predetermined threshold).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to use training losses and thresholds to adjust activations of models as taught by Lee in the system of Tai in order to make use of the most well-known and understood method of training neural network models. 

Claim(s) 2, 3, 12, 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Tai and Lee as applied to claims 1 and 11 above, and further in view of Shekhar et al. (US PAP 2025/0232134).

Consider claim 2, Tai and Lee teach the method of claim 1, wherein the domain language model is trained to recognize and capture linguistic patterns, structures, and semantics of one or more specialized domains based on the domain-specific data (Tai section 3.1 and 3.2, extension module captures meaning and structures of domain specific words and phrases), but do not specifically teach wherein the domain language model is a domain-specific causal language model (CLM).
In the same field of language modeling, Shekhar teaches wherein the domain language model is a domain-specific causal language model (CLM) (0056, LM may be Causal Language Model).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to use a CLM as taught by Shekhar in the system of Tai and Lee in order to allow for learning context of the words and phrases, (Shekhar 0056).

Consider claim 3, Shekhar teaches the method of claim 2, further comprising performing statistical evaluation of the CLM by examining at least one of perplexity scores, training loss, and validation loss (0026, using losses for training, i.e. training losses to train the model.).

Claim 12 contains similar limitations as claim 2 and therefore is rejected for the same reasons.

Claim 13 contains similar limitations as claim 3 and therefore is rejected for the same reasons.

Claim(s) 9 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Tai and Lee as applied to claims 6 and 16 above and further in view of Yan et al. (US PAP 2025/0094718).

Consider claim 9, Tai and Lee teach The method of claim 6, but do not specifically teach applying dimensionality reduction to the combined embeddings.
In the same field of combining text embeddings, Yan teaches applying dimensionality reduction to the combined embeddings (0081, performing dimensionality reduction on the combined embedding).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to use dimensionality reduction as taught by Yan in the system of Tai and Lee in order to reduce processing requirements needed to process the embeddings.

Claim 19 contains similar limitations as claim 9 and therefore is rejected for the same reasons.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451. The examiner can normally be reached 6:30am-5pm Monday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached at (571)272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

DOUGLAS GODBOLD
Examiner
Art Unit 2655

/DOUGLAS GODBOLD/Primary Examiner, Art Unit 2655

Read full office action

Prosecution Timeline

Jan 31, 2024

Application Filed

Oct 24, 2025

Non-Final Rejection mailed — §103

Jan 15, 2026

Response Filed

Jan 28, 2026

Final Rejection mailed — §103

Apr 28, 2026

Request for Continued Examination

Apr 30, 2026

Response after Non-Final Action

May 06, 2026

Non-Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/384,009

Patent 12640138

REAL-TIME VOICE RECOGNITION METHOD, MODEL TRAINING METHOD, APPARATUSES, DEVICE, AND STORAGE MEDIUM

2y 7m to grant Granted May 26, 2026

18/449,237

Patent 12626690

SYSTEMS, METHODS, AND DEVICES FOR LOW-POWER AUDIO SIGNAL DETECTION

2y 9m to grant Granted May 12, 2026

18/365,765

Patent 12614553

METHOD, APPARATUS, ELECTRONIC DEVICE, AND MEDIUM FOR SPEECH PROCESSING

2y 8m to grant Granted Apr 28, 2026

18/429,150

Patent 12614037

LARGE LANGUAGE MODEL INTERFACE FOR COMPLEX DATABASES

2y 2m to grant Granted Apr 28, 2026

18/739,304

Patent 12613919

Error Correcting of Programming Code Generated Through Integration with Generative Artificial Intelligence

1y 10m to grant Granted Apr 28, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

83%

Grant Probability

94%

With Interview (+10.6%)

2y 9m (~5m remaining)

Median Time to Grant

High

PTA Risk

Based on 1089 resolved cases by this examiner. Grant probability derived from career allowance rate.