Prosecution Insights
Last updated: April 19, 2026
Application No. 18/748,865

METHOD AND SYSTEM FOR TRAINING A NEURAL LANGUAGE-BASED MODEL FOR DATA ANNOTATION

Non-Final OA §101§102
Filed
Jun 20, 2024
Examiner
KIM, ETHAN DANIEL
Art Unit
2658
Tech Center
2600 — Communications
Assignee
Jpmorgan Chase Bank N A
OA Round
1 (Non-Final)
78%
Grant Probability
Favorable
1-2
OA Rounds
2y 11m
To Grant
99%
With Interview

Examiner Intelligence

Grants 78% — above average
78%
Career Allow Rate
83 granted / 107 resolved
+15.6% vs TC avg
Strong +30% interview lift
Without
With
+29.5%
Interview Lift
resolved cases with interview
Typical timeline
2y 11m
Avg Prosecution
13 currently pending
Career history
120
Total Applications
across all art units

Statute-Specific Performance

§101
7.4%
-32.6% vs TC avg
§103
48.0%
+8.0% vs TC avg
§102
38.1%
-1.9% vs TC avg
§112
1.7%
-38.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 107 resolved cases

Office Action

§101 §102
Notice of Pre-AIA or AIA Status 1. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 101 2. Claim Rejections - 35 USC § 101 7. 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. 3. Claims 1-6 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The Independent claim 1 recite(s) “A method for training a neural language-based model for data annotation, the method comprising: receiving, by at least one processor via a communication interface, a first set of data from a plurality of sources, each item of the first set of data being associated with a pre-defined data class; generating, by the at least one processor, at least one category vocabulary for the pre-defined data class; identifying, by the at least one processor, at least one token in the first set of data based on an analysis of the first set of data, wherein the at least one token corresponds to a category indicator of the pre-defined data class; masking, by the at least one processor, the at least one token; feeding, by the at least one processor, the masked at least one token together with a corresponding contextual vector to the neural language-based model; and predicting, by the at least one processor using the neural language-based model, a class of the masked at least one token using the corresponding contextual vector”. The limitations “receiving, by at least one processor via a communication interface, a first set of data from a plurality of sources, each item of the first set of data being associated with a pre-defined data class; generating, by the at least one processor, at least one category vocabulary for the pre-defined data class; identifying, by the at least one processor, at least one token in the first set of data based on an analysis of the first set of data, wherein the at least one token corresponds to a category indicator of the pre-defined data class; masking, by the at least one processor, the at least one token; feeding, by the at least one processor, the masked at least one token together with a corresponding contextual vector to the neural language-based model; and predicting, by the at least one processor using the neural language-based model, a class of the masked at least one token using the corresponding contextual vector” as drafted, covers a mental process, as this could be done by mentally or by hand with pen and paper. This judicial exception is not integrated into a practical application. Claim 1 recites “A method for training a neural language-based model for data annotation, the method comprising:…” this limitation directs towards using a computer for the method, and does not impose any meaningful limits on practicing the abstract idea. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The addition of the generic computer components recited above with regard to claim 1 do not amount to more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Claim 1 does not recite any additional limitations. The claim as drafted, is not patent eligible. Claim Rejections - 35 USC § 102 4. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. (a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention. 5. Claims 1-15 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Reza (U.S. Publication No. 20230237277). Regarding claim 1, Reza discloses a method for training a neural language-based model for data annotation, the method comprising: receiving, by at least one processor via a communication interface, a first set of data from a plurality of sources, each item of the first set of data being associated with a pre-defined data class ([0036] - The labels may be provided by a user (e.g., a customer) and may be particular to a domain that the user intends to train the model within. For example, the text labels may be words such as terrible, bland, flavorful, delicious, disgusting, sour, sweet, poison, enjoyable, spicy, etc. that relate to various semantic classes (e.g., positive, negative, neutral, or the like) to be predicted for each text example within the domain of food); generating, by the at least one processor, at least one category vocabulary for the pre-defined data class ([0038] - a prompt template 115 is generated for each relevant feature (a subset of features) selected from the extracted relevant features 110. Each dynamic prompt template 115 comprises a prompt for a text example from the training data, a relevant feature 110 extracted from the text example, and a blank or open field. The prompts are in natural language and are composed of discrete tokens from the vocabulary); identifying, by the at least one processor, at least one token in the first set of data based on an analysis of the first set of data, wherein the at least one token corresponds to a category indicator of the pre-defined data class ([0038] - The prompts are in natural language and are composed of discrete tokens from the vocabulary); masking, by the at least one processor, the at least one token ([0041] - The labels within the prompting templates are then masked with a masking token); feeding, by the at least one processor, the masked at least one token together with a corresponding contextual vector to the neural language-based model ([0041] - The prompting functions 120 with the masked prompting templates are then input into the model [0046] - The training techniques may depend on the type of model that is being trained. For example, there are different types of supervised learning models, such as different types of neural network models, support vector machine (SVM) models, and others); and predicting, by the at least one processor using the neural language-based model, a class of the masked at least one token using the corresponding contextual vector ([0064] - The model learns statistical properties of word sequences and linguistic patterns given the words in the prompting functions (i.e., the words in the text example and the prompting template) and uses those properties and patterns to predict text for the masked labels. The conditional probability of predicting the text for the mask labels provided the set of prompting functions is evaluated and combined together to predict a joint probability value for the solution such as a class (e.g., a sentiment class)). Regarding claim 2, Reza discloses the method, wherein the identifying of the at least one token in the first set of data comprises: identifying, by the at least one processor, contextually similar words in the first set of data that represent the pre-defined data class ([0036] - The labels may be provided by a user (e.g., a customer) and may be particular to a domain that the user intends to train the model within. For example, the text labels may be words such as terrible, bland, flavorful, delicious, disgusting, sour, sweet, poison, enjoyable, spicy, etc. that relate to various semantic classes (e.g., positive, negative, neutral, or the like) to be predicted for each text example within the domain of food); retrieving, by the at least one processor from a word repository, at least one replacement word for the contextually similar words ([0057] - his paraphrasing can be done in a number of ways, including using round-trip translation of the prompt into another language then back, using replacement of phrases from a thesaurus, or using a neural prompt rewriter specifically optimized to improve accuracy of systems using the prompt); checking, by the at least one processor, an occurrence of the at least one replacement word in the at least one category vocabulary ([0056] - The prompt mining approach is a mining-based method to automatically find templates given a set of training inputs x and outputs y. This method scrapes a large text corpus (e.g., Wikipedia) for strings containing x and y, and finds either the middle words or dependency paths between the inputs and outputs. Frequent middle words or dependency paths can serve as a template as in “[X] middle words [Z].” The middle words may be searched based on the extracted features or a middle word may be replaced with an extracted feature); and tagging, by the at least one processor, the contextually similar words as the at least one token in an event that the occurrence of the at least one replacement word in the at least one category vocabulary exceeds a threshold number ([0076] - modifying the model parameters with goal being to minimize the difference between the text labels and the text predicted for the mask tokens and minimize the difference between the specified solution label and the joint probability value predicted for the solution. The training may be performed iteratively (steps a-h) for each prompting function and/or until a specified condition is met, e.g., the model achieves a accuracy above a given threshold. The first cost function and the second cost function may be the same cost function or different cost functions). Regarding claim 3, Reza discloses the method, further comprising implementing a self-training process for the neural language-based model on an unlabeled second set of data ([0033] - obtaining a set of training data comprising text examples and associated labels, where the labels comprise: (i) text labels that relate to possible solutions for a task to be learned by a machine learning language model, and (ii) specified solution labels for the task). Regarding claim 4, Reza discloses the method, further comprising: updating, by the at least one processor, the at least one category vocabulary with the identified at least one token for the pre-defined data class ([0033] - where the training learns or updates model parameters of the machine learning language model for performing the task; and providing the machine learning language model with the learned or updated model parameters. [0038] - a prompt template 115 is generated for each relevant feature (a subset of features) selected from the extracted relevant features 110. Each dynamic prompt template 115 comprises a prompt for a text example from the training data, a relevant feature 110 extracted from the text example, and a blank or open field. The prompts are in natural language and are composed of discrete tokens from the vocabulary). Regarding claim 5, Reza discloses the method, wherein the first set of data comprises domain-specific data ([0036] - The labels may be provided by a user (e.g., a customer) and may be particular to a domain that the user intends to train the model within). Regarding claim 6, Reza discloses a computing device configured to implement an execution of a method for training a neural language-based model for data annotation, the computing device comprising: a processor ([0070] - …executed by one or more processing units…); a memory ([0070] – The software may be stored in a memory…); and a communication interface coupled to each of the processor and the memory, wherein the processor is configured to: receive, via the communication interface, a first set of data from a plurality of sources, each item of the first set of data being associated with a pre-defined data class ([0036] - The labels may be provided by a user (e.g., a customer) and may be particular to a domain that the user intends to train the model within. For example, the text labels may be words such as terrible, bland, flavorful, delicious, disgusting, sour, sweet, poison, enjoyable, spicy, etc. that relate to various semantic classes (e.g., positive, negative, neutral, or the like) to be predicted for each text example within the domain of food); generate at least one category vocabulary for the pre-defined data class ([0038] - a prompt template 115 is generated for each relevant feature (a subset of features) selected from the extracted relevant features 110. Each dynamic prompt template 115 comprises a prompt for a text example from the training data, a relevant feature 110 extracted from the text example, and a blank or open field. The prompts are in natural language and are composed of discrete tokens from the vocabulary); identify at least one token in the first set of data based on an analysis of the first set of data, wherein the at least one token corresponds to a category indicator of the pre-defined data class ([0038] - The prompts are in natural language and are composed of discrete tokens from the vocabulary); mask the at least one token ([0041] - The labels within the prompting templates are then masked with a masking token); feed the masked at least one token together with a corresponding contextual vector to the neural language-based model ([0041] - The prompting functions 120 with the masked prompting templates are then input into the model [0046] - The training techniques may depend on the type of model that is being trained. For example, there are different types of supervised learning models, such as different types of neural network models, support vector machine (SVM) models, and others); and predict, using the neural language-based model, a class of the masked at least one token using the corresponding contextual vector ([0064] - The model learns statistical properties of word sequences and linguistic patterns given the words in the prompting functions (i.e., the words in the text example and the prompting template) and uses those properties and patterns to predict text for the masked labels. The conditional probability of predicting the text for the mask labels provided the set of prompting functions is evaluated and combined together to predict a joint probability value for the solution such as a class (e.g., a sentiment class)). Dependent claims 7-10 are analogous in scope to claims 2-5, and are rejected according to the same reasoning. Regarding claim 11, Reza discloses a non-transitory computer readable storage medium storing instructions for training a neural language-based model for data annotation, the storage medium comprising executable code which, when executed by a processor, causes the processor to: receive, via a communication interface, a first set of data from a plurality of sources, each item of the first set of data being associated with a pre-defined data class ([0036] - The labels may be provided by a user (e.g., a customer) and may be particular to a domain that the user intends to train the model within. For example, the text labels may be words such as terrible, bland, flavorful, delicious, disgusting, sour, sweet, poison, enjoyable, spicy, etc. that relate to various semantic classes (e.g., positive, negative, neutral, or the like) to be predicted for each text example within the domain of food); generate at least one category vocabulary for the pre-defined data class ([0038] - a prompt template 115 is generated for each relevant feature (a subset of features) selected from the extracted relevant features 110. Each dynamic prompt template 115 comprises a prompt for a text example from the training data, a relevant feature 110 extracted from the text example, and a blank or open field. The prompts are in natural language and are composed of discrete tokens from the vocabulary); identify at least one token in the first set of data based on an analysis of the first set of data, wherein the at least one token corresponds to a category indicator of the pre-defined data class ([0038] - The prompts are in natural language and are composed of discrete tokens from the vocabulary); mask the at least one token ([0041] - The labels within the prompting templates are then masked with a masking token); feed the masked at least one token together with a corresponding contextual vector to the neural language-based model ([0041] - The prompting functions 120 with the masked prompting templates are then input into the model [0046] - The training techniques may depend on the type of model that is being trained. For example, there are different types of supervised learning models, such as different types of neural network models, support vector machine (SVM) models, and others); and predict, using the neural language-based model, a class of the masked at least one token using the corresponding contextual vector ([0064] - The model learns statistical properties of word sequences and linguistic patterns given the words in the prompting functions (i.e., the words in the text example and the prompting template) and uses those properties and patterns to predict text for the masked labels. The conditional probability of predicting the text for the mask labels provided the set of prompting functions is evaluated and combined together to predict a joint probability value for the solution such as a class (e.g., a sentiment class)). Dependent claims 12-15 are analogous in scope to claims 2-5, and are rejected according to the same reasoning. Conclusion 6. The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Girardi (U.S. Publication No. 20210286831) teaches query expansion in information retrieval systems. Nguyen (U.S. Publication No. 20230044266) teaches machine learning method and named entity recognition apparatus. Any inquiry concerning this communication or earlier communications from the examiner should be directed to ETHAN DANIEL KIM whose telephone number is (571) 272-1405. The examiner can normally be reached on Monday - Friday 9:00 - 5:00. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /ETHAN DANIEL KIM/ Examiner, Art Unit 2658 /RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658
Read full office action

Prosecution Timeline

Jun 20, 2024
Application Filed
Jan 05, 2026
Non-Final Rejection — §101, §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12597414
GENERATION OF TRAINING EXAMPLES FOR TRAINING AUTOMATIC SPEECH RECOGNIZERS
2y 5m to grant Granted Apr 07, 2026
Patent 12596874
OPERATION ERROR DETECTION
2y 5m to grant Granted Apr 07, 2026
Patent 12573384
DEVICE CONTROL SYSTEM
2y 5m to grant Granted Mar 10, 2026
Patent 12566922
KNOWLEDGE ACCELERATOR PLATFORM WITH SEMANTIC LABELING ACROSS DIFFERENT ASSETS
2y 5m to grant Granted Mar 03, 2026
Patent 12562183
DEEP LEARNING FOR JOINT ACOUSTIC ECHO AND ACOUSTIC HOWLING SUPPRESSION IN HYBRID MEETINGS
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
78%
Grant Probability
99%
With Interview (+29.5%)
2y 11m
Median Time to Grant
Low
PTA Risk
Based on 107 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month