Prosecution Insights
Last updated: April 19, 2026
Application No. 18/175,601

SELF-SUPERVISED TERM ENCODING WITH CONFIDENCE ESTIMATION

Non-Final OA §101§103
Filed
Feb 28, 2023
Examiner
SACKALOSKY, COREY MATTHEW
Art Unit
2128
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
1 (Non-Final)
64%
Grant Probability
Moderate
1-2
OA Rounds
4y 2m
To Grant
99%
With Interview

Examiner Intelligence

Grants 64% of resolved cases
64%
Career Allow Rate
16 granted / 25 resolved
+9.0% vs TC avg
Strong +49% interview lift
Without
With
+49.4%
Interview Lift
resolved cases with interview
Typical timeline
4y 2m
Avg Prosecution
39 currently pending
Career history
64
Total Applications
across all art units

Statute-Specific Performance

§101
42.0%
+2.0% vs TC avg
§103
38.0%
-2.0% vs TC avg
§102
12.9%
-27.1% vs TC avg
§112
7.1%
-32.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 25 resolved cases

Office Action

§101 §103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Information Disclosure Statement The information disclosure statement (IDS) submitted on 02/28/2023 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-20 rejected under 35 U.S.C. 101 because they are directed toward an abstract idea without significantly more. Step 1 analysis: Independent Claims 1 and 15 recite, in part, a computer-implemented method, therefore falling into the statutory category of process. Independent Claim 19 recites, in part, a computer program product, therefore falling into the statutory category of manufacture. Regarding Claim 1: Step 2A: Prong 1 analysis: Claim 1 recites in part: “generating, with the term encoder, second embeddings from numerical representations of word subunits of the training terms with an objective of minimizing distances between the first embeddings and the second embeddings”. As drafted and under its broadest reasonable interpretation, this limitation covers a mathematical relationship/concept (generating numerical values and minimizing distances). “predicting confidence scores based on the minimized distances”. As drafted and under its broadest reasonable interpretation, this limitation covers performance of the limitation in the mind (including an observation, evaluation, judgement, or opinion) or with the aid of pencil and paper. For example, this limitation encompasses predicting scores based on calculated values. Accordingly, at Step 2A: Prong 1, the claim is directed to an abstract idea. Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “training the model on a training dataset that associates training terms with first embeddings of the training terms”. This additional element is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component (machine learning model) (See MPEP 2106.05(f)). “wherein the word subunits form part of a predetermined set of word subunits”. This limitation merely indicates a field of use or technological environment in which the judicial exception is performed (word/semantic data) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h). “deploying the model as part of an executable algorithm to allow a user to infer third embeddings and corresponding confidence scores from any input terms written based on word subunits of the predetermined set”. This additional element is recited at a high level of generality such that the claim recites only the idea of a solution or outcome (deploy a model) i.e., the claim fails to recite details of how a solution to a problem is accomplished. Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. As discussed above, the additional element(s) of “training the model on a training dataset that associates training terms with first embeddings of the training terms” is/are recited at a high-level of generality such that it/they amount(s) to no more than mere instructions to apply the exception using generic computer components (See MPEP 2106.05(f)). The additional element(s) of “wherein the word subunits form part of a predetermined set of word subunits” is/are directed to particular field(s) of use (word/semantic data) (MPEP 2106.05(h)) and therefore do not provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible. As discussed above, the additional element(s) of “deploying the model as part of an executable algorithm to allow a user to infer third embeddings and corresponding confidence scores from any input terms written based on word subunits of the predetermined set” is/are recited at a high-level of generality such that the claim recites only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished (See MPEP 2106.05(f)). Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 2: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the word subunits of the training terms are characters”. This limitation merely indicates a field of use or technological environment in which the judicial exception is performed (word/semantic data) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h). “wherein the model is trained to generate the second embeddings based on numerical representations of characters of the training terms”. This additional element is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component (machine learning model) (See MPEP 2106.05(f)). Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. The additional element(s) of “wherein the word subunits of the training terms are characters” is/are directed to particular field(s) of use (word/semantic data) (MPEP 2106.05(h)) and therefore do not provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible. As discussed above, the additional element(s) of “wherein the model is trained to generate the second embeddings based on numerical representations of characters of the training terms s” is/are recited at a high-level of generality such that it/they amount(s) to no more than mere instructions to apply the exception using generic computer components (See MPEP 2106.05(f)). Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 3: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the training terms are captured as tokens, and wherein at least some of the tokens capture respective sets of multiple words, and wherein each of the sets of multiple words are tokenized into a respective single token”. This limitation merely indicates a field of use or technological environment in which the judicial exception is performed (word/semantic data) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h). Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. The additional element(s) of “wherein the training terms are captured as tokens, and wherein at least some of the tokens capture respective sets of multiple words, and wherein each of the sets of multiple words are tokenized into a respective single token” is/are directed to particular field(s) of use (word/semantic data) (MPEP 2106.05(h)) and therefore do not provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible. Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 4: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “prior to training the model, obtaining the training dataset as an embedding matrix which maps the tokens to the first embeddings”. This additional elements is recited at a high level of generality and amounts to extra-solution activity of gathering data i.e. pre-solution activity of gathering data for use in the claimed process. Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. The additional element(s) of “prior to training the model, obtaining the training dataset as an embedding matrix which maps the tokens to the first embeddings” is/are recited at a high level of generality and amount(s) to extra-solution activity of receiving data i.e., pre-solution activity of gathering data for use in the claimed process. The courts have found limitations directed to obtaining information electronically, recited at a high level of generality, to be well-understood, routine, and conventional (see MPEP 2106.05(d)(II), “receiving or transmitting data over a network”, "electronic record keeping," and "storing and retrieving information in memory"). Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 5: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “prior to obtaining the embedding matrix, running a natural language preprocessing pipeline on text data of one or more text corpora to tokenize the text data using a sequence tagging model designed to identify named entities, so as to tokenize multiple words corresponding to the name entities into single tokens”. This additional elements is recited at a high level of generality and amounts to extra-solution activity of gathering data i.e. pre-solution activity of gathering data for use in the claimed process. Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. The additional element(s) of “prior to obtaining the embedding matrix, running a natural language preprocessing pipeline on text data of one or more text corpora to tokenize the text data using a sequence tagging model designed to identify named entities, so as to tokenize multiple words corresponding to the name entities into single tokens” is/are recited at a high level of generality and amount(s) to extra-solution activity of receiving data i.e., pre-solution activity of gathering data for use in the claimed process. The courts have found limitations directed to obtaining information electronically, recited at a high level of generality, to be well-understood, routine, and conventional (see MPEP 2106.05(d)(II), “receiving or transmitting data over a network”, "electronic record keeping," and "storing and retrieving information in memory"). Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 6: Step 2A: Prong 1 analysis: Claim 6 recites in part: “identifying the word subunits of the training terms”. As drafted and under its broadest reasonable interpretation, this limitation covers performance of the limitation in the mind (including an observation, evaluation, judgement, or opinion) or with the aid of pencil and paper. For example, this limitation encompasses identifying characters of words in a training set. Accordingly, at Step 2A: Prong 1, the claim is directed to an abstract idea. Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the term encoder includes a word subunit decomposition layer, a word subunit embedding layer, and one or more trainable layers, and wherein the word subunit decomposition layer is connected to the word subunit embedding layer, itself connected to the one or more trainable layers”. This additional element is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component (machine learning model layers) (See MPEP 2106.05(f)). “through the word subunit decomposition layer”. This additional element is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component (machine learning model layers) (See MPEP 2106.05(f)). “obtaining numerical representations of the identified word subunits through the word subunit embedding layer”. This additional elements is recited at a high level of generality and amounts to extra-solution activity of gathering data i.e. pre-solution activity of gathering data for use in the claimed process. “training the one or more trainable layers to generate the second embeddings from the obtained numerical representations in accordance with an objective function defining the objective”. This limitation merely indicates a field of use or technological environment in which the judicial exception is performed (objective functions) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h). Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. As discussed above, the additional element(s) of “wherein the term encoder includes a word subunit decomposition layer, a word subunit embedding layer, and one or more trainable layers, and wherein the word subunit decomposition layer is connected to the word subunit embedding layer, itself connected to the one or more trainable layers” and “through the word subunit decomposition layer” is/are recited at a high-level of generality such that it/they amount(s) to no more than mere instructions to apply the exception using generic computer components (See MPEP 2106.05(f)). The additional element(s) of “obtaining numerical representations of the identified word subunits through the word subunit embedding layer” is/are recited at a high level of generality and amount(s) to extra-solution activity of receiving data i.e., pre-solution activity of gathering data for use in the claimed process. The courts have found limitations directed to obtaining information electronically, recited at a high level of generality, to be well-understood, routine, and conventional (see MPEP 2106.05(d)(II), “receiving or transmitting data over a network”, "electronic record keeping," and "storing and retrieving information in memory"). The additional element(s) of “training the one or more trainable layers to generate the second embeddings from the obtained numerical representations in accordance with an objective function defining the objective” is/are directed to particular field(s) of use (objective functions) (MPEP 2106.05(h)) and therefore do not provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible. Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 7: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the one or more trainable layers include several layers that are configured as a multilayer perceptron”. This additional element is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component (multilayer perceptron) (See MPEP 2106.05(f)). Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. As discussed above, the additional element(s) of “wherein the one or more trainable layers include several layers that are configured as a multilayer perceptron” is/are recited at a high-level of generality such that it/they amount(s) to no more than mere instructions to apply the exception using generic computer components (See MPEP 2106.05(f)). Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 8: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the one or more trainable layers further include at least one long short-term memory layer interfacing the word subunit embedding layer with the multilayer perceptron”. This additional element is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component (LSTM) (See MPEP 2106.05(f)). Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. As discussed above, the additional element(s) of “wherein the one or more trainable layers further include at least one long short-term memory layer interfacing the word subunit embedding layer with the multilayer perceptron” is/are recited at a high-level of generality such that it/they amount(s) to no more than mere instructions to apply the exception using generic computer components (See MPEP 2106.05(f)). Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 9: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the model further includes an estimator connected by the one or more trainable layers, and wherein training the model further comprises training the estimator to predict the confidence scores”. This additional element is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component (estimator) (See MPEP 2106.05(f)). Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. As discussed above, the additional element(s) of “wherein the model further includes an estimator connected by the one or more trainable layers, and wherein training the model further comprises training the estimator to predict the confidence scores” is/are recited at a high-level of generality such that it/they amount(s) to no more than mere instructions to apply the exception using generic computer components (See MPEP 2106.05(f)). Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 10: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the objective function is a first objective function, and wherein the estimator is trained to predict the confidence scores in accordance with a second objective function, the second objective function defining an objective of minimizing a difference between the confidence scores predicted by the estimator and the first objective function as evaluated based on the minimized distances”. This limitation merely indicates a field of use or technological environment in which the judicial exception is performed (objective functions) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h). Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. The additional element(s) of “training the one or more trainable layers to generate the second embeddings from the obtained numerical representations in accordance with an objective function defining the objective” is/are directed to particular field(s) of use (objective functions) (MPEP 2106.05(h)) and therefore do not provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible. Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 11: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the one or more trainable layers are trained to learn parameters of distributions, from which respective ones of the first embeddings are drawn and wherein the second embeddings and the corresponding confidence scores are obtained from the learned parameters of the distributions”. This additional elements is recited at a high level of generality such that the claim recites only the idea of a solution or outcome (train a model) i.e., the claim fails to recite details of how a solution to a problem is accomplished. Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. As discussed above, the additional element(s) of “wherein the one or more trainable layers are trained to learn parameters of distributions, from which respective ones of the first embeddings are drawn and wherein the second embeddings and the corresponding confidence scores are obtained from the learned parameters of the distributions” is/are recited at a high-level of generality such that the claim recites only the idea of a solution or outcome (train a model) i.e., the claim fails to recite details of how a solution to a problem is accomplished (See MPEP 2106.05(f)). Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 12: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the parameters learned for each distribution of the distributions include a mean and a variance, the mean corresponding to a respective one of the second embeddings, while a corresponding one of the confidence scores is obtained based on a negative log-likelihood of the each distribution”. This limitation merely indicates a field of use or technological environment in which the judicial exception is performed (probability distributions) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h). Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. The additional element(s) of “wherein the parameters learned for each distribution of the distributions include a mean and a variance, the mean corresponding to a respective one of the second embeddings, while a corresponding one of the confidence scores is obtained based on a negative log-likelihood of the each distribution” is/are directed to particular field(s) of use (probability distributions) (MPEP 2106.05(h)) and therefore do not provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible. Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 13: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the objective function is defined based on negative log-likelihoods of the distributions with respect to the first embeddings”. This limitation merely indicates a field of use or technological environment in which the judicial exception is performed (objective functions) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h). Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. The additional element(s) of “wherein the objective function is defined based on negative log-likelihoods of the distributions with respect to the first embeddings” is/are directed to particular field(s) of use (objective functions) (MPEP 2106.05(h)) and therefore do not provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible. Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 14: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the objective function is designed to define a further objective, in addition to the objective of minimizing said distances, the further objective causing to push the second embeddings towards embeddings of semantically related training terms upon training the model”. This limitation merely indicates a field of use or technological environment in which the judicial exception is performed (objective functions) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h). Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. The additional element(s) of “wherein the objective function is designed to define a further objective, in addition to the objective of minimizing said distances, the further objective causing to push the second embeddings towards embeddings of semantically related training terms upon training the model” is/are directed to particular field(s) of use (objective functions) (MPEP 2106.05(h)) and therefore do not provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible. Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 15: Due to claim language similar to that of Claim 1, Claim 15 is rejected for the same reasons as presented above in the rejection of Claim 1. Regarding Claim 16: Step 2A: Prong 1 analysis: Claim 16 recites in part: “accepting or rejecting the third embeddings based on the corresponding confidence scores”. As drafted and under its broadest reasonable interpretation, this limitation covers performance of the limitation in the mind (including an observation, evaluation, judgement, or opinion) or with the aid of pencil and paper. For example, this limitation encompasses accepting or rejecting data based on a calculated score. Accordingly, at Step 2A: Prong 1, the claim is directed to an abstract idea. Step 2A: Prong 2 analysis: The claim does not recite any additional elements that integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. Regarding Claim 17: Step 2A: Prong 1 analysis: Claim 17 recites in part: “performing a lossy compression of the pre-trained embedding matrix by pruning entries of the pre-trained embedding matrix in accordance with the confidence scores inferred for the set of terms”. As drafted and under its broadest reasonable interpretation, this limitation covers a mathematical relationship/concept. “whereby at least some of the entries are deleted”. As drafted and under its broadest reasonable interpretation, this limitation covers performance of the limitation in the mind (including an observation, evaluation, judgement, or opinion) or with the aid of pencil and paper. For example, this limitation encompasses removing data. Accordingly, at Step 2A: Prong 1, the claim is directed to an abstract idea. Step 2A: Prong 2 analysis: The claim does not recite any additional elements that integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. Regarding Claim 18: Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “wherein the objective function is designed to define a further objective, in addition to the objective of minimizing said distances, the further objective causing to push the second embeddings towards embeddings of semantically related training terms upon training the model”. This additional elements is recited at a high level of generality and amounts to extra-solution activity of gathering data i.e. pre-solution activity of gathering data for use in the claimed process. “executing the model on the additional words to controllably update entries of an embedding matrix in accordance with confidence scores inferred for the embeddings generated for the additional words”. This additional elements is recited at a high level of generality such that the claim recites only the idea of a solution or outcome (train a model) i.e., the claim fails to recite details of how a solution to a problem is accomplished. Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. The additional element(s) of “wherein the objective function is designed to define a further objective, in addition to the objective of minimizing said distances, the further objective causing to push the second embeddings towards embeddings of semantically related training terms upon training the model” is/are recited at a high level of generality and amount(s) to extra-solution activity of receiving data i.e., pre-solution activity of gathering data for use in the claimed process. The courts have found limitations directed to obtaining information electronically, recited at a high level of generality, to be well-understood, routine, and conventional (see MPEP 2106.05(d)(II), “receiving or transmitting data over a network”, "electronic record keeping," and "storing and retrieving information in memory"). As discussed above, the additional element(s) of “executing the model on the additional words to controllably update entries of an embedding matrix in accordance with confidence scores inferred for the embeddings generated for the additional words” is/are recited at a high-level of generality such that the claim recites only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished (See MPEP 2106.05(f)). Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 19: Due to claim language similar to the of Claims 1 and 15, Claim 19 is rejected for the same reasons as presented above in the rejections of Claims 1 and 15, with the exception of the limitation(s) covered below. Step 2A: Prong 2 analysis: The judicial exception is not integrated into practical application. In particular, the claim recites the additional elements of: “one or more computer-readable tangible storage medium and program instructions stored on at least one of the one or more tangible storage medium, the program instructions executable by a processor capable of performing a method”. This additional element is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component (storage and processor) (See MPEP 2106.05(f)). Accordingly at Step 2A: Prong 2, the additional elements individually or in combination do not integrate the judicial exception into a practical application. Step 2B analysis: In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception. As discussed above, the additional element(s) of “one or more computer-readable tangible storage medium and program instructions stored on at least one of the one or more tangible storage medium, the program instructions executable by a processor capable of performing a method” is/are recited at a high-level of generality such that it/they amount(s) to no more than mere instructions to apply the exception using generic computer components (See MPEP 2106.05(f)). Accordingly, at Step 2B, the additional elements individually or in combination do not amount to significantly more than the judicial exception. Regarding Claim 20: Due to claim language similar to that of Claim 2, Claim 20 is rejected for the same reasons as presented above in the rejection of Claim 2. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1-3, 15, 16, 19, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Meng et al (US 20230022845 A1, hereinafter Meng) in view of Soliman et al (US 12499879 B1, hereinafter Soliman). Regarding Claim 1: Meng teaches A computer-implemented method of generating a model including a term encoder, the method comprising: training the model on a training dataset that associates training terms with first embeddings of the training terms (Meng [0008]: “FIG. 4 is a block diagram of a modified BERT model or encoder, according to some embodiments”; [0026]: "In making these predictions, various embodiments intelligently convert the one or more numerical characters, the one or more natural language word characters, and/or the questions into a feature vector embedding in feature space based at least in part on training one or more machine learning models in order to learn the meaning of words and/or the numbers themselves"; [0057]: "A “feature vector” (also referred to as a “vector”) as described herein includes one or more real numbers, such as a series of floating values or integers (e.g., [0, 1, 0, 0]) that represent one or more other real numbers, a natural language (e.g., English) word and/or other character sequence (e.g., a symbol (e.g., @, !, #), a phrase, and/or sentence, etc.). Such natural language words and/or character sequences correspond to the set of features and are encoded or converted into corresponding feature vectors so that computers can process the corresponding extracted features") wherein the training comprises: generating, with the term encoder, second embeddings from numerical representations of word subunits of the training terms (Meng [0026]: "For example, some embodiments encode the number $13,500 into two tags—“currency” and the value “thirteen thousand, five hundred.” This indicates that the number refers to currency, as opposed to a date, for example. These encoded values are then converted in to a feature vector and embedded in feature space") wherein the word subunits form part of a predetermined set of word subunits (Meng [0057]: "A “feature vector” (also referred to as a “vector”) as described herein includes one or more real numbers, such as a series of floating values or integers (e.g., [0, 1, 0, 0]) that represent one or more other real numbers, a natural language (e.g., English) word and/or other character sequence (e.g., a symbol (e.g., @, !, #), a phrase, and/or sentence, etc.). Such natural language words and/or character sequences correspond to the set of features and are encoded or converted into corresponding feature vectors so that computers can process the corresponding extracted features"); predicting confidence scores (Meng [0046]: "The filtering module 106-3 is generally responsible for filtering out or removing each token generated by the object recognition component 104 that has a confidence score or prediction lower than a particular threshold") deploying the model as part of an executable algorithm to allow a user to infer third embeddings and corresponding confidence scores from any input terms written based on word subunits of the predetermined set (Meng [0102]: "FIG. 7 illustrates that oftentimes documents, such as invoices, do not inherently and clearly identify information, such as the “due date.” Accordingly, placing the prediction or answer in the “due date” field in the window pane 717, for example, assists the user experience so that the user can better analyze an invoice. FIG. 7 also illustrates that sometimes documents are mere static images and can therefore not be used to dynamic functionality. However, certain embodiments can perform more dynamic functionality by importing or extracting specific words and the like from the invoice 710 for further analysis. For example, all of the information or predictions within the window pane 717 can be imported or copied to another application page or instance in order to, for example, keep a history of all invoice total amounts (and dynamically add all the total amounts) in a single document.") Meng does not distinctly disclose with an objective of minimizing distances between the first embeddings and the second embeddings based on the minimized distances; However, Soliman teaches with an objective of minimizing distances between the first embeddings and the second embeddings (Soliman [Col 13 lines 45-49]: "The similarity determination component 330 may determine a cosine distance between the embedding vectors and is trained to minimize the distance between two embedding vectors (and the corresponding user input data) that are of the same functionality") based on the minimized distances (Soliman [Col 13 lines 45-49]: "The similarity determination component 330 may determine a cosine distance between the embedding vectors and is trained to minimize the distance between two embedding vectors (and the corresponding user input data) that are of the same functionality"); It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Meng and Soliman before him or her, to modify the system for improved accuracy of NLP models of Meng to include the methods of identifying user experiences that are not support by natural understanding models as shown in Soliman. The motivation for doing so would have been to use the objective function for minimizing distances between embeddings of Soliman in order to better predict confidence scores that a term belongs to a certain embedding (Soliman [Abstract]: “Some embodiments may involve identifying functionalities by transforming user inputs to functionality-based representations. The functionality-based representations may be grouped into individual functionalities. The user inputs associated with an individual functionality may be evaluated using an NU component to determine whether the functionality is supported. These techniques may enable discovery at a functionality level, rather than at a user input level, an intent level, or an entity level. These techniques may also be used to group user inputs to determine trending functionalities”). Regarding Claim 2: Meng teaches The computer-implemented method according to claim 1, wherein the word subunits of the training terms are characters, and wherein the model is trained to generate the second embeddings based on numerical representations of characters of the training terms (Meng [0045]: "The coordinate module 106-2 is generally responsible for sorting each token in each block based on the coordinates of each token within a corresponding document. A “token” as described herein refers to an individual element of a document, such as a word, number, sign, symbol, and/or the like. For example, the coordinate module 106-2 can sort the tokens in each block based on the X (left/right) and Y (top/bottom) coordinates of each token (each token can be represented as [‘word,’ xmin, xmax, ymin, ymax]) to make sure the tokens in the same line in the block will appear together as the order in the document"). Regarding Claim 3: Meng teaches The computer-implemented method according to claim 1, wherein the training terms are captured as tokens, and wherein at least some of the tokens capture respective sets of multiple words, and wherein each of the sets of multiple words are tokenized into a respective single token (Meng [0045]: "The coordinate module 106-2 is generally responsible for sorting each token in each block based on the coordinates of each token within a corresponding document. A “token” as described herein refers to an individual element of a document, such as a word, number, sign, symbol, and/or the like. For example, the coordinate module 106-2 can sort the tokens in each block based on the X (left/right) and Y (top/bottom) coordinates of each token (each token can be represented as [‘word,’ xmin, xmax, ymin, ymax]) to make sure the tokens in the same line in the block will appear together as the order in the document"). Regarding Claim 15: Due to claim language similar to that of Claim 1, Claim 15 is rejected for the same reasons as presented above in the rejection of Claim 1. Regarding Claim 16: Meng teaches The computer-implemented method according to claim 15, further comprising: accepting or rejecting the third embeddings based on the corresponding confidence scores (Meng [0046]: “The filtering module 106-3 is generally responsible for filtering out or removing each token generated by the object recognition component 104 that has a confidence score or prediction lower than a particular threshold (e.g., 0.8). For example, if a character sequence in a document is predicted to be a particular word with only a 60% confidence score, then the corresponding character sequences can be removed from the document altogether”). Regarding Claim 19: Due to claim language similar to the of Claims 1 and 15, Claim 19 is rejected for the same reasons as presented above in the rejections of Claims 1 and 15, with the exception of the limitation(s) covered below. Meng teaches A computer program product for generating a model including a term encoder, the computer program product comprising: one or more computer-readable tangible storage medium and program instructions stored on at least one of the one or more tangible storage medium, the program instructions executable by a processor capable of performing a method (Meng [0113]: “The computer-implemented method, the system (that includes at least one computing device having at least one processor and at least one computer readable storage medium), and/or the computer readable medium as described herein may perform or be caused to perform the process 800 or any other functionality described herein”) Regarding Claim 20: Due to claim language similar to that of Claim 2, Claim 20 is rejected for the same reasons as presented above in the rejection of Claim 2. Claim Rejections - 35 USC § 103 Claim(s) 6 and 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Meng and Soliman as applied to claims 1, 15, and 19 above, and further in view of Lee et al (US 20190370337 A1, hereinafter Lee). Regarding Claim 6: Meng teaches obtaining numerical representations of the identified word subunits through the word subunit embedding layer (Meng [0026]: "In making these predictions, various embodiments intelligently convert the one or more numerical characters, the one or more natural language word characters, and/or the questions into a feature vector embedding in feature space based at least in part on training one or more machine learning models in order to learn the meaning of words and/or the numbers themselves"; [0057]: "A “feature vector” (also referred to as a “vector”) as described herein includes one or more real numbers, such as a series of floating values or integers (e.g., [0, 1, 0, 0]) that represent one or more other real numbers, a natural language (e.g., English) word and/or other character sequence (e.g., a symbol (e.g., @, !, #), a phrase, and/or sentence, etc.). Such natural language words and/or character sequences correspond to the set of features and are encoded or converted into corresponding feature vectors so that computers can process the corresponding extracted features"); training the one or more trainable layers to generate the second embeddings from the obtained numerical representations in accordance with an objective function defining the objective (Meng [0058]: "A standard training objective thus involves minimizing the cross-entropy between the model's predicted distribution and the one-hot empirical distribution of training labels. A model performing well on the training set will predict an output distribution with high probability on the correct class and with near-zero probabilities on other classes"). Meng does not distinctly disclose The computer-implemented method according to claim 1, wherein the term encoder includes a word subunit decomposition layer, a word subunit embedding layer, and one or more trainable layers and wherein the word subunit decomposition layer is connected to the word subunit embedding layer, itself connected to the one or more trainable layers identifying the word subunits of the training terms through the word subunit decomposition layer; However, Lee teaches The computer-implemented method according to claim 1, wherein the term encoder includes a word subunit decomposition layer, a word subunit embedding layer, and one or more trainable layers (Lee [0138]: "the identified major features are then used during the question decomposition stage 330 to decompose the question into one or more queries that are applied to the corpora of data/information 345 in order to generate one or more hypotheses"; [0158]: "The embedding logic 394 trains a neural network 397 based on the dependency data structure of an electronic document produced by the dependency annotation logic 142") and wherein the word subunit decomposition layer is connected to the word subunit embedding layer, itself connected to the one or more trainable layers (Lee [0015]: "FIG. 6 is a flowchart outlining a training operation for training an embedding neural network in accordance with one illustrative embodiment") identifying the word subunits of the training terms through the word subunit decomposition layer (Lee [0138]: "the identified major features are then used during the question decomposition stage 330 to decompose the question into one or more queries that are applied to the corpora of data/information 345 in order to generate one or more hypotheses."); It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Meng + Soliman and Lee before him or her, to modify the system for improved accuracy of NLP models of Meng + Soliman to include the methods of the embedding of content of a natural language document as shown in Lee. The motivation for doing so would have been to use the various mechanisms of Lee to analyze a document and identify a structural relationship within said document (Lee [Abstract]: “The mechanisms receive a document data object of an electronic document and analyze a structure of the electronic document to identify one or more structural document elements that have a relationship with the document data object. A dependency data structure is generated, representing the electronic document, where edges define relationships between document elements and at least one edge represents at least one relationship between the one or more structural document elements and the document data object. The mechanisms embed the document data object based on the at least one relationship to thereby represent the document data object as a vector data structure. The mechanisms perform natural language processing on the portion of natural language content based on the vector data structure”). Regarding Claim 9: Meng teaches The computer-implemented method according to claim 6, wherein the model further includes an estimator connected by the one or more trainable layers, and wherein training the model further comprises training the estimator to predict the confidence scores (Meng [0037]: "The object recognition component 104 is generally responsible for detecting one or more objects and/or characters within one or more documents. In some embodiments, the object recognition component 104 performs its functionality in response to the document conversion module 102 performing its functionality. In some embodiments, the object recognition component 104 includes an Object Character Recognition (OCR) component that is configured to detect natural language characters and covert such characters into a machine-readable format"; [0046]: "The filtering module 106-3 is generally responsible for filtering out or removing each token generated by the object recognition component 104 that has a confidence score or prediction lower than a particular threshold"). Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. US 12494200 B1 – Techniques for performing an action with respect to displayed content US 12164877 B2 – Techniques for interacting with users in a discussion environment US 11568143 B2 – system and methods for fine-tuning a pre-trained Universal Language model (encoder) based on transformer architecture US 20220300711 A1 – A system and method for natural language processing for document sequences US 20220245348 A1 – self-supervised semantic shift detection and alignment US 11393456 B1 – A system is provided for a self-learning policy engine that can be used by various spoken language understanding (SLU) processing components US 20220020355 A1 – A method and apparatus for generating speech through neural text-to-speech (TTS) synthesis US 20200311115 A1 – systems and methods for mapping of text phrases to a taxonomy US 20200175360 A1 – Methods, systems and computer program products for updating a word embedding model Any inquiry concerning this communication or earlier communications from the examiner should be directed to COREY M SACKALOSKY whose telephone number is (703)756-1590. The examiner can normally be reached M-F 7:30am-3:30pm EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached at (571) 272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /COREY M SACKALOSKY/Examiner, Art Unit 2128 /OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128
Read full office action

Prosecution Timeline

Feb 28, 2023
Application Filed
Jan 15, 2026
Non-Final Rejection — §101, §103
Apr 14, 2026
Interview Requested

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12596932
METHOD AND SYSTEM FOR DEPLOYMENT OF PREDICTION MODELS USING SKETCHES GENERATED THROUGH DISTRIBUTED DATA DISTILLATION
2y 5m to grant Granted Apr 07, 2026
Patent 12591759
PARALLEL AND DISTRIBUTED PROCESSING OF PROPOSITIONAL LOGICAL NEURAL NETWORKS
2y 5m to grant Granted Mar 31, 2026
Patent 12572441
FULLY UNSUPERVISED PIPELINE FOR CLUSTERING ANOMALIES DETECTED IN COMPUTERIZED SYSTEMS
2y 5m to grant Granted Mar 10, 2026
Patent 12518197
INCREMENTAL LEARNING WITHOUT FORGETTING FOR CLASSIFICATION AND DETECTION MODELS
2y 5m to grant Granted Jan 06, 2026
Patent 12487763
METHOD AND APPARATUS WITH MEMORY MANAGEMENT AND NEURAL NETWORK OPERATION
2y 5m to grant Granted Dec 02, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
64%
Grant Probability
99%
With Interview (+49.4%)
4y 2m
Median Time to Grant
Low
PTA Risk
Based on 25 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month