DETAILED ACTION
Introduction
Applicant's submission filed on 02/09/2026 has been entered. Claims 1, 3-7, 9-13, 15-19 are pending in the application and have been examined.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The response filed on 02/09/2026 has been correspondingly accepted and considered in this Office Action. Claims 1, 3-7, 9-13, 15-19 have been examined. Applicant’s amendments to claims 9 and 15 overcome the Claim objections previously set forth in the Non-Final Office Action mailed 11/07/2025.
Response to Arguments
Applicant's arguments filed 02/09/2026 have been fully considered as follows:
Applicant’s arguments with respect to claim 1 (also representative of claims 7 and 13) state that
“The cited references fail to teach or suggest is "subsequent to determining the part of speech tag for each given word, saving the part of speech tag only for the given words of the target sentence on a storage device storing the corpus data set, and discarding the part of speech tag for the set of context sentences." The Office Action contends that this feature is disclosed in Chao paragraph [0057]. (Office Action, page 7.) Here in [0057], however, Chao only discloses that the "repository 108" contains "the original document and the added annotations from language disambiguation." This contrasts from the selective "subsequent to determining the part of speech tag for each given word, saving the part of speech tag only for the given words of the target sentence on a storage device storing the corpus data set" that is recited in claim 1… Accordingly, for at least this reason, the combination of Chao, Akbik, Bellegarda, Straka, and Gorin fail to render claim 1.”
The examiner respectfully disagrees, Chao teaches processing partially processed documents based on previous analysis or updated documents based on the new sentences added or changed and further processes only newly added sentences and stores only the updated information, in Chao [0053-0054]. The broadest reasonable interpretation of “discarding the part of speech tag for the set of context sentences” includes not storing the previously processed information of the part of speech tags of sentences that were already processed. As stated in the claim the part of speech tag of context sentences are given. Therefore, Chao teaches subsequent to determining the part of speech tag for each given word, saving the part of speech tag only for the given words of the target sentence on a storage device storing the corpus data set, and discarding the part of speech tag for the set of context sentences and therefore, the rejection of Claims 1 is rejected under 35 U.S.C. 103 are sustained and further updated accordingly. It is the examiner’s note it is unclear from the claim “which part of speech tags of context sentences” are discarded, the Specifications([0041-0046], [0049]) refers to pre and post tagging while there is no such mention in the claims and further there is insufficient information in the claim regarding the context sentences with respect to the corpus and how is it decided to store or discard the part of speech tags of the context sentences.
In response to the art rejection(s) of the remainder of dependent claims are rejected under 35 U.S.C 103, in case said claims are correspondingly discussed and/or argued for at least the same rationale presented in Remarks filed 02/09/2026, Examiner respectfully notes as follows. For completeness, should the mentioned claims be likewise traversed for similar reasons to independent claims 1, 7 and 13 correspondingly, Examiner respectfully directs Applicant to the same previous supra reasons provided in the response directed towards claims 1, 7 and 13 correspondingly discussed above. For at least the same supra provided reasons, Examiner likewise respectfully disagrees, and Applicant's arguments have been fully considered but they are not persuasive.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: part of speech tagging model in claims 1, 7 and 13 .
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1, 3-5, 7, 9-11, 13 and 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Chao, US PgPub 2005/0049852 in view of Akbik, A et. al. "Contextual string embeddings for sequence labeling." Proceedings of the 27th international conference on computational linguistics. 2018 further in view of Bellegarda, US PgPub 2009/0089058.
Regarding claim 1, Chao teaches a computer-implemented method (CIM) comprising: receiving a corpus data set corresponding to a piece of natural language text including a plurality of sentences and an indication of the beginning and end of each sentence of the plurality of sentences(see Chao, [0049] discusses processing of documents (101) and determining boundaries of sentences); receiving an indication of a target sentence from the plurality of sentences of the piece of natural language text from the corpus data set (see Chao,[0053] determines the modified sentences (target sentence) ); determining a plurality of taggable words included in the target sentence (see Chao, [0049-0050] discusses the tokenizing and assigning words’ part of speech); and subsequent to determining the part of speech tag for each given word, saving the part of speech tag only for the given words of the target sentence on a storage device storing the corpus data set, and discarding the part of speech tag for the set of context sentences. (see Chao [0057] the analyzed documents and the added annotations(part of speech tag) from language disambiguation are saved in a repository 108; Chao[0054] discusses not processing or storing information previously processed ( part of speech tag for target sentence is based on part of speech tag of context sentences, hence already processed and not stored; discarding the part of speech tag for the context sentences)).
However, Chao fails to teach determining that the plurality of taggable words include words that are difficult for a part of speech tagging model to parse and cause an increase in processing power; determining a set of context sentences for use by the part of speech tagging model to parse the target sentence based on the proximity of the context sentences to the target sentence in the piece of natural language text of the corpus data set;
However, Akbik teaches determining that the plurality of taggable words include words that are difficult for a part of speech tagging model to parse and cause an increase in processing power (see Akbik, sect 3.4, Qualitative inspection (Table 4). To illustrate the contextualized nature of our proposed embeddings, we present example embeddings of the polysemous word “Washington” in different contexts ( Washington : taggable words that are difficult for a part of speech tagging model to parse, as it has different contexts)( hence increase in processing power) ); determining a set of context sentences for use by the part of speech tagging model to parse the target sentence based on [[the]]a proximity of the context sentences to the target sentence in the piece of natural language text of the corpus data set (see Akbik, sect 3.4 and Table 4 We compute contextual string embeddings for all words in the English CONLL03 corpus and compute nearest neighbors in the embedding space using the cosine distance. We then look up nearest neighbors for different mentions of the word “Washington”. As Table 4 shows, the embeddings successfully pry apart person, place, legislative entity and team (a-d). For instance, “Washington” used as last name in context (b) is closest to other last names, many of which are also place names (“Carla Sacramento”); “Washington” used as a sport team name in context (d) is closest to other place names used in sports team contexts. We hypothesize that modeling semantics in context is a key feature that allows our proposed embeddings to better address downstream sequence labeling task ).
Chao teaches an Adaptive Language Processing to improve disambiguation accuracy while processing natural language by organizing the task into multiple iterations of analysis done in successive levels of depth. Akbik teaches contextual string embeddings to process syntactic-semantic word features and disambiguate words in context (see Akbik, sect 1 :Contextual string embeddings; Akbik , Conclusion). Using the known technique of using the contextualized character level word embedding for sequence labeling using Neural Language modeling as taught by Akbik, to improve the tagging for the selected sentence in the reference Chao, such as improved POS tagging would have been obvious to one of ordinary skill in the art.
However, Chao in view of Akbik fail to teach for each given word of the plurality of taggable words of the target sentence and of the set of context sentences, performing natural language processing to determine a part of speech tag for the given word, wherein determining the part of spee[[d]ch tag for the given words in the target sentence is based, at least in part, on the part of speech tags for the set of context sentences.
However, Bellegarda teaches determining that the plurality of taggable words include words that are difficult for a part of speech tagging model to parse and cause an increase in processing power (see Bellegarda, [0048] discusses POS disambiguation( difficult for POS) using latent semantic mapping ( takes less time and hence less processing power), see Bellegarda [0008]); for each given word of the plurality of taggable words of the target sentence and of the set of context sentences(see Bellegarda,[0074] Next, operation 1205 that involves forming a neighborhood of the sequence of input words in the vector space is formed based on the closeness. The neighborhood represents one or more training sequences(context sentences), as described above. ), performing natural language processing to determine a part of speech tag for the given word, wherein determining the part of speed tag for the given words in the target sentence is based, at least in part, on the part of speech tags for the set of context sentences (see Bellegarda, [0076] 1209 that involves aligning the selected sub-sequences to obtain POS characteristics (e.g., POS values) of the words from the sub-sequences. Next, assigning a POS tag to the input word is performed based on the obtained one or more POS characteristics (e.g., POS values) of the word from the sub-sequences that is substantially similar to the input word); discarding the part of speech tag for the set of context sentences (see Bellegarda, [0068] discusses disregarding sequence of words not relevant to the input sequence of words based on the closeness measure of vector representation).
Chao, Akbik and Bellegarda are considered to be analogous to the claimed invention because they relate to methods for determining and tagging parts of speech in a text. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Chao in view of Akbik on processing the parts of speech in a text with the Latent Semantic Space teachings of Bellegarda to determine a part-of-speech tag to assign to a word of the input sequence may be determined based on characteristics of the words of the training sequences from the neighborhood (see Bellegarda, [0010]).
Regarding claim 3, Chao in view of Akbik further in view of Bellegarda teaches the CIM of claim 1. Bellegarda further teaches wherein the performance of natural language processing to determine a part of speech tag for the given word based, at least in part, on the set of context sentences includes determining relevant words in the context sentences and limiting the determining [[a]]of the part of speech tag to only those relevant words (see Bellegarda, [0062] At operation 1002 the received sequence is mapped into a vector space, as described above. In one embodiment, the vector space is an LSM space. In one embodiment, the vector space includes representations of a plurality of training sequences of words from a training corpus, as described above. Next, method 1000 continues with operation 1003 that involves forming a neighborhood associated with the received sequence of words in the vector space to obtain POS characteristics, for example, POS values. In one embodiment, the neighborhood is associated with one or more training sequences that are globally relevant (relevant sentences in LSM/context sentences) to the received sequence).The same motivation to combine as claim 1 applies here.
Regarding claim 4, Chao in view of Akbik further in view of Bellegarda teaches the CIM of claim 1. Chao further teaches selecting a selected word of the target sentence for tagging( see Chao, [0053] determines the changed/target sentence, see Chao, [0050] assigns the words’ part of speech in step 104). Bellegarda further teaches finding a similar word in the set of context sentences (see Bellegarda, [0075] If the training sequence contains the word that is substantially similar to the input word, the training sentence is retained, and a sub-sequence of such training sequence is selected that contains the word that is substantially similar to the input word at operation 1207 ); and tagging the selected word based, at least in part, on a part of speech tag assigned to the similar word (see Bellegarda, [0076]Next, assigning a POS tag to the input word is performed based on the obtained one or more POS characteristics (e.g., POS values) of the word from the sub-sequences that is substantially similar to the input word). The same motivation to combine as claim 1 applies here.
Regarding claim 5, Chao in view of Akbik teaches the CIM of claim 1. Akbik further teaches obtaining a document embedding from the corpus data set, with the document embedding being applicable to the group consisting of: tokenization, named entity recognition, and sentiment analysis (see Akbik, sect 3.5 First, character-level LM is independent of tokenization and a fixed vocabulary. Second, they produce stronger character-level features, which is particularly useful for downstream tasks such as NER (named entity recognition); Akbik , abstract and sect. 1 By learning to predict the next character on the basis of previous characters, such models have been shown to automatically internalize linguistic concepts such as words, sentences, subclauses and even sentiment ); and mathematically representing a potential extension as a continuous feature vector based on the document embedding(see Akbik, sect 2.3 discusses the creation of the final word vectors given by equation (14) ). The same motivation to combine as claim 1 applies here.
Regarding claim 7, is directed to a method claim corresponding to the method claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1 and further, Akbik teaches determining a set of context sentences for use by the part of speech tagging model to parse the target sentence based on random selection of the context sentences from the piece of natural language text of the corpus data set ( see Akbik, sect 3.4 and Table 4 We compute contextual string embeddings for all words in the English CONLL03 corpus and compute nearest neighbors in the embedding space using the cosine distance. We then look up nearest neighbors for different mentions of the word “Washington”. As Table 4 shows, the embeddings successfully pry apart person, place, legislative entity and team (a-d). For instance, “Washington” used as last name in context (b) is closest to other last names, many of which are also place names (“Carla Sacramento”); “Washington” used as a sport team name in context (d) is closest to other place names used in sports team contexts. We include a negative example (e) in Table 4 in which the context is not sufficient to determine the type of mention ( random selection) See Akbik, sect 3.2 we utilize variational dropout, set the number of hidden states per-layer of the LSTM to 256, set the number of LSTM layers to 1, and perform model selection over the learning rate ∈ {0.01, 0.05, 0.1} and mini-batch size ∈ {8, 16, 32}, choosing the model with the best F-measure (for NER and chunking) or accuracy (for PoS) in the best epoch as judged by performance on the validation set ).
Claims 11 is analogous claims to 5 and are subjected to the same rejections as claim 5. Although Claim 11 is of slightly different scope from claim 5.
Claims 9-10 and 15-16 and are analogous claims to 3-4 and are subjected to the same rejections as claims 3-4.
Regarding claim 13, is directed to a method claim corresponding to the method claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1 and further, Akbik teaches determining a set of context sentences for use by the part of speech tagging model to parse the target sentence based on performing Next Sentence Prediction processing using the target sentence and the piece of natural language text of the corpus data set ( see Akbik, sect 1, Neural character-level language modeling : by learning to predict the next character on the basis of previous characters, such models learn internal representations that capture syntactic and semantic properties: even though trained without an explicit notion of word and sentence boundaries, they have been shown to generate grammatically correct text, including words, subclauses, quotes and sentences. The new architecture teaches contextual string embeddings, a novel type of word embeddings based on character-level language modeling, and their use in a state-of-the-art sequence labeling architecture);
Claims 17 is analogous claims to 5 and are subjected to the same rejections as claim 5.
Claim 6, 12 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Chao, US PgPub 2005/0049852 in view of Akbik, A et. al. "Contextual string embeddings for sequence labeling." Proceedings of the 27th international conference on computational linguistics. 2018 further in view of Bellegarda, US PgPub 2009/0089058 further in view of Straka, Milan, et. al. "Evaluating contextualized embeddings on 54 languages in POS tagging, lemmatization and dependency parsing." arXiv preprint arXiv:1908.07448 (2019).
Regarding claim 6, Chao in view of Akbik further in view of Bellegarda teaches the method of claim 1. However, Chao in view of Akbik further in view of Bellegarda fails to teach the data set includes Universal Dependencies; the part of speech tags are derived from XPOS language specific tags; and the natural language processing uses software that includes a BERTBASE layer, a linear layer and a softmax layer.
However, Straka teaches the data set includes Universal Dependencies (see Straka, sect 1, We publish a comparison and evaluation of three recently proposed contextualized word embedding methods: BERT (Devlin et al., 2018), Flair (Akbik et al., 2018) and ELMo (Peters et al., 2018), in 89 corpora which have a training set in 54 languages of the Universal Dependencies 2.3 in three tasks: POS tagging, lemmatization and dependency parsing); the part of speech tags are derived from XPOS language specific tags(see Straka, Fig. 1, pos tags are derived from the XPOS tags); and the natural language processing uses software that includes a BERBASE layer, a linear layer and a softmax layer (see Straka, sect 3, discusses using BERT Base layer and linear layer and softmax layer for the processing ).
Chao in view of Akbik further in view of Bellegarda teaches a contextual string embeddings, which were obtained from internal states of a character-level bidirectional language model. Straka teaches contextualized word embeddings of Universal Dependencies 2.3 in three tasks: POS tagging, lemmatization, and dependency parsing (see Sinha, abstract, sect 1). Using the known technique of using the Transformer based models for contextual word tagging using BERT base as taught by Straka, to improve the tagging for the selected sentence in the reference Chao in view of Akbik further in view of Bellegarda, such as improved POS tagging would have been obvious to one of ordinary skill in the art.
Claims 12 and 18 and are analogous claims to 6 and are subjected to the same rejections as claims 6.
Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Chao, US PgPub 2005/0049852 in view of Akbik, A et. al. "Contextual string embeddings for sequence labeling." Proceedings of the 27th international conference on computational linguistics. 2018 further in view of Bellegarda, US PgPub 2009/0089058 further in view of Gorin et. al., US PgPub 2003/0191625.
Regarding claim 19, Chao in view of Akbik further in view of Bellegarda teaches the CIM of claim 1. However, Chao in view of Akbik further in view of Bellegarda fail to teach training a machine learning model with the corpus data set and the saved part of speech tags. However, Gorin teaches training a machine learning model with the corpus data set and the saved part of speech tags ( see Gorin, [0063] discusses aligning recognized output (saved pos tag) with creating a named entity language model as part of training corpus).
Chao in view of Akbik further in view of Bellegarda teaches a contextual string embeddings, which were obtained from internal states of a character-level bidirectional language model. Gorin teaches recognizing input communications from a training corpus, parsing the training corpus, tagging the parsed training corpus, aligning the recognized training corpus with the tagged training corpus (see Gorin, abstract). Using the known technique of using the aligning the recognized output with the parsed training corpus as taught by Gorin, to expand the training corpus to include additional untagged sentences in the reference Chao in view of Akbik further in view of Bellegarda, such as improved POS tagging would have been obvious to one of ordinary skill in the art.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Bernard et. al. US PgPub 2007/0073745 teaches method of determining similarity between portions of text comprises generating a semantic profile for at least two portions of text, each semantic profile comprising a vector of values and computing a similarity metric representing a similarity between the at least two portions of text using the at least two generated semantic profiles. The semantic profile comprises a vector of information values (see Bernard, Abstract).
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NANDINI SUBRAMANI whose telephone number is (571)272-3916. The examiner can normally be reached Monday - Friday 12:00pm - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh M Mehta can be reached at (571)272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/NANDINI SUBRAMANI/ Examiner, Art Unit 2656 /BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656