Last updated: April 19, 2026

Application No. 18/303,885

Named Entity Recognition System based on Enhanced Label Embedding and Curriculum Learning

Non-Final OA §102§103

Filed

Apr 20, 2023

Examiner

NGUYEN, CHAU T

Art Unit

2145

Tech Center

2100 — Computer Architecture & Software

Assignee

Robert Bosch GmbH

OA Round

1 (Non-Final)

Interview Optional

— +31.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 549 resolved cases, 2023–2026

Examiner Intelligence

NGUYEN, CHAU T View full profile →

Grants 68% — above average

Career Allow Rate

372 granted / 549 resolved

+12.8% vs TC avg

Strong +32% interview lift

Without

With

+31.8%

Interview Lift

resolved cases with interview

Typical timeline

4y 0m

Avg Prosecution

31 currently pending

Career history

580

Total Applications

across all art units

Statute-Specific Performance

§101

14.0%

-26.0% vs TC avg

§103

48.5%

+8.5% vs TC avg

§102

15.9%

-24.1% vs TC avg

§112

12.2%

-27.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 549 resolved cases

Office Action

§102 §103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 04/20/2023 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1 and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Gao et al. (Gao), US Patent Application Publication No. US 2020/0065374 A1.

As to independent claim 1, Gao discloses a method for training a model configured to perform a named entity recognition task, the method comprising:
receiving, with a processor, a sentence and ground truth labels as training inputs (paragraph [0025]: receiving one or more sets of training data, each set of training data may include a training text (sentence) and its corresponding name entity tags and relation labels, wherein the training text may be unstructured text (sentence); paragraphs [0026], [0058]: the ground truth can be named entity tags and relation labels for the training texts);
determining, with the processor, a text embedding representing the sentence using the model based on the sentence (paragraph [0018]: for the named entity recognition (NER) task, various types of word representations of an unstructured text (sentence) is determined, wherein the word representations may include word embeddings (text embeddings);
determining, with the processor, an attention vector using the model using the
model based on the text embedding (paragraph [0037]: attention-based subword encoder sublayer is designated to differentiate these subwords and considering their importance, for example, the attention layer will generate an importance score for each subword based on the dot product of attention weight vector and the encoding of the subword; paragraph [043]: the received input may be fed to a convolutional layer, e.g., a CNN, to generate a set of vector representations (attention vectors) of phrases in the sentence);
determining, with the processor, an attended text embedding using the model
based on the text embedding and the attention vector (paragraph [0043]: an attention layer may be applied to the input layer to learn critical words for relation classification, for example, the attention layer may be a multihead attention layer, and the importance score for each word in the attention mechanism may be based on the dot product of the attention vector and the vector representation of the word);
determining, with the processor, named entity recognition labels for individual
words of the sentence using the model based on the attended text embedding (paragraphs [0008]-[0009]: determining a plurality of word representations of an unstructured text and tagging entities in the unstructured text by performing a named entity recognition task on the plurality of word representations);
determining, with the processor, a first training loss based on the named entity
recognition labels and the ground truth label data (Figure 5 and paragraphs [0060]-[0061]: model training device may calculate a loss associated with named entity recognition (NER) and a loss associated with relation extraction (RE), respectively, the losses may be calculated according to a loss function, which may be a weighted average of the losses of the NER part and the RE part; and
refining, with the processor, the model using the first training loss (paragraphs [0062]-[0064]: the model training device may adjust sets of parameters for optimization).

As to dependent claim 20, Gao discloses determining a classification label for the sentence as a whole using the model (Figure 6 and paragraphs [0068]-[0071]);
determining a second training loss based on the classification label and the ground truth label data (Figure 5 and paragraphs [0060]-[0061]); and
refining the model jointly using the first training loss and the second training loss (Figure 5 and paragraphs [0060]-[0061]).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 2-15 and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Gao as discussed in claims 1 and 20 above, and further in view of Kirch et al. (Kirch), US Patent Application Publication No. US 2024/0111955 A1.

As to dependent claim 2, Gao, however, does not disclose the determining the attention vector further comprising: determining the attention vector based on the text embedding and a label-word relation matrix representing relations between respective words in a vocabulary and respective labels in a plurality of labels.
In the same field of endeavor, Kirch discloses an input text can be a natural language text, where the words in the natural language text are converted into embeddings and inserted into an embedding matrix during pre-processing (paragraph [0009]).  Kirch further discloses in paragraph [0033] that a Neural Capsule Entity Disambiguator receives the natural language text as input, for example, the text input is “John Smith is an artis.  John likes to paint” passed to the Neural Capsule Entity Disambiguator is converted to embedding, where 9 embeddings in the text input having values corresponding to the words comprising embedding vector, wherein each word embedding in the embedding vector is itself a vector.  Kirch further discloses in paragraph [0034] that the features of in text are identified during pre-processing and fed into the Named Entity Disambiguation (NED) model, wherein the feature are converted to numerical representations and included with each word embedding that the feature is relevant to, as feature embeddings, and the embedding vector is converted to a two dimensional input, where each sentence is inserted into a row resulting in an embedding matrix (paragraph [0035]).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system of Gao, to incorporate determining the attention vector based on the text embedding and a label-word relation matrix representing relations between respective words in a vocabulary and respective labels in a plurality of labels, as taught by Kirch for the purpose of identifying if a word in the input is a named entity or not a named entity.

As to dependent claim 3, Gao and Kirch disclose generating, prior to training the model, the label-word relation matrix based on the vocabulary and the plurality of labels (Kirch, paragraphs [0028], [0046]).

As to dependent claim 4, Gao and Kirch disclose the generating the label-word relation matrix further comprising: 
determining a plurality of word embeddings representing all words in the vocabulary using the model (Kirch, paragraphs [0028], [0046]);
determining a plurality of label embeddings representing all labels in the plurality
of labels using the model (Kirch, paragraphs [0028], [0046]); and
determining the label-word relation matrix based on the plurality of word embeddings and plurality of label embeddings (Kirch, paragraphs [0028], [0046]).

As to dependent claim 5, Gao discloses the generating the label-word relation matrix further comprising: determining each element of the label-word relation matrix by determining a dot product of a respective word embedding from the plurality of word embeddings and a respective label embedding from the plurality of label embeddings (paragraphs [0037], [0043]).

As to dependent claim 6, Gao and Kirch disclose the generating the label-word relation matrix further comprising: normalizing each element of the label-word relation matrix using a normalization operation (Gao, paragraph [0043], Kirch, paragraph [0047]).

As to dependent claim 7, Gao and Kirch disclose wherein the plurality of labels include: compound labels each indicating both an named entity type and a label boundary; decomposed entity labels indicating a named entity type (Gao, paragraphs [0018], [0034], [0039]); and
decomposed label boundaries indicating a label boundary (Gao, paragraphs [0018], [0034], [0039]).

As to dependent claim 8, Gao and Kirch disclose the generating the label-word relation matrix further comprising: 
receiving a label-label relation matrix representing relations between respective labels in the plurality of labels and respective other labels in the plurality of labels (Kirch, Figure 8, paragraph [0048]);
receiving a word-word relation matrix representing relations between respective
words in the vocabulary and respective other words in the vocabulary (Kirch, Figure 8, paragraph [0046]); and
augmenting the label-word relation matrix by multiplying the label-label relation
matrix and the word-word relation matrix with the label-word relation matrix (Gao, paragraph [0043]; Kirch, paragraphs [0046], [0048]) .

As to dependent claim 9, Gao and Kirch disclose wherein: each element of the label-label relation matrix represents a similarity in meaning between a respective label from the plurality of labels and a respective other label from the plurality of labels (Kirch, paragraph [0042]); and
each element of the word-word relation matrix represents a similarity in meaning
between a respective word from the vocabulary and a respective other word from the
vocabulary (Kirch, paragraph [0042]).

As to dependent claim 10, Gao and Kirch disclose the determining the attention vector further comprising:
determining the attention vector as a sequence of attention values, each attention value in the attention vector corresponding to a respective word in the sentence and being determined based on a subset of elements in the label-word relation matrix representing relations between the respective word and the plurality of labels (Gao, paragraphs [0043], [0044]; Kirch, paragraph [0008]).

As to dependent claim 11, Gao and Kirch disclose the determining the attention vector further comprising: determining each attention value in the attention vector based on an element in the label-word relation matrix representing a label in the plurality of labels that has a strongest relation with the respective word (Gao, paragraph [0041]; Kirch, paragraph [0042]). 

As to dependent claim 12, Gao and Kirch disclose the determining the attention vector further comprising: determining the attention vector as a sequence of attention values, each corresponding to a respective word in the sentence, each attention value in the attention vector being determined based on a subset of elements in the label-word relation matrix representing (i) relations between the respective word and the plurality of labels and (ii) relations between at least one word adjacent to the respective word and the plurality of labels (Kirch, paragraphs [0047]-[0048]).

As to dependent claim  13, Gao and Kirch disclose the determining the attention vector further comprising at least one of:
weighting the subset of elements in the label-word relation matrix with a weight
matrix (Kirch, paragraph [0042]); and
offsetting the subset of elements in the label-word relation matrix with an offset
matrix.

As to dependent claim 14, Gao and Kirch disclose the determining the attention vector further comprising: modifying the sequence of attention values in the attention vector depending on a pattern in the attention vector (Gao, Figure 5).

As to dependent claim 15, Gao and Kirch disclose wherein the pattern is positional relationship between (i) at least one first word in the sentence corresponding to attention values in the attention vector with respect to a particular label of the plurality of labels that exceed a predetermined threshold and (ii) at least one second word in the sentence corresponding to an entity that is to be labeled with the particular label (Gao, paragraph [0044]).

As to dependent claim 17, Gao discloses the determining the attended text embedding further comprising:
determining the attended text embedding by multiplying the text embedding with
the attention vector (Gao, paragraphs [0037], [0043]).

As to dependent claim 18, Gao and Kirch disclose wherein the sentence is one of a plurality of sentences, the method further comprising:
determining, with the processor, for each respective sentence in the plurality of
sentences, a respective difficulty using the label-word relation matrix, each respective
difficulty indicating a difficulty of performing a named entity recognition task with
respect to the respective sentence (Kirch, paragraphs [0041], [0042]);
scheduling, with the processor, the plurality of sentences into a sequence of
sentences, using a curriculum learning technique in which sentences having a relatively
lower named entity recognition difficulty are sequenced earlier than sentences having a
relatively higher named entity recognition difficulty (Kirch, paragraphs [0041], [0042]); and
feeding, with the processor, during training, the plurality of sentences into the
model according to the scheduled sequence of sentences (Kirch, paragraphs [0041], [0042]).

As to dependent claim 19, Gao and Kirch disclose wherein the sequence of sentences is organized into training batches, each training batch having a respective ratio of (i) sentences having a relatively lower difficulty and (ii) sentences having a relatively higher difficulty (Kirch, paragraphs [0041], [0042]), the scheduling further comprising:
setting, the respective ratio for each training batch in sequence of sentences,
depending on a performance of the model with respect to a previous training batch in the sequence of sentences (Kirch, paragraphs [0041], [0042]).

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Gao and Kirch as discussed in claims 2-15 and 17-19 above, and further in view of Kirch et al. (Kirch_0193368), US Patent Application Publication No. US 2024/0193368 A1.

As to dependent claim 16, Gao and Kirch, however, do not disclose the modifying further comprising: reducing the attention values in the attention vector corresponding to the at least one first word; and increasing attention values in the attention vector corresponding to the at least one second word.
In the same field of endeavor, Kirch_0193368 discloses in paragraphs [0032], [0035], [0036] and [0039] that Each Neural Capsule Embedding Network is configured to identify and classify named entities having a specific span length. M is a defined maximum span length, which is 32 in the preferred embodiment, such that the 32 Neural Capsule Embedding Networks are each configured to an assigned span length of 1 through 32. A Neural Capsule Embedding Network configured to span length 1 205 is tasked with identifying 1 word span named entities. A Neural Capsule Embedding Network configured to span length 2 210 is tasked with identifying 2 word span named entities. A Neural Capsule Embedding Network configured to span length 3 215 is tasked with identifying 3 word span named entities. A Neural Capsule Embedding Network configured to span length 4 220 is tasked with identifying 4 word span named entities, and so on. 
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the systems of Gao and Kirch, to incorporate reducing the attention values in the attention vector corresponding to the at least one first word; and increasing attention values in the attention vector corresponding to the at least one second word, as taught by Kirch_0193368 for the purpose of identifying named entities of longer span lengths


Conclusion

	
Any inquiry concerning this communication should be directed to CHAU T NGUYEN at telephone number (571)272-4092. The examiner can normally be reached on M-F from 8am to 5pm (PT).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) Form at https://www.uspto.gov/patents/uspto-automated-interview-request-air-form.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cesar Paula, can be reached at telephone number 5712724128. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from Patent Center and the Private Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from Patent Center or Private PAIR. Status information for unpublished applications is available through Patent Center and Private PAIR for authorized users only. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/CHAU T NGUYEN/Primary Examiner, Art Unit 2145

Read full office action

Prosecution Timeline

Apr 20, 2023

Application Filed

Jan 24, 2026

Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/962,463

Patent 12596765

GENERATION AND USE OF CONTENT BRIEFS FOR NETWORK CONTENT AUTHORING

2y 5m to grant Granted Apr 07, 2026

17/533,285

Patent 12591795

METHOD FOR PROVIDING EXPLAINABLE ARTIFICIAL INTELLIGENCE

2y 5m to grant Granted Mar 31, 2026

18/335,832

Patent 12585722

IMAGE GENERATION SYSTEM, COMMUNICATION APPARATUS, METHODS OF OPERATING IMAGE GENERATION SYSTEM AND COMMUNICATION APPARATUS, AND STORAGE MEDIUM

2y 5m to grant Granted Mar 24, 2026

17/934,644

Patent 12579356

MATHEMATICAL CALCULATIONS WITH NUMERICAL INDICATORS

2y 5m to grant Granted Mar 17, 2026

18/462,335

Patent 12547825

WHITELISTING REDACTION SYSTEMS AND METHODS

2y 5m to grant Granted Feb 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

68%

Grant Probability

99%

With Interview (+31.8%)

4y 0m

Median Time to Grant

Low

PTA Risk

Based on 549 resolved cases by this examiner. Grant probability derived from career allow rate.