Last updated: April 19, 2026

Application No. 18/935,376

ENTITY LINKING METHOD AND APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM

Non-Final OA §102§103

Filed

Nov 01, 2024

Examiner

ELLIS, MATTHEW J

Art Unit

2152

Tech Center

2100 — Computer Architecture & Software

Assignee

Tencent Technology (Shenzhen) Company Limited

OA Round

1 (Non-Final)

Interview Optional

— +30.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 318 resolved cases, 2023–2026

Examiner Intelligence

ELLIS, MATTHEW J View full profile →

Grants 69% — above average

Career Allow Rate

219 granted / 318 resolved

+13.9% vs TC avg

Strong +31% interview lift

Without

With

+30.9%

Interview Lift

resolved cases with interview

Typical timeline

3y 3m

Avg Prosecution

17 currently pending

Career history

335

Total Applications

across all art units

Statute-Specific Performance

§101

17.2%

-22.8% vs TC avg

§103

55.0%

+15.0% vs TC avg

§102

12.8%

-27.2% vs TC avg

§112

6.5%

-33.5% vs TC avg

Black line = Tech Center average estimate • Based on career data from 318 resolved cases

Office Action

§102 §103

DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA  and is in response to communications filed on 11/01/2024 in which claims 1-20 are presented for examination.

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. CN202211612479.5, filed on 12/14/2022.

Drawings
Drawings have been acknowledged and are acceptable for examination purposes.

Specification
Specification has been acknowledged and is acceptable for examination purposes.

Considerations under - 35 USC § 101
Claims 1-20 are NOT directed to an abstract idea because at least the following: 
Independent claims contain limitations which amount to an inventive concept as found in paragraphs [0004]-[0007] of the specification and represented in the claim limitation, “Merged text content is generated based on the descriptive information, the at least one candidate entity content, and the second screening template content. Target entity content corresponding to the text characters is obtained based on the merged text content.”
Dependent claims also recite significantly more than a judicial exception due at least to their dependency on their respective independent claims.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-2, 12-13, 17-18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Hudetz et al. US 20240370479 A1 (hereinafter referred to as “Hudetz”).

As per claim 1, Hudetz teaches:
An entity linking method, the method comprising: 
obtaining, by processing circuitry, text content including text characters and descriptive information that explains the text characters (Hudetz, [0038] – The term “electronic record” may refer to a contract or other record created, generated, sent, communicated, received, or stored by an electronic mechanism.  [0191] – An abstractive summary may be provided for a candidate document vector which serves as a written description); 
obtaining at least one candidate entity content corresponding to the text characters based on the descriptive information (Hudetz, [0053] – The search results may include a set of candidate document vectors that are semantically similar to the search vector); 
performing content filling on first screening template content based on the text characters to generate second screening template content (Burton, column 35, lines 33-42 – The user constructs a field of analysis units for this tool by means of a search box as implemented through, e.g. a dropdown menu UI control with a multiple selection paradigm, whose contents may be filled by having chosen whether the field of search may include tokens (and thus named entities), channel identifications, or tag identifications as by the use of, e.g. radio or checkbox UI elements, and the system suggests such tokens, channels, or tags as it finds available and appropriate within the document); 
generating merged text content based on the descriptive information, the at least one candidate entity content, and the second screening template content (Hudetz, [0070] – Text tags offer a flexible mechanism for setting up document templates that allow positioning signature and initial fields, collecting data from multiple parties within an agreement, defining validation rules for the collected data, and adding qualifying conditions, wherein the templates are interpreted as at least first and second screening template content.  [0083] – A contextualized embedding may comprise a vector representation of a sequence of words in the search query 144 that includes contextual information for the sequence of words, wherein the vector is interpreted as a candidate entity content.  [0081] – The client 134 may prepare the electronic document 142 as a brand new originally-written document, a modification of a previous electronic document, or from a document template with predefined information content, wherein the modification of a previous electronic document in relation to the templates, descriptive information within the templates, and the vectors); and 
obtaining target entity content corresponding to the text characters based on the merged text content (Hudetz, [0094] – The input data is also known as the features, and the output data is known as the target or label. The goal of a supervised algorithm is to learn the relationship between the input features and the target labels, so that it can make accurate predictions or classifications for new, unseen data).

As per claim 2, Hudetz teaches:
The method according to claim 1, wherein the obtaining the at least one candidate entity content comprises: 
obtaining coded text characters information corresponding to the text characters based on a text characters identifier (Hudetz, [0134] – The search manager 124 may encode or transform a set of electronic documents 706 to create a set of contextualized embeddings (e.g., sentence embeddings) representative of information or document content contained within each electronic document 706); 
obtaining coded descriptive information corresponding to the descriptive information based on a descriptive information identifier (Hudetz, [0122] – the search manager 124 may use the search model 704 to identify a defined entity or a document type for a defined entity); 
obtaining feature mining information of the coded text characters information based on the coded descriptive information (Hudetz, [0137] – Contextualized embeddings are represented as high-dimensional vectors of real numbers, where each dimension corresponds to a particular feature or aspect of the word's context); and 
obtaining the at least one candidate entity content corresponding to the text characters based on the feature mining information (Hudetz, [0148] – The search results 146 may include a set of P candidate document vectors).

Claims 12-13 are directed to an apparatus performing steps recited in claims 1-2 with substantially the same limitations.  Therefore, the rejections made to claims 1-2 are applied to claims 12-13.

Claims 17-18 are directed to a non-transitory computer-readable storage medium performing steps recited in claims 1-2 with substantially the same limitations.  Therefore, the rejections made to claims 1-2 are applied to claims 17-18.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 3-5 are rejected under 35 U.S.C. 103 as being unpatentable over Hudetz in view of Burton et al. US 12210839 B1 (hereinafter referred to as “Burton”).

As per claim 3, Hudetz teaches:
The method according to claim 1, wherein the generating the merged text content further comprises: 
generating an entity content identifier for each of the at least one candidate entity content (Hudetz, [0183] – This can be achieved through a technique called text summarization, which involves identifying the most important information in a document and condensing it into a shorter summary); 
Although Hudetz teaches tag markers in [0070] which are similar to spliced entities, Burton teaches specifically the terms “spliced” and “replacements” of the text in content which map better to spliced entities and masked content:
obtaining at least one spliced entity content based on splicing each of the at least one candidate entity content and the entity content identifier of the respective candidate entity content (Burton, column 27, lines 19-25 – Indices are used for tag splicing); 
obtaining masked screening template content based on masking the second screening template content (Buton, column 26, lines 62-67 – All matches may be replaced with an otherwise illegal sentinel character (e.g. “>”) the length of the match content. Then this parsing boundary sentinel character may be replaced with no character wherever it occurs in the concatenated deadened markup); and 
obtaining the merged text content based on splicing the descriptive information, the at least one spliced entity content, and the masked screening template content according to a preset splicing format (Burton, column 27, lines 1-5 – With the alternative purified, deadened markup-generated text computed for the parentally-bereft lists of union and selection tags, and the union and selection innerHTML easily made available).
It would have been obvious for one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify Hudetz’s invention in view of Burton in order to include splicing and replacing text; this is advantageous and a key feature of the system as flexibly designed is that the quality of results as experienced within the interactive portion of the system can improve over time as the commodity and fine-tuned or prompt-engineered inference engine may be upgraded—the underlying specification of the data model and details of the interactive analysis system need not rapidly change or evolve (Burton, column 57, lines 55-62).

As per claim 4, Hudetz as modified teaches:
The method according to claim 3, wherein the obtaining the target entity content further comprises: 
obtaining coded merged information corresponding to the merged text content based on encoding the merged text content (Burton, column 20, lines 55-60 – May classify the tag as requiring no remediation, recommending to be lysed and merged with a sibling tag, lysed and merged with a parent tag, reconsidered as visual, reconsidered as content, or reconsidered as front matter)); 
obtaining feature mining information corresponding to the merged text content (Hudetz, [0094] – The goal of a supervised algorithm is to learn the relationship between the input features and the target labels, so that it can make accurate predictions or classifications for new, unseen data); and 
obtaining the target entity content based on performing prediction on masked information in the merged text content according to the feature mining information (Hudetz, [0094] – Examples of supervised learning algorithms include: (1) linear regression which is a regression algorithm used to predict continuous numeric values, such as stock prices or temperature; (2) logistic regression which is a classification algorithm used to predict binary outcomes, such as whether a customer will purchase or not purchase a product; (3) decision tree which is a classification algorithm used to predict categorical outcomes by creating a decision tree based on the input features, etc.).

As per claim 5, Hudetz as modified teaches:
The method according to claim 1, wherein the obtaining the at least one candidate entity content includes obtaining the at least one candidate entity content corresponding to the text characters based on the descriptive information with a preset retrieval model (Hudetz, [0156] – To maintain the fast performance that users expect from search, semantic summarization and ranking are applied to a set number of results, such as the top 50 results, as scored by the default scoring algorithm, wherein a default scoring algorithm is interpreted as a preset retrieval model); and 
the obtaining the target entity content includes obtaining the target entity content corresponding to the text characters based on the merged text content with a preset disambiguation model (Burton, column 46, lines 39-50 – Clicking upon an artifact's (or artifact date's, as in the case of multiple artifacts, the selection of one of which can be settled with a disambiguation control offering multiple buttons captioned with an artifact identifier) square may cause a UI popup control to be issued in the vicinity of the square, bearing information including, but not limited to: the filing date or preparation date or the qualified compliance date of the artifact; the semantic channel counts of the artifact, broken down to positive and negative channel identifications and jointly tabulated over positive, negative, and neutral channel identifications; abbreviated tag and entity incidence data; and a LLM-generated summary of the report).

Claims 14-16 are directed to an apparatus performing steps recited in claims 3-5 with substantially the same limitations.  Therefore, the rejections made to claims 3-5 are applied to claims 14-16.

Claims 19-20 are directed to a non-transitory computer-readable storage medium performing steps recited in claims 3-5 with substantially the same limitations.  Therefore, the rejections made to claims 3-5 are applied to claims 19-20.

Claims 6, 10-11 are rejected under 35 U.S.C. 103 as being unpatentable over Hudetz in view of Burton and further in view of Wang et al. US 20230169270 A1 (hereinafter referred to as “Wang”).

As per claim 6, Hudetz as modified teaches:
The method according to claim 5, wherein the obtaining the target entity content further comprises: 
…
obtaining an enhanced entity content sample based on information enhancement of the at least one entity content sample (Burton, column 17, lines 46-57 – The base system which allows relatively static analysis of documents, document streams, and document stream originators, may be enhanced in, e.g., premium computation modes, by tightly integrated generative AI LLM-powered features such as synthetic advisories, reports, presentations, perspective-taking, subjective semantic distance judgments, and query language formation, and extended analysis features such as intelligent data export and a captive interpreter functionality which allows power user extension of the set of analysis capabilities designed for typical use); 
…
obtaining the preset disambiguation model by training the initially trained disambiguation model with the text content sample (Buton, column 46, lines 40-50 – The selection of one of which can be settled with a disambiguation control offering multiple buttons captioned with an artifact identifier) square may cause a UI popup control to be issued in the vicinity of the square, bearing information including, but not limited to: the filing date or preparation date or the qualified compliance date of the artifact; the semantic channel counts of the artifact, broken down to positive and negative channel identifications and jointly tabulated over positive, negative, and neutral channel identifications; abbreviated tag and entity incidence data; and a LLM-generated summary of the report).
Hudetz as modified doesn’t go into detail about a disambiguation model to to-be-trained data, however, Wang teaches:
obtaining a to-be-trained disambiguation model, a text content sample, and at least one entity content sample (Wang, [0125] – The shared code of the target text is obtained by inputting the target text to an entity recognition model.  [0128] – Disambiguation model, to-be-disambiguated sample entity); 
obtaining an initially trained disambiguation model by training the to-be-trained disambiguation model with the enhanced entity content sample (Wang, [0128] – Wherein, the first entity disambiguation model may be a BiLSTM (Bi-directional Long Short-Term Memory) model which is pre-trained by means of training samples in the knowledge base); and 
It would have been obvious for one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify Hudetz’s invention as modified in view of Wang in order to include splicing and replacing text; this is advantageous to improve the accuracy and recalling rate of entity recognition (Wang, [0106]).

As per claim 10, Hudetz as modified with Wang teaches:
The method according to claim 5, further comprising: 
obtaining a to-be-trained retrieval model, a first text content sample, and a second text content sample (Wang, [0128] – Wherein, the first entity disambiguation model may be a BiLSTM (Bi-directional Long Short-Term Memory) model which is pre-trained by means of training samples in the knowledge base. Specifically, one to-be-disambiguated sample entity is randomly selected from the knowledge base, and entities sharing the same name with the to-be-disambiguated sample entity are also selected from the knowledge base to form a candidate sample entity set); 
obtaining an initially trained retrieval model based on pretraining the to-be-trained retrieval model with the first text content sample (Wang, [0128] – The first entity disambiguation model may be a BiLSTM (Bi-directional Long Short-Term Memory) model which is pre-trained by means of training samples in the knowledge base); and 
obtaining the preset retrieval model based on training the initially trained retrieval model with the first text content sample and the second text content sample (Wang, [0128] – The first entity disambiguation model may be a BiLSTM (Bi-directional Long Short-Term Memory) model which is pre-trained by means of training samples in the knowledge base).

As per claim 11, Hudetz as modified with Wang teaches:
The method according to claim 10, wherein the obtaining the initially trained retrieval model comprises: 
obtaining at least one text content unit from the first text content sample (Wang, [0127] – The co-occurrence features, and the shared code of the target text are determined); 
determining a target text content unit in the at least one text content unit (Wang, [0127] – The co-occurrence features, and the shared code of the target text are determined); 
obtaining a masked text content sample based on masking the target text content unit (Wang, [0069] – Generally, 15% of the characters in a sentence are randomly selected to be used for prediction, 80% of the erased characters are replaced with a feature symbol [MASK]); and 
obtaining the initially trained retrieval model based on training the to-be-trained retrieval model with the masked text content sample (Wang, [0128] – Then, the first splicing result is input to a first entity disambiguation model, and the first entity disambiguation model outputs a first encoding result).

Allowable Subject Matter
Claims 7-9 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

As per claim 7,
The method according to claim 6, wherein the obtaining the enhanced entity content sample comprises: 
obtaining a first entity content sample and a second entity content sample by dividing a plurality of entity content samples into two types of entity content samples; 
obtaining a masked entity content sample based on performing masking on the first entity content sample; and 
obtaining the enhanced entity content sample based on splicing the masked entity content sample and the second entity content sample.

As per claim 8, 
The method according to claim 6, wherein the obtaining the preset disambiguation model further comprises: 
obtaining at least one candidate entity content sample corresponding to a sample of the text characters; 
generating a merged text content sample based on the sample of the text characters, a sample of the descriptive information, and the at least one candidate entity content sample; 
performing screening on the at least one candidate entity content sample with the initially trained disambiguation model based on the merged text content sample, to obtain a target entity content sample corresponding to the sample of the text characters; 
calculating model loss information based on the target entity content sample; and 
performing parameter adjustment on the initially trained disambiguation model based on the model loss information, to obtain the preset disambiguation model.

As per claim 9,
The method according to claim 8, wherein the calculating the model loss information comprises: 
determining a positive sample similarity between a positive text characters sample and the target entity content sample; 
obtaining a computed positive sample similarity; 
obtaining a statistical positive sample similarity based on the computed positive sample similarity; and 
obtaining the model loss information based on the statistical positive sample similarity.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Hunn et al. US 20240330605 A1 teaches AI and machine learning techniques within a framework of an electronic document management system to manage and mine a collection of electronic documents for certain types of information. The information may be analyzed and used to generate insights for a defined entity. The insights may comprise deviations, modifications or changes made to an electronic document within the electronic document management system. Other embodiments are described and claimed (Abstract).
Li et al. 18 March 2020, “A Survey on Deep Learning for Named Entity Recognition”, pgs. 1-16.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to the current examiner working on this case, name: Matthew Ellis, telephone number: (571)270-3443, email: matthew.ellis@uspto.gov, normal business hours Monday-Friday 8AM-5PM EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Neveen Abel-Jalil can be reached on (571)270-0474.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

January 25, 2026
/MATTHEW J ELLIS/Primary Examiner, Art Unit 2152

Read full office action

Prosecution Timeline

Nov 01, 2024

Application Filed

Jan 24, 2026

Non-Final Rejection — §102, §103

Feb 19, 2026

Applicant Interview (Telephonic)

Feb 23, 2026

Examiner Interview Summary

Precedent Cases

Applications granted by this same examiner with similar technology

18/052,694

Patent 12602545

WIDE AND DEEP NETWORK FOR LANGUAGE DETECTION USING HASH EMBEDDINGS

2y 5m to grant Granted Apr 14, 2026

18/244,134

Patent 12591551

GENERATION METHOD, SEARCH METHOD, AND GENERATION DEVICE

2y 5m to grant Granted Mar 31, 2026

18/103,973

Patent 12579136

SEMANTIC PARSING USING EMBEDDING SPACE REPRESENTATIONS OF EXAMPLE NATURAL LANGUAGE QUERIES

2y 5m to grant Granted Mar 17, 2026

19/225,984

Patent 12572571

LEARNING OPTIMIZED METALABEL EMBEDDED RANGE SEARCH STRUCTURES

2y 5m to grant Granted Mar 10, 2026

17/659,227

Patent 12536135

TEMPLATE APPLICATION PROGRAM

2y 5m to grant Granted Jan 27, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

69%

Grant Probability

99%

With Interview (+30.9%)

3y 3m

Median Time to Grant

Low

PTA Risk

Based on 318 resolved cases by this examiner. Grant probability derived from career allow rate.