Prosecution Insights
Last updated: April 19, 2026
Application No. 18/544,485

METHOD AND SYSTEM OF EXTRACTING NON-SEMANTIC ENTITIES

Non-Final OA §103
Filed
Dec 19, 2023
Examiner
LANTZ, KARSTEN FOSTER
Art Unit
2664
Tech Center
2600 — Communications
Assignee
L&T Technology Services Limited
OA Round
1 (Non-Final)
Grant Probability
Favorable
1-2
OA Rounds
2y 9m
To Grant

Examiner Intelligence

Grants only 0% of cases
0%
Career Allow Rate
0 granted / 0 resolved
-62.0% vs TC avg
Minimal +0% lift
Without
With
+0.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
19 currently pending
Career history
19
Total Applications
across all art units

Statute-Specific Performance

§103
73.8%
+33.8% vs TC avg
§102
14.3%
-25.7% vs TC avg
§112
11.9%
-28.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 0 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Priority Receipt is acknowledged that application claims priority to foreign application with application number IN202341030420 dated 4/27/2023. Copies of certified papers required by 37 CFR 1.55 have been received. Priority is acknowledged under 35 USC 119(e) and 37 CFR 1.78. Information Disclosure Statement The IDS dated 12/19/2023 has been considered and placed in the application file. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. Claims 1, 2, 4, 7, 8, 10, 11, 14, 15, 16, 17, 19 are rejected under 35 U.S.C. 103 as obvious over US Patent Publication 2023 0267273 A1, (Theriappan et al.) in view of US Patent 11227183 B1, (Connors et al.). Claim 1 Regarding Claim 1, Theriappan et al. teach a method of extracting one or more non-semantic entities in a document image, the method comprising: receiving, by a processor, the document image ("the enterprise document in the format of a semi-structured document of a specific enterprise (e.g., the enterprise 102a) may be received as a portable document format (PDF), Tagged Image File Format (TIFF), JPEG format, and the like," par. 47) comprising a plurality of data entities ("the entities may be positioned at various places in the document that is in an unstructured format or a semi-structured format," par. 24); extracting, by the processor, one or more row entities from the plurality of data entities for each row of the document image ("The post-processing module 504 further divides the tabular area into a plurality of rows and a plurality of columns and assigns each of the candidate tabular entities in a corresponding column of the plurality of columns. Further, extracting the candidate tabular entities and the candidate non-tabular entities from the enterprise documents," par. 99) and a corresponding row location based on a text extraction technique from the document image, ("The layout analyzer 316a analyzes the spatial location (i.e. x and y-coordinates) of the identified tokens in the document. In other words, the layout analyzer 316a extracts the layout property of the individual tokens by finding bounding boxes for each of the tokens," par. 64) wherein the one or more row entities comprises the one or more non-semantic entities and/or one or more semantic entities, wherein the one or more non-semantic entities comprises a plurality of numeric characters or a combination of a plurality of numeric characters, a plurality of special characters, and a plurality of alphabetic characters; for each of the row of the document: extracting, by the processor, one or more semantic entities from the one or more alphabetic entities based on a semantic recognition technique; ("the server system 106 with access to the database 108 is configured to automatically extract the structured data (and/or the semantic and non-sematic information) from the enterprise documents . . . the server system 106 may be configured to process the identified tokens for determining the candidate entities based at least on a combination of mathematical techniques along with rules and standards associated with the extraction of the candidate entities from the enterprise documents," par. 32 and 34) extracting, by the processor, one or more non-semantic entities as the split-row entities other than the one or more semantic entities ("the server system 106 with access to the database 108 is configured to automatically extract the structured data (and/or the semantic and non-sematic information) from the enterprise documents," par. 32); determining, by the processor, a plurality of feature values corresponding to each of a plurality of feature types, for each of the one or more non-semantic entities; ("The server system 106 may use feature engineering to identify and extract the information from the enterprise documents. In general, feature engineering is a process of extracting document features (e.g., properties, attributes, characteristics, etc.) from raw data," par. 34) determining, by the processor, a first probability output for each of a plurality of labels for each of the one or more non-semantic entities based on the plurality of feature values using a first prediction technique ("the server system 106 may be configured to process the identified tokens for determining the candidate entities based at least on a combination of mathematical techniques along with rules and standards associated with the extraction of the candidate entities from the enterprise documents. Thereafter, the server system 106 may compute probability scores corresponding to the identified candidate entities," par. 34), wherein the first prediction technique is trained based on first training data corresponding to a plurality of predefined non-semantic entities labeled based on the plurality of labels and corresponding plurality of feature values ("More specifically, the ML algorithms are trained on a large set of training data including cross-enterprise documents, for producing outcomes (i.e. entity/information extraction) with high accuracy. The training data may include, but are not limited to, invoices, purchase orders, resumes, restaurant menus, bills, receipts, and the like. The ML algorithms may be encoded with a list of custom document features associated with the enterprise documents. The document features typically encode structural, contextual, entity-specific, and token-specific properties for each word (i.e. tokens) in the enterprise documents," par. 32); determining, by the processor, a second probability output for each of the plurality of labels for each of the one or more semantic entities surrounding each of the one or more non-semantic entities using a second prediction technique ("the server system 106 extracts the structured data (i.e. the candidate entities) from the enterprise document (or the semi-structured enterprise document) based on the identified candidate entities and the probability scores associated with each of the tokens. Further, the server system 106 may compute confidence scores based on the probability scores associated with the candidate entities," par. 34), wherein the second prediction technique is trained based on second training data comprising a list of plurality of surrounding unigram semantic entities, bigrams semantic entities and trigram semantic entities corresponding to the plurality of pre-defined non-semantic entities ("More specifically, the ML algorithms are trained on a large set of training data including cross-enterprise documents, for producing outcomes (i.e. entity/information extraction) with high accuracy. The training data may include, but are not limited to, invoices, purchase orders, resumes, restaurant menus, bills, receipts, and the like. The ML algorithms may be encoded with a list of custom document features associated with the enterprise documents. The document features typically encode structural, contextual, entity-specific, and token-specific properties for each word (i.e. tokens) in the enterprise documents," par. 32); and labeling, by the processor, each of the one or more non-semantic entities based on determination of a highest probability value from a sum of the first probability output and the second probability output for each of the plurality of labels ("the word-shape analyzer 316b takes the actual text information from each of the tokens and generates its word-shape property. More specifically, the word-shape property encodes the exact nature of each character in the tokens by storing a continuous stream of “X” and “D”, where “X” represents an alphabet or a special character and “D” represents a digit," par. 64). Theriappan et al. do not explicitly teach all of for each of the row of the document: splitting, by the processor, the one or more row entities into one or more split-row entities based on a predefined splitting rule; determining, by the processor, one or more alphabetic entities and/or one or more numeric entities from the one or more split-row entities based on a detection of only alphabetic characters or only numeric characters respectively in each of the one or more row entities. However, Connors et al. teach for each of the row of the document: splitting, by the processor, the one or more row entities into one or more split-row entities based on a predefined splitting rule ("The document processor 202 analyzes the first document 124 and the second document 126 to split the first document 124 and the second document 126 into individual lines 222 wherein each row of text is considered as one line," col. 6, line 10); determining, by the processor, one or more alphabetic entities and/or one or more numeric entities from the one or more split-row entities based on a detection of only alphabetic characters or only numeric characters respectively in each of the one or more row entities ("The considered properties are i) the character is an alphabetical character and in upper case, ii) the character is an alphabetical character and in lower case, iii) the character is a digit," col. 6, line 66). Therefore, taking the teachings of Theriappan et al. and Connors et al. as a whole, it would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify entity extracting methods as taught by Theriappan et al. to use document and character analyzing methods as taught by Connors et al. The suggestion/motivation for doing so would have been that, “The extraction and selection of specific sections enable the data extraction and expansion system 100 to better identify the entities of a specific type thereby extracting entities at a more granular level and in finer detail” as noted by the Connors et al. disclosure in paragraph [22], which also motivates combination because the combination would predictably have a higher productivity as there is a reasonable expectation that a more specific and exact level of extraction is desired; and/or because doing so merely combines prior art elements according to known methods to yield predictable results. The rejection of method claim 1 above applies mutatis mutandis to the corresponding limitations of system claim 10 and non-transitory storage medium claim 16 while noting that the rejection above cites to both system and non-transitory storage medium disclosures. Claims 10 and 16 are mapped below for clarity of the record and to specify any new limitations not included in claim 1. Claim 2 Regarding Claim 2, Theriappan et al. and Connors et al. teach the method of claim 1 as noted above. Connors et al. do not explicitly teach all of wherein each of the one or more non-semantic entities are determined based on determination of at least four or more characters in each of the one or more split-row entities, and wherein the predefined splitting rule is based on detection of one or more delimiter. However, Theriappan et al. teach wherein each of the one or more non-semantic entities are determined based on determination of at least four or more characters in each of the one or more split-row entities, ("the word-shape analyzer 316b takes the actual text information from each of the tokens and generates its word-shape property. More specifically, the word-shape property encodes the exact nature of each character in the tokens by storing a continuous stream of “X” and “D”, where “X” represents an alphabet or a special character and “D” represents a digit," par. 64) and wherein the predefined splitting rule is based on detection of one or more delimiter ("The unstructured or semi-structured document may also be a binary representation of dark and light areas of a scanned document. Further, the unstructured or semi-structured document may not contain format markers," par. 44). Theriappan et al. and Connors et al. are combined as per claim 1. The rejection of method claim 2 above applies mutatis mutandis to the corresponding limitations of non-transitory storage medium claim 17 while noting that the rejection above cites to non-transitory storage medium disclosures. Claim 17 is mapped below for clarity of the record and to specify any new limitations not included in claim 2. Claim 4 Regarding Claim 4, Theriappan et al. and Connors et al. teach the method of claim 1 as noted above. Theriappan et al. do not explicitly teach all of wherein the plurality of feature types comprises: one or more numeric features, one or more percentage features, one or more positioning features, one or more and one or more pattern features. However, Connors et al. teach wherein the plurality of feature types comprises: one or more numeric features, one or more percentage features, one or more positioning features, one or more and one or more pattern features ("By way of illustration and not limitation, the features extracted from the body text can include: token: token itself isUpper: 1 if the token is in upper case else 0 isTitle: 1 if the token is a title else 0 isDigit: 1 if the token is a digit else 0 isAlphanum: 1 if the token is in alphanumeric form else 0 isAlpha: 1 if the token is an alphabetical character else 0 isHead: 1 if the token is part of a section heading else 0 sectionNo: the number assigned to the corresponding section of the token. Section number is assigned based on the sequential order of the section characterEncoding: encodes the properties of each character in the token. The considered properties are i) the character is an alphabetical character and in upper case, ii) the character is an alphabetical character and in lower case, iii) the character is a digit, iv) the character is a punctuation symbol: the name of the token if it's not a word. For example, ‘Comma’, ‘Semicolon’, etc. repeatedSymbolFeature: checks whether symbols are repeated tokenLength: length of token 1 if the token is a * else 0," col. 6, line 51). Theriappan et al. and Connors et al. are combined as per claim 1. The rejection of method claim 4 above applies mutatis mutandis to the corresponding limitations of system claim 11 and non-transitory storage medium claim 19 while noting that the rejection above cites to both system and non-transitory storage medium disclosures. Claims 11 and 19 are mapped below for clarity of the record and to specify any new limitations not included in claim 4. Claim 7 Regarding Claim 7, Theriappan et al. and Connors et al. teach the method of claim 4 as noted above. Connors et al. do not explicitly teach all of determining, by the processor, a position of one or more special characters in each of the non-semantic entities with respect to surrounding characters to the one or more special characters in each of the non-semantic entities. However, Theriappan et al. teach determining, by the processor, a position of one or more special characters in each of the non-semantic entities with respect to surrounding characters to the one or more special characters in each of the non-semantic entities ("The layout analyzer 316a analyzes the spatial location (i.e. x and y-coordinates) of the identified tokens in the document. In other words, the layout analyzer 316a extracts the layout property of the individual tokens by finding bounding boxes for each of the tokens. Further, the word-shape analyzer 316b takes the actual text information from each of the tokens and generates its word-shape property. More specifically, the word-shape property encodes the exact nature of each character in the tokens by storing a continuous stream of “X” and “D”, where “X” represents an alphabet or a special character and “D” represents a digit," par. 64). Theriappan et al. and Connors et al. are combined as per claim 1. The rejection of method claim 7 above applies mutatis mutandis to the corresponding limitations of system claim 14 while noting that the rejection above cites to system disclosures. Claim 14 is mapped below for clarity of the record and to specify any new limitations not included in claim 7. Claim 8 Regarding Claim 8, Theriappan et al. and Connors et al. teach the method of claim 4 as noted above. Theriappan et al. do not explicitly teach all of determining, by the processor, a pattern for each of the one or more non-semantic entities based on a presence of a numerical character, an alphabetical character, or a special character. However, Connors et al. teach determining, by the processor, a pattern for each of the one or more non-semantic entities based on a presence of a numerical character, an alphabetical character, or a special character ("The features listed above are input to the sequential learning model 262 in a successive manner (i.e., in the same order in which the tokens were presented in the first document 124 and the second document 126). Also, section boundaries can also be used as features by the sequential learning model 262. The sequential learning model 262 is configured to predict the class label of each token. The class labels of the tokens for a given section are recorded and the label which is predicted for the maximum number of tokens is predicted as the target label of that section e.g., such as the section label," col. 7, line 10). Theriappan et al. and Connors et al. are combined as per claim 1. The rejection of method claim 8 above applies mutatis mutandis to the corresponding limitations of system claim 15 while noting that the rejection above cites to system disclosures. Claim 15 is mapped below for clarity of the record and to specify any new limitations not included in claim 8. Claim 10 Regarding claim 10, Theriappan et al. teach a system for extracting one or more non-semantic entities in a document image, comprising: one or more processors ("The computer system 202 includes at least one processor," par. 38); a memory communicatively coupled to the processors, wherein the memory stores a plurality of processor-executable instructions, which, upon execution, cause the processors to: ("The computer system 202 includes at least one processor 206 for executing instructions, a memory 208, a communication interface 210, and a storage interface 214 that communicate with each other," par. 38) extract one or more row entities from a plurality of data entities for each row of the document image ("The post-processing module 504 further divides the tabular area into a plurality of rows and a plurality of columns and assigns each of the candidate tabular entities in a corresponding column of the plurality of columns. Further, extracting the candidate tabular entities and the candidate non-tabular entities from the enterprise documents," par. 99) and a corresponding row location based on a text extraction technique from the document image, wherein the one or more row entities comprises the one or more non-semantic entities and/or one or more semantic entities, ("The layout analyzer 316a analyzes the spatial location (i.e. x and y-coordinates) of the identified tokens in the document. In other words, the layout analyzer 316a extracts the layout property of the individual tokens by finding bounding boxes for each of the tokens," par. 64) wherein the one or more non-semantic entities comprises a plurality of numeric characters or a combination of a plurality of numeric characters, a plurality of special characters, and a plurality of alphabetic characters; for each of the row of the document, causing the processors to: extract one or more semantic entities from the one or more alphabetic entities based on a semantic recognition technique; ("the server system 106 with access to the database 108 is configured to automatically extract the structured data (and/or the semantic and non-sematic information) from the enterprise documents . . . the server system 106 may be configured to process the identified tokens for determining the candidate entities based at least on a combination of mathematical techniques along with rules and standards associated with the extraction of the candidate entities from the enterprise documents," par. 32 and 34) extract one or more non-semantic entities as the split-row entities other than the one or more semantic entities ("the server system 106 with access to the database 108 is configured to automatically extract the structured data (and/or the semantic and non-sematic information) from the enterprise documents," par. 32); determine a plurality of feature values corresponding to each of a plurality of feature types, for each of the one or more non-semantic entities ("The server system 106 may use feature engineering to identify and extract the information from the enterprise documents. In general, feature engineering is a process of extracting document features (e.g., properties, attributes, characteristics, etc.) from raw data," par. 34); determine a first probability output for each of a plurality of labels for each of the one or more non-semantic entities based on the plurality of feature values using a first prediction technique ("the server system 106 may be configured to process the identified tokens for determining the candidate entities based at least on a combination of mathematical techniques along with rules and standards associated with the extraction of the candidate entities from the enterprise documents. Thereafter, the server system 106 may compute probability scores corresponding to the identified candidate entities," par. 34), wherein the first prediction technique is trained based on first training data corresponding to a plurality of predefined non-semantic entities labeled based on the plurality of labels and corresponding plurality of feature values ("More specifically, the ML algorithms are trained on a large set of training data including cross-enterprise documents, for producing outcomes (i.e. entity/information extraction) with high accuracy. The training data may include, but are not limited to, invoices, purchase orders, resumes, restaurant menus, bills, receipts, and the like. The ML algorithms may be encoded with a list of custom document features associated with the enterprise documents. The document features typically encode structural, contextual, entity-specific, and token-specific properties for each word (i.e. tokens) in the enterprise documents," par. 32); determine a second probability output for each of the plurality of labels for each of the one or more semantic entities surrounding each of the one or more non-semantic entities using a second prediction technique ("the server system 106 extracts the structured data (i.e. the candidate entities) from the enterprise document (or the semi-structured enterprise document) based on the identified candidate entities and the probability scores associated with each of the tokens. Further, the server system 106 may compute confidence scores based on the probability scores associated with the candidate entities," par. 34), wherein the second prediction technique is trained based on second training data comprising a list of plurality of surrounding unigram semantic entities, bigrams semantic entities and trigram semantic entities corresponding to the plurality of pre-defined non-semantic entities ("More specifically, the ML algorithms are trained on a large set of training data including cross-enterprise documents, for producing outcomes (i.e. entity/information extraction) with high accuracy. The training data may include, but are not limited to, invoices, purchase orders, resumes, restaurant menus, bills, receipts, and the like. The ML algorithms may be encoded with a list of custom document features associated with the enterprise documents. The document features typically encode structural, contextual, entity-specific, and token-specific properties for each word (i.e. tokens) in the enterprise documents," par. 32); and label each of the one or more non-semantic entities based on determination of a highest probability value from a sum of the first probability output and the second probability output for each of the plurality of labels ("the word-shape analyzer 316b takes the actual text information from each of the tokens and generates its word-shape property. More specifically, the word-shape property encodes the exact nature of each character in the tokens by storing a continuous stream of “X” and “D”, where “X” represents an alphabet or a special character and “D” represents a digit," par. 64). Theriappan et al. do not explicitly teach all of for each of the row of the document, causing the processors to: split the one or more row entities into one or more split-row entities based on a predefined splitting rule; determine one or more alphabetic entities and/or one or more numeric entities from the one or more split-row entities based on a detection of only alphabetic characters or only numeric characters respectively in each of the one or more row entities. However, Connors et al. teach for each of the row of the document, causing the processors to: split the one or more row entities into one or more split-row entities based on a predefined splitting rule ("The document processor 202 analyzes the first document 124 and the second document 126 to split the first document 124 and the second document 126 into individual lines 222 wherein each row of text is considered as one line," col. 6, line 10); determine one or more alphabetic entities and/or one or more numeric entities from the one or more split-row entities based on a detection of only alphabetic characters or only numeric characters respectively in each of the one or more row entities ("The considered properties are i) the character is an alphabetical character and in upper case, ii) the character is an alphabetical character and in lower case, iii) the character is a digit," col. 6, line 66). Theriappan et al. and Connors et al. are combined as per claim 1. Claim 11 Regarding claim 11, Theriappan et al. and Connors et al. teach the system of claim 10 as noted above. Theriappan et al. do not explicitly teach all of wherein the plurality of feature types comprises: one or more numeric features, one or more percentage features, one or more positioning features, and one or more pattern features. However, Connors et al. teach wherein the plurality of feature types comprises: one or more numeric features, one or more percentage features, one or more positioning features, and one or more pattern features ("By way of illustration and not limitation, the features extracted from the body text can include: token: token itself isUpper: 1 if the token is in upper case else 0 isTitle: 1 if the token is a title else 0 isDigit: 1 if the token is a digit else 0 isAlphanum: 1 if the token is in alphanumeric form else 0 isAlpha: 1 if the token is an alphabetical character else 0 isHead: 1 if the token is part of a section heading else 0 sectionNo: the number assigned to the corresponding section of the token. Section number is assigned based on the sequential order of the section characterEncoding: encodes the properties of each character in the token. The considered properties are i) the character is an alphabetical character and in upper case, ii) the character is an alphabetical character and in lower case, iii) the character is a digit, iv) the character is a punctuation symbol: the name of the token if it's not a word. For example, ‘Comma’, ‘Semicolon’, etc. repeatedSymbolFeature: checks whether symbols are repeated tokenLength: length of token 1 if the token is a * else 0," col. 6, line 51). Theriappan et al. and Connors et al. are combined as per claim 1. Claim 14 Regarding claim 14, Theriappan et al. and Connors et al. teach the system of claim 11 as noted above. Connors et al. do not explicitly teach all of determination of a position of one or more special characters in each of the non-semantic entities with respect to surrounding characters to the one or more special characters in each of the non-semantic entities. However, Theriappan et al. teach determination of a position of one or more special characters in each of the non-semantic entities with respect to surrounding characters to the one or more special characters in each of the non-semantic entities ("The layout analyzer 316a analyzes the spatial location (i.e. x and y-coordinates) of the identified tokens in the document. In other words, the layout analyzer 316a extracts the layout property of the individual tokens by finding bounding boxes for each of the tokens. Further, the word-shape analyzer 316b takes the actual text information from each of the tokens and generates its word-shape property. More specifically, the word-shape property encodes the exact nature of each character in the tokens by storing a continuous stream of “X” and “D”, where “X” represents an alphabet or a special character and “D” represents a digit," par. 64). Theriappan et al. and Connors et al. are combined as per claim 1. Claim 15 Regarding claim 15, Theriappan et al. and Connors et al. teach the system of claim 11 as noted above. Theriappan et al. do not explicitly teach all of determination of a pattern for each of the one or more non-semantic entities based on a presence of a numerical character, an alphabetical character, or a special character. However, Connors et al. teach determination of a pattern for each of the one or more non-semantic entities based on a presence of a numerical character, an alphabetical character, or a special character ("The features listed above are input to the sequential learning model 262 in a successive manner (i.e., in the same order in which the tokens were presented in the first document 124 and the second document 126). Also, section boundaries can also be used as features by the sequential learning model 262. The sequential learning model 262 is configured to predict the class label of each token. The class labels of the tokens for a given section are recorded and the label which is predicted for the maximum number of tokens is predicted as the target label of that section e.g., such as the section label," col. 7, line 10). Theriappan et al. and Connors et al. are combined as per claim 1. Claim 16 Regarding claim 16, Theriappan et al. teach a non-transitory computer-readable medium storing computer-executable instructions for extracting one or more non-semantic entities in a document image, the computer-executable instructions configured for: ("In addition, the server system 106 should be understood to be embodied in at least one computing device in communication with the network 110, which may be specifically configured, via executable instructions, to perform as described herein, and/or embodied in at least one non-transitory computer-readable media," par. 31) receiving the document image ("the enterprise document in the format of a semi-structured document of a specific enterprise (e.g., the enterprise 102a) may be received as a portable document format (PDF), Tagged Image File Format (TIFF), JPEG format, and the like," par. 47) comprising a plurality of data entities ("the entities may be positioned at various places in the document that is in an unstructured format or a semi-structured format," par. 24); extracting one or more row entities from the plurality of data entities for each row of the document image ("The post-processing module 504 further divides the tabular area into a plurality of rows and a plurality of columns and assigns each of the candidate tabular entities in a corresponding column of the plurality of columns. Further, extracting the candidate tabular entities and the candidate non-tabular entities from the enterprise documents," par. 99) and a corresponding row location based on a text extraction technique from the document image ("The layout analyzer 316a analyzes the spatial location (i.e. x and y-coordinates) of the identified tokens in the document. In other words, the layout analyzer 316a extracts the layout property of the individual tokens by finding bounding boxes for each of the tokens," par. 64), wherein the one or more row entities comprises the one or more non-semantic entities and/or one or more semantic entities, wherein the one or more non-semantic entities comprises a plurality of numeric characters or a combination of a plurality of numeric characters, a plurality of special characters, and a plurality of alphabetic characters; for each of the row of the document: extracting one or more semantic entities from the one or more alphabetic entities based on a semantic recognition technique ("the server system 106 with access to the database 108 is configured to automatically extract the structured data (and/or the semantic and non-sematic information) from the enterprise documents . . . the server system 106 may be configured to process the identified tokens for determining the candidate entities based at least on a combination of mathematical techniques along with rules and standards associated with the extraction of the candidate entities from the enterprise documents," par. 32 and 34); extracting one or more non-semantic entities as the split-row entities other than the one or more semantic entities ("the server system 106 with access to the database 108 is configured to automatically extract the structured data (and/or the semantic and non-sematic information) from the enterprise documents," par. 32); determining a plurality of feature values corresponding to each of a plurality of feature types, for each of the one or more non-semantic entities ("The server system 106 may use feature engineering to identify and extract the information from the enterprise documents. In general, feature engineering is a process of extracting document features (e.g., properties, attributes, characteristics, etc.) from raw data," par. 34); determining a first probability output for each of a plurality of labels for each of the one or more non-semantic entities based on the plurality of feature values using a first prediction technique ("the server system 106 may be configured to process the identified tokens for determining the candidate entities based at least on a combination of mathematical techniques along with rules and standards associated with the extraction of the candidate entities from the enterprise documents. Thereafter, the server system 106 may compute probability scores corresponding to the identified candidate entities," par. 34), wherein the first prediction technique is trained based on first training data corresponding to a plurality of predefined non-semantic entities labeled based on the plurality of labels and corresponding plurality of feature values ("More specifically, the ML algorithms are trained on a large set of training data including cross-enterprise documents, for producing outcomes (i.e. entity/information extraction) with high accuracy. The training data may include, but are not limited to, invoices, purchase orders, resumes, restaurant menus, bills, receipts, and the like. The ML algorithms may be encoded with a list of custom document features associated with the enterprise documents. The document features typically encode structural, contextual, entity-specific, and token-specific properties for each word (i.e. tokens) in the enterprise documents," par. 32); determining a second probability output for each of the plurality of labels for each of the one or more semantic entities surrounding each of the one or more non-semantic entities using a second prediction technique ("the server system 106 extracts the structured data (i.e. the candidate entities) from the enterprise document (or the semi-structured enterprise document) based on the identified candidate entities and the probability scores associated with each of the tokens. Further, the server system 106 may compute confidence scores based on the probability scores associated with the candidate entities," par. 34), wherein the second prediction technique is trained based on second training data comprising a list of plurality of surrounding unigram semantic entities, bigrams semantic entities and trigram semantic entities corresponding to the plurality of pre-defined non-semantic entities ("More specifically, the ML algorithms are trained on a large set of training data including cross-enterprise documents, for producing outcomes (i.e. entity/information extraction) with high accuracy. The training data may include, but are not limited to, invoices, purchase orders, resumes, restaurant menus, bills, receipts, and the like. The ML algorithms may be encoded with a list of custom document features associated with the enterprise documents. The document features typically encode structural, contextual, entity-specific, and token-specific properties for each word (i.e. tokens) in the enterprise documents," par. 32); and labeling each of the one or more non-semantic entities based on determination of a highest probability value from a sum of the first probability output and the second probability output for each of the plurality of labels ("the word-shape analyzer 316b takes the actual text information from each of the tokens and generates its word-shape property. More specifically, the word-shape property encodes the exact nature of each character in the tokens by storing a continuous stream of “X” and “D”, where “X” represents an alphabet or a special character and “D” represents a digit," par. 64). Theriappan et al. do not explicitly teach all of for each of the row of the document: splitting the one or more row entities into one or more split-row entities based on a predefined splitting rule; determining one or more alphabetic entities and/or one or more numeric entities from the one or more split-row entities based on a detection of only alphabetic characters or only numeric characters respectively in each of the one or more row entities. However, Connors et al. teach for each of the row of the document: splitting the one or more row entities into one or more split-row entities based on a predefined splitting rule ("The document processor 202 analyzes the first document 124 and the second document 126 to split the first document 124 and the second document 126 into individual lines 222 wherein each row of text is considered as one line," col. 6, line 10); determining one or more alphabetic entities and/or one or more numeric entities from the one or more split-row entities based on a detection of only alphabetic characters or only numeric characters respectively in each of the one or more row entities ("The considered properties are i) the character is an alphabetical character and in upper case, ii) the character is an alphabetical character and in lower case, iii) the character is a digit," col. 6, line 66). Theriappan et al. and Connors et al. are combined as per claim 1. Claim 17 Regarding claim 17, Theriappan et al. and Conners et al. teach the non-transitory computer-readable medium of claim 16. Connors et al. do not explicitly teach all of wherein each of the one or more non-semantic entities are determined based on determination of at least four or more characters in each of the one or more split-row entities, and wherein the predefined splitting rule is based on detection of one or more delimiter. However, Theriappan et al. teach wherein each of the one or more non-semantic entities are determined based on determination of at least four or more characters in each of the one or more split-row entities, ("the word-shape analyzer 316b takes the actual text information from each of the tokens and generates its word-shape property. More specifically, the word-shape property encodes the exact nature of each character in the tokens by storing a continuous stream of “X” and “D”, where “X” represents an alphabet or a special character and “D” represents a digit," par. 64) and wherein the predefined splitting rule is based on detection of one or more delimiter ("The unstructured or semi-structured document may also be a binary representation of dark and light areas of a scanned document. Further, the unstructured or semi-structured document may not contain format markers," par. 44). Theriappan et al. and Connors et al. are combined as per claim 1. Claim 19 Regarding claim 19, Theriappan et al. and Conners et al. teach the non-transitory computer-readable medium of claim 16. Theriappan et al. do not explicitly teach all of wherein the plurality of feature types comprises: one or more numeric features, one or more percentage features, one or more positioning features, one or more and one or more pattern features. However, Connors et al. teach wherein the plurality of feature types comprises: one or more numeric features, one or more percentage features, one or more positioning features, one or more and one or more pattern features ("By way of illustration and not limitation, the features extracted from the body text can include: token: token itself isUpper: 1 if the token is in upper case else 0 isTitle: 1 if the token is a title else 0 isDigit: 1 if the token is a digit else 0 isAlphanum: 1 if the token is in alphanumeric form else 0 isAlpha: 1 if the token is an alphabetical character else 0 isHead: 1 if the token is part of a section heading else 0 sectionNo: the number assigned to the corresponding section of the token. Section number is assigned based on the sequential order of the section characterEncoding: encodes the properties of each character in the token. The considered properties are i) the character is an alphabetical character and in upper case, ii) the character is an alphabetical character and in lower case, iii) the character is a digit, iv) the character is a punctuation symbol: the name of the token if it's not a word. For example, ‘Comma’, ‘Semicolon’, etc. repeatedSymbolFeature: checks whether symbols are repeated tokenLength: length of token 1 if the token is a * else 0," col. 6, line 51). Theriappan et al. and Connors et al. are combined as per claim 1. 2nd Claim Rejections - 35 USC § 103 Claims 3 and 18 are rejected under 35 U.S.C. 103 as obvious over US Patent Publication 2023 0267273 A1, (Theriappan et al.) and US Patent 11227183 B1, (Connors et al.) in view of US Patent Publication 2021 0357766 A1, (Paul et al.). Claim 3 Regarding Claim 3, Theriappan et al. and Connors et al. teach the method of claim 1 as noted above. Connors et al. teach removing, by the processor, one or more stop words from the one or more row entities ("the document processor 202 may also function to further preprocess the first document 124 and the second document 126 such as by parsing, tokenizing, removing stop words," par. 25); and lemmatizing, by the processor, the one or more row entities ("The consecutive lines following each heading and before the occurrence of the next heading are combined to form the body text which is identified as a separate section e.g. section 1, section 2, . . . section n," par. 26). Theriappan et al. and Connors et al. do not explicitly teach all of trimming, by the processor, one or more white spaces between the one or more row entities; removing, by the processor, one or more punctuation characters in each of one or more row entities; converting, by the processor, each alphabetic character of the one or more row entities into a lower case alphabetic character. However, Paul et al. teach trimming, by the processor, one or more white spaces between the one or more row entities ("The text normalizer 310 includes an initialization component 312 that joins multiple text fields from the maintenance report, where present, into a single block of text, shifts all letters in the text to lower case, and removes all white space and punctuation from the text," par. 38); removing, by the processor, one or more punctuation characters in each of one or more row entities ("The text normalizer 310 includes an initialization component 312 that joins multiple text fields from the maintenance report, where present, into a single block of text, shifts all letters in the text to lower case, and removes all white space and punctuation from the text," par. 38); converting, by the processor, each alphabetic character of the one or more row entities into a lower case alphabetic character ("The text normalizer 310 includes an initialization component 312 that joins multiple text fields from the maintenance report, where present, into a single block of text, shifts all letters in the text to lower case, and removes all white space and punctuation from the text," par. 38). Therefore, taking the teachings of Theriappan et al., Connors et al., and Paul et al. as a whole, it would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify entity extracting methods as taught by Theriappan et al. to use document and character analyzing methods as taught by Connors et al. and the character editing methods as taught by Paul et al. The suggestion/motivation for doing so would have been that, “the network interface 302 can retrieve only the free text portions of a given maintenance report as opposed to the entire report” as noted by the Paul et al. disclosure in paragraph [37], which also motivates combination because the combination would predictably have additional capability as there is a reasonable expectation that systems can more easily process free text as opposed to entire documents; and/or because doing so merely combines prior art elements according to known methods to yield predictable results. The rejection of method claim 3 above applies mutatis mutandis to the corresponding limitations of non-transitory storage medium claim 18 while noting that the rejection above cites to non-transitory storage medium disclosures. Claim 18 is mapped below for clarity of the record and to specify any new limitations not included in claim 3. Claim 18 Regarding Claim 3, Theriappan et al. and Connors et al. teach the method of claim 16 as noted above. Connors et al. teach removing one or more stop words from the one or more row entities ("the document processor 202 may also function to further preprocess the first document 124 and the second document 126 such as by parsing, tokenizing, removing stop words," par. 25); and lemmatizing the one or more row entities ("The consecutive lines following each heading and before the occurrence of the next heading are combined to form the body text which is identified as a separate section e.g. section 1, section 2, . . . section n," par. 26). Theriappan et al. and Connors et al. do not explicitly teach all of trimming one or more white spaces between the one or more row entities; removing one or more punctuation characters in each of one or more row entities; converting each alphabetic character of the one or more row entities into a lower case alphabetic character. However, Paul et al. teach trimming one or more white spaces between the one or more row entities ("The text normalizer 310 includes an initialization component 312 that joins multiple text fields from the maintenance report, where present, into a single block of text, shifts all letters in the text to lower case, and removes all white space and punctuation from the text," par. 38); removing one or more punctuation characters in each of one or more row entities ("The text normalizer 310 includes an initialization component 312 that joins multiple text fields from the maintenance report, where present, into a single block of text, shifts all letters in the text to lower case, and removes all white space and punctuation from the text," par. 38); converting each alphabetic character of the one or more row entities into a lower case alphabetic character ("The text normalizer 310 includes an initialization component 312 that joins multiple text fields from the maintenance report, where present, into a single block of text, shifts all letters in the text to lower case, and removes all white space and punctuation from the text," par. 38). Theriappan et al., Connors et al., and Paul et al. are combined as per claim 3. 3rd Claim Rejections - 35 USC § 103 Claims 5, 12, and 20 are rejected under 35 U.S.C. 103 as obvious over US Patent Publication 2023 0267273 A1, (Theriappan et al.) and US Patent 11227183 B1, (Connors et al.) in view of US Patent Publication 2017 0293687 A1, (Kolotienko et al.), US Patent Publication 2023 0245485 A1, (Rimchala et al.), and US Patent Publication 2024 0202214 A1, (Bordawekar). Claim 5 Regarding Claim 5, Theriappan et al. and Connors et al. teach the method of claim 4 as noted above. Theriappan et al. and Connors et al. do not explicitly teach all of determining, by the processor, a custom weight for each of the one or more non-semantic entities based on a number of alphabetic characters, a number of numeric characters and a number of special characters; determining, by the processor, a plurality of consecutive numeric characters present in a first half or a second half of each of the one or more non semantic entities; and determining, by the processor, a logarithmic value of each of the numeric entities. However, Kolotienko et al. teach determining, by the processor, a custom weight for each of the one or more non-semantic entities based on a number of alphabetic characters, a number of numeric characters and a number of special characters ("The hierarchy of the semantic classes may be reflected by associating certain attribute values (which may be thought of as weight coefficients reflecting the relationship of the particular semantic class to a certain text feature) with each semantic class along a certain line of ancestry," par. 28). Rimchala et al. teach determining, by the processor, a plurality of consecutive numeric characters present in a first half or a second half of each of the one or more non semantic entities ("Tokens that are adjacent or consecutive are aggregated based on one or more rules. The rules may specify a number, type or set of tokens that are grouped based on the corresponding token labels and relative locations to each other," par. 70). Bordawekar teaches determining, by the processor, a logarithmic value of each of the numeric entities ("the partitioned set of buckets are assigned cluster identifications in the set of semantic clusters 420 (FIG. 4). In an example process, the numbers are first assigned to buckets based on their base 10 log values," par. 49). Therefore, taking the teachings of Theriappan et al., Connors et al., Kolotienko et al., Rimchala et al., and Bordawekar as a whole, it would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify entity extracting methods as taught by Theriappan et al. to use document and character analyzing methods as taught by Connors et al., the feature weights as taught by Kolotienko et al., the token grouping determination as taught by Rimchala et al., and the logarithmic assigning as taught by Bordawekar. The suggestion/motivation for doing so would have been that, “ The semantic classes may be organized into a hierarchical structure which is also referred to as a “semantic hierarchy” herein. In certain implementations, the feature extraction may produce more accurate results by taking into account the semantic hierarchy to effectively consider chains of semantic classes representing multiple levels of abstraction of a certain semantic class.” as noted by the Kolotienko et al. disclosure in paragraph [28], which also motivates combination because the combination would predictably have a higher extraction accuracy as there is a reasonable expectation that the semantic hierarchy would more accurately define certain text features; and/or because doing so merely combines prior art elements according to known methods to yield predictable results. The rejection of system claim 5 above applies mutatis mutandis to the corresponding limitations of method claim 12 and apparatus claim 20 while noting that the rejection above cites to both device and method disclosures. Claims 12 and 20 are mapped below for clarity of the record and to specify any new limitations not included in claim 5. Claim 12 Regarding Claim 12, Theriappan et al. and Connors et al. teach the system of claim 11 as noted above. Theriappan et al. and Connors et al. do not explicitly teach all of determination of a custom weight for each of the one or more non-semantic entities based on several alphabetic characters, a number of numeric characters and a number of special characters; determination of a plurality of consecutive numeric characters present in a first half or a second half of each of the one or more non-semantic entities; and determining a logarithmic value of each of the numeric entities. However, Kolotienko et al. teach determination of a custom weight for each of the one or more non-semantic entities based on several alphabetic characters, a number of numeric characters and a number of special characters ("The hierarchy of the semantic classes may be reflected by associating certain attribute values (which may be thought of as weight coefficients reflecting the relationship of the particular semantic class to a certain text feature) with each semantic class along a certain line of ancestry," par. 28). Rimchala et al. teach determination of a plurality of consecutive numeric characters present in a first half or a second half of each of the one or more non-semantic entities ("Tokens that are adjacent or consecutive are aggregated based on one or more rules. The rules may specify a number, type or set of tokens that are grouped based on the corresponding token labels and relative locations to each other," par. 70). Bordawekar teaches determining a logarithmic value of each of the numeric entities ("the partitioned set of buckets are assigned cluster identifications in the set of semantic clusters 420 (FIG. 4). In an example process, the numbers are first assigned to buckets based on their base 10 log values," par. 49). Theriappan et al., Connors et al., Kolotienko et al., Rimchala et al., and Bordawekar are combined as per claim 5. Claim 20 Regarding Claim 20, Theriappan et al. and Connors et al. teach the non-transitory computer-readable medium of claim 19 as noted above. Theriappan et al. and Connors et al. do not explicitly teach all of determining a custom weight for each of the one or more non-semantic entities based on a number of alphabetic characters, a number of numeric characters and a number of special characters; determining a plurality of consecutive numeric characters present in a first half or a second half of each of the one or more non-semantic entities; and determining a logarithmic value of each of the numeric entities. However, Kolotienko et al. teach determining a custom weight for each of the one or more non-semantic entities based on a number of alphabetic characters, a number of numeric characters and a number of special characters ("The hierarchy of the semantic classes may be reflected by associating certain attribute values (which may be thought of as weight coefficients reflecting the relationship of the particular semantic class to a certain text feature) with each semantic class along a certain line of ancestry," par. 28). Rimchala et al. teach determining a plurality of consecutive numeric characters present in a first half or a second half of each of the one or more non-semantic entities ("Tokens that are adjacent or consecutive are aggregated based on one or more rules. The rules may specify a number, type or set of tokens that are grouped based on the corresponding token labels and relative locations to each other," par. 70). Bordawekar teaches determining a logarithmic value of each of the numeric entities ("the partitioned set of buckets are assigned cluster identifications in the set of semantic clusters 420 (FIG. 4). In an example process, the numbers are first assigned to buckets based on their base 10 log values," par. 49). Theriappan et al., Connors et al., Kolotienko et al., Rimchala et al., and Bordawekar are combined as per claim 5. 4th Claim Rejections - 35 USC § 103 Claims 6 and 13 are rejected under 35 U.S.C. 103 as obvious over US Patent Publication 2023 0267273 A1, (Theriappan et al.) and US Patent 11227183 B1, (Connors et al.) in view of US Patent Publication 2018 0181646 A1, (Balasa et al.). Claim 6 Regarding Claim 6, Theriappan et al. and Connors et al. teach the method of claim 4 as noted above. Theriappan et al. teach the non-semantic data ("the server system 106 with access to the database 108 is configured to automatically extract the structured data (and/or the semantic and non-sematic information)," par. 32). Theriappan et al. and Connors et al. do not explicitly teach all of determining, by the processor, a percentage value of numeric characters, a percentage value of alphabetic characters, and a percentage value of special character. However, Balasa et al. teach determining, by the processor, a percentage value of numeric characters, a percentage value of alphabetic characters, and a percentage value of special character ("According to an embodiment of the invention, one of the soft matching technique is partial string matching technique 420. For instance, if one part of a string is exactly matched with another string or part of another string then two strings may be partially matched. The part of the string may be words. The words of a string may be split by a space or other special characters. For instance, in two strings viz. String 1 and String 2, String 1 words are compared one by one and with string 2 words. Based on how many percent each word of a string1 is exactly or likely same to another word of a string 2 or in string 2 itself, a weight to that word of the string 1 is assigned," par. 37). Therefore, taking the teachings of Theriappan et al., Connors et al., and Balasa et al. as a whole, it would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify entity extracting methods as taught by Theriappan et al. to use document and character analyzing methods as taught by Connors et al. and the character percentage determination methods as taught by Balasa et al. The suggestion/motivation for doing so would have been that, “The soft matching technique is used to generate one or more clusters of the grouped data and a plurality of relationship scores among the clusters. The soft match is a possibilistic match (rather than probabilistic match). … There are several soft match techniques available. Each one has a certain level of accuracy. Hence, to obtain exhaustive and accurate results one or more techniques are combined to determine the identity relationships.” as noted by the Balasa et al. disclosure in paragraph [35], which also motivates combination because the combination would predictably have a higher accuracy as there is a reasonable expectation that the soft matching techniques would be used to pair any character or string with another; and/or because doing so merely combines prior art elements according to known methods to yield predictable results. The rejection of method claim 6 above applies mutatis mutandis to the corresponding limitations of system claim 13 while noting that the rejection above cites to system disclosures. Claim 13 is mapped below for clarity of the record and to specify any new limitations not included in claim 6. Claim 13 Regarding Claim 13, Theriappan et al. and Connors et al. teach the system of claim 11 as noted above. Theriappan et al. teach the non-semantic data ("the server system 106 with access to the database 108 is configured to automatically extract the structured data (and/or the semantic and non-sematic information)," par. 32). Theriappan et al. and Connors et al. do not explicitly teach all of determination of a percentage value of numeric characters, a percentage value of alphabetic characters, and a percentage value of special characters. However, Balasa et al. teach determination of a percentage value of numeric characters, a percentage value of alphabetic characters, and a percentage value of special characters ("According to an embodiment of the invention, one of the soft matching technique is partial string matching technique 420. For instance, if one part of a string is exactly matched with another string or part of another string then two strings may be partially matched. The part of the string may be words. The words of a string may be split by a space or other special characters. For instance, in two strings viz. String 1 and String 2, String 1 words are compared one by one and with string 2 words. Based on how many percent each word of a string1 is exactly or likely same to another word of a string 2 or in string 2 itself, a weight to that word of the string 1 is assigned," par. 37). Theriappan et al., Connors et al., and Balasa et al. are combined as per claim 6. 5th Claim Rejections - 35 USC § 103 Claim 9 is rejected under 35 U.S.C. 103 as obvious over US Patent Publication 2023 0267273 A1, (Theriappan et al.) and US Patent 11227183 B1, (Connors et al.) in view of US Patent Publication 2022 0253871 A1, (Miller et al.). Claim 9 Regarding Claim 6, Theriappan et al. and Connors et al. teach the method of claim 1 as noted above. Theriappan et al. and Connors et al. do not explicitly teach all of wherein the plurality of labels are determined based on the list of plurality of surrounding unigram semantic entities, bigram semantic entities and trigram semantic entities corresponding to the plurality of predefined non-semantic entities. However, Miller et al. teach wherein the plurality of labels are determined based on the list of plurality of surrounding unigram semantic entities, bigram semantic entities and trigram semantic entities corresponding to the plurality of predefined non-semantic entities ("Initiation of interpretation of data in Input or DSs generally can comprise, e.g., tokenization as discussed above, at various levels (e.g., a single term/delimiter level, a bigram level, a trigram level, or higher Ngram level, etc., often in combination with testing such different level(s) against expected syntax, common expressions, meaning/NER, etc.) (e.g., identifying “New York” as a particular stage/city by bigram tokenization, versus evaluating the data as “New” (possibly disregarded) and “York” at an individual token interpretation level)," par. 277). Therefore, taking the teachings of Theriappan et al., Connors et al., and Miller et al. as a whole, it would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify entity extracting methods as taught by Theriappan et al. to use document and character analyzing methods as taught by Connors et al. and the Ngram labeling methods as taught by Miller et al. The suggestion/motivation for doing so would have been that, “The use of subword token method(s) in NLP can aid in detectably reducing out of vocabulary (OOV) interpretation problem(s). NLP processes also can be employed in Input processes, such as receipt of instructions, responses to questions, etc., made by an IM” as noted by the Miller et al. disclosure in paragraph [277], which also motivates combination because the combination would predictably have a higher accuracy as there is a reasonable expectation that the methods mentioned will reduce room for error in the processed document; and/or because doing so merely combines prior art elements according to known methods to yield predictable results. Reference Cited The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. US Patent Publication 2023 0351115 A1 to Zeng et al. discloses a semantic token processing and extraction method applying a trained model. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to KARSTEN F LANTZ whose telephone number is (571) 272-4564. The examiner can normally be reached Monday-Friday 8:00-4:00. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ms. Jennifer Mehmood can be reached on 571-272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /Karsten F. Lantz/Examiner, Art Unit 2664 Date: 1/14/2026 /JENNIFER MEHMOOD/Supervisory Patent Examiner, Art Unit 2664
Read full office action

Prosecution Timeline

Dec 19, 2023
Application Filed
Jan 15, 2026
Non-Final Rejection — §103 (current)

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
Grant Probability
2y 9m
Median Time to Grant
Low
PTA Risk
Based on 0 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month