DETAILED ACTION
Status of Claims
This action is in reply to the amendment filed on 12/17/2025.
Claims 1-11, 13-20, 22 and 27 have been amended.
Claims 12 and 25 have been cancelled.
Claims 1-11, 13-24 and 26-28 are currently pending and have been examined.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/17/2025 has been entered.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-11, 13-24 and 26-28 are rejected under 35 U.S.C. §101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1:
Claims 1-11 and 13 are directed to non-transitory computer readable medium (i.e., a manufacture) and claims 14-24 and 26 are directed to a method (i.e., a process) and claims 27-28 are directed to a system (i.e., a machine). Accordingly, claims 1-11, 13-24 and 26-28 are all within at least one of the four statutory categories.
Step 2A - Prong One:
An “abstract idea” judicial exception is subject matter that falls within at least one of the following groupings: a) mathematical concepts, b) certain methods of organizing human activity, and/or c) mental processes.
Representative independent claim 1 includes limitations that recite an abstract idea. Note that independent claim 1 is the computer readable medium claim, while claim 14 covers a method claim and claim 27 covers the systems claim.
Specifically, independent claim 1 recites:
A non-transitory computer readable medium comprising instructions which, when executed by one or more hardware processors, cause performance of operations comprising:
Identifying, in one or more sets of proprietary codes, a plurality of unmapped proprietary codes for mapping to standard codes, each proprietary code in the one or more sets of proprietary codes comprising a code-specific dataset that does not include any patient- specific datasets;
generating a plurality of vector embeddings corresponding to the plurality of unmapped
proprietary codes, wherein generating the plurality of vector embeddings comprises:
generating a first vector embedding for a first unmapped proprietary code of the plurality of unmapped proprietary codes at least by:
applying a vector embedding function to a first code-specific dataset corresponding to the first unmapped proprietary code;
generating a target vector embedding for a target standard code at least by:
applying the vector embedding function to a dataset of code-specific datasets corresponding to mapped proprietary codes mapped to the target standard code to generate a target set of one or more vector embeddings;
computing a similarity measure for the target vector embedding and each of the plurality of vector embeddings to generate a plurality of similarity values, the plurality of similarity values comprises:
a first similarity measure for the target vector embedding and the first vector embedding;
based at least on the first similarity measure, mapping the first unmapped proprietary code to the target standard code as a semantic match for the first unmapped proprietary code;
receiving a record associated with the first unmapped proprietary code; and
associating the target standard code with the record based on the mapping of the first unmapped proprietary code to the target standard code.
The Examiner submits that the foregoing underlined limitations constitute: (a) “mathematic concepts” because generating vector embeddings, applying a vector embedding function to a dataset and computing a similarity measure for a target vector embedding and are all mathematical relationships. Furthermore, these limitations constitute (b) “certain methods of organizing human activity” because codes used by organizations or vendors that have not been mapped to a standard code of a concept that might not otherwise have an industry standardized code to belong to and associating the target standard code with the record based on the mapping of the first unmapped proprietary code to the target standard code are ways of recommending candidate code, which are managing human behavior/interactions between people. The foregoing underlined limitations also relate to claim 1 (similarly to claims 14 and 27).
Accordingly, the claim describes at least one abstract idea.
In relation to claims 2-11, 13, 15-24, 26 and 28, these claims merely recite determining steps such as: claims 2 & 15 - generating the first vector embedding comprises: applying the vector embedding function to the dataset of the first unmapped proprietary code to generate a first set of vector embeddings; and generating the first vector embedding based on the first set of vector embeddings, claims 3 & 16 - applying the vector embedding function to the dataset of mapped proprietary codes mapped to the target standard code to generate a target set of vector embeddings; and generating the target vector embedding based on the first target set of vector embeddings, claims 4 & 17 - training the machine learning model based on training datasets to compute vector embeddings from mapped proprietary codes, wherein particular training data, of the training datasets, comprises: one or more historical mapped proprietary codes; a vector embedding corresponding to the historical mapped proprietary codes; wherein applying the vector embedding function to the dataset of the first unmapped proprietary code comprises applying the machine learning model to the dataset of the first unmapped proprietary code; receiving feedback based on an accuracy of results generated by applying the vector embedding function; and retraining the machine learning model based on the feedback, claims 5 & 18 – generating a second vector embedding for a second unmapped proprietary code of the plurality of unmapped proprietary codes at least by: applying the vector embedding function to a dataset of the second unmapped proprietary code; wherein the plurality of similarity values further comprises: a second similarity measure for the target vector embedding and the second vector embedding; wherein the operations further comprise: based at least on the second similarity measure, refraining from presenting the second unmapped proprietary code as any candidate unmapped proprietary code for mapping to the target standard code, claims 6 & 19 - aggregating the dataset of the first unmapped proprietary code into an aggregated data record; identifying a plurality of tokens based on the aggregated data record; and applying the machine learning model to each token of the plurality of tokens to generate the first set of vector embeddings, claims 7 & 20 - applying the vector embedding function to the dataset of the first unmapped proprietary code comprises applying the vector embedding function to an aggregated set of text that is generated by aggregating text corresponding to each dataset of the first unmapped proprietary code, claims 8 & 21 - the first similarity measure comprises a weighted cosine similarity measure for the target vector embedding and the first vector embedding, claims 9 & 22 - prior to applying the vector embedding function to the first unmapped proprietary code, pre-processing the dataset of the first unmapped proprietary code at least by: (a) converting text data into lowercase, (b) retaining numeric tokens, (c) handling special characters, (d) removing unwanted text from event set hierarchy, and (e) custom reprocessing for synonyms, abbreviations, and short hands, claims 10 & 23 - identifying “N” highest similarity values of the plurality of similarity values; and presenting unmapped proprietary codes, mapped to vector embeddings that correspond to the “N” highest similarity values, as candidate unmapped proprietary codes for mapping to the target standard code, claims 11 & 24 - identifying a subset of similarity values, of the plurality of similarity values, that meet a threshold similarity values; and presenting unmapped proprietary codes, mapped to vector embeddings that correspond to the subset of similarity values, as candidate unmapped proprietary codes for mapping to the target standard code, claims 13 & 26 - selecting the first unmapped proprietary code for mapping to the target standard code and claim 28 - presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code.
Step 2A - Prong Two:
Regarding Prong Two of Step 2A, it must be determined whether the claim as a whole integrates the abstract idea into a practical application. As noted, it must be determined whether any additional elements in the claim beyond the abstract idea integrate the exception into a practical application in a manner that imposes a meaningful limit on the judicial exception. The courts have indicated that additional elements merely using a computer to implement an abstract idea, adding insignificant extra solution activity, or generally linking use of a judicial exception to a particular technological environment or field of use do not integrate a judicial exception into a “practical application.”
The limitations of claims 1, 14 and 27, as drafted is a process that, under its broadest reasonable interpretation, covers performance of the limitations as mathematical for mapping a candidate but for the recitation of generic computer components. That is, other than reciting a non-transitory computer readable medium comprising instructions executed by one or more hardware processors to perform the limitations, nothing in the claim elements precludes the steps from practically being performed mathematically. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation within a biomedical environment, done mathematically for mapping a candidate but for the recitation of generic computer components, then it falls within the “mathematic concepts” and “certain methods of organizing human activity” grouping of abstract ideas. Accordingly, the claims recite an abstract idea.
The judicial exception is not integrated into a practical application. In particular, the non-transitory computer readable medium comprising instructions executed by one or more hardware processors are recited at high levels of generality (i.e., as generic computer components performing generic computer functions of receiving data/inputs, determining and providing data) such that it amounts no more than mere instructions to apply the exception using the generic computer components.
Regarding the additional limitations “a plurality of unmapped proprietary codes for mapping to standard codes”, “a plurality of vector embeddings”, “a first vector embedding”, “target standard code”, “ and “a machine learning model” the Examiner submits that this additional limitation amount to merely using a computer to perform the at least one abstract idea (see MPEP § 2106.05(f)). Regarding the additional limitation “dataset” and “training datasets ……” the Examiner submits that this additional limitation merely adds insignificant pre-solution activity (data gathering; selecting data to be manipulated) to the at least one abstract idea (see MPEP § 2106.05(g)).
Thus, taken alone, the additional elements do not amount to significantly more than the above identified judicial exception (the abstract idea). Looking at the limitations as an ordered combination add nothing that is not already present when looking at the elements taken individually. For instance, there is no indication that the additional elements, when considered as a whole, reflect an improvements in the functioning of a computer or an improvement to another technology or technical field, apply or us the above-noted implement/use to above-noted judicial exception with a particular machine or manufacture that is integral to the claim, effect a transformation or reduction of a particular article to a different state or thing, or apply or use the judicial exception in some meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is not more than a drafting effort designed to monopolize the exception (see MPEP §2106.05). Their collective functions merely provide conventional computer implementation.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into practical application, the additional elements amount to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer component provide an inventive concept. The claims are not patent eligible.
Step 2B:
Regarding Step 2B, in representative independent claim 1, regarding the additional limitations of the non-transitory computer readable medium comprising instructions executed by one or more hardware processors, the Examiner submits that these limitations amount to merely using a computer to perform the at least one abstract idea (see MPEP § 2106.05(f)).
Thus, representative independent claim 1 and analogous independent claims 14 and 27 do not include additional elements (considered both individually and as an ordered combination) that are sufficient to amount to significantly more than the judicial exception for the same reasons to those discussed above with respect to determining that the claim does not integrate the abstract idea into a practical application.
The dependent claims no not include additional elements (considered both individually and as an ordered combination) that are sufficient to amount to significantly more than the judicial exception for the same reason discussed above with respect to determining that the dependent claims do not integrate the at least abstract idea into a practical application.
Therefore, claims 1-11, 13-24 and 26-28 are ineligible under 35 USC §101.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-3, 5-11, 13-16, 18-24 and 26-28 are rejected under 35 U.S.C. 103 as being unpatentable over Velez (US 11,488,713 B2) in view of Wendell (US 2023/0039937 A1).
Claim 1:
Velez discloses a non-transitory computer readable medium comprising instructions which, when executed by one or more hardware processors, cause performance of operations (See column 15, lines 16-26 processor storage media and computer logic instructions executable by a programmed processor.) comprising:
identifying , in one or more sets of proprietary codes, a plurality of unmapped proprietary codes for mapping to standard codes, each proprietary code in the one or more sets of proprietary codes comprising a code-specific dataset that does not include any patient- specific datasets (With code-specific dataset not including any patient- specific datasets as industry/customer specific data, see conceptual clinician expertise (column 5, lines 6-20and 38-48) and unmapped proprietary codes as codes used by specific organizations or vendors that have not been mapped to a standard code, see column 4, line 54 to column 5, line 5 and column 5, lines 49-63 where the Unified Medical Language System (UMLS) brings together health and biomedical vocabularies and standards including ICD, SNOMED, LOINC, RXNORM, and CPT with custom mapping. Also, see Fig. 1A Mapping Ontologies 126 mentioned in column 6, lines 19-33 and [column 13, lines 1-19] import large samples of aggregated raw structured and unstructured EMR patient and employs a data processing “pipeline” to normalize/harmonize raw EMR data to a standard (e.g. map a medication name or number used in a proprietary EMR to a concept in the RXNORM medication ontology.),
generating a plurality of vector embeddings corresponding to the plurality of unmapped
proprietary codes (See Fig. 9, column 13, lines 1-19 where raw unstructured EMR data serves as unmapped code.), wherein generating the plurality of vector embeddings comprises:
generating a first vector embedding for a first unmapped proprietary code of the plurality of unmapped proprietary codes (See [column 13, line 64 to column 14, line 50] Word and phrases contained in specific text notes are mapped to vectors (embeddings) using well-established unsupervised neural networks technologies (word2vec/Glove). The vectorized training data combined with automatically generated ontologically-derived labels is further used in a supervised Machine Learning setting to generate a document classification predictive model.) at least by:
applying a vector embedding function to a first code-specific dataset corresponding to the first unmapped proprietary code (Besides the EMR information model unique codes and value sets in column 9, lines 11-23, see column 14, lines 28-50 training dataset.);
generating a target vector embedding for a target standard code at least by:
applying the vector embedding function to a dataset of code-specific datasets corresponding to mapped proprietary codes mapped to the target standard code to generate a target set of one or more vector embeddings (See Fig. 9 where ‘Target feature text classifier, with historical data performance’ 918 serves as a target set of vector embeddings mentioned in [column 13, line 64 to column 14, line 50] The vectorized training data combined with automatically generated ontologically-derived labels is further used in a supervised Machine Learning setting to generate a document classification predictive model using semantic characterizations of text features derived from the disease specific ontology. Also, see column 10, lines 35-48.);
mapping as a semantic match for the first unmapped proprietary code;
receiving a record associated with the unmapped proprietary code (With semantic expressivity as indispensable, able to access, exchange, integrate and cooperatively use data or digital health records across diverse care settings and clinical software, see Fig. 2, Fig. 7 using raw EMR data to semantically identify characterize and interpret clinical features (column 3, lines 36-52, column 4, line 57 to column 5, line 5) and [column 16, line 58 to column 17, line 9] A modular design integrates with providers' existing EMR and local rule engine data storage solutions, facilitating interoperability across providers and making our system convenient and adaptable.); and
associating the target standard code with the record based on the mapping of the first unmapped proprietary code to the target standard code (See Fig. 1A using a standard ontology to map ontologies in column 7, lines 1-28, Vmaps in column 3, lines 28-52 and in [column 4, line 57 to column 5, line 5] a set of files and software that brings together many health and biomedical vocabularies and standards to enable interoperability between computer systems at the “raw data” level. The key UMLS knowledgebase used is the Metathesaurus (including concepts from many vocabularies including ICD, SNOMED, LOINC, RXNORM, and CPT), and the Semantic Network consisting of (1) a set of broad subject categories, or semantic types, that provide a consistent categorization of all concepts represented in the UMLS Metathesaurus, and (2) a set of useful and important relationships, or semantic relations, that exist between semantic types.).
Although Velez discloses generating a vector embedding for unmapped proprietary code by applying a vector embedding function to a dataset and generating a target vector embedding for a target standard code mentioned above, Velez does not explicitly teach computing a similarity measure for the target vector embedding and presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code based at least on the first similarity measure. Wendell teaches:
computing a similarity measure for the target vector embedding and each of the plurality of vector embeddings to generate a plurality of similarity values, the plurality of similarity values (See Token Vector Similarity equation show in P0058-P0059, P0062.) comprises:
a first similarity measure for the target vector embedding and the first vector embedding (See P0058-P0060 where ontology synonym vector B serves as the target vector.);
based at least on the first similarity measure the first unmapped proprietary code as the target standard code (With the candidate unmapped proprietary code as recommendations of candidate standard codes, see string match screen in P0059-P0061 and exemplary general rule preplacement of term “tumor” with “neoplasm”.).
Therefore, it would have been obvious to one of ordinary skill in the technology of ontology concepts arts before the effective filing date of the claimed invention to modify the software and method of Velez to include computing a similarity measure for the target vector embedding and presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code based at least on the first similarity measure as taught by Wendell for curating and indexing large scales of documentation infeasible for humans mentioned in Wendel’s P0003.
Regarding claims 2 and 15, Velez discloses the non-transitory computer readable medium of Claim 1 and the method of Claim 14, wherein generating the first vector embedding comprises:
applying the vector embedding function to the first code-specific dataset of the first unmapped proprietary code to generate a first set of vector embeddings; and generating the first vector embedding based on the first set of vector embeddings,
applying the vector embedding function to the first code-specific dataset corresponding to the first unmapped proprietary code to generate a first set of vector embeddings; and generating the first vector embedding based on the first set of vector embeddings (Taught in column 13, line 64 to column 14, line 50 as generated ontologically-derived labels and generated document classification predictive model.).
Regarding claims 3 and 16, Velez discloses the non-transitory computer readable medium of Claim 1 and the method of Claim 14, wherein generating the target vector embedding comprises:
applying the vector embedding function to the dataset of the code-specific datasets of the mapped proprietary codes mapped to the target standard code to generate a target set of vector embeddings; and generating the target vector embedding based on the target set of vector embeddings,
applying the vector embedding function to the dataset of the code-specific datasets corresponding to the mapped proprietary codes mapped to the target standard code to generate a target set of vector embeddings; and generating the target vector embedding based on the (See Fig. 9 where ‘Target feature text classifier, with historical data performance’ 918 serves as a target set of vector embeddings mentioned in [column 13, line 64 to column 14, line 50] The vectorized training data combined with automatically generated ontologically-derived labels is further used in a supervised Machine Learning setting to generate a document classification predictive model using semantic characterizations of text features derived from the disease specific ontology.
Regarding claims 5 and 18, although Velez discloses the non-transitory computer readable medium of claim 1, method of claim 14 and wherein generating the plurality of vector embeddings further comprises: generating a second vector embedding for a second unmapped proprietary code of the plurality of unmapped proprietary codes at least by: applying the vector embedding function to a second code-specific dataset corresponding to the second unmapped proprietary code mentioned above, Velez does not explicitly teach a similarity measure for the target vector embedding, presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code based at least on the first similarity measure and refraining from presenting unmapped proprietary code as any candidate unmapped proprietary code for mapping to the target standard code. Wendell teaches:
wherein the plurality of similarity values further comprises: a second similarity measure for the target vector embedding and the second vector embedding ((See Token Vector Similarity equation show in P0058-P0059, P0062.);
wherein the operations further comprise: based at least on the second similarity measure (See P0058-P0060 where ontology synonym vector B serves as the target vector.), refraining from presenting the second unmapped proprietary code as any candidate unmapped proprietary code for mapping to the target standard code (See exemplary refraining from presenting in P0057, P0061 Stop words and stop word removals.).
Therefore, it would have been obvious to one of ordinary skill in the technology of ontology concepts arts before the effective filing date of the claimed invention to modify the software and method of Velez to include a similarity measure for the target vector embedding, presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code based at least on the first similarity measure and refraining from presenting unmapped proprietary code as any candidate unmapped proprietary code for mapping to the target standard code as taught by Wendell for curating and indexing large scales of documentation infeasible for humans mentioned in Wendel’s P0003.
Regarding claims 6 and 19, although Velez discloses the non-transitory computer readable medium of claim 1 and method of claim 14 mentioned above, Velez does not explicitly teach aggregating the dataset of the first unmapped proprietary code into an aggregated data record, identifying tokens based on the aggregated data record and applying the machine learning model to each token of the plurality of tokens. Wendell teaches:
wherein applying the vector embedding function to the first unmapped proprietary code comprises: aggregating the first code-specific dataset of the first unmapped proprietary code into an aggregated data record; identifying a plurality of tokens based on the aggregated data record; and applying the vector embedding function to each token of the plurality of tokens to generate a first set of vector embeddings,
wherein applying the vector embedding function to the first code-specific dataset corresponding to the first unmapped proprietary code comprises: aggregating the first code-specific dataset corresponding to the first unmapped proprietary code into an aggregated data record; identifying a plurality of tokens based on the aggregated data record; and applying the vector embedding function to each token of the plurality of tokens to generate a first set of vector embeddings (Besides the Token Vector Similarity and token-wise substring of the ontology mentioned in P0058-P0059, P0091 see processing tokens in documents in P0061-P0062, P0177.).
Therefore, it would have been obvious to one of ordinary skill in the technology of ontology concepts arts before the effective filing date of the claimed invention to modify the software and method of Velez to include aggregating the dataset of the first unmapped proprietary code into an aggregated data record, identifying tokens based on the aggregated data record and applying the machine learning model to each token of the plurality of tokens as taught by Wendell for curating and indexing large scales of documentation infeasible for humans mentioned in Wendel’s P0003.
Regarding claims 7 and 20, although Velez discloses the non-transitory computer readable medium of claim 1 and method of claim 14 mentioned above, Velez does not explicitly teach an aggregated set of text generated by aggregating text corresponding to a dataset of unmapped proprietary code. Wendell teaches:
wherein applying the vector embedding function to the first code-specific dataset of the first unmapped proprietary code comprises applying the vector embedding function to an aggregated set of text that is generated by aggregating text corresponding to each dataset of the first code-specific dataset of the first unmapped proprietary code,
wherein applying the vector embedding function to the first code-specific dataset corresponding to the first unmapped proprietary code comprises applying the vector embedding function to an aggregated set of text that is generated by aggregating text corresponding to each dataset of the first code-specific dataset corresponding to the first unmapped proprietary code (See aggregate with synergistic effect in P0026 and aggregation of synonyms into a single concept within a preferred ontology in P0174-P0175, P0179 where characters in sentences serve as dataset.).
Therefore, it would have been obvious to one of ordinary skill in the technology of ontology concepts arts before the effective filing date of the claimed invention to modify the software and method of Velez to include an aggregated set of text generated by aggregating text corresponding to a dataset of unmapped proprietary code as taught by Wendell for curating and indexing large scales of documentation infeasible for humans mentioned in Wendel’s P0003.
Regarding claims 8 and 21, although Velez discloses the non-transitory computer readable medium of claim 1 and method of claim 14 mentioned above, Velez does not explicitly teach a weighted cosine similarity measure for the target vector embedding and the first vector embedding. Wendell teaches:
wherein the first similarity measure comprises a weighted cosine similarity measure for the target vector embedding and the first vector embedding (See [P0038] Note that the entity can be at least two, three, or more of an acronym entity, a hyphenated entity, an entity including a Greek letter, an entity that includes a Roman numeral, or an entity that has a high token vector cosine similarity to an ontology concept, where the ontology candidate identifier and the ontology concept are associated with an ontology. Also, see P0062, P0064.).
Therefore, it would have been obvious to one of ordinary skill in the technology of ontology concepts arts before the effective filing date of the claimed invention to modify the software and method of Velez to include a weighted cosine similarity measure for the target vector embedding and the first vector embedding as taught by Wendell for curating and indexing large scales of documentation infeasible for humans mentioned in Wendel’s P0003.
Regarding claims 9 and 22, Velez discloses the non-transitory computer readable medium of claim 1, method of claim 14 and wherein the operations further comprise: prior to applying the vector embedding function to the first unmapped proprietary code, pre-processing the first code-specific dataset corresponding to the first unmapped proprietary code at least by:
(a) converting text data into lowercase,
(b) retaining numeric tokens,
(c) handling special characters (See sematic characterization in column ,3, lines 17-26 and lines 44-52 shown in Fig. 2.).
(d) removing unwanted text from event set hierarchy, and
(e) custom reprocessing for synonyms, abbreviations, and short hands.
Regarding claims 10 and 23, although Velez discloses the non-transitory computer readable medium of claim 1 and method of claim 14 mentioned above, Velez does not explicitly teach “N” highest similarity values, as candidate unmapped proprietary codes for mapping to the target standard code. Wendell teaches:
wherein the operations further comprise: identifying “N” highest similarity values of the plurality of similarity values; and presenting unmapped proprietary codes, mapped to vector embeddings that correspond to the “N” highest similarity values, as candidate unmapped proprietary codes for mapping to the target standard code (Taught in P0012-P0013, P0187-P0189 as nodes levels. Also, see an upper bound of an n-gram size in P0029.).
Therefore, it would have been obvious to one of ordinary skill in the technology of ontology concepts arts before the effective filing date of the claimed invention to modify the software and method of Velez to include “N” highest similarity values, as candidate unmapped proprietary codes for mapping to the target standard code as taught by Wendell for curating and indexing large scales of documentation infeasible for humans mentioned in Wendel’s P0003.
Regarding claims 11 and 24, although Velez discloses the non-transitory computer readable medium of claim 1 and method of claim 14 mentioned above, Velez does not explicitly teach a similarity measure for the target vector embedding and presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code based at least on the first similarity measure. Wendell teaches:
wherein the operations further comprise: identifying a subset of similarity values, of the plurality of similarity values, that meet a threshold similarity values; and presenting unmapped proprietary codes, mapped to vector embeddings that correspond to the subset of similarity values, as candidate unmapped proprietary codes for mapping to the target standard code (See the threshold for an upper bound of an n-gram size in P0029 and Token Vector Similarity equation show in P0058-P0059, P0062.).
Therefore, it would have been obvious to one of ordinary skill in the technology of ontology concepts arts before the effective filing date of the claimed invention to modify the software and method of Velez to include a similarity measure for the target vector embedding and presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code based at least on the first similarity measure as taught by Wendell for curating and indexing large scales of documentation infeasible for humans mentioned in Wendel’s P0003.
Regarding claim 13, Velez discloses the non-transitory computer readable medium of Claim 1, wherein the operations further comprise: selecting the first unmapped proprietary code for mapping to the target standard code (See Fig. 9 where ‘Target feature text classifier, with historical data performance’ 918 serves as a target set of vector embeddings mentioned in [column 13, line 64 to column 14, line 50] The vectorized training data combined with automatically generated ontologically-derived labels is further used in a supervised Machine Learning setting to generate a document classification predictive model using semantic characterizations of text features derived from the disease specific ontology.
Claim 14:
Velez discloses a method comprising:
identifying , in one or more sets of proprietary codes, a plurality of unmapped proprietary codes for mapping to standard codes, each proprietary code in the one or more sets of proprietary codes comprising a code-specific dataset that does not include any patient- specific datasets (With code-specific dataset not including any patient- specific datasets as industry/customer specific data, see conceptual clinician expertise (column 5, lines 6-20and 38-48) and unmapped proprietary codes as codes used by specific organizations or vendors that have not been mapped to a standard code, see column 4, line 54 to column 5, line 5 and column 5, lines 49-63 where the Unified Medical Language System (UMLS) brings together health and biomedical vocabularies and standards including ICD, SNOMED, LOINC, RXNORM, and CPT with custom mapping. Also, see Fig. 1A Mapping Ontologies 126 mentioned in column 6, lines 19-33 and [column 13, lines 1-19] import large samples of aggregated raw structured and unstructured EMR patient and employs a data processing “pipeline” to normalize/harmonize raw EMR data to a standard (e.g. map a medication name or number used in a proprietary EMR to a concept in the RXNORM medication ontology.),
generating a plurality of vector embeddings corresponding to the plurality of unmapped
proprietary codes (See Fig. 9, column 13, lines 1-19 where raw unstructured EMR data serves as unmapped code.), wherein generating the plurality of vector embeddings comprises:
generating a first vector embedding for a first unmapped proprietary code of the plurality of unmapped proprietary codes (See [column 13, line 64 to column 14, line 50] Word and phrases contained in specific text notes are mapped to vectors (embeddings) using well-established unsupervised neural networks technologies (word2vec/Glove). The vectorized training data combined with automatically generated ontologically-derived labels is further used in a supervised Machine Learning setting to generate a document classification predictive model.) at least by:
applying a vector embedding function to a first code-specific dataset corresponding to the first unmapped proprietary code (Besides the EMR information model unique codes and value sets in column 9, lines 11-23, see column 14, lines 28-50 training dataset.);
generating a target vector embedding for a target standard code at least by:
applying the vector embedding function to a dataset of code-specific datasets corresponding to mapped proprietary codes mapped to the target standard code to generate a target set of one or more vector embeddings (See Fig. 9 where ‘Target feature text classifier, with historical data performance’ 918 serves as a target set of vector embeddings mentioned in [column 13, line 64 to column 14, line 50] The vectorized training data combined with automatically generated ontologically-derived labels is further used in a supervised Machine Learning setting to generate a document classification predictive model using semantic characterizations of text features derived from the disease specific ontology. Also, see column 10, lines 35-48.);
wherein the method is performed by at least one device including a hardware processor (See processors in column 7, line 56 to column 8, line 6, column 9, lines 33-67 for ingesting data and evoking machine learning algorithms.).
mapping as a semantic match for the first unmapped proprietary code;
receiving a record associated with the first unmapped proprietary code (With semantic expressivity as indispensable, able to access, exchange, integrate and cooperatively use data or digital health records across diverse care settings and clinical software, see Fig. 2, Fig. 7 using raw EMR data to semantically identify characterize and interpret clinical features (column 3, lines 36-52, column 4, line 57 to column 5, line 5) and [column 16, line 58 to column 17, line 9] A modular design integrates with providers' existing EMR and local rule engine data storage solutions, facilitating interoperability across providers and making our system convenient and adaptable.); and
associating the target standard code with the record based on the mapping of the first unmapped proprietary code to the target standard code (See Fig. 1A using a standard ontology to map ontologies in column 7, lines 1-28, Vmaps in column 3, lines 28-52 and in [column 4, line 57 to column 5, line 5] a set of files and software that brings together many health and biomedical vocabularies and standards to enable interoperability between computer systems at the “raw data” level. The key UMLS knowledgebase used is the Metathesaurus (including concepts from many vocabularies including ICD, SNOMED, LOINC, RXNORM, and CPT), and the Semantic Network consisting of (1) a set of broad subject categories, or semantic types, that provide a consistent categorization of all concepts represented in the UMLS Metathesaurus, and (2) a set of useful and important relationships, or semantic relations, that exist between semantic types.).
Although Velez discloses generating a vector embedding for unmapped proprietary code by applying a vector embedding function to a dataset and generating a target vector embedding for a target standard code mentioned above, Velez does not explicitly teach computing a similarity measure for the target vector embedding and presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code based at least on the first similarity measure. Wendell teaches:
computing a similarity measure for the target vector embedding and each of the plurality of vector embeddings to generate a plurality of similarity values, the plurality of similarity values (See Token Vector Similarity equation show in P0058-P0059, P0062.) comprises:
a first similarity measure for the target vector embedding and the first vector embedding (See P0058-P0060 where ontology synonym vector B serves as the target vector.); based at least on the first similarity measure the first unmapped proprietary code to the target standard code (With the candidate unmapped proprietary code as recommendations of candidate standard codes, see string match screen in P0059-P0061 and exemplary general rule preplacement of term “tumor” with “neoplasm”.).
Therefore, it would have been obvious to one of ordinary skill in the technology of ontology concepts arts before the effective filing date of the claimed invention to modify the software and method of Velez to include computing a similarity measure for the target vector embedding and presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code based at least on the first similarity measure as taught by Wendell for curating and indexing large scales of documentation infeasible for humans mentioned in Wendel’s P0003.
Claim 27:
Velez discloses A system comprising:
one or more hardware processors;
one or more non-transitory computer-readable media; and
program instructions stored on the one or more non-transitory computer-readable media which,
when executed by the one or more hardware processors (See column 15, lines 16-26 processor storage media and computer logic instructions executable by a programmed processor.), cause performance of operations comprising:
identifying, in one or more sets of proprietary codes, a plurality of unmapped proprietary codes for mapping to standard codes (With unmapped proprietary codes as codes used by specific organizations or vendors that have not been mapped to a standard code, see column 4, line 54 to column 5, line 5 and column 5, lines 49-63 where the Unified Medical Language System (UMLS) brings together health and biomedical vocabularies and standards including ICD, SNOMED, LOINC, RXNORM, and CPT with custom mapping. Also, see Fig. 1A Mapping Ontologies 126 mentioned in column 6, lines 19-33 and [column 13, lines 1-19] import large samples of aggregated raw structured and unstructured EMR patient and employs a data processing “pipeline” to normalize/harmonize raw EMR data to a standard (e.g. map a medication name or number used in a proprietary EMR to a concept in the RXNORM medication ontology, each proprietary code in the one or more sets of proprietary codes comprising a code-specific dataset that does not include any patient- specific datasets (With code-specific dataset not including any patient- specific datasets as industry/customer specific data, see conceptual clinician expertise (column 5, lines 6-20and 38-48) and unmapped proprietary codes as codes used by specific organizations or vendors that have not been mapped to a standard code, see column 4, line 54 to column 5, line 5 and column 5, lines 49-63 where the Unified Medical Language System (UMLS) brings together health and biomedical vocabularies and standards including ICD, SNOMED, LOINC, RXNORM, and CPT with custom mapping. Also, see Fig. 1A Mapping Ontologies 126 mentioned in column 6, lines 19-33 and [column 13, lines 1-19] import large samples of aggregated raw structured and unstructured EMR patient and employs a data processing “pipeline” to normalize/harmonize raw EMR data to a standard (e.g. map a medication name or number used in a proprietary EMR to a concept in the RXNORM medication ontology.);
generating a plurality of vector embeddings corresponding to the plurality of unmapped proprietary codes (See Fig. 9, column 13, lines 1-19 where raw unstructured EMR data serves as unmapped code.), wherein generating the plurality of vector embeddings comprises:
generating a first vector embedding for a first unmapped proprietary code of the plurality of unmapped proprietary codes (See [column 13, line 64 to column 14, line 50] Word and phrases contained in specific text notes are mapped to vectors (embeddings) using well-established unsupervised neural networks technologies (word2vec/Glove). The vectorized training data combined with automatically generated ontologically-derived labels is further used in a supervised Machine Learning setting to generate a document classification predictive model.) at least by:
applying a vector embedding function to a first code-specific dataset corresponding to the first unmapped proprietary code (Besides the EMR information model unique codes and value sets in column 9, lines 11-23, see column 14, lines 28-50 training dataset.):
generating a target vector embedding for a target standard code at least by:
applying the vector embedding function to a dataset of code-specific datasets corresponding to mapped proprietary codes mapped to the target standard code to generate a target set of one or more vector embeddings (See Fig. 9 where ‘Target feature text classifier, with historical data performance’ 918 serves as a target set of vector embeddings mentioned in [column 13, line 64 to column 14, line 50] The vectorized training data combined with automatically generated ontologically- derived labels is further used in a supervised Machine Learning setting to generate a document classification predictive model using semantic characterizations of text features derived from the disease specific ontology. Also, see column 10, lines 35-48.).
generating a plurality of vector embeddings corresponding to the plurality of unmapped proprietary codes (See Fig. 9, column 13, lines 1-19 where raw unstructured EMR data serves as unmapped code.), wherein generating the plurality of vector embeddings comprises:
generating a first vector embedding for a first unmapped proprietary code of the plurality of unmapped proprietary codes (See [column 13, line 64 to column 14, line 50] Word and phrases contained in specific text notes are mapped to vectors (embeddings) using well-established unsupervised neural networks technologies (word2vec/Glove). The vectorized training data combined with automatically generated ontologically-derived labels is further used in a supervised Machine Learning setting to generate a document classification predictive model.) at least by:
applying a vector embedding function to a dataset of the first unmapped proprietary code (Besides the EMR information model unique codes and value sets in column 9, lines 11-23, see column 14, lines 28-50 training dataset.):
generating a target vector embedding for a target standard code at least by:
applying the vector embedding function to a dataset of mapped proprietary codes mapped to the target standard code to generate a target set of one or more vector embeddings (See Fig. 9 where ‘Target feature text classifier, with historical data performance’ 918 serves as a target set of vector embeddings mentioned in [column 13, line 64 to column 14, line 50] The vectorized training data combined with automatically generated ontologically- derived labels is further used in a supervised Machine Learning setting to generate a document classification predictive model using semantic characterizations of text features derived from the disease specific ontology. Also, see column 10, lines 35-48.).
receiving a record associated with the first unmapped proprietary code (With semantic expressivity as indispensable, able to access, exchange, integrate and cooperatively use data or digital health records across diverse care settings and clinical software, see Fig. 2, Fig. 7 using raw EMR data to semantically identify characterize and interpret clinical features (column 3, lines 36-52, column 4, line 57 to column 5, line 5) and [column 16, line 58 to column 17, line 9] A modular design integrates with providers' existing EMR and local rule engine data storage solutions, facilitating interoperability across providers and making our system convenient and adaptable.); and
associating the target standard code with the record based on the mapping of the first unmapped proprietary code to the target standard code (See Fig. 1A using a standard ontology to map ontologies in column 7, lines 1-28, Vmaps in column 3, lines 28-52 and in [column 4, line 57 to column 5, line 5] a set of files and software that brings together many health and biomedical vocabularies and standards to enable interoperability between computer systems at the “raw data” level. The key UMLS knowledgebase used is the Metathesaurus (including concepts from many vocabularies including ICD, SNOMED, LOINC, RXNORM, and CPT), and the Semantic Network consisting of (1) a set of broad subject categories, or semantic types, that provide a consistent categorization of all concepts represented in the UMLS Metathesaurus, and (2) a set of useful and important relationships, or semantic relations, that exist between semantic types.).
Although Velez discloses generating a vector embedding for unmapped proprietary code by applying a vector embedding function to a dataset and generating a target vector embedding for a target standard code mentioned above, Velez does not explicitly teach computing a similarity measure for the target vector embedding and presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code based at least on the first similarity measure. Wendell teaches:
computing a similarity measure for the target vector embedding and each of the plurality of vector embeddings to generate a plurality of similarity values, the plurality of similarity values (See Token Vector Similarity equation show in P0058-P0059, P0062.) comprises:
a first similarity measure for the target vector embedding and the first vector embedding (See P0058-P0060 where ontology synonym vector B serves as the target vector.); and
based at least on the first similarity measure, mapping the first unmapped proprietary code to the target standard code as a semantic match for the first unmapped proprietary code (With the candidate unmapped proprietary code as recommendations of candidate standard codes, see string match screen in P0059-P0061 and exemplary general rule preplacement of term “tumor” with “neoplasm”.).
Therefore, it would have been obvious to one of ordinary skill in the technology of ontology concepts arts before the effective filing date of the claimed invention to modify the software and method of Velez to include computing a similarity measure for the target vector embedding and presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code based at least on the first similarity measure as taught by Wendell for curating and indexing large scales of documentation infeasible for humans mentioned in Wendel’s P0003.
Regarding claim 28, although Velez and Wendell teach the non-transitory computer readable medium of Claim 1 mentioned above, Velez does not explicitly teach presenting unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code. Wendell teaches wherein the operations further comprise: presenting the first unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code (See performing an ontology-specified screen shown in Fig. 1, Steps 106, 108 and 110, see [P0048] the processor selects the ontology candidate identifier based on the first words, symbols, and/or characters, the first relative weights, the letter, symbol, and/or character groups, and the second relative weights. Also, see Fig. 3A, P0070-P0073).
Therefore, it would have been obvious to one of ordinary skill in the technology of ontology concepts arts before the effective filing date of the claimed invention to modify the software and method of Velez to include presenting unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code as taught by Wendell for curating and indexing large scales of documentation infeasible for humans mentioned in Wendel’s P0003.
Claims 4 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Velez (US 11,488,713 B2) in view of Wendell (US 2023/0039937 A1) further in view of McCallie (11,526,508 B1).
Regarding claims 4 and 17, although Velez and Wendell teach the medium of Claim 1 and the method of Claim 14 and wherein the vector embedding function comprises a machine learning model mentioned above, Velez and Wendell do not explicitly teach training the machine learning model based on training datasets to compute vector embeddings from mapped proprietary codes comprising historical mapped proprietary codes, applying the vector embedding function to the dataset, receiving feedback based on an accuracy of results generated retraining the machine learning model based on the feedback. McCallie teaches wherein the operations further comprise:
training the machine learning model based on training datasets to compute vector embeddings from mapped proprietary codes (See machine learning model in column 2, line 59 to column 3, line 19 and column 10, lines 41-49.), wherein particular training data, of the training datasets, comprises: one or more historical mapped proprietary codes (See history of previous words in column 19, lines 8-15.);
a vector embedding corresponding to the historical mapped proprietary codes (See [column 19, lines 8-15] the word embedding vector model is trained using the maximum likelihood (ML) principle to maximize probability of the next word w.sub.t (i.e., “target”) given the previous words h (i.e., “history”) in terms.);
wherein applying the vector embedding function to the dataset of the first unmapped proprietary code comprises applying the machine learning model to the dataset of the first unmapped proprietary code (See data sets in column 18, line 50 to column 19, line 22.);
receiving feedback based on an accuracy of results generated by applying the vector embedding function; and retraining the machine learning model based on the feedback (See [column 18, line 60 to column 19, line 7] A retrained or trained word embedding model is an embedding that receives training feedback after it has received initial training session(s) and is optimized or generated for a specific data set.).
Therefore, it would have been obvious to one of ordinary skill in the art of vector embedded medical concepts arts before the effective filing date of the claimed invention to modify the software and method of Velez and Wendell to include training the machine learning model based on training datasets to compute vector embeddings from mapped proprietary codes comprising historical mapped proprietary codes, applying the vector embedding function to the dataset, receiving feedback based on an accuracy of results generated retraining the machine learning model based on the feedback as taught by McCallie for word embeddings that is indicative of introducing bias or removing bias associated with the vectors mentioned in McCallie’s column 1, lines 41-60.
Response to Arguments
Applicant argues that claim 1 does not recite the mathematical concept (See, e.g., Thales Visionix, Inc. v. United States & Ex Parte Desjardins) by generating vector embeddings, applying a vector embedding function to a dataset, and computing a similarity measure for a target vector embedding. see pg. 12-13 of Remarks – Examiner disagrees.
Unlike Thales Visionix that had claims directed to with enough specificity to ensure the step one inquiry is meaningful, and unlike Desjardins’ functioning computer improvements, the instant case claim mapping a first unmapped proprietary code to a target standard code is done by applying a vector embedding function. An embedding vector is a mathematical representation of an object (like a word, image or sound) as a numerical array of numbers, called a vector, with a multi-dimensional space similarly described in paragraph 34 of Applicant’s own specification. Also, presenting unmapped proprietary code as a candidate unmapped proprietary code for mapping to the target standard code are ways of recommending candidate code, which is done among individuals within organizations, regions and global populations when the individuals are accessing, exchanging and integrating healthcare data and cooperatively, described in paragraph 3 of Applicant’s own specification.
Applicant argues that claim 1 recites features that reflect improvements to the functioning of a computer and/or to another technology. see pgs. 13-14 of Remarks – Examiner disagrees.
No technological improvements to mapping unmapped proprietary code to standard code and the functioning of the computer itself have been genuinely set forth and are nonetheless directed towards improving the abstract idea and not the computer itself — that is, the recited invention may improve facilitating entry (i.e. the abstract idea), but there is no evidence to show that it improves the structural or functional properties of the computer itself, outside of improving the computer specifically for implementing the abstract idea. In fact, identifying generating a target vector, computing a similarity measure, applying a vector embedding function, mapping unmapped proprietary code and associating target standard code are operations any generic computer is merely using the computer as a tool to implement the abstract idea (saying “apply it”) and is merely using the computer in the manner in which it was designed to be used, i.e., performing generic computer functions.
Furthermore, the means of generating vector embeddings, applying a vector embedding function to a dataset, computing a similarity measure for a target vector embedding in the claims are well-understood, routine, and conventional. That is, use of vector embedding corresponding to unmapped and mapped healthcare data is well-known in the art, as evidenced by at least McCallie et al. (US 11,526,508 B1) column 2, line 60 to column 3, line 8, Fig. 5A-Fig. 6, Hane et al. (US 10,891,352 B1) particularly linked to medical code in column 1, lines 33-62, column 14, line 65 to column 15, line 25 Figs. 4A-4B; and Godbole et al. (US 2024/0221949 A1) P0064-P0065, Fig. 8-Fig.9. Further, Velez’s Fig. 2, Fig. 7 use of raw EMR data to semantically identify characterize and interpret clinical features is taught in column 3, lines 36-52, column 4, line 57 to column 5, line 5) and [column 16, line 58 to column 17, line 9]. Accordingly, the processes to perform analysis in the invention do not provide “significantly more” than the abstract idea.
Applicant further argues that Velez fails to disclose patient- specific datasets are specifically excluded from the datasets of the proprietary codes. see pgs. 16-17 of Remarks – Examiner disagrees.
According to Applicant’s specification, industry/customer specific data are the code-specific dataset that does not including any patient- specific datasets. This does not preclude Velez’ normalized codes shown in Fig. 2, column 9, lines 11-23 where Systematic Nomenclature of Medicine) SNOMED is unique to the medical industry for identifying diagnosis, symptoms and procedures applied to all patients, no specific one. Also, see Velez’s Fig. 2, Fig. 7 mapping to coded outcome data such as ICD-9, frequently errored or missing (column 13, lines 1-38), using Vmap mentioned in column 6, lines 5-33.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TERESA S WILLIAMS whose telephone number is (571)270-5509. The examiner can normally be reached Mon-Fri, 8:30 am -6:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mamon Obeid can be reached at (571) 270-1813. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/T.S.W./Examiner, Art Unit 3687 03/28/2026
/ALAAELDIN M. ELSHAER/Primary Examiner, Art Unit 3687