DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 01/27/2026 has been entered.
Response to Arguments
Remarks page 10-11, Applicant contends:
Applicant traverses 103 rejections at least in view of the amendments.
Response:
The amendment to indicate codes in the independent claim 1 is not seen as traversing the 103 rejections. Chimmad is used to teach the use of procedure codes and diagnosis codes. Chimmad also notes comparing such codes. The specific quotes and motivations are in the previous office action for claim 3 103 rejection. Wu, as indicated in previous office action claim 1 103 rejection, teaches many of the limitations of claim 1. However, as claim 3 indicates, Wu does not specific data such as procedure codes or diagnosis codes. The combination of Wu and Chimmad addresses the issue by indicating the use of such data. The codes being a form of data, and Wu already teaching knowledge graphs of data, the combination for the graphs to include a particular data is not seen as unconventional. Chimmad supposedly not indicating anything for extracting a subgraph (as argued in Remarks page 11 by applicant) is not seen as convincing, for Wu is the reference used to teach such limitations (as indicated by previous claim 1 rejection).
As noted in previous claim 1 rejection for 103, Wu is seen as teaching extracting a subgraph of historical vectors that comprise historical features, such as elements of [Wu Figure 9 906]. The amendments to indicate the features as codes gives indication of the elements of claim 3, such as the diagnosis and procedure codes, being an example of the types of data being able to be referenced. As a result of existing elements utilized in rejections of previously recited claims still teaching the claim limitations, applicant’s arguments in regards to 103 are not seen as convincing.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 4-8, 10, and 13-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al (US 20220108188 A1), referred to as Wu in this document, in view of Li et al (US 10606846 B2), referred to as Li in this document, and further in view of Yang et al ("An Efficient Parallel Keyword Search Engine on Knowledge Graphs"), referred to as Yang in this document, and further in view of Chimmad (“Assessment of Healthcare Claims Rejection Risk Using Machine Learning”), referred to as Chimmad in this document.
Regarding Claim 1:
Wu teaches:
PNG
media_image1.png
488
568
media_image1.png
Greyscale
A computer-implemented method comprising: receiving, by one or more processors of a recommendation system, a plurality of source vectors via a plurality of first Application Programming Interface (API) calls from a plurality of client computing entities, wherein a source vector of the plurality of source vectors comprises a plurality of source features comprising at least a first source code and a second source code describing an input case of a plurality of input cases respectively corresponding to the plurality of source vectors
Figure 9 notes receiving a natural language question from a system attached to a processor [Wu Figure 9 902]. [A computer-implemented method comprising: receiving, by one or more processors of a recommendation system, a plurality of source vectors… wherein a source vector of the plurality of source vectors comprises a plurality of source features comprising at least a first source code and a second source code describing an input case of a plurality of input cases respectively corresponding to the plurality of source vectors]
Support for multiple questions and thus there being multiple source features, is shown by: [Wu 0038]: “The one or more KGs 108 can be one or more collections of data that can be queried by the KG-QA component 110 to identify an answer to one or more natural language questions presented via the one or more input devices 106.”)
generating, by the one or more processors and in parallel for each of the plurality of input cases, a plurality of recommendation data objects from a single graph-based data structure, wherein (i) the single graph-based data structure is generated before the plurality of first API calls
[Wu 0002]: “Semantic parsing methods translate a natural language question to a KG query, which can subsequently be used to query the KG directly [wherein (i) the single graph-based data structure is generated before the plurality of first API calls as the query is noted to be for a graph that exists meaning the graph existed before the query. The teachings of API calls are taught later in Li] in order to derive the answer to the natural language question. Retrieve-then-extract methods first retrieve the KG coarsely to obtain a set of subgraphs containing multiple answer candidates. Subsequently, answers [generating, by the one or more processors and in parallel for each of the plurality of input cases, a plurality of recommendation data objects from a single graph-based data structure] to the natural language question are extracted from the top-K matching subgraphs.”
and comprises a plurality of historical vectors, each comprising a plurality of historical features comprising at least a first historical code and a second historical code,
[Wu Figure 9 904] shows encoding the information into graphs, such as the knowledge graph subgraph. [and comprises a plurality of historical vectors, each comprising a plurality of historical features comprising at least a first historical code and a second historical code as historical data is shown in the graphs in figure 3]
and (ii) the generating comprises generating a source feature vector based at least in part on the first source code and the second source code
[Wu Figure 9 904] notes encoding the input into a question graph. [and (ii) the generating comprises generating a source feature vector based at least in part on the first source code and the second source code]
providing a second API call to traverse the single graph-based data structure
[Wu Figure 9 904] shows encoding the information into graphs, such as the knowledge graph subgraph. [providing a second API call to traverse the single graph-based data structure that comprises a plurality of historical vectors comprising at least a first historical feature and a second historical feature as the traversing a graph-based structure is shown by the data being looked at is in a knowledge graph]
extracting, based on the traversal of the single graph-based data structure and the first source code and the second source code, a first subgraph-based data structure from the single graph-based data structure, wherein the first subgraph-based data structure identifies a subset of historical vectors of the plurality of historical vectors that comprise the first historical code that matches the first source code and the second historical code that matches the second source code
[Wu Figure 9 906] notes finding a matching subgraph. [extracting, based on the traversal of the single graph-based data structure and the first source code and the second source code, a first subgraph-based data structure from the single graph-based data structure, wherein the first subgraph-based data structure identifies a subset of historical vectors of the plurality of historical vectors that comprise the first historical code that matches the first source code and the second historical code that matches the second source code] The idea of multiple parts matching is supported by the quote from [Wu 0038] indicating accepting multiple questions and answers.
PNG
media_image2.png
670
502
media_image2.png
Greyscale
temporarily storing the first subgraph-based data structure separately from the single graph-based data structure and a second subgraph-based data structure to enable parallel processing with respect to each of the plurality of input cases
[Wu Figure 4] shows that the subgraphs (202) and the larger graph (108) are kept separate [temporarily storing the first subgraph-based data structure separately from the single graph-based data structure and a second subgraph-based data structure to enable parallel processing with respect to each of the plurality of input cases]
While Wu does not explicitly note the storage is temporary, the limitation of noting something is temporary without a limitation performing the removal is seen as noting that the storage of graph should allow for the deletion of the graph at a later time, which the graphs being separate enables the deletion of one without deleting the others. Wu containing memory also supports the ability for storage to be temporary, as paragraph 98 of Wu notes RAM and erasable programmable read-only memory, which are forms of memory that storage can be temporary in, for the data can be deleted.
performing, in parallel, a first graph traversal operation on the first subgraph-based data structure and a second graph traversal operation on the second subgraph-based data structure
([Wu 0010]: “In some examples, the computer program instructions can further cause the processor to determine, by the processor, an amount of similarity between the question graph and the knowledge graph subgraph [performing, in parallel, a first graph traversal operation on the first subgraph-based data structure and a second graph traversal operation on the second subgraph-based data structure as would have to traverse a subgraph to compare aspects of the graph to something else, thus traversing subgraphs is taught. There being multiple subgraphs is supported by other quotes and again in this quote by noting the subgraphs can be ranked, thus indicating multiple of them.] based on a first neural network embedding from the neural network embeddings and a second neural network embedding from the neural network embeddings. The program instructions can also cause the processor to match, by the processor, the knowledge graph subgraph to the question graph based on the amount of similarity being greater than or equal to a defined threshold. Moreover, the program instructions can cause the processor to rank, by the processor, the knowledge graph subgraph amongst a plurality of knowledge graph subgraphs based on the amount of similarity.”)
PNG
media_image3.png
662
586
media_image3.png
Greyscale
generating based on the first graph traversal operation on the first subgraph-based data structure, a historical feature vector for a historical vector of the subset of historical vectors, comprising a historical feature of the historical vector
[Wu Figure 3] shows the graph being used to answer a query where it traversed nodes, as indicated by the arrows. [generating based on the first graph traversal operation on the first subgraph-based data structure, a historical feature vector for a historical vector of the subset of historical vectors, comprising a historical feature of the historical vector]
selecting a historical case cohort identifying the historical vector of the subset of historical vectors based at least in part on comparing the historical feature vector to the source vector
[Wu Figure 9 906] shows matching subgraphs. [selecting a historical case cohort identifying the historical vector of the subset of historical vectors based at least in part on comparing the historical feature vector to the source vector]
and generating a recommendation data object for the input case based at least in part on the historical case cohort
[Wu 0053]: “Moreover, the KG-QA component 110 can utilize the match scores to rank the KG subgraphs according to their similarity to the question graph. Thereby, the KG-QA component 110 can identify those KG subgraphs of the one or more KGs 108 that most closely match the question graph and thereby have the highest expectation of answering [and generating a recommendation data object for the input case based at least in part on the historical case cohort] the natural language question correctly.”
sending, by the one or more processors and in parallel for each of the plurality of input cases, a plurality of notifications, based on the plurality of recommendation data objects, to the plurality of client computing entities, wherein a notification, of the plurality of notifications, for the recommendation data object is sent to a client computing entity, of the plurality of client computing entities, that is associated with the source vector
[Wu 0037]: “Additionally, the one or more input devices 106 can be employed to display one or more outputs [sending, by the one or more processors and in parallel for each of the plurality of input cases, a plurality of notifications, based on the plurality of recommendation data objects, to the plurality of client computing entities, wherein a notification, of the plurality of notifications, for the recommendation data object is sent to a client computing entity, of the plurality of client computing entities, that is associated with the source vector] (e.g., displays, data, visualizations, and/or the like) generated by the server 102 and/or associate components.”
Wu does not explicitly teach:
and in parallel for each of the plurality of input cases
to enable parallel processing with respect to each of the plurality of input cases
performing, in parallel, a first graph traversal operation on the first subgraph-based data structure and a second graph traversal operation on the second subgraph-based data structure
sending, by the one or more processors and in parallel for each of the plurality of input cases, a plurality of notifications
via a plurality of first Application Programming Interface (API) calls from a plurality of client computing entities
a second API call to
code (referring to first source code, second source code, first historical code, second historical code or other instances of the data being code)
Wu teaches data or features, but the teaching of “codes” is not explicitly taught. Another reference is utilized to help show the teaching of using codes as data.
Li teaches:
via a plurality of first Application Programming Interface (API) calls from a plurality of client computing entities
[Li Column 6 line 54]: “After the model is trained, a question is fed [via a plurality of first Application Programming Interface (API) calls from a plurality of client computing entities] in to get the probability of each token being part of the subject chunk. In embodiments, based on the probability, a threshold is set and all tokens whose probability is higher than the threshold is concatenated as the predicted subject string. In embodiments of the system, a relative measurement rather than the absolute threshold may be used. In embodiments, firstly, the token with the highest probability is selected, and then expand the selection to both sides until the probability decreases more than a certain percentage relative to the adjacent inner one. Empirically, this method is slightly better.”
The question having come from client entities is made apparent by the field of use of Li being to answer questions from an interface ([Li Column 1 Line 18]: “The present disclosure relates generally to computing technologies, and more specifically to systems and methods for automating the answering of questions raised in natural language and improving human computer interfacing”).
a second API call to
[Li Column 6 line 66]: “Based on the chosen subject chunk, the candidate subjects may be obtained by querying [a second API call to] the KG for entities whose name or alias has the same surface form (i.e., same spelling). However, in embodiments, if no matched entity is founded (5%), the Freebase Suggest API is simply utilized to suggest entities using the chosen chunk. After this, there may be either one or multiple entities as candidate subject(s). For easier reference, the case with only one entity is termed as the single-subject case, and the other case with multiple entities is termed as the multi-subject case.”
One of ordinary skill in the art, prior to the effective filing date, would have been motivated to combine Li with Wu to utilize API. Li with Wu are of the same field of endeavor, as they are both in the field of machine learning. One of ordinary skill would have been motivated to combine Li with Wu in order to be able to utilized structured data to answer questions, which helps alleviate the difficulty with simple questions ([Li Column 3 Line 29]: “Open-domain Question Answering (QA) targets providing exact answer(s) to questions expressed as natural language, without restriction of domain. Recently, the maturity of large-scale Knowledge Graph (KG), such as Freebase, which stores extracted facts from all domains as unified triplets, offers QA systems the opportunity to infer the answer(s) using structured data. Under such circumstances, the core task of a QA system can be formulated as matching the question in natural language with informative triple(s) in KG, and reasoning about the answer(s) based on these triples. Among all sorts of questions, there is a type of question requiring only one fact (triple) in KG as evidence to answer, which we refer as Simple Questions in this document. A typical example can be “Where was Fran Drescher born?” Though simple enough, answering such questions remains an unsolved problem. Quite the contrary, Simple Questions are the most common type of question observed in community QA sites.”).
Yang teaches:
and in parallel for each of the plurality of input cases
sending, by the one or more processors and in parallel for each of the plurality of input cases, a plurality of notifications
temporary storing the first subgraph-based data structure separately… to enable parallel processing with respect to each of the plurality of input cases
performing, in parallel, a first graph traversal operation on the first subgraph-based data structure and a second graph traversal operation on the second subgraph-based data structure
For background and note of parallelism for graphs [Yang Section 1 Introduction Contributions page 2]: "Second, we develop a two-stage parallel algorithm framework that can work on not only multi-core CPUs, but also GPUs. Our algorithm works in a lock-free way during traversal, which is critical for efficiency. In the first stage, we find a set of potential Central Graphs in a bottom-up manner starting from nodes containing keywords (keyword nodes). In the second stage, we extract, prune and select the top-ranked Central Graphs derived from the first stage in a top-down manner starting from Central Nodes, which are centers of respective Central Graphs."
[Yang V Two-Stage Parallel Algorithm page 8]: “First, for the extraction step, it is a standard BFS traversal from the respective Central Node, and for each node we check through the hitting levels of q keywords. It can also be thought of as q independent standard BFSes, one for each keyword. Therefore the time complexity is O(q|V|+|E|). Second, for level-cover strategy, to classify all keyword nodes, we need to scan all hitting levels of these nodes, which is bounded by O(q|V|). To do pruning, need to scan from top level to the lowest level, in the worst case no keyword nodes are pruned. In this case, we scan everything again and the time complexity is also O(q|V|). Therefore, the total time complexity of level-cover strategy is O(q|V|). Third, to insert results to Tk which is a heap. Thus, the complexity of insertion is O(log2k) for maintain top-k answers. All together, suppose we have |C| top-(k,d) Central Graphs, the time complexity is then O(|C|(q(|V|+|E|)+log2k)) in sequential execution and O(|C|(q(|V|+|E|)+log2k) / T) in parallel [performing, in parallel... as this notes that the operations on the graphs can be performed in parallel rather than sequential][and in parallel for each of the plurality of input cases as Yang is demonstrating being able to run in parallel on multiple parts, thus demonstrating the teaching of accelerating multiple inputs using parallelism][sending, by the one or more processors and in parallel for each of the plurality of input cases, a plurality of notifications as Yang, as already noted, is displaying the teaching of using parallelism to accelerate running over multiple inputs], where T is the number of threads. For space complexity, the major cost arises from storing Central Graphs while extraction [temporary storing the first subgraph-based data structure separately… to enable parallel processing with respect to each of the plurality of input cases as this limitation is interpreted as setting up the elements for the performing in parallel (thus the parallel processing here is seen as being for parallelism on separately stored graphs and not for there being multiple possible parallel inputs) and the temporary part is supported through Yang noting the storing during extraction ( which adds a timeframe for the storage, thus making the storage temporary)], besides node-keyword matrix and graph storage cost.”
One of ordinary skill in the art, prior to the effective filing date would have been motivated to combine Wu and Yang. Wu and Yang are in the same field of endeavor of knowledge graphs. One of ordinary skill in the art would have been motivated to combine Wu and Yang to incorporate aspects of parallelism to speed up operations, such as on graph structures ([Yang Introduction page 1]: "Having witnessed great advances in computer hardwares (e.g. multi-core CPUs and GPUs), we are inspired to think whether we can make use of parallel computational power of modern hardwares to address the efficiency issues.").
Chimmad teaches:
code (referring to first source code, second source code, first historical code, second historical code or other instances of the data being code)
([Chimmad Page 1 Analysis of Rejected and Denied Claims]: “Assessment of claim rejection risk is essential for increasing accuracy and reducing errors in processing the claims, which has a major impact on the revenue cycle. Reasons for the denial of a claim include incorrect patient identifier information, claim billed to wrong insurance company, lack of technical experience of medical biller in charge, filing a claim after the deadline, medical biller filling the claim with incorrect CPT and ICD [code (referring to first source code, second source code, first historical code, second historical code or other instances of the data being code)] codes [2] or leaving out codes altogether, patient not covered by insurance policy for the service, missing referrals from primary service provider for certain procedures etc.” )
One of ordinary skill in the art, prior to the effective filing date, would have been motivated to combine Chimmad with modified Wu to utilize diagnosis and procedure codes. Chimmad and Wu are of the same field of endeavor, as they are both in the field of machine learning. One of ordinary skill would have been motivated to combine Chimmad with Wu to utilize diagnosis and prodedure codes in order to improve accuracy while reducing errors in elements such claim rejections ([Chimmad Page 1 Analysis of Rejected and Denied Claims]: “Assessment of claim rejection risk is essential for increasing accuracy and reducing errors in processing the claims, which has a major impact on the revenue cycle.”).
Regarding Claim 2:
The method of claim 1 is taught by Wu, Li, Yang, and Chimmad.
Wu teaches:
extracting the second subgraph-based data structure responsive to receiving an additional source vector comprising one or more additional source features describing an additional input case, wherein the second subgraph-based data structure corresponds to the additional source vector and the additional input case.
[Wu Figure 9 906] extracting the second subgraph-based data structure responsive to,
[Wu Figure 9 902] receiving an additional source vector comprising one or more additional source features describing an additional input case
[Wu Figure 9 904] wherein the second subgraph-based data structure corresponds to the additional source vector and the additional input case.
Regarding Claim 3:
The method of claim 1 is taught by Wu, Li, Yang, and Chimmad.
Modified Wu does not explicitly teach:
wherein the first source code comprises a diagnosis code and the second source code comprises a procedure code, wherein the first historical code comprises a diagnosis feature dimension, and the first historical code comprises at least one diagnosis code identical to the diagnosis code of the first source code; and wherein the second historical code comprises a procedure feature dimension, and the second historical code comprises at least one procedure code identical to the procedure code of the second source code
Wu already teaches finding a matching in [Wu Figure 9 906]. The limitation of teaching using similarity and such for matching is taught by Wu in claims 4 or other claims relevant to similarity comparison. The reference below teaching comparing of codes is utilized to help show that the combination would be more obvious, as Wu notes the comparing of data. Another reference shows comparing of a particular data. That mean that in combination Wu should be able to also compare that particular data as that data can be compared.
Chimmad teaches:
the first historical code comprises a diagnosis feature dimension,
([Chimmad Page 1 Introduction]: “During the treatment, type of diagnosis done, treatments and medication prescribed are all recorded to maintain a patient history [the first historical code comprises a diagnosis feature dimension]. A trained coding specialist converts these into alphanumeric codes, using several standards accepted universally, like ICD codes to record diagnosis and CPT codes for treatments, and the resulting claims are sent to a third party adjudication system (clearinghouse or insurance company) for further processing.”)
wherein the first source code comprises a diagnosis code and the second source code comprises a procedure code… and the first historical code comprises at least one diagnosis code identical to the diagnosis code of the first source code; and wherein the second historical code comprises a procedure feature dimension, and the second historical code comprises at least one procedure code identical to the procedure code of the second source code
([Chimmad Page 1 Analysis of Rejected and Denied Claims]: “Assessment of claim rejection risk is essential for increasing accuracy and reducing errors in processing the claims, which has a major impact on the revenue cycle. Reasons for the denial of a claim include incorrect patient identifier information, claim billed to wrong insurance company, lack of technical experience of medical biller in charge, filing a claim after the deadline, medical biller filling the claim with incorrect CPT [and wherein the second historical code comprises a procedure feature dimension, and the second historical code comprises at least one procedure code identical to the procedure code of the second source code as checking if it is incorrect is checking if it is identical] and ICD [wherein the first source code comprises a diagnosis code… and the first historical code comprises at least one diagnosis code identical to the diagnosis code of the first source code] codes [2] or leaving out codes altogether, patient not covered by insurance policy for the service, missing referrals from primary service provider for certain procedures etc.” )
The motivation to combine with Chimmad is the same as the motivation to combine with Chimmad in claim 1.
Regarding Claim 4:
The method of claim 1 is taught by Wu, Li, Yang, and Chimmad.
Wu teaches:
wherein selecting the historical case cohort comprises: generating a similarity value corresponding to the historical vector and is based at least in part on a comparison between a corresponding historical feature vector and the source vector; assigning the similarity value to the historical vector of the subset of historical vectors; and generating the historical case cohort identifying one or more historical vectors based at least in part on the similarity value assigned to the historical vector
([Wu 0010]: “In some examples, the computer program instructions can further cause the processor to determine, by the processor, an amount of similarity [wherein selecting the historical case cohort comprises: generating a similarity value corresponding to the historical vector and is based at least in part on a comparison between a corresponding historical feature vector and the source vector; assigning the similarity value to the historical vector of the subset of historical vectors;] between the question graph and the knowledge graph subgraph based on a first neural network embedding from the neural network embeddings and a second neural network embedding from the neural network embeddings. The program instructions can also cause the processor to match, by the processor, the knowledge graph subgraph to the question graph based on the amount of similarity being greater than or equal to a defined threshold [generating the historical case cohort identifying one or more historical vectors based at least in part on the similarity value assigned to the historical vector]. Moreover, the program instructions can cause the processor to rank, by the processor, the knowledge graph subgraph amongst a plurality of knowledge graph subgraphs based on the amount of similarity.”)
Regarding Claim 5:
The method of claim 4 is taught by Wu, Li, Yang, and Chimmad.
Wu Teaches:
wherein the similarity value for the historical vector is determined based at least in part on a cosine similarity between the corresponding historical feature vector and the source vector
([Wu 0051]: “For example, the match component 602 can calculate a matching score between the question graph embedding “r.sub.Q” and the KG subgraph embedding “r.sub.i”. For instance, each KG subgraph can be characterized by one or more respective KG subgraph embeddings, and each KG subgraph embedding can be compared with the question graph embedding; thereby, the match component 602 can generate match scores associated with each KG subgraph. The match scores can characterize an amount of similarity [wherein the similarity value for the historical vector is determined] between the respective KG subgraph and the question graph. In various embodiments, the match component 602 can execute a similarity algorithm to computer the match scores. For example, the match component 602 can execute a cosine similarity algorithm [based at least in part on a cosine similarity between the corresponding historical feature vector and the source vector] to compare the neural network embeddings in accordance with Equation 9 below.”)
Regarding Claim 6:
The method of claim 4 is taught by Wu, Li, Yang, and Chimmad.
Wu teaches:
wherein the historical case cohort identifies the one or more historical vectors comprising respective similarity value greater than or equal to a threshold similarity score.
([Wu 0010]: “The program instructions can also cause the processor to match, by the processor, the knowledge graph subgraph to the question graph [wherein the historical case cohort identifies the one or more historical vectors] based on the amount of similarity being greater than or equal to a defined threshold [comprising a respective similarity value greater than or equal to a threshold similarity score].”)
Regarding Claim 7:
The method of claim 4 is taught by Wu, Li, Yang, and Chimmad.
Wu teaches:
wherein the historical case cohort identifies a configurable number of historical vectors based at least in part on a ranking of the plurality of historical vectors with respect to an assigned similarity score.
[Wu 0027]: “Additionally, various embodiments described herein can encode structure information of the one or more [a configurable number of] question graphs and KG subgraphs via a plurality of bidirectional GNNs. The embeddings generated by the bidirectional GNNs can be utilized to determine matching scores characterizing an amount of similarity between the question graph and the KG subgraphs [identifies… historical vectors]. Further, the matching scores can be utilized to match and/or rank the KG subgraphs [based at least in part on a ranking of the plurality of historical vectors with respect to an assigned similarity score].”
Regarding Claim 8:
The method of claim 1 is taught by Wu, Li, Yang, and Chimmad.
Wu teaches:
wherein the recommendation data object comprises at least one of (i) a decision recommendation data object, (ii) a confidence value, or (iii) a consequence prediction data object.
([Wu 0039]: “Given a natural language question “Q” (e.g., provided via the one or more input devices 106) and a set “S” of subgraphs “g” (e.g., characterized by S={g.sub.1, g.sub.2, . . . g.sub.n}) from a KG 108, the KG-QA component 110 can automatically determine whether the natural language question matches a KG subgraph [wherein the recommendation data object comprises] and then rank all the KG subgraphs in the set based on one or more matching relevance scores [at least one of (i) a decision recommendation data object, (ii) a confidence value, or (iii) a consequence prediction data object wherein this case is a confidence value] associated with the KG subgraphs.”)
Regarding Claim 10:
The method of claim 8 is taught by Wu, Li, Yang, and Chimmad.
Wu teaches:
wherein the confidence value is determined based at least in part on one or more similarity values of the historical case cohort.
([Wu 0051]: “For example, the match component 602 can calculate a matching score between the question graph embedding “r.sub.Q” and the KG subgraph embedding “r.sub.i”. For instance, each KG subgraph can be characterized by one or more respective KG subgraph embeddings, and each KG subgraph embedding can be compared with the question graph embedding; thereby, the match component 602 can generate match scores [wherein the confidence value is determined] associated with each KG subgraph. The match scores can characterize an amount of similarity [based at least in part on one or more similarity values of the historical case cohort] between the respective KG subgraph and the question graph.”)
Regarding Claim 13:
The method of claim 1 is taught by Wu, Li, Yang, and Chimmad.
Wu teaches:
wherein generating the source vector comprises selecting one or more source features corresponding to one or more pre-determined or configured case feature dimensions
[Wu Figure 9 904] shows the input from 902 being encoded into graphs which are then compared [wherein generating the source vector comprises selecting one or more source features corresponding to one or more pre-determined or configured case feature dimensions] in 906.
Regarding Claim 14:
Wu teaches:
A recommendation system comprising: one or more memories storing processor-executable instructions that, when executed by the one or more processors, cause the one or more processor to perform operations comprising
([Wu 0009]: “According to an embodiment, a computer program product for identifying candidate knowledge graph subgraphs in a question answering over knowledge graph task is provided. The computer program product can comprise a computer readable storage medium having program instructions [one or more memories storing processor-executable instructions that] embodied therewith. The program instructions can be executable by a processor [when executed by the one or more processors, cause the one or more processor to perform operations comprising] to cause the processor to encode, by the processor, graph structure information of a knowledge graph subgraph and a question graph into neural network embeddings. An advantage of such a computer program product can be increased accuracy and/or precision in identifying a knowledge graph subgraph for answering a natural language question, as compared to traditional approaches to executing KG-QA tasks.”)
The rest of the claim is analogous to claim 1.
Regarding Claim 15:
The apparatus of claim 14 is taught by Wu, Li, Yang, and Chimmad.
The claim is analogous to claim 2.
Regarding Claim 16:
The apparatus of claim 14 is taught by Wu, Li, Yang, and Chimmad.
The claim is analogous to claim 4.
Regarding Claim 17:
The apparatus of claim 16 is taught by Wu, Li, Yang, and Chimmad.
The claim is analogous to claim 5.
Regarding Claim 18:
The apparatus of claim 14 is taught by Wu, Li, Yang, and Chimmad.
The claim is analogous to claim 8.
Regarding Claim 19:
The apparatus of claim 14 is taught by Wu, Li, Yang, and Chimmad.
The claim is analogous to claim 7.
Regarding Claim 20:
The claim is analogous to claim 14.
Claims 9, 11, and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al (US 20220108188 A1), referred to as Wu in this document, in view of Li et al (US 10606846 B2), referred to as Li in this document, and further in view of Yang et al ("An Efficient Parallel Keyword Search Engine on Knowledge Graphs"), referred to as Yang in this document, and further in view of Chimmad (“Assessment of Healthcare Claims Rejection Risk Using Machine Learning”), referred to as Chimmad in this document, and in further view of Edgar (US 20150317337), referred to as Edgar in this document.
Regarding Claim 9:
The method of claim 8 is taught by Wu, Li, Yang, and Chimmad.
Modified Wu does not explicitly teach:
wherein the decision recommendation data object is generated based at least in part on one or more decision case features in the historical case cohort
Edgar teaches:
wherein the decision recommendation data object is generated based at least in part on one or more decision case features in the historical case cohort.
([Edgar 0024]: “Aspects disclosed and described herein enable identification of unique patterns in healthcare data [wherein the decision recommendation data object is generated], using healthcare payment denials as an example [based at least in part on one or more decision case features in the historical case cohort]. The patterns identify different problems or defects in processing of claims. The patterns point to or are closely associated with root causes of the denials. Once identified, automated methods are used to fix the denials and prevent them from occurring in the future.”)
One of ordinary skill in the art, prior to the effective filing date, would have been motivated to combine Edgar with Wu to utilize recommendation data. Edgar with Wu are of the same field of endeavor, as they are both in the field of machine learning. One of ordinary skill would have been motivated to combine Edgar with Wu in order to be able to identify unique patterns in historical data to fix denials ([Edgar 0024]: “Aspects disclosed and described herein enable identification of unique patterns in healthcare data, using healthcare payment denials as an example. The patterns identify different problems or defects in processing of claims. The patterns point to or are closely associated with root causes of the denials. Once identified, automated methods are used to fix the denials and prevent them from occurring in the future.”).
Regarding Claim 11:
The method of claim 8 is taught by Wu, Li, Yang, and Chimmad.
Modified Wu does not explicitly teach:
wherein the consequence prediction data object is generated based at least in part on: selecting the historical vector from the plurality of historical vectors comprising a negative decision feature; generating an average alternative cost value for the selected historical vector; and generating the consequence prediction data object based at least in part on the average alternative cost value.
Edgar teaches:
wherein the consequence prediction data object is generated based at least in part on: selecting the historical vector from the plurality of historical vectors comprising a negative decision feature; generating an average alternative cost value for the selected historical vector; and generating the consequence prediction data object based at least in part on the average alternative cost value.
([Edgar 0086]: “An analysis of denials [wherein the consequence prediction data object is generated based at least in part on: selecting a historical vector from the plurality of historical vectors comprising a negative decision feature] for a medium size provider network can provide an opportunity benchmark of dollars per claim and an identification of payer and provider attribute combinations that have unexpectedly high rates of denials. An opportunity benchmark measures an amount of value [generating an average alternative cost value, where the average alternative cost definition is quoted below, for the selected historical vector] to an enterprise if a problem can be addressed. An opportunity benchmark equals an opportunity cost [and generating a consequence prediction data object based at least in part on the average alternative cost value], for example. For a denial, an opportunity benchmark equals a denied cost plus a cost of labor to fix.”)
Average alternative cost is noted to essentially be an opportunity cost in the current application ([0117]: “In various embodiments, an average alternative cost value is configured to describe costs or consequences resulting from a negative decision being made for a corresponding historical case.”).
The consequence prediction data object is noted to be an object that holds a value in current application ([0026]: “The consequence prediction data object, for example, may be and/or may comprise an average cost of step therapy in denied historical prior authorization cases.”)
One of ordinary skill in the art, prior to the effective filing date, would have been motivated to combine Edgar with Wu to utilize consequence prediction or opportunity cost. Edgar with Wu are of the same field of endeavor, as they are both in the field of machine learning. One of ordinary skill would have been motivated to combine Edgar with Wu in order to better understand the financial cost of decisions (Edgar [0086]: “An opportunity benchmark measures an amount of value to an enterprise if a problem can be addressed. An opportunity benchmark equals an opportunity cost for example. For a denial, an opportunity benchmark equals a denied cost plus a cost of labor to fix.”).
Regarding Claim 12:
The method of claim 11 is taught by Wu, Li, Yang, Chimmad, and Edgar.
Edgar teaches:
wherein the average alternative cost value for a selected historical vector is generated by performing a graph traversal operation on the single graph-based data structure.
([Edgar 0136]: “relationships as well as modeled denial reason and remark code definitions, knowledge can be applied to the data to infer those relationships (e.g., infer root causes for denials, predict potential reason/remark codes for a claim, provide a knowledge graph, etc.) [wherein the average alternative cost value for a selected historical vector is generated by performing a graph traversal operation on the single graph-based data structure as the quote notes the denial reason can be found utilizing a graph, such as a knowledge graph, thus it is traversing a graph structure]. In some examples, a payer/provider system description, action description, and the semantic model description combine to provide a problem description and resolution through recommended next action(s).”)
One of ordinary skill in the art, prior to the effective filing date, would have been motivated to combine Edgar with Wu to utilize graphs and to perform graph traversals. Edgar with Wu are of the same field of endeavor, as they are both in the field of machine learning. One of ordinary skill would have been motivated to combine Edgar with Wu in order to take advantage of graphs data representation allowing the representation of inference of relationships ([Edgar 0136]: “relationships as well as modeled denial reason and remark code definitions, knowledge can be applied to the data to infer those relationships (e.g., infer root causes for denials, predict potential reason/remark codes for a claim, provide a knowledge graph, etc.)”).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Lecue et al (US 11442963 B1) is relevant art as the reference notes ranking subgraphs to provide an explanation for a classification, which is similar in idea to the current application ranking subgraphs to find a recommendation.
Japa et al (US 20220292262 A1) is relevant art as the reference notes the utilization of a knowledge graph, grabbing nodes from the knowledge graph (in a manner akin to grabbing a subgraph), and using the nodes from the knowledge graph to answer a question.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHRISTOPHER D DEVORE whose telephone number is (703)756-1234. The examiner can normally be reached Monday-Friday 7:30 am - 5 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached at (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/C.D.D./Examiner, Art Unit 2129
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129