Last updated: April 19, 2026
Application No. 18/663,807
MACHINE LEARNING TECHNIQUES FOR GUIDELINE-BASED EXTRACTION OF RELEVANT INFORMATION FROM UNSTRUCTURED DATA

Non-Final OA §103§112
Filed
May 14, 2024
Examiner
DWIVEDI, MAHESH H
Art Unit
2168
Tech Center
2100 — Computer Architecture & Software
Assignee
Optum Inc.
OA Round
1 (Non-Final)
Interview Optional

— +4.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 751 resolved cases, 2023–2026
Examiner Intelligence

DWIVEDI, MAHESH H View full profile →
Grants 69% — above average
Career Allow Rate
521 granted / 751 resolved
+14.4% vs TC avg
Minimal +4% lift
Without
With
+4.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 6m
Avg Prosecution
21 currently pending
Career history
772
Total Applications
across all art units
Statute-Specific Performance

§101
16.5%
-23.5% vs TC avg
§103
40.2%
+0.2% vs TC avg
§102
17.2%
-22.8% vs TC avg
§112
19.5%
-20.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 751 resolved cases
Office Action

§103 §112
DETAILED ACTION
1.	The present application is being examined under the pre-AIA  first to invent provisions. 
Information Disclosure Statement
2.         The information disclosure statements (IDS) submitted on 09/26/2024, 07/15/2025, 10/13/2025, and 02/18/2026 have been received, entered into the record, and considered.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.
Specification
3.	The abstract of the disclosure is objected to because the term “embodiments” is a legal term that should be removed/replaced.  A corrected abstract of the disclosure is required and must be presented on a separate sheet, apart from any other text. See MPEP § 608.01(b).
Claim Rejections - 35 USC § 112
4.	The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
5.	The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
6.	Claims 1, 9, and 15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  Specifically, it is unclear as to whether the latter claimed label “i)” in the limitation “(i) generating a guideline-specific cross-reference data object for the guideline data object by: (a) extracting one or more questions from the guideline data object, (b) assigning a plurality of scores to a plurality of passages from one or more training unstructured data objects that correspond to the one or more questions, (c) identifying, for a question of the one or more questions, one or more top ranking passages from the plurality of passages based on the plurality of scores, and (d) generating the guideline-specific cross-reference data object based on the one or more top ranking passages” refers to the earlier claimed label “i)” in the limitation “receiving, by one or more processors, a constrained query that comprises (i) an input unstructured data object that is associated with an entity and (ii) a reference to a guideline data object; generating, by the one or more processors”. 
	Dependent claims 2-8, 10-14, and 16-20 are rejected for incorporating the deficiencies of independent claims 1, 9, and 15 respectively.
	Claims 1, 9, and 15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  Specifically, it is unclear as to whether the latter claimed label “ii)” in the limitation “(ii) generating one or more cross-reference embeddings based on the guideline-specific cross-reference data object” refers to the earlier claimed label “ii)” in the limitation “receiving, by one or more processors, a constrained query that comprises (i) an input unstructured data object that is associated with an entity and (ii) a reference to a guideline data object; generating, by the one or more processors”. 
	Dependent claims 2-8, 10-14, and 16-20 are rejected for incorporating the deficiencies of independent claims 1, 9, and 15 respectively.
Claims 1, 9, and 15 recite the limitation “initiating, by the one or more processors, the performance of one or more prediction-based actions based on the one or more prediction outputs” in Pages 1, 3, and 5 respectively.  There is insufficient antecedent basis for this limitation in the claim as no “performance of one or more prediction-based actions” is claimed earlier in the claims.
	Dependent claims 2-8, 10-14, and 16-20 are rejected for incorporating the deficiencies of independent claims 1, 9, and 15 respectively.
Claim 6 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  Specifically, it is unclear as to whether the labels “i)” and “ii” refer to the earlier labels “i)” and “ii)” in parent independent claim 1 respectively.
Claim 13 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  Specifically, it is unclear as to whether the labels “i)” and “ii” refer to the earlier labels “i)” and “ii)” in parent independent claim 9 respectively.
Claim 20 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  Specifically, it is unclear as to whether the labels “i)” and “ii” refer to the earlier labels “i)” and “ii)” in parent independent claim 16 respectively.
Claim Rejections - 35 USC § 103
7.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
8.	This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
9.	Claims 1-2, 7-10, 13-17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ghose et al. (U.S. PGPUB 2025/0095642), in view of Mass et al. (U.S. PGPUB 2023/0029829), and further in view of Mulligan et al. (U.S. PGPUB 2021/0057098).
10.	Regarding claims 1, 9, and 16, Ghose teaches a method, computing system, and one or more non-transitory computer-readable storage media comprising:
A)  receiving, by one or more processors, a constrained query that comprises (i) an input unstructured data object that is associated with an entity and (ii) a reference to a guideline data object (Paragraphs 89-91); 
B)  generating, by the one or more processors, one or more prediction outputs for the constrained query using a predictive machine learning model that comprises one or more learned parameters previously trained by: (i) generating a guideline-specific cross-reference data object for the guideline data object by: (a) extracting one or more questions from the guideline data object (Paragraphs 25, 33, 36, 57, and 91); 
H)  initiating, by the one or more processors, the performance of one or more prediction-based actions based on the one or more prediction outputs (Paragraphs 17, 25, 33, 36, 57, and 91).
	The examiner notes that Ghose teaches “receiving, by one or more processors, a constrained query that comprises (i) an input unstructured data object that is associated with an entity and (ii) a reference to a guideline data object” as “Method 700 begins at 702, where method 700 includes receiving a query from a care provider. The query may be a question posed by the care provider regarding a treatment for a patient being treated by the care provider. For example, the care provider may not be sure of a possible treatment, or may have two or more options for treating the patient, and may wish to consult with the clinical recommendation system to determine which of the two or more options to select. The care provider may submit the query via a UI of the clinical recommendation system, such as UI 404 of FIG. 4. In some embodiments, the care provider may submit the query in written form, for example via a keyboard. In other embodiments, the care provider may submit a query by voice by speaking into a microphone. The query may be posed in natural language, for example, as a question or instruction. As an example, a query submitted by a care provider may be, “Given stage 4 lung cancer with TNM status [xxx] and surgical margin [yyy], what should be the next steps?”” (Paragraph 89), “At 704, method 700 includes translating the query into a prompt to be submitted to the LLM. Translating the query into the prompt may include converting an audio clip or audio file received by the clinical recommendation system via a microphone into a written form. Additionally, a format of the query may be adjusted to an input format of the LLM. For example, the query may be posed by the care provider as a question, and the question may be converted into an instruction for the LLM” (Paragraph 90), and “At 706, method 700 includes submitting the prompt to the LLM. The prompt may be inputted into the LLM, and the LLM may output a response to the prompt. The response may include clinically explainable information, such as why the patient is at a certain node or stage in the recommendation model (and guidelines). The response may further indicate advised or suggested next steps in accordance with clinical guidelines, based on the patient's position and existing clinical/pathological variables” (Paragraph 91).  The examiner further notes that a query regarding a patient (i.e. an entity) can include text, audio, video etc. (i.e. examples of the claimed input unstructured data object).  Moreover, such a query can reference that patient’s stage of a guideline (i.e. the claimed undefined guideline data object in the broadest reasonable interpretation).  The examiner notes that Ghose teaches “generating, by the one or more processors, one or more prediction outputs for the constrained query using a predictive machine learning model that comprises one or more learned parameters previously trained by: (i) generating a guideline-specific cross-reference data object for the guideline data object by: (a) extracting one or more questions from the guideline data object” as “Non-transitory memory 106 may store an AI model module 108, a training module 110, an inference module 111, and a database 114. AI model module 108 may include various AI models, such as ML and/or DL models. In particular, AI model module may include an LLM 120 that may be trained to respond to natural language queries regarding treatment alternatives posed by health care providers, as described in greater detail below in reference to FIG. 5. The AI module may include a reward model 126, which may be used to train the LLM using reinforcement learning, as described in greater detail below in reference to FIG. 6. The AI module may also include a recommendation model 122, which may be used to recommend a treatment option based on clinical guidelines 150 as described in greater detail below in reference to FIG. 7. The AI module may also include a prediction model 124, which may be used to verify an output of the LLM” (Paragraph 25), “First workflow 200 starts with the creation of a recommendation model 204 for recommending one or more treatment options for a patient to a care provider using the clinical recommendation system. Recommendation model 204 may be generated from and/or based on one or more sets of clinical guidelines 202 (e.g., clinical guidelines 150 of FIG. 1). Clinical guidelines 202 may be digital guidelines available online and/or on a computer system or network of a healthcare system using the clinical recommendation system. Clinical guidelines 202 may include different reference guidelines for different types of patients and/or pathologies. For example, a first set of clinical guidelines may relate to patients who have or are suspected of having cancer; a second set of clinical guidelines may relate to patients who have or are suspected of having an auto-immune disease; a third set of clinical guidelines may relate to patients who are suffering from traumatic wounds; and so on. In one example, clinical guidelines 202 includes the NCCN Guidelines®” (Paragraph 33), “Recommendation model 204 may be used for an LLM training step 212, and a prediction model training step 213. LLM training step 212 may rely on a first training data set, and prediction model training step 213 may rely on a second training data set, where the first training data set and the second training data set may both be generated with the aid of recommendation model 204” (Paragraph 36), “Method 500 begins at 502, where the method includes training a recommendation model based on one or more sets of clinical guidelines. Training the recommendation model may include mapping elements of the one or more sets of clinical guidelines to a hierarchical structure for modeling decisions, such as a decision tree. The hierarchical structure may then be traversed based on patient data, until reaching a decision point at which one or more courses of action may be recommended, based on the options available at the decision point. The decision tree/knowledge graph/recommendation model may be created in a rule-based framework from existing clinical guidelines, literature, domain knowledge of a disease and/or using routinely collected clinical training data to create a graph or tree based machine learning models in a supervised, semi-supervised, self-supervised or un-supervised machine learning framework. An example recommendation model implemented as a decision tree is shown in FIG. 8” (Paragraph 57), and “At 706, method 700 includes submitting the prompt to the LLM. The prompt may be inputted into the LLM, and the LLM may output a response to the prompt. The response may include clinically explainable information, such as why the patient is at a certain node or stage in the recommendation model (and guidelines). The response may further indicate advised or suggested next steps in accordance with clinical guidelines, based on the patient's position and existing clinical/pathological variables” (Paragraph 91).  The examiner further notes that recommendations that are output from a trained LLM from a received constrained query are based off of a trained recommendation model (which trains the LLM) that is based off of clinical guidelines (which include questions).  Such questions are extracted (See mapping) to generate a hierarchical structure (i.e. an example of the claimed undefined guideline-specific cross-reference data object in the broadest reasonable interpretation).  The examiner further notes that Ghose teaches “initiating, by the one or more processors, the performance of one or more prediction-based actions based on the one or more prediction outputs” as “The clinical recommendation system may provide a clinically explainable disease state of a patient and recommend a next course of action (e.g., a treatment) based on clinical guidelines and population statistics, in a manner that reduces a current burden of clinicians in consulting digital clinical manuals via a series of time-consuming and cumbersome interactions with a graphical user interface (GUI) of the digital clinical manuals. A natural language-based interaction would reduce the digital device interactions of clinicians, thereby allowing the clinicians to devote more time to interacting with patients. Additionally, the proposed clinical guidance system would offer transparency to patients on a care pathway of a disease and provide patients with better context of a clinical procedure” (Paragraph 17), “Non-transitory memory 106 may store an AI model module 108, a training module 110, an inference module 111, and a database 114. AI model module 108 may include various AI models, such as ML and/or DL models. In particular, AI model module may include an LLM 120 that may be trained to respond to natural language queries regarding treatment alternatives posed by health care providers, as described in greater detail below in reference to FIG. 5. The AI module may include a reward model 126, which may be used to train the LLM using reinforcement learning, as described in greater detail below in reference to FIG. 6. The AI module may also include a recommendation model 122, which may be used to recommend a treatment option based on clinical guidelines 150 as described in greater detail below in reference to FIG. 7. The AI module may also include a prediction model 124, which may be used to verify an output of the LLM” (Paragraph 25), “First workflow 200 starts with the creation of a recommendation model 204 for recommending one or more treatment options for a patient to a care provider using the clinical recommendation system. Recommendation model 204 may be generated from and/or based on one or more sets of clinical guidelines 202 (e.g., clinical guidelines 150 of FIG. 1). Clinical guidelines 202 may be digital guidelines available online and/or on a computer system or network of a healthcare system using the clinical recommendation system. Clinical guidelines 202 may include different reference guidelines for different types of patients and/or pathologies. For example, a first set of clinical guidelines may relate to patients who have or are suspected of having cancer; a second set of clinical guidelines may relate to patients who have or are suspected of having an auto-immune disease; a third set of clinical guidelines may relate to patients who are suffering from traumatic wounds; and so on. In one example, clinical guidelines 202 includes the NCCN Guidelines®” (Paragraph 33), “Recommendation model 204 may be used for an LLM training step 212, and a prediction model training step 213. LLM training step 212 may rely on a first training data set, and prediction model training step 213 may rely on a second training data set, where the first training data set and the second training data set may both be generated with the aid of recommendation model 204” (Paragraph 36), “Method 500 begins at 502, where the method includes training a recommendation model based on one or more sets of clinical guidelines. Training the recommendation model may include mapping elements of the one or more sets of clinical guidelines to a hierarchical structure for modeling decisions, such as a decision tree. The hierarchical structure may then be traversed based on patient data, until reaching a decision point at which one or more courses of action may be recommended, based on the options available at the decision point. The decision tree/knowledge graph/recommendation model may be created in a rule-based framework from existing clinical guidelines, literature, domain knowledge of a disease and/or using routinely collected clinical training data to create a graph or tree based machine learning models in a supervised, semi-supervised, self-supervised or un-supervised machine learning framework. An example recommendation model implemented as a decision tree is shown in FIG. 8” (Paragraph 57), and “At 706, method 700 includes submitting the prompt to the LLM. The prompt may be inputted into the LLM, and the LLM may output a response to the prompt. The response may include clinically explainable information, such as why the patient is at a certain node or stage in the recommendation model (and guidelines). The response may further indicate advised or suggested next steps in accordance with clinical guidelines, based on the patient's position and existing clinical/pathological variables” (Paragraph 91).  The examiner further notes that recommendations that are output from a trained LLM regarding patient treatment can be “initiated” by medical professionals.    
	Ghose does not explicitly teach:
C)  (b) assigning a plurality of scores to a plurality of passages from one or more training unstructured data objects that correspond to the one or more questions; 
D)  (c) identifying, for a question of the one or more questions, one or more top ranking passages from the plurality of passages based on the plurality of scores; and 
E)  (d) generating the guideline-specific cross-reference data object based on the one or more top ranking passages.
	Mass, however, teaches “(b) assigning a plurality of scores to a plurality of passages from one or more training unstructured data objects that correspond to the one or more questions” as “Each such H2H conversation typically includes multiple utterances (also “messages,” “turns,” or “rounds”) exchanged between the user and the agent, often starting with an initial query by the user, continuing with a series of clarification questions asked by the agent and answered by the user, and ending with the agent providing a resolution to the user's problem, commonly in the form of a hyperlink to a relevant solution document (stored in a solution documents database)” (Paragraph 38), “text passages that are relevant to each of the H2H conversations may be obtained from solution documents stored in the solution documents database. For each H2H conversation, text passages are obtained from the solution document(s) that is/are associated with (namely, linked from or referred by) the respective conversation. This is because that solution document is likely a good source of relevant texts, per the decision of the human agent to select that document as the solution to the user's problem” (Paragraph 41), “Following the extraction of the candidate text passages, each extracted (retrieved) text passage p may be assigned a score based on the coverage of words in C.sup.k by p. Coverage may be defined as the sum over all words in each utterance, using words' global IDF (Inverse Document Frequency) and their (scaled) TF (Term Frequency). In case there are multiple solution documents associated with C.sup.k, that score may be combined (e.g., using linear combination, or any other type of combination) with the relevancy score of the solution document it is extracted from.  Finally, top-r text passages may be selected, for each H2H conversation, as the output of step 204, namely—a predefined number (r, an integer) of text passages having the highest scores. r may be, for example, a value between 1-50, such as in a range between 1-10, 5-10, 1-20, 5-20, 1-30, 10-30, 1-40, 20-40, and so on and so forth” (Paragraphs 45-46), “Steps 202 through 206 provide, essentially, training sets for the first and second models. Specifically, for the first model, a training set 208 may include the H2H conversations (each with a labeled clarification question), as well as a negative example for each H2H conversation—a randomly-selected clarification question from the clarification questions database, which is not the labeled clarification question in that H2H conversation. For the second model, a training set 210 may include the H2H conversations (each with a labeled clarification question), the text passages retrieved for each H2H conversation, and a negative example similar to that described above” (Paragraph 49), “(c) identifying, for a question of the one or more questions, one or more top ranking passages from the plurality of passages based on the plurality of scores” as “Each such H2H conversation typically includes multiple utterances (also “messages,” “turns,” or “rounds”) exchanged between the user and the agent, often starting with an initial query by the user, continuing with a series of clarification questions asked by the agent and answered by the user, and ending with the agent providing a resolution to the user's problem, commonly in the form of a hyperlink to a relevant solution document (stored in a solution documents database)” (Paragraph 38), “text passages that are relevant to each of the H2H conversations may be obtained from solution documents stored in the solution documents database. For each H2H conversation, text passages are obtained from the solution document(s) that is/are associated with (namely, linked from or referred by) the respective conversation. This is because that solution document is likely a good source of relevant texts, per the decision of the human agent to select that document as the solution to the user's problem” (Paragraph 41), and “Following the extraction of the candidate text passages, each extracted (retrieved) text passage p may be assigned a score based on the coverage of words in C.sup.k by p. Coverage may be defined as the sum over all words in each utterance, using words' global IDF (Inverse Document Frequency) and their (scaled) TF (Term Frequency). In case there are multiple solution documents associated with C.sup.k, that score may be combined (e.g., using linear combination, or any other type of combination) with the relevancy score of the solution document it is extracted from.  Finally, top-r text passages may be selected, for each H2H conversation, as the output of step 204, namely—a predefined number (r, an integer) of text passages having the highest scores. r may be, for example, a value between 1-50, such as in a range between 1-10, 5-10, 1-20, 5-20, 1-30, 10-30, 1-40, 20-40, and so on and so forth” (Paragraphs 45-46), and “(d) generating the guideline-specific cross-reference data object based on the one or more top ranking passages” as “Each such H2H conversation typically includes multiple utterances (also “messages,” “turns,” or “rounds”) exchanged between the user and the agent, often starting with an initial query by the user, continuing with a series of clarification questions asked by the agent and answered by the user, and ending with the agent providing a resolution to the user's problem, commonly in the form of a hyperlink to a relevant solution document (stored in a solution documents database)” (Paragraph 38), “text passages that are relevant to each of the H2H conversations may be obtained from solution documents stored in the solution documents database. For each H2H conversation, text passages are obtained from the solution document(s) that is/are associated with (namely, linked from or referred by) the respective conversation. This is because that solution document is likely a good source of relevant texts, per the decision of the human agent to select that document as the solution to the user's problem” (Paragraph 41), “Following the extraction of the candidate text passages, each extracted (retrieved) text passage p may be assigned a score based on the coverage of words in C.sup.k by p. Coverage may be defined as the sum over all words in each utterance, using words' global IDF (Inverse Document Frequency) and their (scaled) TF (Term Frequency). In case there are multiple solution documents associated with C.sup.k, that score may be combined (e.g., using linear combination, or any other type of combination) with the relevancy score of the solution document it is extracted from.  Finally, top-r text passages may be selected, for each H2H conversation, as the output of step 204, namely—a predefined number (r, an integer) of text passages having the highest scores. r may be, for example, a value between 1-50, such as in a range between 1-10, 5-10, 1-20, 5-20, 1-30, 10-30, 1-40, 20-40, and so on and so forth” (Paragraphs 45-46), and “In a step 206, for each of the top-r text passages selected in the previous step, candidate clarification questions, that are relevant to both the respective text passage and the H2H conversation for which that text passage what retrieved, may be retrieved from a pool of clarification questions, stored in a clarification questions database.  This may be performed, for each text passage p, by concatenating the text of all utterances in C.sup.k to the content of p, and using these joint texts as a query to the clarification questions database (which may be operated by a conventional search engine). Step 206 thus results in a list of candidate clarification questions for each of the text passages. This list may be shortened, to include only the top-s candidate clarification questions, namely—a predefined number (s, an integer) of candidate clarification questions having the highest relevancy scores, as provided by the search engine. s may be, for example, a value between 1-50, such as in a range between 1-10, 5-10, 1-20, 5-20, 1-30, 10-30, 1-40, 20-40, and so on and so forth” (Paragraphs 47-48).
	The examiner further notes that the secondary reference of Mass teaches the concept of ascertaining top-ranked passages from unstructured documents (which are associated in H2H conversations which are part of training data).  Such top ranked passages are then subsequently used to determine clarification questions (i.e. an example of the undefined claimed guideline-specific cross reference data object in the broadest reasonable interpretation).  The combination would result in the guideline-specific cross reference data object of Ghose to be determined based off of ascertained top ranked passages determined from its question(s).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Mass’s would have allowed Ghose’s to provide a method for using multiple models to generate questions, as noted by Mass (Paragraph 25).
	Ghose and Mass do not explicitly teach:
F)  (ii) generating one or more cross-reference embeddings based on the guideline-specific cross-reference data object; and 
G)  (iii) training the one or more learned parameters using the one or more cross-reference embeddings.
	Mulligan, however, teaches “(ii) generating one or more cross-reference embeddings based on the guideline-specific cross-reference data object” as “Starting with block 602, during the training phase, a patient profile may be accessed for a patient having historical data associated therewith. That is, block 602 may include a database with a selected number of historical patient profiles data. The patient profile may be sent to a patient pathways features learning module 612 and used as input. The patient pathways features learning module 612 may learn (and produce as output) one or more models of patient pathways (e.g., patient pathway model “M1”), as in block 614. That is, the patient pathways features learning module 612 may initialize a machine learning operation and implement one or more machine learning operations to learn features of one or more patient pathways. In one aspect, the patient pathways are extracted as an ordered sequence of events from the historical patient profiles data. A neural network may be trained using labeled data that may include 1) a vector representation of patient pathways, and/or 2) labels characterizing the type of pathways. One or more learned patient pathway models may be sent to the matching module 624, for commencing the runtime phase 620.  Additionally, during the training phase 610, a collection of clinical guidelines 604 and/or feedback data 630 (if available from the runtime phase 620) may be sent to the clinical guidelines features learning module 616 and used as input. The clinical guidelines features learning module 616 may learn (and produce as output) one or more clinical guidelines models (e.g., CPG models or clinical guideline model “M2”), as in block 618. In one aspect, the CPGs may be available as documents (e.g., textual data). The clinical guidelines features learning module 616 may build a quantitative representation of each of the features of guidelines. In one embodiment, clinical guidelines features learning module 616 may use artificial intelligence, natural language processing (“NLP), and/or one or more word embedding operations, which yield a vector-based model of the text. Additionally, the clinical guidelines features learning module 616 may use feedback data 630 from one or more domain experts 632 to improve the quality of the learned CPG model. The feedback data 630 may be used as labeled data at training time. The clinical guidelines features learning module 616 may send the one or more models of clinical guidelines (e.g., CPG models or clinical guideline model “M2”) to the matching module 624” (Paragraphs 113-114) and “(iii) training the one or more learned parameters using the one or more cross-reference embeddings” as “Starting with block 602, during the training phase, a patient profile may be accessed for a patient having historical data associated therewith. That is, block 602 may include a database with a selected number of historical patient profiles data. The patient profile may be sent to a patient pathways features learning module 612 and used as input. The patient pathways features learning module 612 may learn (and produce as output) one or more models of patient pathways (e.g., patient pathway model “M1”), as in block 614. That is, the patient pathways features learning module 612 may initialize a machine learning operation and implement one or more machine learning operations to learn features of one or more patient pathways. In one aspect, the patient pathways are extracted as an ordered sequence of events from the historical patient profiles data. A neural network may be trained using labeled data that may include 1) a vector representation of patient pathways, and/or 2) labels characterizing the type of pathways. One or more learned patient pathway models may be sent to the matching module 624, for commencing the runtime phase 620.  Additionally, during the training phase 610, a collection of clinical guidelines 604 and/or feedback data 630 (if available from the runtime phase 620) may be sent to the clinical guidelines features learning module 616 and used as input. The clinical guidelines features learning module 616 may learn (and produce as output) one or more clinical guidelines models (e.g., CPG models or clinical guideline model “M2”), as in block 618. In one aspect, the CPGs may be available as documents (e.g., textual data). The clinical guidelines features learning module 616 may build a quantitative representation of each of the features of guidelines. In one embodiment, clinical guidelines features learning module 616 may use artificial intelligence, natural language processing (“NLP), and/or one or more word embedding operations, which yield a vector-based model of the text. Additionally, the clinical guidelines features learning module 616 may use feedback data 630 from one or more domain experts 632 to improve the quality of the learned CPG model. The feedback data 630 may be used as labeled data at training time. The clinical guidelines features learning module 616 may send the one or more models of clinical guidelines (e.g., CPG models or clinical guideline model “M2”) to the matching module 624” (Paragraphs 113-114).
	The examiner further notes that the secondary reference of Mulligan teaches the concept of generating vectors (i.e. embeddings) of pathways and guidelines (i.e. examples of the undefined claimed guideline-specific cross-reference data object in the broadest reasonable interpretation) for subsequent training.  The combination would result in vectorizing the guideline-specific cross-reference data objects of Ghose and Mass for subsequent training.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Mulligan’s would have allowed Ghose’s and Mass’s to provide a method for improving the generation of medical recommendations, as noted by Mulligan (Paragraph 22).

	Regarding claims 2, 10, and 17, Ghose further teaches a method, computing system, and one or more non-transitory computer-readable storage media comprising:
A)  wherein the one or more training unstructured data objects are associated with one or more training entities and the guideline data object (Paragraphs 33 and 36).
	The examiner notes that Ghose teaches “wherein the one or more training unstructured data objects are associated with one or more training entities and the guideline data object” as “First workflow 200 starts with the creation of a recommendation model 204 for recommending one or more treatment options for a patient to a care provider using the clinical recommendation system. Recommendation model 204 may be generated from and/or based on one or more sets of clinical guidelines 202 (e.g., clinical guidelines 150 of FIG. 1). Clinical guidelines 202 may be digital guidelines available online and/or on a computer system or network of a healthcare system using the clinical recommendation system. Clinical guidelines 202 may include different reference guidelines for different types of patients and/or pathologies. For example, a first set of clinical guidelines may relate to patients who have or are suspected of having cancer; a second set of clinical guidelines may relate to patients who have or are suspected of having an auto-immune disease; a third set of clinical guidelines may relate to patients who are suffering from traumatic wounds; and so on. In one example, clinical guidelines 202 includes the NCCN Guidelines®” (Paragraph 33) and “Recommendation model 204 may be used for an LLM training step 212, and a prediction model training step 213. LLM training step 212 may rely on a first training data set, and prediction model training step 213 may rely on a second training data set, where the first training data set and the second training data set may both be generated with the aid of recommendation model 204. For LLM training step 212, a natural language training data set 210 may be generated manually during a prompt generation stage 208 by human experts, who generate realistic natural language prompts and associated responses recommending treatments using recommendation model 204 as a guide. The LLM may be trained on data pairs including the realistic natural language prompts, where the associated responses are included as ground truth data” (Paragraph 36).  The examiner further notes that the training data set 210 (which is unstructured) is “associated” with human experts (i.e. an example of the undefined claimed one or more training entities) and clinical guidelines (i.e. the claimed guideline data object).  

	Regarding claims 13 and 20, Ghose further teaches a computing system, and one or more non-transitory computer-readable storage media comprising:
A)  wherein the predictive machine learning model comprises a supervised machine learning model that is trained using a labeled training dataset that comprises (i) one or more ground truth labels (Paragraph 20).
	The examiner notes that Ghose teaches “wherein the predictive machine learning model comprises a supervised machine learning model that is trained using a labeled training dataset that comprises (i) one or more ground truth labels” as “The LLM may be trained using a combination of supervised learning, with ground truth data generated manually based on the recommendation model, and reinforcement learning, where the LLM is optimized using a reward model” (Paragraph 20).  The examiner further notes that a supervised ML model utilizes ground truth data as part of its training. 
	Ghose and Mass do not explicitly teach:
A)  (i) the one or more cross-reference embeddings or one or more structured data embeddings.
	Mulligan, however, teaches “(i) the one or more cross-reference embeddings or one or more structured data embeddings” as “Starting with block 602, during the training phase, a patient profile may be accessed for a patient having historical data associated therewith. That is, block 602 may include a database with a selected number of historical patient profiles data. The patient profile may be sent to a patient pathways features learning module 612 and used as input. The patient pathways features learning module 612 may learn (and produce as output) one or more models of patient pathways (e.g., patient pathway model “M1”), as in block 614. That is, the patient pathways features learning module 612 may initialize a machine learning operation and implement one or more machine learning operations to learn features of one or more patient pathways. In one aspect, the patient pathways are extracted as an ordered sequence of events from the historical patient profiles data. A neural network may be trained using labeled data that may include 1) a vector representation of patient pathways, and/or 2) labels characterizing the type of pathways. One or more learned patient pathway models may be sent to the matching module 624, for commencing the runtime phase 620.  Additionally, during the training phase 610, a collection of clinical guidelines 604 and/or feedback data 630 (if available from the runtime phase 620) may be sent to the clinical guidelines features learning module 616 and used as input. The clinical guidelines features learning module 616 may learn (and produce as output) one or more clinical guidelines models (e.g., CPG models or clinical guideline model “M2”), as in block 618. In one aspect, the CPGs may be available as documents (e.g., textual data). The clinical guidelines features learning module 616 may build a quantitative representation of each of the features of guidelines. In one embodiment, clinical guidelines features learning module 616 may use artificial intelligence, natural language processing (“NLP), and/or one or more word embedding operations, which yield a vector-based model of the text. Additionally, the clinical guidelines features learning module 616 may use feedback data 630 from one or more domain experts 632 to improve the quality of the learned CPG model. The feedback data 630 may be used as labeled data at training time. The clinical guidelines features learning module 616 may send the one or more models of clinical guidelines (e.g., CPG models or clinical guideline model “M2”) to the matching module 624” (Paragraphs 113-114).
	The examiner further notes that the secondary reference of Mulligan teaches the concept of generating vectors (i.e. embeddings) of pathways and guidelines (i.e. examples of the undefined claimed guideline-specific cross-reference data object in the broadest reasonable interpretation) for subsequent training.  The combination would result in vectorizing the guideline-specific cross-reference data objects of Ghose and Mass as a basis for the training in Ghose.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Mulligan’s would have allowed Ghose’s and Mass’s to provide a method for improving the generation of medical recommendations, as noted by Mulligan (Paragraph 22).

	Regarding claims 7 and 14, Ghose and Mass do not explicitly teach a method and computing system comprising:
A)  generating the one or more cross-reference embeddings based on term frequency-inverse document frequency, one-hot encoding, or character embeddings.
	Mulligan, however, teaches “(ii) generating one or more cross-reference embeddings based on the guideline-specific cross-reference data object” as “the optimal medical actions 420 and the one or more sections of existing CPGs 422 may be text based and text fragments may be converted to term frequency-inverse document frequency (“TF-IDF”) vectors where a cosine similarity between the optimal medical actions 420 and the one or more sections of existing CPGs 422 is determined” (Paragraph 92) and “Starting with block 602, during the training phase, a patient profile may be accessed for a patient having historical data associated therewith. That is, block 602 may include a database with a selected number of historical patient profiles data. The patient profile may be sent to a patient pathways features learning module 612 and used as input. The patient pathways features learning module 612 may learn (and produce as output) one or more models of patient pathways (e.g., patient pathway model “M1”), as in block 614. That is, the patient pathways features learning module 612 may initialize a machine learning operation and implement one or more machine learning operations to learn features of one or more patient pathways. In one aspect, the patient pathways are extracted as an ordered sequence of events from the historical patient profiles data. A neural network may be trained using labeled data that may include 1) a vector representation of patient pathways, and/or 2) labels characterizing the type of pathways. One or more learned patient pathway models may be sent to the matching module 624, for commencing the runtime phase 620.  Additionally, during the training phase 610, a collection of clinical guidelines 604 and/or feedback data 630 (if available from the runtime phase 620) may be sent to the clinical guidelines features learning module 616 and used as input. The clinical guidelines features learning module 616 may learn (and produce as output) one or more clinical guidelines models (e.g., CPG models or clinical guideline model “M2”), as in block 618. In one aspect, the CPGs may be available as documents (e.g., textual data). The clinical guidelines features learning module 616 may build a quantitative representation of each of the features of guidelines. In one embodiment, clinical guidelines features learning module 616 may use artificial intelligence, natural language processing (“NLP), and/or one or more word embedding operations, which yield a vector-based model of the text. Additionally, the clinical guidelines features learning module 616 may use feedback data 630 from one or more domain experts 632 to improve the quality of the learned CPG model. The feedback data 630 may be used as labeled data at training time. The clinical guidelines features learning module 616 may send the one or more models of clinical guidelines (e.g., CPG models or clinical guideline model “M2”) to the matching module 624” (Paragraphs 113-114).
	The examiner further notes that the secondary reference of Mulligan teaches the concept of generating vectors (i.e. embeddings) of pathways and guidelines (i.e. examples of the undefined claimed guideline-specific cross-reference data object in the broadest reasonable interpretation) for subsequent training via the use of TF-IDF.  The combination would result in vectorizing the guideline-specific cross-reference data objects of Ghose and Mass for subsequent training.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Mulligan’s would have allowed Ghose’s and Mass’s to provide a method for improving the generation of medical recommendations, as noted by Mulligan (Paragraph 22).

	Regarding claims 8 and 15, Ghose further teaches a method and computing system comprising:
A)  wherein the guideline data object comprises a decision tree that is associated with one or more series of questions (Paragraph 19).
	The examiner notes that Ghose teaches “wherein the guideline data object comprises a decision tree that is associated with one or more series of questions” as “The recommendation model may be based on clinical guidelines used by the health care system, depending on a type of patient and/or pathology presented. For example, the National Comprehensive Cancer Network (NCCN) Clinical Practice Guidelines in Oncology (NCCN Guidelines®) may be a first set of clinical guidelines used for cancer patients. Additional sets of clinical guidelines may be used for specific types of cancer. For example, for patients with prostate cancer, the Prostrate Imaging Reporting & Data System (PI-RADS®) may be used as a second set of clinical guidelines. In various embodiments, the recommendation model may be implemented as a decision tree or knowledge graph that may be traversed to determine a suggested treatment based on patient data” (Paragraph 19).  The examiner further notes that a recommendation model is based off of clinical guidelines that are decision trees that are associated with one or more questions.
11.	Claims 3-4, 11-12, and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Ghose et al. (U.S. PGPUB 2025/0095642), in view of Mass et al. (U.S. PGPUB 2023/0029829), and further in view of Mulligan et al. (U.S. PGPUB 2021/0057098) as applied to claims 1-2, 7-10, 13-17, and 20 above, and further in view of Berglund et al. (U.S. PGPUB 2024/0403341).
12.	Regarding claims 3, 11, and 18, Ghose, Mass, and Mulligan do not explicitly teach a method, computing system, and one or more non-transitory computer-readable storage media comprising:
A)  wherein generating the guideline-specific cross-reference data object further comprises: generating, using a retrieval machine learning model, one or more answers to the question based on the one or more top ranking passages; and 
B)  combining the one or more answers to the question with a plurality of answers to other questions of the one or more questions from the guideline data object.
	Berglund, however, teaches “wherein generating the guideline-specific cross-reference data object further comprises: generating, using a retrieval machine learning model, one or more answers to the question based on the one or more top ranking passages” as “the query service 112 sends the LLM 140 a prompt that includes the search query and the text chunks matched to the search query, instructing the LLM 140 to generate an answer to the search query based on the matching text chunks. The prompt can further include the additional questions generated based on the query to improve the query answer generated by the LLM. When the prompt includes the additional questions, the prompt can instruct the LLM to generate a combined answer based on both the original user query and the additional questions, where the combined answer can be returned as the answer to the user query” (Paragraph 46) and “combining the one or more answers to the question with a plurality of answers to other questions of the one or more questions from the guideline data object” as “the query service 112 sends the LLM 140 a prompt that includes the search query and the text chunks matched to the search query, instructing the LLM 140 to generate an answer to the search query based on the matching text chunks. The prompt can further include the additional questions generated based on the query to improve the query answer generated by the LLM. When the prompt includes the additional questions, the prompt can instruct the LLM to generate a combined answer based on both the original user query and the additional questions, where the combined answer can be returned as the answer to the user query” (Paragraph 46).
	The examiner further notes that the secondary reference of Berglund teaches the concept of an LLM combining an answer (i.e. the claimed one or more answers) from a query and text chunks (i.e. passages in the broadest reasonable interpretation) with other answers to other questions.  The combination would result in the other questions being questions from the guideline data object of Ghose in order to generate a combined answer via the top ranked passages of Mass.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Berglund’s would have allowed Ghose’s, Mass’s, and Mulligan’s to provide a method for avoiding non-relevant answers to user queries, as noted by Berglund (Paragraph 12).

Regarding claims 4, 12, and 19, Ghose, Mass, and Mulligan do not explicitly teach a method, computing system, and one or more non-transitory computer-readable storage media comprising:
A)  generating a model prompt comprising the question and the one or more top ranking passages; and 
B)  providing the model prompt to the retrieval machine learning model to generate the one or more answers to the question based on the one or more top ranking passages.
	Berglund, however, teaches “generating a model prompt comprising the question and the one or more top ranking passages” as “the query service 112 sends the LLM 140 a prompt that includes the search query and the text chunks matched to the search query, instructing the LLM 140 to generate an answer to the search query based on the matching text chunks. The prompt can further include the additional questions generated based on the query to improve the query answer generated by the LLM. When the prompt includes the additional questions, the prompt can instruct the LLM to generate a combined answer based on both the original user query and the additional questions, where the combined answer can be returned as the answer to the user query” (Paragraph 46) and “providing the model prompt to the retrieval machine learning model to generate the one or more answers to the question based on the one or more top ranking passages” as “the query service 112 sends the LLM 140 a prompt that includes the search query and the text chunks matched to the search query, instructing the LLM 140 to generate an answer to the search query based on the matching text chunks. The prompt can further include the additional questions generated based on the query to improve the query answer generated by the LLM. When the prompt includes the additional questions, the prompt can instruct the LLM to generate a combined answer based on both the original user query and the additional questions, where the combined answer can be returned as the answer to the user query” (Paragraph 46).
	The examiner further notes that the secondary reference of Berglund teaches the concept of generating a prompt that is sent to an LLM that includes a query and text chunks (i.e. passages in the broadest reasonable interpretation) in order to obtain an answer.  The combination would result in the question Ghose and the top ranked passages of Mass to be included in such a prompt for subsequent answer generation.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Berglund’s would have allowed Ghose’s, Mass’s, and Mulligan’s to provide a method for avoiding non-relevant answers to user queries, as noted by Berglund (Paragraph 12).
Allowable Subject Matter
13.	Claim 5 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
	Specifically, although the prior art (See Mulligan) generates embeddings that are used for subsequent training, the detailed claim language directed towards the defined mapping of the defined cross-reference embeddings and the generated structured embeddings to a common feature vector space that is then used for training the defined learned parameters is not found in the prior art, in conjunction with the rest of the limitations of the parent independent claim. 
	Dependent claim 6 is deemed allowable for depending on the deemed allowable subject matter of dependent claim 5.
Conclusion
14.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
U.S. PGPUB 2020/0226164 issued to Eifert et al. on 16 July 2020.  The subject matter disclosed therein is pertinent to that of claims 1-20 (e.g., methods to determine medical actions based off of guidelines).
U.S. PGPUB 2025/0285729 issued to Pan on 11 September 2025.  The subject matter disclosed therein is pertinent to that of claims 1-20 (e.g., methods to determine medical actions based off of guidelines).
Contact Information
15.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Mahesh Dwivedi whose telephone number is (571) 272-2731.  The examiner can normally be reached on Monday to Friday 8:20 am – 4:40 pm.
	If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Rones can be reached (571) 272-4085.  The fax number for the organization where this application or proceeding is assigned is (571) 273-8300.
	Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).


Mahesh Dwivedi
Primary Examiner
Art Unit 2168

February 27, 2026
/MAHESH H DWIVEDI/Primary Examiner, Art Unit 2168
Read full office action
Prosecution Timeline

May 14, 2024
Application Filed
Feb 27, 2026
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/171,704
Patent 12591818
FORECASTING AND MITIGATING CONCEPT DRIFT USING NATURAL LANGUAGE PROCESSING
2y 5m to grant Granted Mar 31, 2026
18/653,456
Patent 12585690
COMPUTER-READABLE RECORDING MEDIUM STORING INFORMATION VERIFICATION PROGRAM, INFORMATION PROCESSING APPARATUS, AND INFORMATION PROCESSING SYSTEM
2y 5m to grant Granted Mar 24, 2026
18/918,887
Patent 12561366
Real-Time Micro-Profile Generation Using a Dynamic Tree Structure
2y 5m to grant Granted Feb 24, 2026
18/976,038
Patent 12561469
INFERRING SCHEMA STRUCTURE OF FLAT FILE
2y 5m to grant Granted Feb 24, 2026
18/506,032
Patent 12554730
HYBRID DATABASE IMPLEMENTATIONS
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
69%
Grant Probability
74%
With Interview (+4.3%)
3y 6m
Median Time to Grant
Low
PTA Risk
Based on 751 resolved cases by this examiner. Grant probability derived from career allow rate.