Last updated: April 19, 2026
Application No. 18/735,886
SYSTEMS AND METHOD FOR ENHANCED CONVERSATIONAL PERFORMANCE OF LARGE LANGUAGE MODELS USING ADAPTIVE RETRIEVAL-AUGMENTED GENERATION

Non-Final OA §101§102§103
Filed
Jun 06, 2024
Examiner
HASSAN, ALI MOHAMAD
Art Unit
2653
Tech Center
2600 — Communications
Assignee
Jpmorgan Chase Bank N A
OA Round
1 (Non-Final)
Interview Optional

— +33.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 10 resolved cases, 2023–2026
Examiner Intelligence

HASSAN, ALI MOHAMAD View full profile →
Grants 70% — above average
Career Allow Rate
7 granted / 10 resolved
+8.0% vs TC avg
Strong +33% interview lift
Without
With
+33.3%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
19 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
30.8%
-9.2% vs TC avg
§103
40.3%
+0.3% vs TC avg
§102
22.0%
-18.0% vs TC avg
§112
4.4%
-35.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 10 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Claims 1, 9, and 17, Further claim 1 recites A method, comprising:
receiving, at a computer program, a query from a user;
retrieving, by the computer program, a plurality of summaries of historical conversations from a database of historical conversation summaries similar to the query;
generating, by the computer program, a first prompt comprising the query and the plurality of summaries;
submitting, by the computer program, the first prompt to a first large language model (LLM);
receiving, by the computer program and from the first LLM, a first response;
presenting, by the computer program, the first response to the user;
generating, by the computer program, a second prompt for a summary of the query and the first response;
submitting, by the computer program, the second prompt to a second LLM; and
saving, by the computer program, a second response to the second prompt from the second LLM to the database of historical conversation summaries, wherein the second response comprises the summary.

Further claim 17 states A system, comprising:
a user electronic device;
a database comprising historical conversation summaries;

The limitation of “receiving…”, “retrieving…”, “generating…”, submitting…”, receiving…”, presenting…”, generating…”, submitting…”, and “saving…”  , as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind.  For example, a person receiving a query from a user where he answers the query based on past conversations where he created a summary on each query. After answering the question, he then creates a  summary on the question and answer pair that he just answered. Further after the answering/presenting having a bullet point list of summaries of the query/topic that they discussed/ answered.

If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claims recite an abstract idea.

This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are computer components “device” (paragraph 31), “LLM” (paragraph 33)  and “database” (paragraphs 73) recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component.  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.  The claims are directed to an abstract idea.   
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using the computer components amounts to no more than mere instructions to apply the exception using a generic computer component.  Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.  The claims are not patent eligible.

Claims 2, 10  additionally claim 2 recites the method of claim 1, further comprising: generating, by the computer program, an embedding vector for the query; wherein the computer program retrieves the plurality of summaries of historical conversations from the database of historical conversation summaries using the embedding vector for the query. However, this limitation does not prevent a human from performing the steps mentally as described above. Further, the person associating a numerical value to the query. Where based on this numerical value he is able to look through his bullet point list to see if there’s any summary associated with that particular query. Thus, these claims are directed towards a mental process. Similar to above, no additional limitations are provided that provide a practical application, or amount to significantly more than the abstract idea.  Therefore, the claims are not patent eligible.

Claims 3 and 11 additionally claim 3  recites the method of claim 2, wherein the computer program compares the embedding vector for the query to embedding vectors for each of the plurality of summaries. However, these limitations encompass a person answering the query by looking at is bullet point list to see if he has one like it to use to answer the query. Similar to above, no additional limitations are provided that provide a practical application, or amount to significantly more than the abstract idea.  Therefore, the claims are not patent eligible.


Claims 4 and 12 additionally claim 4 recites the method of claim 3, wherein a certain number of summaries having embedding vectors with values closest to a value for the embedding vector for the query are retrieved. However, these limitations encompass a person answering the query by looking at his bullet point list and using a summary that’s closets to the query. Further looking into his bullet point list to see . Thus, the claim is directed towards a mental process. Similar to above, no additional limitations are provided that provide a practical application, or amount to significantly more than the abstract idea.  Therefore, the claims are not patent eligible.

Claims 5, 13, and 19 additionally claim 5 recites the method of claim 1, wherein the summary is saved to the database of historical conversation summaries in response to an embedding vector for the summary being distinct from embedding vectors for the plurality of summaries in the database. However, these limitations encompass a person answering a query and generating a summary with a title/ subject to add to the bullet point list. Further, only adding if its distinct to the other summaries. Thus, the claim is directed towards a mental process. Similar to above, no additional limitations are provided that provide a practical application, or amount to significantly more than the abstract idea.  Therefore, the claims are not patent eligible.

Claims 6, 14, 20 additionally claim 6 recites the method of claim 1, further comprising: removing, by the computer program, one of the plurality of summaries in the database in response to an embedding vector for the summary having a value that is similar to an embedding vector for the one of the plurality of summaries. However, these limitations encompass a person looking through the bullet point list and removing duplicates. Thus, the claim is directed towards a mental process. Similar to above, no additional limitations are provided that provide a practical application, or amount to significantly more than the abstract idea.  Therefore, the claim is not patent eligible.

Claims 7 and 15 additionally claim 7 recites the method of claim 1, wherein the first LLM and the second LLM are the same LLM. However, these limitations encompass a  person receiving a query from a user where he answers the query based on past conversation he had with them. Further after the query having a bullet point list of summaries of the query/topic that they discussed/ answered. In particular, the claim only recites additional elements that are computer components, “LLM” (paragraph 33)  recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component.  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.  The claims are directed to an abstract idea.   


Claims 8 and 16 additionally claim 8 The method of claim 1, wherein the computer program comprises a user interface computer program and a prompt generator computer program. However, these limitations encompass a person receiving answering a query and generating a query/ prompt. Thus, the claim is directed towards a mental process. In particular, the claim only recites additional elements that is “user interface” where its pre-solution by receiving something from the user. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.  The claims are directed to an abstract idea.   

Claim 18  contains limitations similar to those found in claims 2, 3, and 4 and therefore are not patent eligible for the same reasons.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1,8, 9, 16, and 17 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by US Patent US 20240414108 A1, (Sun; Haowen.).
Claim 1
 Regarding Claim 1, Sun teaches
A method, comprising:
receiving, at a computer program, a query from a user;
(See figures 4 (502), 4A (402), and 6A (644) which teach the prompt which is the query
 
paragraph 77 "In operation 402, the chatbot system 300 receives, via a user system 306 (of FIG. 3), a prompt 302 of a user 334 during an interactive session. For example, the user 334 uses user system 306 to access an interactive platform that hosts the chatbot system 300. The user 334 enters a prompt 302 into the user system 306 and the user system 306 communicates the prompt 302 to the chatbot system 300. In some examples, the prompt 302 may include other types of data as well as text such as, but not limited to, image data, video data, audio data, electronic documents, links to data stored on the Internet or the user system 306, and the like. In addition, the prompt 302 may include media such as, but not limited to, audio media, image media, video media, textual media, and the like. Regardless of the data type of the prompt 302, keyword attribution and expansion may be used to automatically generate a cluster of keywords or attributes that are associated with the received prompt 302. For example, image recognition may be deployed to identify objects and location associated with visual media and image data and to generate a keyword cluster or cloud that is then associated with the image-based prompt.")

retrieving, by the computer program, a plurality of summaries of historical conversations from a database of historical conversation summaries similar to the query;
(See figures 4 (502), 4A (402), and 6A (644) which teach the prompt which is the query
See figures 5 and  6A. Fig 5  "chat history input 508" teaches the "summaries of historical conversation". Fig 6A "conversation history database 620" being subjected to the “Summarizing Component 616” which uses the  “Summarizing Model 648” to generate “summaries 632.”
 
 Paragraph 109 “ In operation 604, the chatbot system 300 generates summarized memories 632 using the conversation history 628. For example, the chatbot system 300 uses a summarizing component 616 to generate summarized memories 632 from the conversation history 628 using artificial intelligence methodologies and a summarizing model 648 that was previously generated using machine learning methodologies….”  
The “summarized memories 632” are further processed and stored in a “memories datastore 622.”   
 
Paragraph 110 “ In operation 606, the chatbot system 300 generates moderated memories 634 using the summarized memories 632 and stores the memories 634 into a memories datastore 622….”
 
paragraph 122 "In some examples, the retrieving component 624 retrieves relevant information from all stored user knowledge, and only sends relevant contexts to the generative AI model 332. Instead of instructing the generative AI model 332 to do the filtering. In some examples, the retrieving component 624 determines contextual important memories in the memories datastore 622 using a semantic search process.")

generating, by the computer program, a first prompt comprising the query and the plurality of summaries;
(See figure 6A "augmented prompt model 652" generate an “Augmented Prompt” / “First Prompt” of the Claim from the “Input Prompt” /”Query” of the Claim
 
See paragraphs 118-119 which explains how the augments of the initial input query/prompt with "current conversation context 638" to arrive at the “augmented prompt 640” / “First Prompt” of the Claim
 
paragraph 120 "In operation 612, the chatbot system 300 generates an augmented prompt 640 using the context-appropriate memories 636 and the user prompt 644 and communicates the augmented prompt 640 to the generative AI model 332. For example, the chatbot system 300 uses the retrieving component 624 to generate the augmented prompt 640 using the user prompt 644, the context-appropriate memories 636, and an augmented prompt model 652 that was previously generated using machine learning methodologies. In some examples, the augmented prompt model 652 includes, but is not limited to, a generative AI model, a neural network, a learning vector quantization network, a logistic regression model, a support vector machine, a random decision forest, a naïve Bayes model, a linear discriminant analysis model, and a K-nearest neighbor model. In some examples, machine learning methodologies used to generate the augmented prompt model 652 may include, but are not limited to, supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, dimensionality reduction, self-learning, feature learning, sparse dictionary learning, and anomaly detection.")

submitting, by the computer program, the first prompt to a first large language model (LLM);
(See figures 4B and 6A: communicates the augmented prompt 640 to the generative AI model 332, “Generative AI model 332” teaches the first LLM of the Claim
 
paragraph 120"In operation 612, the chatbot system 300 generates an augmented prompt 640 using the context-appropriate memories 636 and the user prompt 644 and communicates the augmented prompt 640 to the generative AI model 332. For example, the chatbot system 300 uses the retrieving component 624 to generate the augmented prompt 640 using the user prompt 644, the context-appropriate memories 636, and an augmented prompt model 652 that was previously generated using machine learning methodologies. In some examples, the augmented prompt model 652 includes, but is not limited to, a generative AI model, a neural network, a learning vector quantization network, a logistic regression model, a support vector machine, a random decision forest, a naïve Bayes model, a linear discriminant analysis model, and a K-nearest neighbor model. In some examples, machine learning methodologies used to generate the augmented prompt model 652 may include, but are not limited to, supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, dimensionality reduction, self-learning, feature learning, sparse dictionary learning, and anomaly detection.")
 
receiving, by the computer program and from the first LLM, a first response;
(See figures 4B and 6A:  “Chatbot response 308” output the “Generative AI model 332” teaches the “first response"
 
paragraph 96 "In operation 418, the chatbot system 300 receives a chatbot response 308 from the generative AI model 332.")
presenting, by the computer program, the first response to the user;
(paragraph 97 "In operation 420, the chatbot system 300 provides the chatbot response 308 to the user 334 via the user system 306.")
generating, by the computer program, a second prompt for a summary of the query and the first response;
(See Figure 6A shows that the “chatbot system core 646” is in communication with the “conversation history 620” to provide the questions and answers back to this database for being stored.
 
Paragraph 106 “During one or more conversations between a user 334 and the chatbot system 300, the chatbot system 300 stores conversation histories 630 into a conversation history datastore 620. For example, the conversation histories 630 include sequences of user prompts 644, augmented prompts 640 and personalized responses 642 by generative AI model 332 during the one or more conversations.” 
The storing of the summaries by the Chatbot system 300 teaches the generation of the “Second Prompt” of the Claim by the system. 
 
Paragraph 109 “ In operation 604, the chatbot system 300 generates summarized memories 632 using the conversation history 628. For example, the chatbot system 300 uses a summarizing component 616 to generate summarized memories 632 from the conversation history 628 using artificial intelligence methodologies and a summarizing model 648 that was previously generated using machine learning methodologies….”  
 
Abstract: “…  The chatbot system retrieves a conversation history of one or more conversations between a user and a chatbot from a conversation history datastore and generates one or more summarized memories using the conversation history. …”  
 
To Retrieve the Histories, they must have been stored previously as shown in Figure 6A.)

submitting, by the computer program, the second prompt to a second LLM; and
( See Figure 6A shows that the “chatbot system core 646” is in communication with the “conversation history 620” to provide the questions and answers back to this database for being stored.  The sending of the “conversation history 630” to the “datastore 620” teaches the “submitting” of this Claim.
 
The “second LLM”  is taught by the “summarizing component 616” and its “summarizing model 648” which are AI models:  
Paragraph 109 “ In operation 604, the chatbot system 300 generates summarized memories 632 using the conversation history 628. For example, the chatbot system 300 uses a summarizing component 616 to generate summarized memories 632 from the conversation history 628 using artificial intelligence methodologies and a summarizing model 648 that was previously generated using machine learning methodologies….”  )

saving, by the computer program, a second response to the second prompt from the second LLM to the database of historical conversation summaries, wherein the second response comprises the summary.
(See Figure 6A, saving the summaries / “second response” in the “memories database 622.”  
 
paragraph 110 "In operation 606, the chatbot system 300 generates moderated memories 634 using the summarized memories 632 and stores the memories 634 into a memories datastore 622. For example, the chatbot system 300 uses the moderating component 618 to eliminate specified content from the summarized memories 632, for instance obscene words or concepts, or content that some may consider harmful from the summarized memories 632 using artificial intelligence methodologies and a moderating model 650 that was previously generated using machine learning methodologies. In some examples, the moderating model 650 includes, but is not limited to, a generative AI model, a neural network, a learning vector quantization network, a logistic regression model, a support vector machine, a random decision forest, a naïve Bayes model, a linear discriminant analysis model, and a K-nearest neighbor model. In some examples, machine learning methodologies used to generate the moderating model 650 may include, but are not limited to, supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, dimensionality reduction, self-learning, feature learning, sparse dictionary learning, and anomaly detection.")

Claim 9
 Regarding Claim 9, Sun teaches
a user electronic device;
(paragraph 77 "In operation 402, the chatbot system 300 receives, via a user system 306 (of FIG. 3), a prompt 302 of a user 334 during an interactive session. For example, the user 334 uses user system 306 to access an interactive platform that hosts the chatbot system 300. The user 334 enters a prompt 302 into the user system 306 and the user system 306 communicates the prompt 302 to the chatbot system 300. In some examples, the prompt 302 may include other types of data as well as text such as, but not limited to, image data, video data, audio data, electronic documents, links to data stored on the Internet or the user system 306, and the like. In addition, the prompt 302 may include media such as, but not limited to, audio media, image media, video media, textual media, and the like. Regardless of the data type of the prompt 302, keyword attribution and expansion may be used to automatically generate a cluster of keywords or attributes that are associated with the received prompt 302. For example, image recognition may be deployed to identify objects and location associated with visual media and image data and to generate a keyword cluster or cloud that is then associated with the image-based prompt.")

a database comprising historical conversation summaries;
(Paragraph 109 "In operation 604, the chatbot system 300 generates summarized memories 632 using the conversation history 628. For example, the chatbot system 300 uses a summarizing component 616 to generate summarized memories 632 from the conversation history 628 using artificial intelligence methodologies and a summarizing model 648 that was previously generated using machine learning methodologies. In some examples, the summarizing model 648 includes, but is not limited to, a generative AI model, a neural network, a learning vector quantization network, a logistic regression model, a support vector machine, a random decision forest, a naïve Bayes model, a linear discriminant analysis model, and a K-nearest neighbor model. In some examples, machine learning methodologies used to generate the summarizing model 648 may include, but are not limited to, supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, dimensionality reduction, self-learning, feature learning, sparse dictionary learning, and anomaly detection."
 
Paragraph 34 "In some examples, the chatbot system generates the augmented prompt by generating a current conversation context from a current conversation between the user and the chatbot, retrieving, from the memories datastore, one or more stored memories using the current conversation context, and generating the augmented prompt using the one or more stored memories.")

claim 9 is similar to claim 1 and therefore are rejected for similar reasons as in claim 1.

Claim 17
 Regarding Claim 17, Sun teaches
A non-transitory computer readable storage medium, including instructions stored thereon, which when read and executed by one or more computer processors, cause the one or more computer processors to perform steps comprising:
(paragraph 38 "In some examples, the methodologies described herein relate to a machine-readable medium storing machine-executable instructions that, when executed by a machine, cause the machine to perform operations including: retrieving, by one or more processors, from a conversation history datastore, a conversation history of one or more conversations between a user and a chatbot; generating one or more summarized memories using the chat history; generating one or more moderated memories using the summarized memories; storing the one or more moderated memories into a memories datastore storing one or more memories; receiving a user prompt from the user; generating a current conversation context from a current conversation between the user and the chatbot; retrieving, by one or more processors, from the memories datastore, one or more memories using the current conversation context; generating an augmented prompt using the user prompt and the one or more memories; communicating the augmented prompt to a generative AI model; receiving a response from the generative AI model to the augmented prompt; and providing the response to the user.")



claim 17 is similar to claim 1 and therefore are rejected for similar reasons as in claim 1.

Claim 8 and 16
 Regarding Claim 8 and 16, Sun teaches
The method of claim 1, wherein the computer program comprises a user interface computer program and a prompt generator computer program.
(paragraph 77 "In operation 402, the chatbot system 300 receives, via a user system 306 (of FIG. 3), a prompt 302 of a user 334 during an interactive session. For example, the user 334 uses user system 306 to access an interactive platform that hosts the chatbot system 300. The user 334 enters a prompt 302 into the user system 306 and the user system 306 communicates the prompt 302 to the chatbot system 300. In some examples, the prompt 302 may include other types of data as well as text such as, but not limited to, image data, video data, audio data, electronic documents, links to data stored on the Internet or the user system 306, and the like. In addition, the prompt 302 may include media such as, but not limited to, audio media, image media, video media, textual media, and the like. Regardless of the data type of the prompt 302, keyword attribution and expansion may be used to automatically generate a cluster of keywords or attributes that are associated with the received prompt 302. For example, image recognition may be deployed to identify objects and location associated with visual media and image data and to generate a keyword cluster or cloud that is then associated with the image-based prompt."
 
Paragraph 120 " In operation 612, the chatbot system 300 generates an augmented prompt 640 using the context-appropriate memories 636 and the user prompt 644 and communicates the augmented prompt 640 to the generative AI model 332. For example, the chatbot system 300 uses the retrieving component 624 to generate the augmented prompt 640 using the user prompt 644, the context-appropriate memories 636, and an augmented prompt model 652 that was previously generated using machine learning methodologies. In some examples, the augmented prompt model 652 includes, but is not limited to, a generative AI model, a neural network, a learning vector quantization network, a logistic regression model, a support vector machine, a random decision forest, a naïve Bayes model, a linear discriminant analysis model, and a K-nearest neighbor model. In some examples, machine learning methodologies used to generate the augmented prompt model 652 may include, but are not limited to, supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, dimensionality reduction, self-learning, feature learning, sparse dictionary learning, and anomaly detection."
The interactive session would teach the “user interactive computer program ”, the augmented prompt would teach the “prompt generator”)

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 2, 3, 4, 10, 11, 12, and 18 are rejected under 35 U.S.C. 103 as obvious over US Patent US 20240414108 A1, (Sun; Haowen.) in view of US Patent US 20250310280 A1, (D'Agostino; Dino Paul).
Claim 2 and 10
 Regarding Claim 2 and 10, Sun do not explicitly teach all of the method of claim 1, further comprising:
generating, by the computer program, an embedding vector for the query;
wherein the computer program retrieves the plurality of summaries of historical conversations from the database of historical conversation summaries using the embedding vector for the query.

However, D'Agostino teach 
2. The method of claim 1, further comprising:
generating, by the computer program, an embedding vector for the query;
(Paragraph 138" Referring to FIG. 6B, the attributes/search criteria may be identified as text values. The text values may be converted into a vector via execution of an LLM 650 or the like which embeds the text values into a vector 652. For example, the LLM 650 may be a transformer neural network with an encoder/decoder framework which can embed a block of text into a single vector. Here, the LLM 650 may convert a block of text, such as a sentence, phrase, combination of words, word, or the like, into a multi-dimensional vector. In this example, the vector 652 is pointing in a direction 654 within a vector space 660. The output of the LLM 650 may be transferred to the retriever 642. Next, the retriever 642 may compare the direction 654 of the vector 652 to the direction of other vectors from the vector storage 644 which are mapped into the vector space 660. For example, the retriever 642 may perform a cosine similarity analysis and identify a vector 662, a vector 664, and a vector 666 which are pointing in the same direction as the vector 652 (within the predetermined amount of degrees). As a result, the retriever 642 determines that the vector 662, the vector 664, and the vector 666 are each a match to the contextual attributes included in the search criteria.")
wherein the computer program retrieves the plurality of summaries of historical conversations from the database of historical conversation summaries using the embedding vector for the query.
(Paragraph 136 "According to various embodiments, the contextual attributes that are identified by the LLM 634 and the LLM 635 may be fed to the retriever 642 as search criteria. In this example, the retriever 642 may identify one or more vectors within the vector storage 644 that are similar to the search criteria (e.g., the item of interest, the sentiment, etc.) and retrieve these vectors from the vector storage 644. In the example of FIG. 6A, the retriever 642 identifies a subset 648 of vectors which are identified by comparing the search criteria to the vectors stored within the vector storage 644. For example, the search criteria may be converted into a vector and then compared to the vectors within the vector storage 644 to identify whether any matches exist."
 
Paragraph 139 " Referring again to FIG. 6A, the retriever 642 may retrieve the vector 662, the vector 664, and the vector 666 from the vector storage 644 and forward the vectors to the LLM 646 for additional processing. Here, the LLM 646 may generate a response for the conversation between the source device 610 and the service provider device 620 and provide the response to the software application 632. Here, the software application 632 may output the response from the LLM 646 during a communication session between the source device 610 and the service provider device 620.")


It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Sun to incorporate the teachings of D'Agostino to provide a “3. The method of claim 2, wherein the computer program compares the embedding vector for the query to embedding vectors for each of the plurality of summaries.” Doing so would Create a response that is more narrowly tailored toward a particular context, as recognized by D'Agostino. (Paragraph 131).

Claim 3 and 11
 Regarding Claim 3 and 11, Sun do not explicitly teach all of 3. The method of claim 2, wherein the computer program compares the embedding vector for the query to embedding vectors for each of the plurality of summaries.
However, D'Agostino. teach 
The method of claim 2, wherein the computer program compares the embedding vector for the query to embedding vectors for each of the plurality of summaries.
(Paragraph 146 "In one embodiment, an apparatus comprising a memory and a processor coupled to the memory is configured to convert search criteria into a search criteria vector and identify the subset of vectors based on comparing the search criteria vector and the plurality of vectors in vector space. Upon receiving interaction content from a communication session between a source device and a service provider device, the processor identifies specific search criteria from the content. The search criteria serve as the basis for selecting relevant vectors from a vector database, which comprises a plurality of vectors representing various aspects of previous interactions between users and the service provider. To convert the identified search criteria into a search criteria vector, the processor employs mathematical techniques to represent the search criteria in a vector format suitable for comparison with the vectors stored in the vector database. The conversion process ensures that the search criteria are represented in a manner consistent with the vectors in the database, facilitating effective comparison and retrieval of the subset of vectors. Once the search criteria vector is generated, the processor compares it with the plurality of vectors in vector space. The comparison identifies vectors in the database that closely match the search criteria vector, selecting a subset of vectors relevant to the current communication session. The processor retrieves the subset of vectors from the vector database, comprising previous interaction content between the user and the service provider. Following the retrieval of the vector subset, the processor executes a LLM on these vectors to generate a response for the ongoing communication session. The generated response is outputted to at least one of the source and service provider devices.")

See claim 2 and 10 for rationale. 

Claim 4 and 12
 Regarding Claim 4 and 12, Sun do not explicitly teach all of 4. The method of claim 3, wherein a certain number of summaries having embedding vectors with values closest to a value for the embedding vector for the query are retrieved.
However, D'Agostino. teach 
The method of claim 3, wherein a certain number of summaries having embedding vectors with values closest to a value for the embedding vector for the query are retrieved.
(Paragraph 136 "According to various embodiments, the contextual attributes that are identified by the LLM 634 and the LLM 635 may be fed to the retriever 642 as search criteria. In this example, the retriever 642 may identify one or more vectors within the vector storage 644 that are similar to the search criteria (e.g., the item of interest, the sentiment, etc.) and retrieve these vectors from the vector storage 644. In the example of FIG. 6A, the retriever 642 identifies a subset 648 of vectors which are identified by comparing the search criteria to the vectors stored within the vector storage 644. For example, the search criteria may be converted into a vector and then compared to the vectors within the vector storage 644 to identify whether any matches exist."
 
Paragraph 137 "As an example, the comparison may be performed based on a cosine similarity, or the like, within vector space. The cosine similarity may identify vectors that are pointed in the same direction (or roughly in the same direction). For example, referring to FIG. 6B, if a vector within the vector space 660 is pointing in a direction that is within a predetermined amount of degrees (e.g., +/−1 degrees) of the search criteria vector, it may be considered a match to the search criteria. FIG. 6B illustrates an example of a process 600B of the retriever 642 performing a comparison of the search criteria to the vectors in the vector space 660."

Paragraph 138" Referring to FIG. 6B, the attributes/search criteria may be identified as text values. The text values may be converted into a vector via execution of an LLM 650 or the like which embeds the text values into a vector 652. For example, the LLM 650 may be a transformer neural network with an encoder/decoder framework which can embed a block of text into a single vector. Here, the LLM 650 may convert a block of text, such as a sentence, phrase, combination of words, word, or the like, into a multi-dimensional vector. In this example, the vector 652 is pointing in a direction 654 within a vector space 660. The output of the LLM 650 may be transferred to the retriever 642. Next, the retriever 642 may compare the direction 654 of the vector 652 to the direction of other vectors from the vector storage 644 which are mapped into the vector space 660. For example, the retriever 642 may perform a cosine similarity analysis and identify a vector 662, a vector 664, and a vector 666 which are pointing in the same direction as the vector 652 (within the predetermined amount of degrees). As a result, the retriever 642 determines that the vector 662, the vector 664, and the vector 666 are each a match to the contextual attributes included in the search criteria."
 
Paragraph 139 " Referring again to FIG. 6A, the retriever 642 may retrieve the vector 662, the vector 664, and the vector 666 from the vector storage 644 and forward the vectors to the LLM 646 for additional processing. Here, the LLM 646 may generate a response for the conversation between the source device 610 and the service provider device 620 and provide the response to the software application 632. Here, the software application 632 may output the response from the LLM 646 during a communication session between the source device 610 and the service provider device 620.")
)

See claim 2 and 10 for rationale. 

Claim 18  contains limitations similar to those found in claims 2, 3, and 4 and therefore are not patent eligible for the same reasons.

Claims 5, 13, and 19 are rejected under 35 U.S.C. 103 as obvious over US Patent US 20240414108 A1, (Sun; Haowen.) in view of US Patent US 20250310303 A1, (Xu; Tao).
Claim 5, 13, and 19
 Regarding Claim 5, 13, and 19, Sun do not explicitly teach all of the method of claim 1, wherein the summary is saved to the database of historical conversation summaries in response to an embedding vector for the summary being distinct from embedding vectors for the plurality of summaries in the database.
However, Xu teach 
The method of claim 1, wherein the summary is saved to the database of historical conversation summaries in response to an embedding vector for the summary being from embedding vectors for the plurality of summaries in the database.
(Paragraph 70 "Here, the proxy server 100 receives a new query 410n from a client device 40. The proxy server 100 processes the new query 410n, using the text encoder 110, to generate an embedding vector 420n of the new query 410n. Proxy server 100 then performs a vector search on the vector database 120 with respect to the new query vector 420n. In this case, the proxy server 100 does not identify an entry in the vector database 120 that includes a query vector 420 that is similar to the new query vector 420n, as measured by a query similarity metric. The proxy server 100 then forwards the new query 410n to the web server 200. The proxy server 100 receives a true response 412t to the new query 410n, from the web server 200, that was generated by the neural network 210. The proxy server 100 then initializes a new entry 122n in the vector database 120 with the new query vector 420n and the true response 412t, and subsequently transmits the true response 412t to the client device 40.")

It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Sun to incorporate the teachings of Xu to provide a “The method of claim 1, wherein the summary is saved to the database of historical conversation summaries in response to an embedding vector for the summary being from embedding vectors for the plurality of summaries in the database.” Doing so would increase the speed of vector searches , as recognized by Xu (Paragraph 131).
Claims  6, 14, and 20 are rejected under 35 U.S.C. 103 as obvious over US Patent US 20240414108 A1, (Sun; Haowen.) in view of US Patent US 20250131040 A1, (Wang; Zijia).

Claim 6, 14, and 20
 Regarding Claim 6, 14, and 20, Sun do not explicitly teach all of the method of claim 1, further comprising: removing, by the computer program, one of the plurality of summaries in the database in response to an embedding vector for the summary having a value that is similar to an embedding vector for the one of the plurality of summaries.

However, Wang teach removing, by the computer program, one of the plurality of summaries in the database in response to an embedding vector for the summary having a value that is similar to an embedding vector for the one of the plurality of summaries.
(Paragraph 35"Returning to FIG. 2 and FIG. 3, as shown in FIG. 2, at a block 202, feature extraction is performed on the input data using the selected target pre-trained model to determine text descriptors for the input data. As shown in FIG. 3, after processing by the model selection module, various types of pre-trained models suitable for various types of data can be selected. As shown in FIG. 3, the computing device further includes a data compression module 303, which processes (e.g., compresses) the input data using the selected target pre-trained model to obtain the corresponding feature F. The feature F can be a vector, which may be a text descriptor or an identifier. The text descriptor requires very little storage space relative to original data."
 
Paragraph 38 "A process 500 for data deduplication will be described in detail below with reference to FIG. 5A and FIG. 5B. As shown in FIG. 5A, at a block 501, a cosine similarity matrix of a plurality of text descriptors is determined."
 
Paragraph 39 "For a feature vector F∈R.sup.1×512 of given input data, the feature vector F∈R.sup.1×512 may correspond to 512 text descriptors in the query table (each text descriptor may be a vector). As illustrated in FIG. 5B, only 1 to 7 text descriptors 510 are shown as examples. First, the cosine similarity matrix CS=F×F.sup.T of the feature vector F of the input data is calculated. That is, the cosine similarity matrix CS between 512 entries or text descriptors in the lookup table is calculated. Each element of the cosine similarity matrix CS may measure the similarity between two vectors in the feature vectors. F and F.sup.T are multiplied to obtain a matrix of 512 rows and 512 columns, and the matrix is the cosine similarity matrix CS of F. Each element CS[i][j] of the cosine similarity matrix CS represents the cosine similarity between the i.sup.th vector and the j.sup.th vector in the feature vector F∈R.sup.1×512, that is, the cosine similarity between the i.sup.th text descriptor and the j.sup.th text descriptor in the 512 text descriptors."
 
Paragraph 47 "As shown in FIG. 5A, at a block 504, duplicate text descriptors in a plurality of text descriptors are removed based on the compressed Laplacian graph, for removing duplicate data. As can be seen from FIG. 5B, by taking the 1st to 7th text descriptors 510 in the lookup table as inputs and through deduplication processing, it may be found that the 4th text descriptor and the 1st text descriptor are duplicated, and the 3rd text descriptor and the 6th text descriptor are duplicated. Therefore, the 4th and 6th text descriptors may be removed. Therefore, duplicate text descriptors in the query table may be removed based on the compressed or dimensionally reduced Laplacian graph to obtain a deduplicated query table 550, thereby removing the data corresponding to the text descriptors.")

It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Sun to incorporate the teachings of Wang to provide a “The method of claim 1, wherein the summary is saved to the database of historical conversation summaries in response to an embedding vector for the summary being from embedding vectors for the plurality of summaries in the database.” Doing so wouldpreserve information as much as possible, as recognized by Wang (Paragraph 131).

Claims  7, and 15 are rejected under 35 U.S.C. 103 as obvious over US Patent US 20240414108 A1, (Sun; Haowen.) in view of US Patent US 20250021768 A1, (MADAN; Umesh).
Claim 7 and 15
 Regarding Claim7 and 15, Sun do not explicitly teach all of 
The method of claim 1, wherein the first LLM and the second LLM are the same LLM.
However, MADAN teach the method of claim 1, wherein the first LLM and the second LLM are the same LLM.
(paragraph 16 "In general, the processing circuitry 14 may be configured to receive, via the prompt interface 48 (in some implementations, the prompt interface API), an instruction 52, which is incorporated into a prompt 50. The trained generative model 56 receives the prompt 50, which includes the instruction 52, and produces a response 58. It will be understood that the instruction 52 may also be generated by and received from a software program, rather than directly from a human user. The prompt 50 may be inputted into the trained generative model 56 by an API call from a client to a server hosting the trained generative model 56, and the response 58 may be received in an API response from the server. Alternatively, the input of the prompt 50 into the trained generative model 56 and the reception of the response 58 from the trained generative model 56 may performed at one computing device."

Paragraph 21 "As illustrated in the subsequent examples, the extraction of synthetic memories 34 by the memory-extracting trained model 30 is not the mere recording or filtering of raw data, but the summary or encapsulation of the essence of the interactions in the persistent user interaction history 32 in accordance with instructions in a prompt 28. As such, the synthetic memories 34 offer an intelligent, context-aware reflection of the interactions in the persistent user interaction history 32."

 Paragraph 18 "Responsive to receiving the persistent user interaction history 32, the prompt generator 26 generates one or more memory-extracting prompts 28 to be inputted into the memory-extracting trained model 30, which may be identical to the trained generative model 56 or separate from the trained generative model 56. Both the trained model 30 and the trained generative model 56 are generative models that have been configured through machine learning to receive input that includes natural language text and generate output that includes natural language text in response to the input. It will be appreciated that the memory-extracting trained model 30 and the trained generative model 56 can be large language models (LLMs) having tens of millions to billions of parameters, non-limiting examples of which include GPT-3 and BLOOM, or alternatively configured as other architectures of generative models, including various forms of diffusion models, generative adversarial networks, and multi-modal models. Either or both of the memory-extracting trained model 30 and the trained generative model 56 can be multi-modal generative language models configured to receive multi-modal input including natural language text input as a first mode of input and image, video, or audio as a second mode of input, and generate output including natural language text based on the multi-modal input. The output of the multi-modal model may additionally include a second mode of output such as image, video, or audio output. Non-limiting examples of multi-modal generative models include Kosmos-1, GPT-4, and LLaMA. Further, either or both of the memory-extracting trained model 30 and the trained generative model 56 can be configured to have a generative pre-trained transformer architecture, examples of which are used in the GPT-3 and GPT-4 models.")
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Sun to incorporate the teachings of MADAN to provide a “The method of claim 1, wherein the first LLM and the second LLM are the same LLM.” Doing so would Improve the functional capability of the model, as recognized by MADAN (Paragraph 18).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALI M HASSAN whose telephone number is (571)272-5331. The examiner can normally be reached Monday - Friday 8:00am - 4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras Shah can be reached at (571)270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ALI M HASSAN/Examiner, Art Unit 2653                                                                                                                                                                                                        
/Paras D Shah/Supervisory Patent Examiner, Art Unit 2653                                                                                                                                                                                                        

03/12/2026
Read full office action
Prosecution Timeline

Jun 06, 2024
Application Filed
Mar 12, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/618,484
Patent 12598014
CONTENT DRIVEN INTEGRATED BROADCAST SYSTEM WITH ONE OR MORE SELECTABLE AUTOMATED BROADCAST PERSONALITY AND METHOD FOR ITS USE
2y 5m to grant Granted Apr 07, 2026
18/087,647
Patent 12572852
LEXICAL DROPOUT FOR NATURAL LANGUAGE PROCESSING
2y 5m to grant Granted Mar 10, 2026
18/072,927
Patent 12541540
INFORMATION PROCESSING DEVICE, TERMINAL DEVICE, INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND RECORDING MEDIUM
2y 5m to grant Granted Feb 03, 2026
Study what changed to get past this examiner. Based on 3 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
70%
Grant Probability
99%
With Interview (+33.3%)
2y 9m
Median Time to Grant
Low
PTA Risk
Based on 10 resolved cases by this examiner. Grant probability derived from career allow rate.