Last updated: May 29, 2026

Application No. 18/486,075

SELECTIVE MEMORY RETRIEVAL FOR THE GENERATION OF PROMPTS FOR A GENERATIVE MODEL

Final Rejection §103

Filed

Oct 12, 2023

Priority

Jul 14, 2023 — provisional 63/513,696 +1 more

Examiner

SERROU, ABDELALI

Art Unit

2659

Tech Center

2600 — Communications

Assignee

Microsoft Technology Licensing, LLC

OA Round

2 (Final)

Interview Optional

— +30.5% interview lift. Examiner has a relatively high allowance rate (74%); +30.5% interview lift. A written response may suffice.

Based on 589 resolved cases, 2023–2026

Examiner Intelligence

SERROU, ABDELALI View full profile →

Grants 74% — above average

Career Allowance Rate

437 granted / 589 resolved

+12.2% vs TC avg

Strong +30% interview lift

Without

With

+30.5%

Interview Lift

resolved cases with interview

Typical timeline

3y 5m

Avg Prosecution

17 currently pending

Career history

610

Total Applications

across all art units

Statute-Specific Performance

§101

5.0%

-35.0% vs TC avg

§103

80.8%

+40.8% vs TC avg

§102

8.9%

-31.1% vs TC avg

§112

1.2%

-38.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 589 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The filed information disclosure statement (IDS) is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over the prior art of record Zhong et al. ("Memory Bank: Enhancing Large Language Models with Long-Term Memory", arxiv.org, Cornell University Library, 201 Olin Library Cornell University Ithaca, NY 14853, 17 May 2023) in view of Moon (US 2020/410012).
As per claim 1, Zhong teaches processing circuitry (Fig. 1) configured to: provide access to a plurality of memory banks, each storing a plurality of memories (Abstract, page 19725 col. 2, line 6-12, the warehouse of MemoryBank, is a robust data repository holding a meticulous array of information. As shown in Fig. 1, it stores daily conversations records, summaries of past events, and evolving assessments of user personalities, thereby constructing a dynamic and multi-layered memory landscape. The warehouse of MemoryBank is accessed and used for the creation of a LLM-based chatbot named SiliconFriend);
cause an interaction interface for a trained generative model to be presented (Fig. 1, page 19726, col. 2, line 22 – page 19727, line 14, and Figs. 1, 2, wherein trained generative models such as ChatGPT and ChatGLM are presented); 
receive, via the interaction interface, an instruction from a user for the trained generative model to generate an output (page 19728, Fig. 2, wherein a user’s request is received, i.e. “Hi, I’m Zephyr. I recently broke up with my girlfriend”); 
extract a context of the instruction (Abstract, generating long term dialogue contexts. Page 19724, line 23-26, Memory-Bank enables LLMs to recall historical interactions, continually evolve their understanding of context, and adapt to a user’s personality based on past interactions); 
generate a memory request including the context and the instruction (Page 19728, col. 1, line 14-25, wherein memory probing questions are integrated into the dialogues. These questions are designed to prompt SiliconFriend to retrieve specific details from the chat history. See Fig. 3, wherein the user and SiliconFriend engaged in a discussion about programming learning suggestions); 
input the memory request into a plurality of memory banks to retrieve a plurality of relevant memories among the plurality of memories (Figs. 1-3, inputting a memory request by the SiliconFriend during a discussion with different users with different personalities to retrieve different information and answer questions about different topics. See also, page 19728, col. 2, line 7-23, Conversations are synthesized by users acted by ChatGPT based on different predefined topics and user personalities); 
generate a prompt based on the retrieved relevant memories and the instruction from the user (page 19727, col.2, line 21-26, wherein said, upon memory retrieval, a series of information is organized into the conversation prompt, including relevant memory, global user portrait, and global event summary. Consequently, SiliconFriend can generate responses that refer past memories and deliver interactions tailored to the user’s portrait); 
provide the prompt to the trained generative model (page 19728, line 14-25, prompting the SiliconFriend to retrieve specific details from the chat history); 
receive, in response to the prompt, a response from the trained generative model (Fig. 3 and page 19728, col. 1, line 1-13 receiving and delivering responses to the user); and
output the response to the user (Fig. 3, delivering responses to the user).
Zhong may not explicitly disclose a processing circuitry; and a plurality of memory retrieval agents respectively coupled to the plurality of memory banks to retrieve a plurality of relevant memories.  Moon in the same field of endeavor teaches the processing circuitry ([0234] and Fig. 13) and a plurality of memory retrieval agents respectively coupled to the plurality of memory banks to retrieve a plurality of relevant memories ([0048]).  Therefore, it would have been obvious at the time the application was filed to use the above feature of Moon with the system of Zhong, in order to input the memory request into a plurality of memory retrieval agents respectively coupled to the plurality of memory banks to retrieve a plurality of relevant memories among the plurality of memories, as claimed.  This would effectively handle various question types and show significant improvement in accuracy in term of answering questions ([0007]).
As per claim 2, Zhong teaches wherein the trained generative model is a trained generative language model (page 19726, col. 2, line 21 – page 19727, col. 1, line 25, wherein the trained generative model integrates three powerful LLMs).
	As per claim 3, Zhong teaches wherein the trained generative language model is a generative pre-trained transformer model (page 19724, Abstract, ChatGPT. (The GPT architecture, which stands for Generative Pre-trained Transformer, is built upon the transformer architecture).
	As per claim 4, Zhong teaches wherein the instruction is divided into a plurality of instructions (necessarily disclosed in the process of generates long-term dialog contexts
covering a wide array of topics, page 19724, Abstract). Zhong may not explicitly disclose the plurality of instructions are incorporated into a plurality of memory requests, respectively, and inputted into the plurality of memory retrieval agents.  Moon in the same field of endeavor teaches the plurality of instructions are incorporated into a plurality of memory requests, respectively, and inputted into the plurality of memory retrieval agents ([0048]).  Therefore, it would have been obvious at the time the application was filed to use the above feature of Moon with the system of Zhong, in order to input the memory request into a plurality of memory retrieval agents respectively coupled to the plurality of memory banks to retrieve a plurality of relevant memories among the plurality of memories, as claimed.  This would effectively handle various question types and show significant improvement in accuracy in term of answering questions ([0007]).
As per claim 5, Zhong teaches wherein the plurality of memories are converted into vector representations (page 19726, col. 1, line 1-11, consequently, the entire memory storage is pre-encoded intoM = {h0m, h1m, ...h|M|m }, where each hm is a vector representation of a memory piece) .
As per claim 6, Zhong may not explicitly disclose wherein a given memory retrieval agent among the plurality of memory retrieval agents computes distances between the vector representations and a vector representation of the context to retrieve the plurality of relevant memories among memories of a respective memory bank of the given memory retrieval agent.  Moon in the same field of endeavor teaches wherein a given memory retrieval agent among the plurality of memory retrieval agents computes distances between the vector representations and a vector representation of the context to retrieve the plurality of relevant memories among memories of a respective memory bank of the given memory retrieval agent ([0206], wherein the social-networking system 160 may calculate a similarity metric of vectors in vector space 1100. A similarity metric may be a cosine similarity, a Minkowski distance, a Mahalanobis distance, a Euclidean distance). Therefore, it would have been obvious at the time the application was filed to use the distance metric of Moon with the system of Zhong, in order to enable the retrieval agents to return contextually relevant results even if they don't contain the exact search terms, leading to more accurate and useful outcomes.
	As per claim 7, Zhong teaches searching for recalling the most relevant memories (Abstract and page 19726, col. 1, line 1-11).  Zhong may not explicitly disclose the use of a threshold.  However, the use of a predetermined threshold to filter out candidates is well known in the art, as evidenced by Moon. Moon in the same field of endeavor teaches using a threshold to rank candidates ([0114], [0223]).  Therefore, it would have been obvious at the time the application was filed to use Moon’s threshold feature with the system of Zhong, in order to 
Select relevant memories.  This would improve analysis accuracy and provide reliable results.
	As per claim 8, Zhong teaches wherein the vector representations are stored in a database supporting vector search (page 19726, col. 1, line 1-11, wherein the vectors representations are stored and indexed for efficient retrieval).
	As per claim 9, Zhong teaches wherein the plurality of relevant memories are inputted into a relevance evaluator to determine a relative relevance for each of the plurality of relevant memories; and the plurality of relevant memories are selectively filtered based on the relative relevance determined for each of the plurality of relevant memories (page 19725, col. 2, line 50 -page 19726, col. 1, line 11, wherein said, our memory retrieval mechanism operates akin to a knowledge retrieval task. In this context, we adopt a dual-tower dense retrieval mode…., and also wherein said the current context of conversation c is encoded by E(·) into hc, which serves as the query to search M for the most relevant memory).
	As per claim 10, Zhong teaches wherein the relevance evaluator is a generative model or a classifier ((page 19724, Abstract, ChatGPT. ChatGPT is a generative model).
As per claims 11-19, method claims 11-19 and apparatus claims 1-10 are related as method and apparatus of using same, with each claimed element's function corresponding to the claimed method step. Accordingly claims 11-19 are similarly rejected under the same rationale as applied above with respect to apparatus claims 1-10. 
	As per claim 20, Zhong teaches invoking an application programming interface (API) call to transmit the prompt to the trained generative model that receives input of the prompt including natural language text input (necessarily disclosed with the process exchanging information during conversation of Figs. 2, 3, 4) and, in response, generates a response that includes natural language text output (see conversations of Figs. 2-4).  The rest is rejected for the same reason as set with regard to claim 1.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABDELALI SERROU whose telephone number is (571)272-7638. The examiner can normally be reached M-F 9 Am - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached at 571-272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ABDELALI SERROU/Primary Examiner, Art Unit 2659

Read full office action

Prosecution Timeline

Oct 12, 2023

Application Filed

Dec 04, 2025

Non-Final Rejection mailed — §103

Mar 04, 2026

Response Filed

May 27, 2026

Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/738,974

Patent 12632665

CONTEXT-BASED NATURAL LANGUAGE PROCESSING

1y 11m to grant Granted May 19, 2026

18/438,923

Patent 12602544

INFORMATION PROCESSING APPARATUS, OPERATION METHOD, AND RECORDING MEDIUM

2y 2m to grant Granted Apr 14, 2026

18/371,344

Patent 12596875

TECHNIQUES FOR ADAPTIVE LARGE LANGUAGE MODEL USAGE

2y 6m to grant Granted Apr 07, 2026

18/494,763

Patent 12597417

EXPORTING MODULAR ENCODER FEATURES FOR STREAMING AND DELIBERATION ASR

2y 5m to grant Granted Apr 07, 2026

18/675,840

Patent 12596889

GENERATION OF NATURAL LANGUAGE (NL) BASED SUMMARIES USING A LARGE LANGUAGE MODEL (LLM) AND SUBSEQUENT MODIFICATION THEREOF FOR ATTRIBUTION

1y 10m to grant Granted Apr 07, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

74%

Grant Probability

99%

With Interview (+30.5%)

3y 5m (~9m remaining)

Median Time to Grant

Moderate

PTA Risk

Based on 589 resolved cases by this examiner. Grant probability derived from career allowance rate.