Last updated: April 19, 2026

Application No. 18/733,022

METHOD AND SYSTEM OF CONTEXT WINDOW ENGINEERING FOR LARGE LANGUAGE MODELS FINE-TUNED FOR CONVERSATIONS

Non-Final OA §102

Filed

Jun 04, 2024

Examiner

VILLENA, MARK

Art Unit

2658

Tech Center

2600 — Communications

Assignee

Cdk Global LLC

OA Round

1 (Non-Final)

Interview Optional

— +15.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 478 resolved cases, 2023–2026

Examiner Intelligence

VILLENA, MARK View full profile →

Grants 70% — above average

Career Allow Rate

334 granted / 478 resolved

+7.9% vs TC avg

Strong +16% interview lift

Without

With

+15.5%

Interview Lift

resolved cases with interview

Typical timeline

3y 10m

Avg Prosecution

22 currently pending

Career history

500

Total Applications

across all art units

Statute-Specific Performance

§101

13.7%

-26.3% vs TC avg

§103

51.5%

+11.5% vs TC avg

§102

20.4%

-19.6% vs TC avg

§112

5.0%

-35.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 478 resolved cases

Office Action

§102

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/04/2024.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings were submitted on 06/04/2024.  These drawings are reviewed and accepted by the examiner.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-28 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Cohen et al. (US 12254279 B1).

Regarding claims 1, 11, and 20, Cohen teaches:
“receiving a message in a user language” (col. 6, lines 58-65; ‘At 210, the system receives user voice input which is provided by the user via a user interface.’);
“preprocessing the message” (col. 7, lines 1-5; ‘At 214, the audio signal is transcribed to text for processing.’);
“setting a static context based on a topic of the message for a text sequence including the message, if the topic is new or different from a topic of an immediately preceding message” (col. 2, lines 37-48; ‘The defined condition may be, for example, the start of a particular topic of discussion within the conversation.’; col. 8, lines 65-67; ‘Topic Management: For off-topic inquiries and discussions, provide concise responses and gently steer the conversation back to the subject matter.’);
“attaching the static context to a context window” (col. 11, lines 49-59; ‘In some embodiments, LLM 350 utilizes buffer memory 352 and context window 354 for a given conversation. Buffer memory 352 stores recent history of the conversation.’ col. 12, lines 61-67; ‘Parallel Context Windows: Managing multiple context windows in parallel, allowing LLM 350 to handle complex, multi-threaded conversations more efficiently);
“attaching a dynamic context of long term to the context window based on the message and one or more previous messages, if any, in the text sequence” (col. 12, lines 5-8; ‘One practical function of system directives 302 is to instruct LLM 350 to summarize long conversations or prune irrelevant content from the context window, ensuring that the conversation stays focused and relevant.’; col. 12, lines 65-67; ‘Memory Enhanced Architectures: Utilizing architectures that mirror human short-term and long-term memory (e.g., Sparse Priming Representation) to maintain context and continuity in conversations, like the RAISE framework, which enhances the ability of LLM 350 to handle extended dialogues.’; col. 22, lines 30-32; ‘In Example 9, the subject matter of Examples 4-8 includes, wherein each LLM engine instance is operative to dynamically adjust the size of the context window.’ Dynamically adjusting the size of a context window suggests long term context.);
“attaching a dynamic context of short term to the context window based on the message” (col. 12, lines 65-67; ‘Memory Enhanced Architectures: Utilizing architectures that mirror human short-term and long-term memory (e.g., Sparse Priming Representation) to maintain context and continuity in conversations, like the RAISE framework, which enhances the ability of LLM 350 to handle extended dialogues.’; col. 22, lines 30-32; ‘In Example 9, the subject matter of Examples 4-8 includes, wherein each LLM engine instance is operative to dynamically adjust the size of the context window.’ Dynamically adjusting the size of a context window suggests short term context.);
“providing the context window to a language model server” (col. 12, lines 9-16; ‘In a related embodiment, operation of context window 354 is optimized to improve its efficiency and effectiveness. In some implementations, internal and external optimization algorithms are employed for such optimization.’; col. 13, lines 4-18; ‘In related embodiments, certain operations are performed by LLM 350 to manage context windows for greater efficiency.’);
“receiving a database access command based on the context window from the language model server” (col. 12, lines 9-16; ‘In some implementations, internal and external optimization algorithms are employed for such optimization. In the embodiment depicted, LLM 350 performs the internal optimization of the context window 354, whereas context-window (CW) optimization engine 360 applies the external optimization algorithms.’ Employing external optimization inherently requires a control signal/command to occur.; col. 12, lines 53-56; ‘Dynamic Modules for Personalization: Injecting user profiles or selected scenarios into the context window of a conversation to guide LLM 350 to maintain a relevant and personalized context;’ The injection of user profiles or scenarios require database access command.);
“providing the database access command to a database” (col. 12, lines 53-56; ‘Dynamic Modules for Personalization: Injecting user profiles or selected scenarios into the context window of a conversation to guide LLM 350 to maintain a relevant and personalized context;’ The injection of user profiles or scenarios require database access command.; col. 13, lines 15-18; ‘Dynamic Directives: Using independent modules containing system directives, instructions, personas, and profiles. These independent modules can be loaded dynamically based on predefined criteria.’); and
“receiving a result response to the database access command from the database” (col. 6, lines 43-47; ‘LLM engine 112 provides access to a LLM service such as ChatGPT by OpenAI through its application programming interface (API). Using the LLM service, LLM engine 112 processes and generates responses based on the input and conversation context.’).

Regarding claims 2 (dep. on claim 1), 12 (dep. on claim 11), and 21 (dep. on claim 20), Cohen further teaches:
“providing the result to the language model server” (col. 6, lines 43-47; ‘LLM engine 112 provides access to a LLM service such as ChatGPT by OpenAI through its application programming interface (API). Using the LLM service, LLM engine 112 processes and generates responses based on the input and conversation context.’);
“receiving the result in the user language from the language model server” (col. 6, lines 43-47; ‘LLM engine 112 provides access to a LLM service such as ChatGPT by OpenAI through its application programming interface (API). Using the LLM service, LLM engine 112 processes and generates responses based on the input and conversation context.’); and
“providing the response in the user language” (col. 6, lines 43-47; ‘LLM engine 112 provides access to a LLM service such as ChatGPT by OpenAI through its application programming interface (API). Using the LLM service, LLM engine 112 processes and generates responses based on the input and conversation context.’).

Regarding claim 3 (dep. on claim 1), Cohen further teaches:
“splitting the context window into: the static context; the dynamic context of long term; and the dynamic context of short term” (col. 12, lines 61-67; ‘Parallel Context Windows: Managing multiple context windows in parallel, allowing LLM 350 to handle complex, multi-threaded conversations more efficiently; Memory Enhanced Architectures: Utilizing architectures that mirror human short-term and long-term memory (e.g., Sparse Priming Representation) to maintain context and continuity in conversations, like the RAISE framework, which enhances the ability of LLM 350 to handle extended dialogues.’).

Regarding claims 4 (dep. on claim 1), 13 (dep. on claim 1), and 22 (dep. on claim 20), Cohen further teaches:
“wherein said attaching the dynamic context comprises providing relevant information to the message” (col. 13, lines 4-18; ‘These include: Predictive Prioritization: Anticipating which parts of the context are likely to be most relevant for future responses…’).

Regarding claims 5 (dep. on claim 1), 14 (dep. on claim 11), and 23 (dep. on claim 20), Cohen further teaches:
“storing the one or more contexts in a queue; and processing a next message using the message associated with the one or more contexts” (col. 21, lines 62-67 through col. 22, lines 1-3; ‘… and a context window that stores a selected subset of information from the buffer memory which represents context of a defined recent portion of the current conversation.’).

Regarding claims 6 (dep. on claim 1), 15 (dep. on claim 11), and 24 (dep. on claim 20), Cohen further teaches:
“loading a static context in memory based on one or more topics of the message” (col. 21, lines 62-67 through col. 22, lines 1-3; ‘… and a context window that stores a selected subset of information from the buffer memory which represents context of a defined recent portion of the current conversation.’).

Regarding claims 7 (dep. on claim 6), 16 (dep. on claim 15), and 25 (dep. on claim 24), Cohen further teaches:
“wherein the static context is based on at least one of multiple topics including an order, sales or appointment” (col. 1, lines 18-29; ‘However, the task of training an LLM to effectively steer and manage complex conversations, particularly in scenarios such as presentations and interactive sales pitches that involve multiple threads within specific topics, presents a multitude of challenges.’).

Regarding claims 8 (dep. on claim 6), 17 (dep. on claim 15), and 26 (dep. on claim 24), Cohen further teaches:
“wherein said attaching the dynamic context of long term comprises providing the dynamic context of long term over a period of the text sequence including the message on the one or more topics, based on one or more contents of the message and one or more previous messages” (col. 22, lines 30-32; ‘In Example 9, the subject matter of Examples 4-8 includes, wherein each LLM engine instance is operative to dynamically adjust the size of the context window.’), and
“wherein the message is a follow-up message of the previous message” (col. 12, lines 47-49; ‘System Directives and User Prompts: Guiding LLM 350 to focus on specific aspects of the conversation to manage the context window effectively;’).

Regarding claims 9 (dep. on claim 1), 18 (dep. on claim 11), and 27 (dep. on claim 20), Cohen further teaches:
“wherein the dynamic context of short term is specific to the message, wherein the dynamic context of short term expires after receiving the result response” (col. 1, lines 42-51; ‘Another challenge lies in the model's ability to dynamically switch between topics and manage multiple threads within a conversation.’).

Regarding claims 10 (dep. on claim 1), 19 (dep. on claim 11), and 28 (dep. on claim 20), Cohen further teaches:
“deleting one or more messages in the dynamic context long term on a first-in-first-out (FIFO) basis, when a token count in the context window becomes equal to a maximum number of tokens for the context window” (col. 1, lines 63-67; ‘In Example 4, the subject matter of Examples 1-3 includes, wherein each LLM engine instance comprises: a buffer memory that is operative to temporarily store recent history of a current conversation in which the LLM engine instance is engaged; and a context window that stores a selected subset of information from the buffer memory which represents context of a defined recent portion of the current conversation.’ Buffer memory reads on FIFO.).

Conclusion
Other pertinent prior art are cited in the PTO-892 for the applicant's consideration. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARK VILLENA whose telephone number is (571)270-3191. The examiner can normally be reached 10 am - 6pm EST Monday through Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached at (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MARK . VILLENA
Examiner
Art Unit 2658



/MARK VILLENA/Examiner, Art Unit 2658

Read full office action

Prosecution Timeline

Jun 04, 2024

Application Filed

Jan 09, 2026

Non-Final Rejection — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/111,671

Patent 12591407

ROBUST VOICE ACTIVITY DETECTOR SYSTEM FOR USE WITH AN EARPHONE

2y 5m to grant Granted Mar 31, 2026

18/141,182

Patent 12592232

SYSTEMS, METHODS, AND APPARATUSES FOR DETECTING AI MASKING USING PERSISTENT RESPONSE TESTING IN AN ELECTRONIC ENVIRONMENT

2y 5m to grant Granted Mar 31, 2026

18/250,511

Patent 12586581

ELECTRONIC DEVICE CONTROL METHOD AND APPARATUS

2y 5m to grant Granted Mar 24, 2026

18/623,751

Patent 12578922

Natural Language Processing Platform For Automated Event Analysis, Translation, and Transcription Verification

2y 5m to grant Granted Mar 17, 2026

18/292,214

Patent 12573394

ESTIMATION METHOD, RECORDING MEDIUM, AND ESTIMATION DEVICE

2y 5m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

70%

Grant Probability

85%

With Interview (+15.5%)

3y 10m

Median Time to Grant

Low

PTA Risk

Based on 478 resolved cases by this examiner. Grant probability derived from career allow rate.