Last updated: April 19, 2026
Application No. 19/028,561
SYSTEM AND METHOD FOR COMPRESSING PROMPTS TO LANGUAGE MODELS FOR DOCUMENT PROCESSING

Final Rejection §103§DP
Filed
Jan 17, 2025
Examiner
ALLEN, BRITTANY N
Art Unit
2169
Tech Center
2100 — Computer Architecture & Software
Assignee
SAS Institute Inc.
OA Round
4 (Final)
Interview Optional

— +37.7% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 391 resolved cases, 2023–2026
Examiner Intelligence

ALLEN, BRITTANY N View full profile →
Grants 42% of resolved cases
Career Allow Rate
163 granted / 391 resolved
-13.3% vs TC avg
Strong +38% interview lift
Without
With
+37.7%
Interview Lift
resolved cases with interview
Typical timeline
4y 8m
Avg Prosecution
31 currently pending
Career history
422
Total Applications
across all art units
Statute-Specific Performance

§101
17.5%
-22.5% vs TC avg
§103
52.8%
+12.8% vs TC avg
§102
12.3%
-27.7% vs TC avg
§112
13.6%
-26.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 391 resolved cases
Office Action

§103 §DP
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
This action is in response to the amendments received on 1/20/26.  Claims 1-30 are pending in the application.  Applicants' arguments have been carefully and respectfully considered.
Claims 1-8, 16-19, and 26-29 are provisionally rejected on the ground of nonstatutory double patenting.
Claim(s) 1-3, 7, 10, 13, 14, 16, 17, 21, 23, 24, 26, and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Pathak et al. (US 2024/0394479), and further in view of Wan et al. (US 2025/0094538).
Claim(s) 4-6, 8, 18, 19, 28, and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Pathak in view of Wan, and further in view of Reza et al. (US 2023/0237277).
Claims 9, 20, and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Pathak in view of Wan, and further in view of McAnallen (US 2024/0104055).
Claim(s) 11, 12, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Pathak in view of Wan, and further in view of Thompson et al. (US 2024/0362197).
Claim(s) 15 and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Pathak in view of Wan, and further in view of Blum et al. (US 2025/0068667).

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

19/027199
19/028561
A non-transitory computer-readable medium comprising computer-readable instructions stored thereon that when executed by a processor cause the processor to: 

receive a set of documents for generating a prompt to be input into a language model having a context window with a token limit, from which to generate a topic label and a topic description for a topic, 
wherein the topic label comprises a name for the topic and the topic description comprises a description of the topic in a human-understandable format; 
A non-transitory computer-readable medium comprising computer-readable instructions stored thereon that when executed by a processor cause the processor to: 

receive a set of documents for generating a prompt to be input into a language model having a context window with a token limit, from which to generate a topic label and a topic description for a topic; 

input the set of documents into an unsupervised machine learning model;
input the set of documents into an unsupervised machine learning model;
execute the unsupervised machine learning model to output a plurality of topics for the set of the documents, each of the plurality of topics comprising a plurality of topic terms and each of the plurality of topic terms associated with a first weight value; 
execute the unsupervised machine learning model to output the topic for the set of the documents, the topic comprising a plurality of topic terms; 
select a first subset of topic terms for each topic of the plurality of topics, wherein the first subset of topic terms for each topic are selected from the plurality of topic terms of that topic based on the first weight value assigned to each of the plurality of topic terms of that topic; 
select a first subset of topic documents from the set of documents that belong to the topic; 
compute an inverse document frequency weight value for each topic term in the first subset of topic terms of each topic; 
compute a second weight value for each topic term in the first subset of topic terms based on the first weight value and the inverse document frequency weight value for that topic term;

select a second subset of topic terms for each topic from the first subset of topic terms, wherein the second subset of topic terms are selected based on the second weight value of each topic term in the first subset of topic terms;
select a second subset of topic documents from the first subset of topic documents based on the plurality of topic terms; 

identify a title from each of the second subset of topic documents to obtain a plurality of titles;
generate a compressed representation of the set of documents from the second subset of topic terms of each topic to include in a prompt for each topic, wherein the compressed representation having a first number of tokens to be stored in a computer memory that is less than a second number of tokens in the plurality of topic terms;
generate a compressed representation of the set of documents based on the plurality of titles to include in a prompt, wherein the compressed representation having a first number of tokens stored in a computer memory that is less than a second number of tokens in the plurality of topic terms,
the compressed representation being generated by concatenating the plurality of titles identified from the second subset of topic documents and excluding unselected titles from the set of documents that belong to the topic;
the compressed representation is generated based on (i) text segments extracted by an information extraction model and (ii) topic terms and associated weight values output by the unsupervised machine learning model; 
the compressed representation is generated based on (i) text segments extracted by an information extraction model and (ii) topic terms and associated weight values output by the unsupervised machine learning model; 
the compressed representation is generated based on (i) text segments extracted by an information extraction model and (ii) topic terms and associated weight values output by the unsupervised machine learning model; 
the compressed representation is generated based on (i) text segments extracted by an information extraction model and (ii) topic terms and associated weight values output by the unsupervised machine learning model; 
the compressed representation is generated based on (i) text segments extracted by an information extraction model and (ii) topic terms and associated weight values output by the unsupervised machine learning model; 
the compressed representation is generated based on (i) text segments extracted by an information extraction model and (ii) topic terms and associated weight values output by the unsupervised machine learning model; 
the compressed representation is generated based on (i) text segments extracted by an information extraction model and (ii) topic terms and associated weight values output by the unsupervised machine learning model; 
the compressed representation is generated based on (i) text segments extracted by an information extraction model and (ii) topic terms and associated weight values output by the unsupervised machine learning model; 
input the machine-consumable prompt of each topic into the language model that is distinct from the unsupervised machine learning model;
input the machine-consumable prompt of each topic into [[a]] the language model that is distinct from the unsupervised machine learning model;
generate the topic label and topic description for each topic of the plurality of topics by executing the language model based on the input of the machine-consumable prompt, 

the compressed representation being generated by concatenating the selected subset of topic terms and excluding unselected topic terms of the second number of tokens in the plurality of topic terms.
generate the topic label and topic description for the topic by executing execute the language model based on the input of the machine-consumable prompt.


Claims 1-8, 16-19, and 26-29 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1-9, 12-20, and 23-29 of copending Application No. 19/027199 in view of McAnallen (US 2024/0104055). Copending application does not claim “identify a title from each of the second subset of topic documents to obtain a plurality of titles.”
McAnallen teaches identify a title from each of the second subset of topic documents to obtain a plurality of titles (McAnallen, pa 0043, The trained title generating model 210 receives and processes each document in a document cluster of the document data 260 to generate a title for each document).
It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have included the teachings of McAnallen because it allows users to identify groups of documents (McAnallen, pa 0001).

19/027469
19/028561
A non-transitory computer-readable medium comprising computer-readable instructions stored thereon that when executed by a processor cause the processor to: 
receive a set of documents for generating a prompt to be input into a language model having a context window with a token limit from which to generate a topic label and a topic description for a topic, 
wherein the topic label comprises a name for the topic and the topic description comprises a description of the topic in a human-understandable format; 
A non-transitory computer-readable medium comprising computer-readable instructions stored thereon that when executed by a processor cause the processor to: 
receive a set of documents for generating a prompt to be input into a language model having a context window with a token limit from which to generate a topic label and a topic description for a topic; 
input the set of documents into an unsupervised machine learning model;
input the set of documents into an unsupervised machine learning model; 
execute the unsupervised machine learning model to output the topic for the set of the documents, the topic comprising a plurality of topic terms; 
execute the unsupervised machine learning model to output the topic for the set of the documents, the topic comprising a plurality of topic terms; 
select a subset of topic documents from the set of documents, wherein the subset of topic documents belong to the topic and are selected based on the plurality of topic terms;
select a first subset of topic documents from the set of documents that belong to the topic; 

input the subset of topic documents into an information extraction model;

execute the information extraction model to generate a plurality of snippets from the subset of topic documents for the topic;


select a second subset of topic documents from the first subset of topic documents based on the plurality of topic terms; 

identify a title from each of the second subset of topic documents to obtain a plurality of titles; 
generate a compressed representation of the set of documents based on the plurality of snippets to include in a prompt, wherein the compressed representation having a first number of tokens to be stored in a computer memory that is less than a second number of tokens in the plurality of topic terms, 
generate a compressed representation of the set of documents based on the plurality of titles to include in a prompt, wherein the compressed representation having a first number of tokens stored in a computer memory that is less than a second number of tokens in the plurality of topic terms,

wherein: 
the compressed representation reduces a token count of the prompt to fit within the context window of the language model
wherein: 
the compressed representation reduces a token count of the prompt to fit within the context window of the language model
the compressed representation is generated based on (i) text segments extracted by an information extraction model and (ii) topic terms and associated weight values output by the unsupervised machine learning model;
the compressed representation is generated based on (i) text segments extracted by an information extraction model and (ii) topic terms and associated weight values output by the unsupervised machine learning model;
generating the compressed representation comprises selecting, ordering, and concatenating topic terms based on the weight values to form a machine-consumable prompt;
generating the compressed representation comprises selecting, ordering, and concatenating topic terms based on the weight values to form a machine-consumable prompt;
the machine-consumable prompt has a token count that fits within a context window token limit of the language model;
the machine-consumable prompt has a token count that fits within a context window token limit of the language model;
input the machine-consumable prompt of each topic into the language model that is distinct from the unsupervised machine learning model; and
input the machine-consumable prompt of each topic into the language model that is distinct from the unsupervised machine learning model; and 
generate the topic label and topic description for the topic by executing the language model based on the input of the machine-consumable prompt.
generate the topic label and topic description for the topic by executing execute the language model based on the input of the machine-consumable prompt.


Claims 1-8, 16-19, and 26-29 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1-9, 12-20, and 23-29 of copending Application No. 19/027469 in view of McAnallen (US 2024/0104055). Copending application does not claim “select a second subset of topic documents from the first subset of topic documents based on the plurality of topic terms; identify a title from each of the second subset of topic documents to obtain a plurality of titles.”
McAnallen teaches select a second subset of topic documents from the first subset of topic documents based on the plurality of topic terms (McAnallen, pa 0039, The clustering system 132 utilizes known clustering techniques to categorize a number of documents into one or more clusters);
identify a title from each of the second subset of topic documents to obtain a plurality of titles (McAnallen, pa 0043, The trained title generating model 210 receives and processes each document in a document cluster of the document data 260 to generate a title for each document).
It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have included the teachings of McAnallen because it allows users to identify groups of documents (McAnallen, pa 0001).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3, 7, 10, 13, 14, 16, 17, 21, 23, 24, 26, and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Pathak et al. (US 2024/0394479) and further in view of Wan et al. (US 2025/0094538).

With respect to claim 1, Pathak teaches a non-transitory computer-readable medium comprising computer-readable instructions stored thereon that when executed by a processor cause the processor to:
receive a set of documents for generating a prompt to be input into a language model (Pathak, pa 0069, The application system 108 includes functionality that enables a user to interact with an online resource that links related information items in a graph. In a particular dialogue turn, the user submits an input query that incorporates information pulled from the graph. Or the knowledge-supplementing component 136 extracts information from the graph) having a context window with a token limit (Pathak, pa 0053, the dialogue system 104 can adapt the way it constructs the prompt information 124 … the execution platform that runs the language model 106. & pa 0078, the resource availability-assessing component 606 receives an input signal that indicates the current processing capacity of the application system 108 that uses the dialogue system 104. The resource availability assessing component 606 uses a rules-based system and/or a machine-trained model and/or other functionality to map these factors into a complexity level.);
input the set of documents into an unsupervised machine learning model (Pathak, pa 0090, the compression component 138 compresses the content in the candidate context information 902, including the dialogue history and/or the knowledge information retrieved by the knowledge-supplementing component 136 & pa 0092, once the compression component 138 is invoked, the compression-managing component 914 invokes all of the individual compression components (906, 908, 910, 912), which can then operate in parallel. & pa 0097, The topic-modeling component 910 can likewise uses various rules-based logic and/or machine-trained models to extract topics associated with the source information 904, including Latent Dirichlet Allocation (LDA));
execute the unsupervised machine learning model to output a plurality of topics …, each of the plurality of topics comprising a plurality of topic terms and each of the plurality of topic terms associated with a first weight value (Pathak, pa 0097, The topic-modeling component 910 can likewise uses various rules-based logic and/or machine-trained models to extract topics associated with the source information 904, including Latent Dirichlet Allocation (LDA));
identify a title from each of the … subset of topic documents to obtain a plurality of titles (Pathak, pa 0098, the compression component 138 also weights the relevance of selected terms (keywords, named entities, topics, etc.) based on one or more weighting factors, and uses those weights factors in determining which terms are to be included in the prompt information 124…. By favorably weighting a selected term, the compression component 138 promotes this term over other terms that are not similarly weighted, and increases the likelihood that the selected term will be included in the top K information items);
generate a compressed representation of the set of documents based on the plurality of titles (Pathak, Fig. 1 & pa 0045, The dynamic prompt-generating component 128 also assembles information provided by the separate analysis components 130 into the prompt information 124. & pa 0098, By favorably weighting a selected term, the compression component 138 promotes this term over other terms that are not similarly weighted, and increases the likelihood that the selected term will be included in the top K information items), wherein the compressed representation having a first number of tokens to be stored in a computer memory that is less than a second number of tokens in the plurality of topic terms (Pathak, pa 0054, the dialogue system 104 compresses source information from which the prompt information 124 is constructed, e.g., by picking salient terms from the source information and/or removing redundant information from the source information. & pa 0059, the compression component 138 uses topic analysis to identify one or more topics that are pertinent to the source information. The compression component 138 has the effect of compressing the source information by using selected terms to describe it. & pa 0091, The compression component 138 uses different components and associated techniques to perform different types of compression. Generally, each of the techniques provides a reduced-sized representation of the source information that preserves at least some semantic content of the source information in an original form. The reduced-sized representation of the source information is included in the prompt information 124 in lieu of the source information in its original form.) wherein: 
the compressed representation reduces a token count of the prompt to fit within the context window of the language model (Pathak, pa 0073, the content unit amount-assessing component 608 determines, based on the assessed complexity level, a maximum number of content units to include in the prompt information 124 for the current dialogue turn & pa 0091, Generally, each of the techniques provides a reduced-sized representation of the source information that preserves at least some semantic content of the source information in an original form.);
the compressed representation is generated based on (i) text segments extracted by an information extraction model (Pathak, pa 0094, The keyword-extracting component 906 uses any rules-based logic (e.g., any algorithm) or machine-trained model to detect prominent keywords or named entities associated with the source information 904.) and (ii) topic terms and associated weight values output by the unsupervised machine learning model (Pathak, pa 0098, the compression component 138 also weights the relevance of selected terms (keywords, named entities, topics, etc.) based on one or more weighting factors, and uses those weights factors in determining which terms are to be included in the prompt information 124.); 
generating the compressed representation comprises selecting, ordering, and concatenating topic terms based on the weight values to form a machine-consumable prompt (Pathak, pa 0098, the compression component 138 selects the K top-ranked terms. By favorably weighting a selected term, the compression component 138 promotes this term over other terms that are not similarly weighted, and increases the likelihood that the selected term will be included in the top K information items.); 
the machine-consumable prompt has a token count that fits within a context window token limit of the language model (Pathak, pa 0002, an application typically limits the size of the prompt that can be input to the language model. & pa 0073, the content unit amount-assessing component 608 determines, based on the assessed complexity level, a maximum number of content units to include in the prompt information 124 for the current dialogue turn);
input the machine-consumable prompt of each topic into the language model that is distinct from the unsupervised machine learning model (Pathak, pa 0116, The language model 1402 commences with the receipt of the model-input information, e.g., corresponding to the prompt information 124. The model-input information is expressed as a series of linguistic tokens 1406 & Fig. 16, pa 0129, In block 1608, the dialogue system submits the prompt information to the machine trained language model, and receives a response (e.g., the response 126) from the machine-trained language model based on the prompt information.) & Fig. 1, compression component (having topic model as shown in Fig. 9) and separate language model 106 & Fig. 9, compression managing component 914 (compression component 138) including topic modeling component 910 creating compressed source information & ); and
generate the topic label … the topic by executing the language model based on the input of the machine-consumable prompt (Pathak, Fig. 16, pa 0129, In block 1610, the dialogue system 104 generates output information (e.g., the output information 120) based on the response).
Pathak doesn't expressly discuss select a first subset of topic documents from the set of documents that belong to the topic; select a second subset of topic documents from the first subset of topic documents based on the plurality of topic terms; input the prompt of each topic into a language model, and execute the language model based on the prompt to generate the topic label and the topic description for the topic.
Wan teaches select a first subset of topic documents from the set of documents that belong to the topic (Wan, pa 0039, The clustering system 132 utilizes known clustering techniques to categorize a number of documents into one or more clusters. The clustering techniques may include analyzing the content of the documents to identify documents that have similar content (e.g., topics, keywords, and the like).);
select a second subset of topic documents from the first subset of topic documents based on the plurality of topic terms (Wan, pa 0104, Per block 1304, some embodiments sample a first dataset belonging to a first cluster and a second dataset belonging to a second cluster. To sample means to select a subset or a representative sample of data points (e.g., a chat conversation message) from a larger dataset (e.g., a chat conversation thread) that has been grouped or clustered together based on some similarity or grouping criterion));
generate a compressed representation of the set of documents … to include in a prompt (Wan, pa 0054, The input 304 includes multiple batches of dataset summaries (i.e., natural language summaries), an instruction of cluster descriptions, an instruction of cluster labels, and one or more constraint instructions), wherein the compressed representation having a first number of tokens stored in a computer memory that is less than a second number of tokens in the plurality of topic terms (Wan, pa 0038, A "natural language summary" as described herein refers to text summarization. Text summarization ( or automatic summarization or NLP text summarization) is the process of breaking down text (e.g., several paragraphs) into smaller text (e.g., one sentence or paragraph).);
input the prompt of each topic into a language model (Wan, pa 0053, The input 304 is fed to the language model encoder(s) and/or decoder(s) 306 (which may be the same model as 206 of FIG. 2), which then produces the output 308); and
generate the topic label and the topic description for the topic by executing the language model based on the prompt (Wan, pa 0056, Continuing with FIG. 3, the output 308 includes the generated cluster description(s) and label(s) for each batch.).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Pathak with the teachings of Wan because humans can more easily interpret and make sense of the results because the outputs of the model are in natural language (e.g., cluster descriptions, and cluster labels) (Wan, pa 0030).

With respect to claim 2, Pathak in view of Wan teaches the non-transitory computer-readable medium of claim 1, wherein the unsupervised machine learning model is a topic model (Pathak, pa 0097, The topic-modeling component 910 can likewise uses various rules-based logic and/or machine-trained models to extract topics associated with the source information 904, including Latent Dirichlet Allocation (LDA)), and wherein the language model is a Large Language Model (LLM) (Pathak, pa 0005, language model refers to a machine-trained model that is capable of processing language-based input information and, optionally, any other kind of input information (including video information, image information, audio information, etc.). As such, a language model can correspond to a multi-modal machine-trained model.).

With respect to claim 3, Pathak in view of Wan teaches the non-transitory computer-readable medium of claim 2, wherein the topic model is a Latent Dirichlet Allocation (LDA) clustering model (Pathak, pa 0097, The topic-modeling component 910 can likewise uses various rules-based logic and/or machine-trained models to extract topics associated with the source information 904, including Latent Dirichlet Allocation (LDA)).

With respect to claim 7, Pathak in view of Wan teaches the non-transitory computer-readable medium of claim 1, wherein to select the second subset of topic documents from the first subset of topic documents, the computer-readable instructions further cause the processor to at least one of identify documents from the first subset of topic documents that have a greatest number of the plurality of topic terms therein or identify the documents from the first subset of topic documents that have most frequently occurring topic terms of the plurality of topic terms (Wan, pa 0105, Per block 1306, some embodiments receive an instance ranking prompt. An instance ranking prompt includes a natural language instruction to select which of the first dataset or the second dataset is more similar to the reference document based on the instructed use case. For example, using the illustration above, the instance ranking prompt can be, "select the chat conversation message that is most similar to the reference dataset).

With respect to claim 10, Pathak in view of Wan teaches the non-transitory computer-readable medium of claim 1, wherein the title is identified from each of the second subset of topic documents from a body of each of the second subset of topic documents (Wan, pa 0068, The encoder/decoder block(s) 506 takes in a sentence, paragraph).

With respect to claim 13, Pathak in view of Wan teaches the non-transitory computer-readable medium of claim 10, wherein to generate the title from the body of a topic document, the computer-readable instructions further cause the processor to:
extract a first paragraph from the body of the topic document (Wan, pa 0038, A "natural language summary" as described herein refers to text summarization. Text summarization ( or automatic summarization or NLP text summarization) is the process of breaking down text (e.g., several paragraphs) into smaller text (e.g., one sentence or paragraph).);
input the first paragraph into a Large Language Model (LLM) (Wan, pa 0053, The input 304 is fed to the language model encoder(s) and/or decoder(s) 306 (which may be the same model as 206 of FIG. 2), which then produces the output 308 & pa 0068, The encoder/decoder block(s) 506 takes in a … paragraph); and 
execute the LLM to generate the title (Wan, pa 0053, The input 304 is fed to the language model encoder(s) and/or decoder(s) 306 (which may be the same model as 206 of FIG. 2), which then produces the output 308).

With respect to claim 14, Pathak in view of Wan teaches the non-transitory computer-readable medium of claim 10, wherein to generate the title from the body of a topic document, the computer-readable instructions further cause the processor to:
extract a first line from the body of the topic document (Wan, pa 0038, A "natural language summary" as described herein refers to text summarization. Text summarization ( or automatic summarization or NLP text summarization) is the process of breaking down text (e.g., several paragraphs) into smaller text (e.g., one sentence or paragraph).);
input the first line into a Large Language Model (LLM) (Wan, pa 0053, The input 304 is fed to the language model encoder(s) and/or decoder(s) 306 (which may be the same model as 206 of FIG. 2), which then produces the output 308 & pa 0054, The input 304 includes multiple batches of dataset summaries (i.e., natural language summaries), an instruction of cluster descriptions, an instruction of cluster labels, and one or more constraint instructions); and
execute the first LLM to generate the title (Wan, pa 0053, The input 304 is fed to the language model encoder(s) and/or decoder(s) 306 (which may be the same model as 206 of FIG. 2), which then produces the output 308).

With respect to claims 16, 17, 20, 21, 23, and 24, the limitations are essentially the same as claims 1, 2, 9, 10, 13, and 14, and are rejected for the same reasons.

With respect to claims 26, 27, and 30, the limitations are essentially the same as claims 1, 2, 3, and 9, and are rejected for the same reasons.

Claim(s) 4-6, 8, 18, 19, 28, and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Pathak in view of Wan, and further in view of Reza et al. (US 2023/0237277).

With respect to claim 4, Pathak in view of Wan teaches the non-transitory computer-readable medium of claim 1, as discussed above.
Reza teaches wherein to generate the compressed representation of the set of documents based on the plurality of titles, the computer-readable instructions further cause the processor to concatenate the plurality of titles to generate a string for the topic (Reza, pa 0059, A domain adaptation algorithm may be used to train T5 to generate unique domain relevant features (DRFs; a set of keywords that characterize domain information) for each input. Then those DRFs can be concatenated with the input to form a template).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Pathak in view of Wan with the teachings of Reza because it provides dynamic prompting which can be highly beneficial to develop a pre-trained model by appending the prompts to each set of input with an opinion and aspect. This will provide a better in-context learning and capture the opinion context information, which can lead to effective semantic information modelling (Reza, pa 0034).

With respect to claim 5, Pathak in view of Wan and Reza teaches the non-transitory computer-readable medium of claim 4, wherein the machine-consumable prompt for the topic comprises the string for the topic, an output definition defining a format for the topic label and the topic description for the topic, and one or more constraints (Reza, Fig. 1 & pa 0039, The original input 105 is then concatenated with the respective generated prompting template 115 to create a prompting function 120. For example, the original input 105 and the prompting template 115 may be used as input for a concatenate function configured to join the two text strings into a single text string: the original input 105; the prompting template 115, such that the two text strings are now linked or associated with one another.).

With respect to claim 6, Pathak in view of Wan and Reza teaches the non-transitory computer-readable medium of claim 5, wherein the one or more constraints include a system role and a user role to provide a framework for how to generate the topic label and topic description for the topic (Reza, pa 0048, The training data may be acquired from the public domain or private domain. For example, a user such as a customer in the food and service industry may provide training data for fine-tuning a model to analyze sentiment in online food blog posts. & pa 0036, FIG. 1 is a block diagram illustrating the overall concept 100 of dynamic aspect based prompting and its influence on improving the confidence in a downstream task. As shown, original input 105 is obtained from a set of training data. The original input 105 is a text example such as “I like the food but not the service” from the set of training data. …The set of training data includes labels. The labels comprise: (i) text that relate to possible solutions for the given task to be learned by the model, and (ii) the specified solutions (e.g., a class identifier or ground truth for the text example). The labels may be provided by a user (e.g., a customer) and may be particular to a domain that the user intends to train the model within.), and wherein the one or more constraints further include a summary of what to include in the topic description ((Reza, pa 0036, The labels may be provided by a user (e.g., a customer) and may be particular to a domain that the user intends to train the model within. For example, the text labels may be words such as terrible, bland, flavorful, delicious, disgusting, sour, sweet, poison, enjoyable, spicy, etc. that relate to various semantic classes (e.g., positive, negative, neutral, or the like) to be predicted for each text example within the domain of food. In other words, the original input 105 may include text that relates to possible solutions for the given task (e.g., the food was good but the service was bad—with good and bad being text that relate to possible sentiment solutions or classes); & Wan, pa 0054, An instruction of cluster description is a natural language instruction for the model to summarize, group, and/or cluster the dataset summaries in a certain way according to a particular use case and generate corresponding descriptions.).

With respect to claim 8, Pathak in view of Wan and Reza teaches the non-transitory computer-readable medium of claim 5, wherein the format comprises: <topic number>:<topic label>:<topic description> (Examiner note: the format of the topic label is an obvious variant of design choice that could be specified by the programmer.).

	With respect to claims 18, 19, 28, and 29, the limitations are essentially the same as claims 4-6 and 8, and are rejected for the same reasons.

Claims 9, 20, and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Pathak in view of Wan, and further in view of McAnallen (US 2024/0104055).

With respect to claim 9, Pathak in view of Wan teaches the non-transitory computer-readable medium of claim 1, as discussed above.
McAnallen teaches wherein the title is identified from each of the second subset of topic documents from metadata of each of the second subset of topic documents (McAnallen, pa 0040, Document data 260 is transmitted to the title generation system 112 for processing. The document data 260 may include information about one or more clusters of documents for which a title should be generated. the document data 260 may include the documents in one or more document groups ( e.g., a plurality of individual user feedbacks categorized into multiple different user feedback groups). Document data 260 may also include metadata).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Pathak in view of Wan with the teachings of McAnallen because it provides information about the document to create clusters (McAnallen, pa 0040).

Claim(s) 11, 12, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Pathak in view of Wan, and further in view of Thompson et al. (US 2024/0362197).

With respect to claim 11, Pathak in view of Wan teaches the non-transitory computer-readable medium of claim 10, as discussed above.
Thompson teaches wherein the title is identified from a first sentence of a first a number of paragraphs in the body (Thompson, “Document Titles” as described herein refer broadly to a phrase or sentence generally describing the information presented in the document as a whole. Document Titles typically appear at the “beginning” of a document, which is determined according to the natural reading order of the language(s) in which information in the document is presented…. Document Titles for documents presented in a modern Western language are typically found at the upper-most portion of the first page of the document, with the title possibly including some unique formatting, such as being center-aligned, presented in a larger, emphasized font (e.g., bold, italicized, underlined, uniquely colored, etc.) compared to text of the body of the document, etc. Of course, the foregoing example is just one possibility, and depending on the nature of the document, the Document Title may be found elsewhere within the document).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified  because it provides a description of the information presented in the document as a whole (Thompson, pa 0089).

With respect to claim 12, Pathak in view of Wan teaches the non-transitory computer-readable medium of claim 10, as discussed above.
Thompson teaches wherein the title is a first sentence of a topic document (Thompson, pa 0089 “Document Titles” as described herein refer broadly to a phrase or sentence generally describing the information presented in the document as a whole. Document Titles typically appear at the “beginning” of a document, which is determined according to the natural reading order of the language(s) in which information in the document is presented…. Document Titles for documents presented in a modern Western language are typically found at the upper-most portion of the first page of the document, with the title possibly including some unique formatting, such as being center-aligned, presented in a larger, emphasized font (e.g., bold, italicized, underlined, uniquely colored, etc.) compared to text of the body of the document, etc. Of course, the foregoing example is just one possibility, and depending on the nature of the document, the Document Title may be found elsewhere within the document).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified  because it provides a description of the information presented in the document as a whole (Thompson, pa 0089).

	With respect to claim 22, the limitations are essentially the same as claim 12 and are rejected for the same reasons.

Claims 15 and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Pathak in view of Wan, and further in view of Blum et al. (US 2025/0068667).

With respect to claim 15, Pathak in view of Wan teaches the non-transitory computer-readable medium of claim 10, wherein to generate the title from the body of a topic document, the computer-readable instructions further cause the processor to:
input the second subset of topic documents into an information extraction model to generate a plurality of snippets (Wan, pa 0052, the input 204 includes the raw natural language text of multiple datasets (e.g., multiple conversations in a chat thread) and a summarization instruction, where each dataset is represented by (Xi));
execute the information extraction model to generate the plurality of snippets (Wan, 0052, the language model encoder(s)/decoder(s) 206 generates a text summary-i.e., the "summary of each dataset" as illustrated in the output 208, which is represented by (fi).);
input the plurality of snippets into a Large Language Model (LLM) (Wan, pa 0050, the language model encoder(s)/decoder(s) represent any suitable language model or component thereof, such as a LLM (e.g., a GPT or BERT), & pa 0053, The input 304 is fed to the language model encoder(s) and/or decoder(s) 306 (which may be the same model as 206 of FIG. 2), which then produces the output 308);
execute the LLM to generate a title for each of the second subset of topic documents to obtain a plurality of titles (Wan, pa 0056, Continuing with FIG. 3, the output 308 includes the generated cluster description(s) and label(s) for each batch.).
Blum teaches concatenate the plurality of titles to generate a title string (Blum, pa 0038, The process of constructing (or generate) the prompt 300 may include….  converting data extracted from the storage 120 (e.g. application data 121) into strings. The resulting strings can then be concatenated or otherwise combined to form the prompt 300.); and
generate the machine-consumable prompt from the title string (Blum, pa 0038, The resulting strings can then be concatenated or otherwise combined to form the prompt 300.).
It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Pathak in view of Wan with the teachings of Blum because it creates a summarization of the data.

With respect to claim 25, the limitations are essentially the same as claim 15 and are rejected for the same reasons.

Response to Arguments
35 U.S.C. 101 
With regard to claims 1-30, the amendments to the claims have overcome the 35 U.S.C. 101 rejection.  The Examiner withdraws the 35 U.S.C. 101 rejection to claims 1-30.  

35 U.S.C. 103 
Applicant seems to argue a newly amended limitation.  Applicant’s amendment has rendered the previous rejection moot.  Upon further consideration of the amendment, a new grounds of rejection is made in view of Pathak et al. (US 2024/0394479).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Kelly et al. (US 2024/0289560) teaches generating a contextual classification for the one or more request text fields and identifying a refined document subset based on the contextual classification and generating, using a large language model, one or more generative text fields using a generative model prompt based on the prompt document subset and the one or more request text fields.
Ailem et al. (US 2026/0004135) teaches using one or more large language models and the plurality of instructive, generate a plurality of annotated clusters, wherein an annotated cluster comprises a category annotation and a summary annotation.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRITTANY N ALLEN whose telephone number is (571)270-3566. The examiner can normally be reached M-F 9 am - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sherief Badawi can be reached on 571-272-9782. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/BRITTANY N ALLEN/           Primary Examiner, Art Unit 2169
Read full office action
Prosecution Timeline

Jan 17, 2025
Application Filed
Apr 09, 2025
Non-Final Rejection — §103, §DP
Jun 05, 2025
Examiner Interview Summary
Jun 09, 2025
Response Filed
Jun 23, 2025
Final Rejection — §103, §DP
Aug 21, 2025
Examiner Interview Summary
Aug 25, 2025
Response after Non-Final Action
Sep 12, 2025
Request for Continued Examination
Sep 24, 2025
Response after Non-Final Action
Oct 15, 2025
Non-Final Rejection — §103, §DP
Jan 15, 2026
Applicant Interview (Telephonic)
Jan 15, 2026
Examiner Interview Summary
Jan 20, 2026
Response Filed
Mar 16, 2026
Final Rejection — §103, §DP (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/851,506
Patent 12585707
SYSTEMS AND METHODS FOR DOCUMENT ANALYSIS TO PRODUCE, CONSUME AND ANALYZE CONTENT-BY-EXAMPLE LOGS FOR DOCUMENTS
2y 5m to grant Granted Mar 24, 2026
17/978,752
Patent 12561342
MULTI-REGION DATABASE SYSTEMS AND METHODS
2y 5m to grant Granted Feb 24, 2026
18/749,683
Patent 12530391
Digital Duplicate
2y 5m to grant Granted Jan 20, 2026
18/375,735
Patent 12524389
ENTERPRISE ENGINEERING AND CONFIGURATION FRAMEWORK FOR ADVANCED PROCESS CONTROL AND MONITORING SYSTEMS
2y 5m to grant Granted Jan 13, 2026
19/054,190
Patent 12524475
CONCEPTUAL CALCULATOR SYSTEM AND METHOD
2y 5m to grant Granted Jan 13, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
42%
Grant Probability
79%
With Interview (+37.7%)
4y 8m
Median Time to Grant
High
PTA Risk
Based on 391 resolved cases by this examiner. Grant probability derived from career allow rate.
SYSTEM AND METHOD FOR COMPRESSING PROMPTS TO LANGUAGE MODELS FOR DOCUMENT PROCESSING

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email