Last updated: April 19, 2026
Application No. 19/036,646
CACHING LARGE LANGUAGE MODEL (LLM) RESPONSES USING HYBRID RETRIEVAL AND RECIPROCAL RANK FUSION

Non-Final OA §103§DP
Filed
Jan 24, 2025
Examiner
TOUGHIRY, ARYAN D
Art Unit
2165
Tech Center
2100 — Computer Architecture & Software
Assignee
Inventus Holdings LLC
OA Round
1 (Non-Final)
Interview Optional

— +19.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 189 resolved cases, 2023–2026
Examiner Intelligence

TOUGHIRY, ARYAN D View full profile →
Grants 68% — above average
Career Allow Rate
128 granted / 189 resolved
+12.7% vs TC avg
Strong +20% interview lift
Without
With
+19.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
17 currently pending
Career history
206
Total Applications
across all art units
Statute-Specific Performance

§101
7.0%
-33.0% vs TC avg
§103
64.4%
+24.4% vs TC avg
§102
14.9%
-25.1% vs TC avg
§112
7.0%
-33.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 189 resolved cases
Office Action

§103 §DP
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Double Patenting
Claims 1-19 are rejected on the ground of obviousness-type nonstatutory  double patenting as being unpatentable over claims 1-20 of Application No 17749546. Although the claims at issue are not identical, they are not patentably distinct from each other because the claims of the Application No 17749546 disclose the function and structure of the claims of the instant application to those having ordinary skill in the art. 
Regarding claim 1 of the instant application: 
1/24/2025 – 19036646 – claim 1
9/3/2024 – 18441863 – claim 1
A method for improving computer functionality by retrieving outputs to inputs from a cache, the method comprising:using a hardware processor communicatively coupled to memory to perform accessing an input, stored in primary storage communicatively coupled to a cache, in a text format;accessing metadata associated with the input in the text format;vectorizing the input in the text format into a vector using a text embedding algorithm;performing a semantic search using the vector to search an input portion of the cache and performing query filtering with the metadata associated with the input to provide a semantic layer set of semantic outputs in the text format with associated semantic output metadata and semantic relevance values;performing a lexical search using the input in the text format to search an output portion of the cache and performing query filtering with the metadata associated with the input to provide a lexical layer set of lexical outputs in the text format with associated lexical output metadata and lexical relevance values; and computing a combined ranking set for the semantic outputs in the text format and the lexical outputs in the text format, using the semantic layer set in order of the semantic relevance values from highest to lowest and the lexical layer set in order of the lexical relevance values from highest to lowest, to provide an identified output.
A method for improving computer functionality by retrieving answers to questions from a cache, the method comprising:using a hardware processor communicatively coupled to memory to perform accessing a question stored in primary storage communicatively coupled to a cache, in a text format; accessing metadata associated with the question in the text format; vectorizing the question in the text format into a high dimensional vector using a text embedding algorithm, wherein the high dimensional vector is greater than or equal to 1024 dimensions;using the high dimensional vector to search a question portion of the cache using a plurality of retriever types to create a hybrid search, in which the hybrid search combines one or more text format queries using the metadata with one or more high dimensional vector queries in a single search request;performing query filtering with the metadata associated with the question to provide a semantic layer set of semantic answers in a text format with metadata associated with an answer and semantic relevance values; using the question in the text format to search an answer portion of the cache and performing query filtering with the metadata associated with the question to provide a lexical layer set of lexical answers in the text format with the metadata associated with the answer and lexical relevance values; using the semantic layer set in order of the semantic relevance values from highest to lowest and the lexical layer set from highest to lowest; and applying a reciprocal rank fusion algorithm to compute a combined ranking set for the semantic answers in the text format and the lexical answers in the text format to provide an identified answer.

Corresponding system claim 12 is rejected similarly as claim 1 above
Regarding claim 2 of the instant application: 
1/24/2025 – 19036646 – claim 2
9/3/2024 – 18441863 – claim 1
The method of claim 1, wherein the vectorizing the input in the text format into the vector using the text embedding algorithm includes vectorizing the input in the text format into a high dimensional vector, wherein the high dimensional vector is greater than or equal to 1024 dimensions and wherein the performing the semantic search includes performing the semantic search using the high dimensional vector.
A method for improving computer functionality by retrieving answers to questions from a cache, the method comprising:using a hardware processor communicatively coupled to memory to perform accessing a question stored in primary storage communicatively coupled to a cache, in a text format; accessing metadata associated with the question in the text format; vectorizing the question in the text format into a high dimensional vector using a text embedding algorithm, wherein the high dimensional vector is greater than or equal to 1024 dimensions;using the high dimensional vector to search a question portion of the cache using a plurality of retriever types to create a hybrid search, in which the hybrid search combines one or more text format queries using the metadata with one or more high dimensional vector queries in a single search request;performing query filtering with the metadata associated with the question to provide a semantic layer set of semantic answers in a text format with metadata associated with an answer and semantic relevance values; using the question in the text format to search an answer portion of the cache and performing query filtering with the metadata associated with the question to provide a lexical layer set of lexical answers in the text format with the metadata associated with the answer and lexical relevance values; using the semantic layer set in order of the semantic relevance values from highest to lowest and the lexical layer set from highest to lowest; and applying a reciprocal rank fusion algorithm to compute a combined ranking set for the semantic answers in the text format and the lexical answers in the text format to provide an identified answer.

Corresponding system claim 13 is rejected similarly as claim 2 above

Regarding claim 3 of the instant application: 
1/24/2025 – 19036646 – claim 3
9/3/2024 – 18441863 – claim 1
The method of claim 1, wherein the computing a combined ranking includes computing the combined ranking using a reciprocal rank fusion algorithm.
A method for improving computer functionality by retrieving answers to questions from a cache, the method comprising:using a hardware processor communicatively coupled to memory to perform accessing a question stored in primary storage communicatively coupled to a cache, in a text format; accessing metadata associated with the question in the text format; vectorizing the question in the text format into a high dimensional vector using a text embedding algorithm, wherein the high dimensional vector is greater than or equal to 1024 dimensions;using the high dimensional vector to search a question portion of the cache using a plurality of retriever types to create a hybrid search, in which the hybrid search combines one or more text format queries using the metadata with one or more high dimensional vector queries in a single search request;performing query filtering with the metadata associated with the question to provide a semantic layer set of semantic answers in a text format with metadata associated with an answer and semantic relevance values; using the question in the text format to search an answer portion of the cache and performing query filtering with the metadata associated with the question to provide a lexical layer set of lexical answers in the text format with the metadata associated with the answer and lexical relevance values; using the semantic layer set in order of the semantic relevance values from highest to lowest and the lexical layer set from highest to lowest; and applying a reciprocal rank fusion algorithm to compute a combined ranking set for the semantic answers in the text format and the lexical answers in the text format to provide an identified answer.

Corresponding system claim 14 is rejected similarly as claim 3 above

Regarding claim 4 of the instant application: 
1/24/2025 – 19036646 – claim 4
9/3/2024 – 18441863 – claim 2
The method of claim 1, further comprising: in response to the semantic relevance values being above a settable value, returning the semantic outputs in the text format with the highest semantic relevance values and, otherwise, sending the input in the text format to create a prompt.
The method of claim 1, further comprising:in response to the semantic relevance values being above a settable value, returning the semantic answers in the text format with the highest semantic relevance values and, otherwise, sending the question in the text format to create a prompt.

Corresponding system claim 15 is rejected similarly as claim 4 above

Regarding claim 5 of the instant application: 
1/24/2025 – 19036646 – claim 5
9/3/2024 – 18441863 – claim 4
The method of claim 1, further comprising:in response to the combined ranking set being above a settable value, returning the identified output and, otherwise, sending the input in the text format to create a prompt.
The method of claim 3, further comprising:in response to the combined ranking set being above a settable value, returning the identified answer and, otherwise, sending the question in the text format to create a prompt.

Corresponding system claim 16 is rejected similarly as claim 5 above

Regarding claim 6 of the instant application: 
1/24/2025 – 19036646 – claim 6
9/3/2024 – 18441863 – claim 6
The method of claim 1, wherein the performing query filtering with the metadata associated with the input provides an exact match result, wherein the vector to search the input portion of the cache provides an approximate match ranked by the semantic relevance values.
The method of claim 1, wherein the performing query filtering with the metadata associated with the question provides an exact match result, wherein the high dimensional vector to search the question portion of the cache provides an approximate match ranked by the semantic relevance values.

Corresponding system claim 17 is rejected similarly as claim 6 above

Regarding claim 7 of the instant application: 
1/24/2025 – 19036646 – claim 7
9/3/2024 – 18441863 – claim 7
The method of claim 1, wherein the performing query filtering with metadata associated with the input provides an exact match result, wherein the input in the text format to search the input portion of the cache provides an approximate match ranked by the lexical relevance values.
The method of claim 3, wherein the performing query filtering with Q-metadata provides an exact match result, wherein the question in the text format to search a question portion of the cache provides an approximate match ranked by the lexical relevance values.

Corresponding system claim 18 is rejected similarly as claim 7 above

Regarding claim 8 of the instant application: 
1/24/2025 – 19036646 – claim 8
9/3/2024 – 18441863 – claim 8
The method of claim 1, further comprising:in response to a subsequent input being received, the cache is first checked to see if a similar request has already been made and, in response, retrieving the output from the cache.
The method of claim 1, further comprising:in response to a subsequent input being received, the cache is first checked to see if a similar request has already been made and, in response, retrieving the output from the cache.

Corresponding system claim 19 is rejected similarly as claim 8 above

Regarding claim 9 of the instant application: 
1/24/2025 – 19036646 – claim 9
9/3/2024 – 18441863 – claim 9
The method of claim 1, wherein the accessing the input in the text format includes accessing input that originated from a human user or from a computer process.
The method of claim 1, wherein the accessing a question in a text format includes accessing a question that originated from a human user or from a computer process.



Regarding claim 10 of the instant application: 
1/24/2025 – 19036646 – claim 10
9/3/2024 – 18441863 – claim 6
The method of claim 2, wherein the performing query filtering with the metadata associated with the input provides an exact match result, wherein the high dimensional vector to search the input portion of the cache provides an approximate match ranked by the semantic relevance values.
The method of claim 1, wherein the performing query filtering with the metadata associated with the question provides an exact match result,  wherein the high dimensional vector to search the question portion of the cache provides an approximate match ranked by the semantic relevance values.



Regarding claim 11 of the instant application: 
1/24/2025 – 19036646 – claim 11
9/3/2024 – 18441863 – claim 7
The method of claim 2, wherein the performing query filtering with the metadata associated with the input provides an exact match result, wherein the high dimensional vector to search the input portion of the cache provides an approximate match ranked by the lexical relevance values.
The method of claim 2, wherein the performing query filtering with the metadata associated with the input provides an exact match result, wherein the high dimensional vector to search the input portion of the cache provides an approximate match ranked by the lexical relevance values.




Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-3,6-14,and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over US 20210382923 A1; Gragnani; Louis Rudolph et al. (hereinafter Gragnani) in view of US 20220156298 A1; Mahmoud; Mohamed Gamal Mohamed et al. (hereinafter Mahmoud) and US 20150081279 A1; SULEMAN; Kaheer et al. (hereinafter Suleman).
Regarding claim 1, Gragnani teaches A method for improving computer functionality by retrieving outputs to inputs from a cache, the method comprising: using a hardware processor communicatively coupled to memory to perform accessing an input, stored in primary storage communicatively coupled to a cache, in a text format; (Gragnani [FIG. 1] shows retrieving answers to questions from cache and using a hardware processor communicatively coupled to memory to perform accessing a question in a text format [0015] Tools and techniques for answering questions by retrieving data from a system of record utilizing natural language interpretation are described. More specifically, the invention pertains to the use of machine learning to retrieve answers from a system of record [0036]  “Computer storage media” can comprise volatile and non-volatile, removable, and non-removable media implemented in any methods or technology for storage of information such as computer readable instructions, data structures, program modules, or other data.[FIG.10] shows corresponding flow of steps taking in question with text, creating embeddings/vectors along with metadata and then performing a corresponding search/query using said embeddings/vectors )			accessing metadata associated with the input in the text format; (Gragnani [0040]  question in the form of text data can also include metadata objects. In a further aspect, the metadata objects can include previously completed phrase metadata. [0047]  templates may be specified by the user 14 within the user question 15 or specified by an administrator 16 in the configuration of the metadata term describing the result value as understood in the given interpretation [0054] question structure by treating names/instance IDs as variables in the expression.[0065] At step 706, another classifier predicts the intended metadata object ID. The prediction of the metadata object ID can be determined by implementing a data abstraction layer (metadata model). The metadata model can comprise a plurality of metadata objects... [0066] An example implementation of the metadata model can comprise responding to the question...)					vectorizing the input in the text format into a vector using a text embedding algorithm; (Gragnani [FIG. 1] shows vectorizing the question in the text format into a high dimensional vector using a text embedding algorithm [0064] a user's question/request can be vectorized wherein each word in the question is converted [0099] processes the question texts ... by passing question texts at input X 1006 to a word embedding model. This word embedding model may be pre-trained. The calculation of X1 1006a may occur in the same processor as executes method ... occur in the processor of a vectorization [FIG.10] shows corresponding flow of steps taking in question with text, creating embeddings/vectors along with metadata and then performing a corresponding search/query using said embeddings/vectors )		performing a semantic search using the vector to search an input portion of the cache and performing query filtering with the metadata associated with the input to provide a semantic layer set of semantic outputs in the text format with associated semantic output metadata and semantic relevance values; (Gragnani [FIG.10] shows corresponding flow of steps taking in question with text, creating embeddings/vectors along with metadata and then performing a corresponding search/query using said embeddings/vectors and using different types/models to retrieve the data and create a mixed/hybrid search [0043]  processes the user question ... query results [0061]a classifier predicts the semantic type of the request ... [0050] execute the search across the various storage locations and associated protocols and maintain stability in the system 100. [0092] query model ... outputs of column model 1004 can be initialized per the acceptable roles/aggregation functions/sort directions expected from the semantic type of “c”. including copies of the column model 1004 each responsible for learning about a different data element “c” in the configuration of method 900. In Training step 912, the training module 164 can assign the sampled question texts to query model input “X” 1006. During the training step 912, the training module 164 also assigns the associated sampled Abstract Query IDs to the query model [FIG. 1] shows using the high dimensional vector to search a question portion of a cache and performing query filtering with the metadata associated with the question to provide a semantic layer set of semantic answers in a text format with metadata associated with an answer and semantic relevance values [0079-82] elaborate on the semantics involved in the answers [40,50,1108] further show using different types/models to retrieve the data and create a mixed/hybrid search)						Gragnani lacks explicitly and orderly teaching performing a lexical search using the input in the text format to search an output portion of the cache and performing query filtering with the metadata associated with the input to provide a lexical layer set of lexical outputs in the text format with associated lexical output metadata and lexical relevance values;										However Mahmoud teaches performing a lexical search using the input in the text format to search an output portion of the cache and performing query filtering with the metadata associated with the input to provide a lexical layer set of lexical outputs in the text format with associated lexical output metadata and lexical relevance values; (Mahmoud [0054] The query filter 216 may filter out queries The query filter 216 may use approaches for query filtering including detecting whether a query is small talk (a binary classifier using an algorithm like Naive Bayes), and detecting whether a query is a question (whether it's interrogative, declarative, or imperative) such as by using lexical sequential and syntactic structural patterns and semantic representations. Additionally, the query filter 216 may use approaches such as detecting whether a query is relevant to given corpora (a knowledge base profile), such as by using word-based relevance (e.g., frequency; Log-Likelihood, TF-IDF, clustering-based relevance to words in given corpora) and sentence similarity threshold (e.g., using cosine similarity of sentence embeddings) with pre-computed cluster centroids (e.g., using k-means) of sentences in given corpora. Additionally, the query filter 216 may use approaches such as an ensemble (combinations using plurality voting, weighted average, etc.) of the approaches above.[0084] Generally, sentence-embedding methods capture semantic similarity between non-exact matches (different phrases and/or words whose meanings are similar) while IoU methods typically consider exact lexical matches (e.g., the words “NYC” and “New York City” don't match) unless a multi-term synonyms graph is used to test for exact and expanded list of terms for each input.)													Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to take all prior methods and make the addition of Mahmoud's data processing in order to improve the quality system and generated higher quality data (Mahmoud [AB.] Techniques for agent-assist systems to provide context-aware, subdocument-granularity recommended answers to agents that are attempting to answer queries of users. The agent-assist system may obtain collections of documents that include information for responding to queries, and analyze those documents to identify subdocuments that are associated with different semantics or meanings. Subsequently, any queries received can be analyzed to identify their semantics, and relevant subdocuments can be identified as having similar semantics. When the agent-assist system presents the agent with the relevant documents, it may highlight or otherwise indicate the relevant subdocument within the document for quick identification by the agent. Further, the agent-assist system may collect feedback from the agent and/or user to determine a relevancy of the recommended answers. The agent-assist system can use the feedback to improve the quality of the recommended answers provided to the agents. [0002] The present disclosure relates generally to an agent-assist system that provides context-aware recommendations to agents that are attempting to answer queries of users, and improves the quality of the recommended answers provided to agents when responding to the queries of users. 0042] The agent-assist system 118 may include or support the components and operations of the agent-assist pipeline 202. The agent-assist system 118 may be configured to help agents 112 while handling user 104 interactions by recommending resources that are relevant to a user's 104 issue or question. Goals of the agent-assist system 118 may include decreasing the average handling time (AHT), increasing the first-contact resolution (FCR) rate, minimizing agent 112 training time, and to provide a fast and accurate source of information)												The combination lack explicitly and orderly teaching and computing a combined ranking set for the semantic outputs in the text format and the lexical outputs in the text format, using the semantic layer set in order of the semantic relevance values from highest to lowest and the lexical layer set in order of the lexical relevance values from highest to lowest, to provide an identified output.					However Suleman teaches computing a combined ranking set for the semantic outputs in the text format and the lexical outputs in the text format, using the semantic layer set in order of the semantic relevance values from highest to lowest and the lexical layer set in order of the lexical relevance values from highest to lowest, to provide an identified output. (Suleman [0034] NLP engine 214 may include a statistical classifier configured to apply one or more models statistical models 408, 410 in classifying the text representation of the audio input 152 into a domain, and in some embodiments, into a sub-domain as well. A domain is general class of action or functionality such as the CALENDAR domain, the WEATHER domain, and so forth. In one embodiment, each domain of functionality offered by application 101 has a particular statistical model 408 that is trained to recognize queries that lie within the domain of functionality offered by the domain. The natural language processor may include one type of classifier or a combination of classifiers, such as support vector machines (SVMs), random forest trees, Naive Bayes classifiers, and so forth. In one embodiment, more than one type of classifier is used to classify the text representation of the audio input 152, the classifiers being aggregated using a neural network and/or reciprocal rank fusion. Classifying a given text representation of the audio input 152 into a domain may be performed by a primary classifier which has access to the statistical models 408 for each domain, as well as a variety of aggregators as employed in a particular implementation)											Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to take all prior methods and make the addition of Suleman in order to improve the quality of the system output via specialized algorithms and classifiers (Suleman  [0034] NLP engine 214 may include a statistical classifier configured to apply one or more models statistical models 408, 410 in classifying the text representation of the audio input 152 into a domain, and in some embodiments, into a sub-domain as well. A domain is general class of action or functionality such as the CALENDAR domain, the WEATHER domain, and so forth. In one embodiment, each domain of functionality offered by application 101 has a particular statistical model 408 that is trained to recognize queries that lie within the domain of functionality offered by the domain. The natural language processor may include one type of classifier or a combination of classifiers, such as support vector machines (SVMs), random forest trees, Naive Bayes classifiers, and so forth. In one embodiment, more than one type of classifier is used to classify the text representation of the audio input 152, the classifiers being aggregated using a neural network and/or reciprocal rank fusion. Classifying a given text representation of the audio input 152 into a domain may be performed by a primary classifier which has access to the statistical models 408 for each domain, as well as a variety of aggregators as employed in a particular implementation.[0052] It should be noted that the rules 434 applied by the post-processor above are merely exemplary and will typically be tailored to a particular application 101 and may be modified over time. The rules 434 supplement the classifier 404 and allow the adaptive nature of statistical classifiers to be fine-tuned as text representations are received. The rules 434 also provides a mechanism for improving the performance of statistical classifier without having to train and redeploy statistical models, which often reside in a cloud-computing environment. Accordingly, new cases can be added to a rules text file and deploy this file without having to retrain any models.)
Corresponding system claim 12 is rejected similarly as claim 1 above. Additional Limitations: Device with processor(s) and memory (Gragnani [FIG.1] shows corresponding Device with processor(s) and memory )
Regarding claim 2, Gragnani, Mahmoud and Suleman teach The method of claim 1, wherein the vectorizing the input in the text format into the vector using the text embedding algorithm includes vectorizing the input in the text format into a high dimensional vector, wherein the high dimensional vector is greater than or equal to 1024 dimensions and wherein the performing the semantic search includes performing the semantic search using the high dimensional vector. (Mahmoud [0046] Generally, ingestion breaks down a document 206 into subdocuments using signals from the source document; for example, an HTML document contains boundary markup for sections, headers, paragraphs, lists, etc, Other signals include topic-drift measures, length of buffer of text being read so far, etc. The document processor 210 performs techniques to break down (using topic drift among other signals), transform, and index a collection contact-center knowledge base documents 206 into subdocuments represented by embeddings (points in a high-dimensional vector space) to enable semantic approximate nearest neighbor (ANN) search for a given query (a turn/utterance of a contact-center conversation). [0057] After the query filter 216 filters out irrelevant queries or input from the user 104, a retriever component 218 may send queries at “6” to the knowledge-base storage 212 to identify relevant documents. That is, for each query that passes through the query filter 216, the retriever component 218 may embed the query using an embedding model (that preserves the semantic relationships detailed earlier) into a vector q, and searches the vector space index that corresponds to the agent's 112 knowledge base profile. An agent 112 handling a user conversation can be assigned to a single profile at a time, Each subdocument in the knowledge base profile, with references to the document to which it belongs, is represented as a point in the high-dimensional vector space (e.g., 1024 dimensions). The agent-assist system 118 finds the k nearest points, embedded using the same embedding model, to the input q. The cosine similarity measure is used to rank and score the top [FIG. 3] shows corresponding flow of steps)
Corresponding system claim 13 is rejected similarly as claim 2 above.
Regarding claim 3, Gragnani, Mahmoud and Suleman teach The method of claim 1, wherein the computing a combined ranking includes computing the combined ranking using a reciprocal rank fusion algorithm. (Suleman [0034] NLP engine 214 may include a statistical classifier configured to apply one or more models statistical models 408, 410 in classifying the text representation of the audio input 152 into a domain, and in some embodiments, into a sub-domain as well. A domain is general class of action or functionality such as the CALENDAR domain, the WEATHER domain, and so forth. In one embodiment, each domain of functionality offered by application 101 has a particular statistical model 408 that is trained to recognize queries that lie within the domain of functionality offered by the domain. The natural language processor may include one type of classifier or a combination of classifiers, such as support vector machines (SVMs), random forest trees, Naive Bayes classifiers, and so forth. In one embodiment, more than one type of classifier is used to classify the text representation of the audio input 152, the classifiers being aggregated using a neural network and/or reciprocal rank fusion. Classifying a given text representation of the audio input 152 into a domain may be performed by a primary classifier which has access to the statistical models 408 for each domain, as well as a variety of aggregators as employed in a particular implementation. )
Corresponding system claim 14 is rejected similarly as claim 3 above.
Regarding claim 6, Gragnani, Mahmoud and Suleman teach The method of claim 1, wherein the performing query filtering with the metadata associated with the input provides an exact match result, wherein the vector to search the input portion of the cache provides an approximate match ranked by the semantic relevance values. (Mahmoud [0046] to enable semantic approximate nearest neighbor (ANN) search for a given query [0047] Embeddings, their respective text and metadata, and other data can be indexed using an indexing service (e.g., Elasticsearch), Hierarchical Navigable Small World graphs, any system that supports Approximate Nearest Neighbor (ANN) search [0049] The document processor 210 then relies on the rest of the pipeline 202 to perform a more relined search given the list of retrieved matches. [0054] The query filter 216 may filter out queries The query filter 216 may use approaches for query filtering including detecting whether a query is small talk (a binary classifier using an algorithm like Naive Bayes), and detecting whether a query is a question (whether it's interrogative, declarative, or imperative) such as by using lexical sequential and syntactic structural patterns and semantic representations. [0057] The cosine similarity measure is used to rank and score the top-k candidate matches. [0081] matching set of criteria to determine similarity—i.e., both exact and fuzzy match techniques are used when determining query similarity and corresponding answers ...a given input query.)
Corresponding system claim 17 is rejected similarly as claim 6 above.
Regarding claim 7, Gragnani, Mahmoud and Suleman teach The method of claim 1, wherein the performing query filtering with metadata associated with the input provides an exact match result, wherein the input in the text format to search the input portion of the cache provides an approximate match ranked by the lexical relevance values. (Mahmoud [0046] to enable semantic approximate nearest neighbor (ANN) search for a given query [0047] Embeddings, their respective text and metadata, and other data can be indexed using an indexing service (e.g., Elasticsearch), Hierarchical Navigable Small World graphs, any system that supports Approximate Nearest Neighbor (ANN) search [0049] The document processor 210 then relies on the rest of the pipeline 202 to perform a more relined search given the list of retrieved matches. [0054] The query filter 216 may filter out queries The query filter 216 may use approaches for query filtering including detecting whether a query is small talk (a binary classifier using an algorithm like Naive Bayes), and detecting whether a query is a question (whether it's interrogative, declarative, or imperative) such as by using lexical sequential and syntactic structural patterns and semantic representations. [0057] The cosine similarity measure is used to rank and score the top-k candidate matches. [0084] Generally, sentence-embedding methods capture semantic similarity between non-exact matches (different phrases and/or words whose meanings are similar) while IoU methods typically consider exact lexical matches (e.g., the words “NYC” and “New York City” don't match) unless a multi-term synonyms graph is used to test for exact and expanded list of terms for each input. [0081] matching set of criteria to determine similarity—i.e., both exact and fuzzy match techniques are used when determining query similarity and corresponding answers ...a given input query.)
Corresponding system claim 18 is rejected similarly as claim 7 above.
Regarding claim 8, Gragnani, Mahmoud and Suleman teach The method of claim 1, further comprising: in response to a subsequent input being received, the cache is first checked to see if a similar request has already been made and, in response, retrieving the output from the cache. (Gragnani [0003] The query models can learn from previous data or data samples. [0041] user rankings can be inferred by a user's history of questions...[0043] search parameters from previous searches. In the event that query results involve tabular or chart data, question application 108 can provide statistical analysis tools including calculations and trendline [FIG.1] shows the system capable of using past data )
Corresponding system claim 19 is rejected similarly as claim 8 above.
Regarding claim 9, Gragnani, Mahmoud and Suleman teach The method of claim 1, wherein the accessing the input in the text format includes accessing input that originated from a human user or from a computer process. (Gragnani [0017] The question text is generated from the user's initial query [0042] Referring back to FIG. 1, user question 15 can comprise a text representation received from a keyboard device or a text conversion of an audio signal from the user…)
Regarding claim 10, Gragnani, Mahmoud and Suleman teach The method of claim 2, wherein the performing query filtering with the metadata associated with the input provides an exact match result, wherein the high dimensional vector to search the input portion of the cache provides an approximate match ranked by the semantic relevance values. (Mahmoud [0046] to enable semantic approximate nearest neighbor (ANN) search for a given query [0047] Embeddings, their respective text and metadata, and other data can be indexed using an indexing service (e.g., Elasticsearch), Hierarchical Navigable Small World graphs, any system that supports Approximate Nearest Neighbor (ANN) search [0049] The document processor 210 then relies on the rest of the pipeline 202 to perform a more relined search given the list of retrieved matches. [0054] The query filter 216 may filter out queries The query filter 216 may use approaches for query filtering including detecting whether a query is small talk (a binary classifier using an algorithm like Naive Bayes), and detecting whether a query is a question (whether it's interrogative, declarative, or imperative) such as by using lexical sequential and syntactic structural patterns and semantic representations. [0057] The cosine similarity measure is used to rank and score the top-k candidate matches. [0081] matching set of criteria to determine similarity—i.e., both exact and fuzzy match techniques are used when determining query similarity and corresponding answers ...a given input query.)
Regarding claim 11, Gragnani, Mahmoud and Suleman teach The method of claim 2, wherein the performing query filtering with the metadata associated with the input provides an exact match result, wherein the high dimensional vector to search the input portion of the cache provides an approximate match ranked by the lexical relevance values. (Mahmoud [0046] to enable semantic approximate nearest neighbor (ANN) search for a given query [0047] Embeddings, their respective text and metadata, and other data can be indexed using an indexing service (e.g., Elasticsearch), Hierarchical Navigable Small World graphs, any system that supports Approximate Nearest Neighbor (ANN) search [0049] The document processor 210 then relies on the rest of the pipeline 202 to perform a more relined search given the list of retrieved matches. [0054] The query filter 216 may filter out queries The query filter 216 may use approaches for query filtering including detecting whether a query is small talk (a binary classifier using an algorithm like Naive Bayes), and detecting whether a query is a question (whether it's interrogative, declarative, or imperative) such as by using lexical sequential and syntactic structural patterns and semantic representations. [0057] The cosine similarity measure is used to rank and score the top-k candidate matches. [0084] Generally, sentence-embedding methods capture semantic similarity between non-exact matches (different phrases and/or words whose meanings are similar) while IoU methods typically consider exact lexical matches (e.g., the words “NYC” and “New York City” don't match) unless a multi-term synonyms graph is used to test for exact and expanded list of terms for each input. [0081] matching set of criteria to determine similarity—i.e., both exact and fuzzy match techniques are used when determining query similarity and corresponding answers ...a given input query.)
Regarding claim 3, Gragnani, Mahmoud and Suleman teach
Regarding claim 3, Gragnani, Mahmoud and Suleman teach
Regarding claim 3, Gragnani, Mahmoud and Suleman teach
Regarding claim 3, Gragnani, Mahmoud and Suleman teach
Claims 5,6,16 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Gragnani in view of Mahmoud, Suleman and US 20220366333 A1; Lollo; Niklas et al. (hereinafter Lollo)
Regarding claim 4, Gragnani, Suleman Mahmoud teach The method of claim 1,		the combination lack explicitly and orderly teaching further comprising:in response to the combined ranking set being above a settable value, returning the identified output and, otherwise, sending the input in the text format to create a prompt.											However Lollo teaches further comprising:in response to the combined ranking set being above a settable value, returning the identified output and, otherwise, sending the input in the text format to create a prompt.. (Lollo [0066] The assessment system 102 may generate output that indicates what metrics (e.g., any of the scores, values, or metrics described above) are above a threshold. The assessment system 102 may generate output that indicates a sufficient number of questions for an impact category but that also indicates that a question quality score for the impact category is below a threshold. The assessment system 102 may generate a recommendation to add, remove, or modify one or more questions to improve question quality for an assessment. In some embodiments, a lack of data quality may lead to users supplementing their survey with better questions, further analysis of survey responses, or engagement with assessment managers to adopt better question-practices.[0067] The assessment system 102 may determine these gaps and display one or more indications of the gaps in a user interface (e.g., a dashboard) In some embodiments, the assessment system 102 may determine whether adding one or more questions to an assessment will decrease the reliability score of the assessment (e.g., decrease the reliability score more than a threshold amount). The assessment system 102 may balance between adding more questions to an assessment and tiring the survey respondent, leading to poor quality responses. In some embodiments, coherence across questions (including content, style, and jargon) may be considered when surfacing questions.[0074] improve (e.g., more than a threshold) an score for an impact category. For example, after adding one or more questions the assessment system 102 may recalculate a score for an impact category. If the score is less than a threshold amount higher than the score before the one or more questions were added, the assessment system 102 may determine that no more questions need to be added to the assessment. In response, the assessment system 102 may output an indication to the user device )			Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to take all prior methods make the addition of Lollo in order to improve the natural language processing of the system via thresholding methods (Lollo [0002] To improve value chain sustainability, many assessments have been developed. The assessments may include a variety of questions to help an organization measure the sustainability of its supply chain. For example, an assessment used to address workplace safety may include a question about how often workers are given workplace safety training. [0006] The computing system can compare the scores with other scores to determine how the current assessment can be improved. For example, by comparing the scores with other scores, the computing system can determine a question to add to the assessment. Mapping different assessments to the same impact categories allows the computing system to compare assessments and may eliminate the disparity problems ... through the use of machine learning or natural language processing.[0045]  a data quality score for an assessment, question, or response. The data quality score may be output to the user device 104 to inform a user or may be used to generate recommendations for improving the data quality[0074] improve (e.g., more than a threshold) an score for an impact category. For example, after adding one or more questions the assessment system 102 may recalculate a score for an impact category. If the score is less than a threshold ...)
 Corresponding system claim 16 is rejected similarly as claim 5 above.
Regarding claim 6, Gragnani, Suleman, Mahmoud, and Lollo teach The method of claim 1, wherein the performing query filtering with the metadata associated with the input provides an exact match result, wherein the vector to search the input portion of the cache provides an approximate match ranked by the semantic relevance values. (Mahmoud [0046] to enable semantic approximate nearest neighbor (ANN) search for a given query [0047] Embeddings, their respective text and metadata, and other data can be indexed using an indexing service (e.g., Elasticsearch), Hierarchical Navigable Small World graphs, any system that supports Approximate Nearest Neighbor (ANN) search [0049] The document processor 210 then relies on the rest of the pipeline 202 to perform a more relined search given the list of retrieved matches. [0054] The query filter 216 may filter out queries The query filter 216 may use approaches for query filtering including detecting whether a query is small talk (a binary classifier using an algorithm like Naive Bayes), and detecting whether a query is a question (whether it's interrogative, declarative, or imperative) such as by using lexical sequential and syntactic structural patterns and semantic representations. [0057] The cosine similarity measure is used to rank and score the top-k candidate matches. [0081] matching set of criteria to determine similarity—i.e., both exact and fuzzy match techniques are used when determining query similarity and corresponding answers ...a given input query.)
Corresponding system claim 17 is rejected similarly as claim 6 above.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARYAN D TOUGHIRY whose telephone number is (571)272-5212. The examiner can normally be reached Monday - Friday, 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksandr Kerzhner can be reached at (571) 270-1760. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ARYAN D TOUGHIRY/Examiner, Art Unit 2165                                                                                                                                                                                                        
/ALEKSANDR KERZHNER/Supervisory Patent Examiner, Art Unit 2165
Read full office action
Prosecution Timeline

Jan 24, 2025
Application Filed
Jan 26, 2026
Non-Final Rejection — §103, §DP
Apr 14, 2026
Interview Requested
Precedent Cases

Applications granted by this same examiner with similar technology

19/052,975
Patent 12602374
DATA ACQUISITION METHOD AND APPARATUS, COMPUTER DEVICE AND STORAGE MEDIUM
2y 5m to grant Granted Apr 14, 2026
16/372,783
Patent 12596596
USER-SPACE PARALLEL ACCESS CHANNEL FOR TRADITIONAL FILESYSTEM USING CAPI TECHNOLOGY
2y 5m to grant Granted Apr 07, 2026
18/917,566
Patent 12579141
GENERATING QUERY ANSWERS FROM A USER'S HISTORY
2y 5m to grant Granted Mar 17, 2026
17/119,986
Patent 12572390
SYSTEMS AND METHODS FOR ADAPTIVE WEIGHTING OF MACHINE LEARNING MODELS
2y 5m to grant Granted Mar 10, 2026
17/703,522
Patent 12573292
VEHICLE IDENTIFICATION USING ADVANCED DRIVER ASSISTANCE SYSTEMS (ADAS)
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
68%
Grant Probability
88%
With Interview (+19.9%)
3y 1m
Median Time to Grant
Low
PTA Risk
Based on 189 resolved cases by this examiner. Grant probability derived from career allow rate.