Last updated: April 19, 2026

Application No. 18/380,505

NATURAL LANGUAGE QUESTION ANSWERING WITH EMBEDDING VECTORS

Final Rejection §103

Filed

Oct 16, 2023

Examiner

ZHU, RICHARD Z

Art Unit

2654

Tech Center

2600 — Communications

Assignee

Permanence AI Inc.

OA Round

2 (Final)

Interview Optional

— +15.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 718 resolved cases, 2023–2026

Examiner Intelligence

ZHU, RICHARD Z View full profile →

Grants 69% — above average

Career Allow Rate

498 granted / 718 resolved

+7.4% vs TC avg

Strong +15% interview lift

Without

With

+15.4%

Interview Lift

resolved cases with interview

Typical timeline

3y 2m

Avg Prosecution

32 currently pending

Career history

750

Total Applications

across all art units

Statute-Specific Performance

§101

16.0%

-24.0% vs TC avg

§103

54.5%

+14.5% vs TC avg

§102

19.7%

-20.3% vs TC avg

§112

4.2%

-35.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 718 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Acknowledgement  
Acknowledgement is made of applicant’s amendment made on 10/14/2025. Applicant’s submission filed has been entered and made of record.
Status of the Claims
Claims 1-22 are pending. 
Response to Applicant’s Arguments
In response to “However, the provisional application does not describe nor make obvious all the elements cited and relied by the Office for the rejections of claim 1. Many of the elements and features relied upon by the office are first introduced in the application filed on October 30, 2023 (after the filing date of the present application) and, therefore, do not qualify as prior art” and “For example, on pages 4-5 of the Office Action, the Office cited paragraphs 35 and 37-38 as allegedly describing a score that reflects the accuracy of the first response in relation to an expected output format. The paragraphs refer to Figs. 2A-2C and relate to prompt features. However, as best understood by the Applicant, Liu's provisional application fails to provide any such descriptions and/or figures. Applicant submits that other cited paragraphs and features of Liu are likewise unsupported by the provisional application”.
The provisional specification of 63/510074 at ¶48 states: “This application is further described with respect to the attached document in Appendix I, titled Investigating the Integration of Retriever for Enhancing Answer Generation with LLMs, 13 pages, which is considered part of this disclosure and the entirety of which is incorporated by reference”.
According to Appendix I on p. 1, “1 Introduction”, “To address this issue, a promising approach has been the integration of retrieval-based techniques…, where relevant documents or text passages are fetched to aid the answer generation process”. 

    PNG
    media_image1.png
    592
    745
    media_image1.png
    Greyscale

Further according to Appendix I on p. 2 of “1 Introduction”, “We initially focus on two CoT strategies incorporating in-context learning, demonstrating a pruning strategy and a selection strategy on retrieved passages to guide the LLM’s answer generation. Subsequently, we investigate two feedback methods, beginning with presenting the retrieved passages to the LLM, gathering its responses, and then modifying our LLM interaction based on this feedback”.
Specifically, according to Appendix I on p. 2, “3 Method”: “Our initial focus is on several single-round strategies, wherein the retrieved passages are directly forwarded to the LLM. Subsequently, we delve into several multi-round approaches, which entails initially supplying the retrieved passages to the LLM, obtaining feedback, and then modifying the process by which we engage with the LLM based on that feedback”. 
In particular, according to Appendix I on p. 2, “3.0.1 One Shot”, “Prune Prompt”, “This necessitates the LLM to accurately identify answerable passages through a process of elimination. Consequently, the demonstration involves discerning irrelevant passages to the question at hand, pinpointing the appropriate passage that can provide an answer, and consequently delivering the final response” and “Select Prompt Summarization represents a strategy that extracts the central information from the Top-k passages…Based on this synthesized summary, the LLM can produce the final answer”. 
Further still, according to Appendix I on pp. 2-3, “3.2 First Concatenation Second Separate”, “Initially, we employ the concatenation method to obtain the predicted answer from LLMs. If the LLM determines that the input passages are unable to provide a response to the question, we then proceed to the second round where we utilize a separate approach to predict the answer pool. Finally, we employ a majority vote system to select the final answer”.
These disclosures directly support Liu et al. (US 2024/0428044 A1) at ¶35 and ¶¶37-38 cited by the examiner regarding “a prompt provides a desired output format 204 such as instructing LLM 120 to effectively identify answerable passages and an answering format example 222, a reasoning and output format 224, and a demonstrative example 226 that initially identifies the relevant information and then summarize the relevant information like chain of thought and generate the final answer. i.e., select the most accurate final answer / response according to the rating / ranking of the equation in ¶¶32-33”.
In particular, Answer Format Example 222 in Fig. 2C is the result of majority vote system (i.e., rating / ranking of responses) described in Appendix I, Reasoning and Output Format 224 corresponds to CoT (i.e., chain of thought, “Let’s think step by step” in Fig. 2C, 224) described in Appendix I, and Summary Reasoning Demo 226 corresponds to summarization strategy described in Appendix I. 
Therefore, Appendix I provides adequate support for prior art Liu’s priority date of 06/23/2023, which is before the effective filing date of the instant application. 
Claim Rejections - 35 USC § 103
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 103 that form the basis for the rejections under this section made in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 and 3-21 are rejected under 35 USC 103(a) as being unpatentable over Xu et al. (US 2024/0095460 A1) in view of Liu et al. (US 2024/0428044 A1).
Regarding Claims 1, 11, and 18, Xu discloses a system (¶32 and Fig. 1, dialogue systems), comprising: 
at least one server computer (¶179 and Fig. 9D, server 978) comprising at least one processor (¶179, CPU 980) and at least one memory (¶186, memory 1004), the at least one server computer configured to: 
receive a natural language question (¶¶33-34, processing audio data 104 to generate text data 106 (per ¶35, audio data 202 / text data 204 and transcript 206) representing a task / question being requested by the user); 
determine that the natural language question relates to a first reference document (¶¶47-48, a retrieval component 114 retrieves contextual data 116 associated with text data 106 from information database 112 (which stores manual, document, webpages per ¶47) based on contextual information being related to the same topic, component, feature as the question represented by text data 106); 
compute an embedding vector for the natural language question, wherein the embedding vector represents the natural language question in a vector space (¶41, generate embeddings 204 based on text data 204 representing transcript 206); 
select one or more question-and-answer pairs from a set of available question-and-answer pairs using the embedding vector (¶39, retrieve one or more question / answer pairs that are associated with the text data 106 from information database 112 (per ¶36, database 112 stores a number of question / answer pairs) based on a question / answer pair is associated with the text data 106 as being related to the same topic, component, feature; ¶42, retrieval component 108 uses embedding 404 associated with text data 204 and embeddings 402 associated with question / answer pairs to retrieve a threshold number of question / answer pairs); 
create a prompt for a language model (Fig. 1, prompt component 118 and language model 124), the prompt comprising: 
a representation of the natural language question, a representation of the one or more question-and-answer pairs, and an expected output format, wherein the expected output format requests a quotation of the first reference document or a citation to the first reference document (¶58, prompt component 118 uses text data 106, question / answer data 110, and contextual data 116 to generate prompt data 122; per ¶61, if audio data 104 includes a question about a component of the vehicle, then the output data 126 represent information about the component of the vehicle; in view of ¶49, contextual information is associated with an OEM manual of a vehicle; i.e., the prompt data to the language model requires output format to include at least portion of an OEM manual (“citation”) corresponding to the component of the vehicle); 
submit the prompt to the language model (¶61, input the prompt data into a language model); 
receive a plurality of responses from the language model, the plurality of responses including a first response (¶71, the language model 124 processes prompt data 122 to generate output data 126 associated with the question).
Xu does not teach compute response scores for the plurality of responses.
Liu teaches a question answering framework that generates a plurality of responses / answers to an input question based on retrieving a plurality of supporting documents in parallel and selects one or more relevant answers as final response (¶21). Based on relevant passages / documents in response to an input question, the framework uses a language model / LLM to generate a respective answer / response using each selected passage and compute response scores for the plurality of responses (¶21, a language model may then rate or rank the answers in the pool to generate the final response to the input question; see equation of ¶33), wherein the response scores include a first response score and the first response score reflects an accuracy of the first response in relation to an expected output format (¶35 and ¶¶37-38, a prompt provides a desired output format 204 such as instructing LLM 120 to effectively identify answerable passages and an answering format example 222, a reasoning and output format 224, and a demonstrative example 226 that initially identifies the relevant information and then summarize the relevant information like chain of thought and generate the final answer. i.e., select the most accurate final answer / response according to the rating / ranking of the equation in ¶¶32-33); 
select the first response using the response scores (¶¶32-33, select a most accurate final answer using a majority voting mechanism according to the equation of ¶33); and 
determine an answer to the natural language question using the first response (¶39, select the final answer).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to compute response scores reflecting accuracy of respective response to an expected output format in order to generate the most accurate final answer (Liu, ¶32).
Further regarding claim 18, Xu discloses one or more non-transitory, computer-readable media comprising computer-executable instructions that, when executed, cause at least one processor to perform actions of claims 1 and 11 (¶190, memory 1004 storing computer readable instructions for CPU 1006 of ¶188).
Regarding Claim 3, Liu discloses wherein computing the first response score comprises determining an inclusion of a quotation or a citation from the first reference document (Liu, ¶37, instruct LLM 120 to effectively identify answerable passages through a process of selective elimination).
Regarding Claim 4, Liu discloses wherein computing the first response score comprises verifying content of the quotation in the first reference document (Liu, ¶39 and Fig. 3, LLM 120 determines that input passages 112a-n are unable to provide an answer to question 102 (i.e., if don’t know the answer, just say unknown per Liu, ¶30), produce an answer pool by independently generating an answer for each passage 112a-n and apply an LLM prediction to select the final answer from answer pool).
Regarding Claim 5, Xu discloses wherein selecting the first response comprises creating a second prompt for a second language model, wherein the second prompt includes the representation of the natural language question and the first response (¶62, if user continues to ask questions associated with the vehicle, the prompt component 118 uses the contextual data 120 to continue generating the prompt data 122 for the questions, where the contextual data 120 represents a context associated with outputs to previous questions; alternatively, ¶64, process 100 to use a question represented by text data 106 to prompt the language model 124 to generate output data for comparison with answer from testing question / answer pair to determine system accuracy).
Regarding Claim 6, Xu discloses wherein the second prompt asks the second language model to determine a validity of the first response to the natural language question (¶¶64-65, perform process 100 using question represented by text data 106 to generate output data 126 associated with the question and compare the answer from a testing question / answer pair to the answer represented by output data 126 to determine if the dialogue system is accurate).
Regarding Claims 7 and 19, Xu discloses wherein: the prompt comprises at least a portion of a reference document or a link to the reference document (¶58, the prompt data 122 includes contextual data 116; ¶¶47-48, contextual data 116 associated with text data 106 corresponds to manual, document, and webpage in information database 112); and 
selecting the first response comprises creating a second prompt for a second language model (¶¶63-64, when determine whether the dialogue system is accurate, using a question represented by text data 106 to prompt language model(s) 124 (i.e., more than one LM or second LM; compare to LM 126 in Fig. 1B of  Liu) to generate output data 126 associated with the question), wherein the second prompt includes the representation of the natural language question and the first response (¶64, the question is from a testing question / answer pair so as to compare the text answer with answer represented by output data 126; i.e., the prompt to language model (s) 124 or a second LM 124 comprises the question represented by text data 106 and a corresponding test answer / response, and the response represented by output data 126 to perform a validation process).
Regarding Claim 8, Xu discloses wherein: wherein the second prompt asks the second language model to verify that the first response is consistent with the reference document (¶64, compare the answer from the testing question / answer pair to the answer represented by the output data 126; see also Liu, Fig. 1B, ¶33, LLM 120 generates respective answers including the first answer / first response as prompt to LLM 126 to select the most accurate answer as the final answer).
Regarding Claim 9, Xu discloses determining pair scores for the set of available question-and-answer pairs (¶43, determine scores for the question / answer pairs using embedding 404 and embeddings 402); and 
selecting the one or more question-and-answer pairs using the pair scores (¶46, determine a first question / answer pair with the highest score that is most relevant to the text data 106).
Regarding Claim 10, Xu discloses wherein determining the pair scores comprises at least one of: 
determining a similarity of a question-and-answer pair to the natural language question (¶¶41-42, use embedding 404 associated with text data and embeddings 402 associated with question / answer pairs to identify the question / answer pairs that are the most similar to the transcript 206 represented by the text data 204; ¶43, determine scores 406(1)-(6) for the question / answer pairs to identify the question / answer pairs that are most similar to the transcript 206); 
determining a number of hallucinations generated by the language model when a question-and-answer pair was used in a previous prompt; 
determining a number of citations generated by the language model when a question-and-answer pair was used in a previous prompt; or 
determining a number of times a question-and-answer pair was used in a previous prompt.
Regarding Claim 12, Xu discloses wherein the at least one server computer is further configured to create the prompt for the language model by including the representation of the natural language question and the representation of the one or more question-and-answer pairs (¶58, prompt component 118 uses text data 106, question / answer data 110, and contextual data 116 to generate prompt data 122) as a dialogue history with the language model (¶62, prompt data 122 includes contextual data representing a context associated with outputs to previous questions).
Regarding Claim 13, Xu discloses wherein the at least one server computer is further configured to: 
submit a second prompt to a second language model (¶64, process 100 to use a question represented by text data 106 to prompt the language models 124 (i.e., at least a second language model 124) to generate output data for comparison with answer from testing question / answer pair to determine system accuracy; compare Liu, Fig. 1B, this is parallel to LLM 126 / second LM to select the most accurate answer). 
As modified by Liu, the combination teaches receive a second plurality of responses from the second language model (Liu, ¶33, Fig. 1B, LLM 126 -> ak = LLM (q, pk)); 
compute second response scores for the second plurality of responses (Liu, ¶33, LLM 126 performs rating and ranking of answers in the answer pool according to majority voting mechanism equation argmax ai); and 
select the first response from the plurality of responses and the second plurality of responses (Liu, ¶32, select a most accurate final answer).
Regarding Claim 14, Xu as modified by Liu discloses wherein the at least one server computer is further configured to: 
create a second prompt for the language model (Liu, Fig. 1B, answer pool 122 -> LLM 126); 
receive a second plurality of responses from the language model using the second prompt (Liu, ¶33, Fig. 1B, LLM 126 -> ak = LLM (q, pk)); 
compute second response scores for the second plurality of responses (Liu, ¶33, LLM 126 performs rating and ranking of answers in the answer pool according to majority voting mechanism equation argmax ai); and 
select the first response from the plurality of responses and the second plurality of responses (Liu, ¶32, select a most accurate final answer).
Regarding Claim 15, Xu discloses wherein the at least one server computer is further configured to present, at a user interface, the answer to the natural language question (¶71 and ¶75, vehicle interface to provide the information by displaying content associated with output data 126).
Regarding Claim 16, Xu discloses wherein computing the response scores comprises obtaining a plurality of response embedding vectors for the plurality of responses (¶49 and ¶53, Fig. 5, extracting contextual information by transforming words from contextual portions 506 to dense vectors / embeddings 508; ¶62, LM 124 outputs contextual data 120 (i.e., embedding vectors) as part of output data 126; compare Liu, Fig. 1B, output data 126 comprises embeddings 508(1)-(6) to constitute the answer pool 122).
Regarding Claim 17, Xu discloses wherein obtaining the plurality of response embedding vectors comprises querying a third-party service (¶216, extract contextual information from private or public cloud storages).
Regarding Claim 20, Xu discloses wherein at least one of the one or more question-and-answer pairs relates to a second reference document (¶47, information may include text from multiple OEM manuals associated with multiple models of vehicles).
Regarding Claim 21, Xu discloses wherein determining the answer comprises removing a quotation or a citation from the first response (¶48, retrieving a threshold amount of contextual information for output as a portion of output data 126 means removing the remaining amount of contextual information from the manuals as possible response for output data 126).
Claims 2 and 22 are rejected under 35 USC 103(a) as being unpatentable over Xu et al. (US 2024/0095460 A1) in view of Liu et al. (US 2024/0428044 A1) as applied to claims 1 and 18, in further view of Bolcer et al. (US 2025/0086211 A1).
Regarding Claims 2 and 22, Xu does not disclose wherein computing the first response score comprises determining a number or severity of hallucinations in the first response.
Bolcer discloses a system receiving user query comprising a question (¶27) to prompt a LLM to generate an answer / response (¶29 and ¶45). The system computes response scores comprising determining a number or severity of hallucinations in a response (¶51, leverage an LLM to generate responses based on user’s original query; ¶53, verify the accuracy of system generated answers by evaluating their similarity to a cluster of semantically related documents; ¶54, vectorized each record’s clusters to generate a respective answer and compare to the generated answer to determine a respective degree of similarity or hallucination score) and causing the language model to regenerate a response when the response scores are below a threshold value (Fig. 4 and claim 5, determine if hallucination is acceptable as compared to a threshold; e.g., ¶78, determine whether or not to rewrite prompt at 403A (or at 403B per ¶80) into something that has a lower hallucination prediction).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to at least determine a severity of hallucinations in the first response to compute at least the first response score and causing the language model to regenerate a response when the response scores are below a threshold value in order to determine when the language model will fail and became hallucinatory according to similarity based validation (Bolcer, ¶28).
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to examiner Richard Z. Zhu whose telephone number is 571-270-1587 or examiner’s supervisor Hai Phan whose telephone number is 571-272-6338. Examiner Richard Zhu can normally be reached on M-Th, 0730:1700.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RICHARD Z ZHU/Primary Examiner, Art Unit 2654                                                                                                                                                                                                        11/15/2025

Read full office action

Prosecution Timeline

Oct 16, 2023

Application Filed

Jul 24, 2025

Non-Final Rejection — §103

Oct 14, 2025

Response Filed

Nov 15, 2025

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/247,441

Patent 12592228

SPEECH INTERACTION METHOD ,AND APPARATUS, COMPUTER READABLE STORAGE MEDIUM, AND ELECTRONIC DEVICE

2y 5m to grant Granted Mar 31, 2026

18/365,694

Patent 12592222

APPARATUSES, COMPUTER PROGRAM PRODUCTS, AND COMPUTER-IMPLEMENTED METHODS FOR ADAPTING SPEECH RECOGNITION CONFIDENCE SCORES BASED ON EXPECTED RESPONSE

2y 5m to grant Granted Mar 31, 2026

18/510,086

Patent 12586574

ELECTRONIC DEVICE FOR PROCESSING UTTERANCE, OPERATING METHOD THEREOF, AND STORAGE MEDIUM

2y 5m to grant Granted Mar 24, 2026

18/520,336

Patent 12579978

NETWORKED DEVICES, SYSTEMS, & METHODS FOR INTELLIGENTLY DEACTIVATING WAKE-WORD ENGINES

2y 5m to grant Granted Mar 17, 2026

17/957,934

Patent 12572739

GENERATING MACHINE INTERPRETABLE DECOMPOSABLE MODELS FROM REQUIREMENTS TEXT

2y 5m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

69%

Grant Probability

85%

With Interview (+15.4%)

3y 2m

Median Time to Grant

Moderate

PTA Risk

Based on 718 resolved cases by this examiner. Grant probability derived from career allow rate.