DETAILED ACTION
This communication is in response to the Application filed on 4/29/2024. Claims 1-20 are pending and have been examined.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 13, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 4/29/2024 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Claim 1 recites A method comprising: generating a first input sequence comprising first task instructions to generate query-response pairs based on context provided by text extracted from unstructured data, wherein each query-response pair comprises a query for natural language content in the unstructured data and a corresponding response to the query; prompting a [first language model] with the first input sequence to obtain a plurality of query-response pairs; and based on receiving a query from a user for content in the unstructured data, augmenting a second input sequence instructing a [second language model] on how to respond to the query from the user, wherein the second input sequence comprises second task instructions to respond to the query based, at least in part, on context provided by one or more of the plurality of query-response pairs.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. An example of how this could be performed by the human mind is a boss asking an employee to create a Frequently Asked Questions (FAQ) sheet for their new product. First, the employee could be asked to create as many FAQs and answers for the product as they can think of. These FAQs could correspond with the product or a guide for the product that would be considered unstructured data. The prompt in this instance would be the boss's natural language request to create the FAQ Sheet. Then, the boss could ask the employee questions they have about the product. This would function as a second input sequence and the employee could respond using the FAQ sheet they created. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. This claim lists the additional components of a first and second language model. The language models are said to be LLMs which are described in paragraph 24 of the specification as taking various general-purpose forms such as GPT-4. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 2 recites further comprising: retrieving, from a database of query-response pairs indexed by embeddings of the queries in the query-response pairs, a set of one or more query-response pairs that satisfy a semantic similarity threshold with respect to the user query, wherein the set of one or more query-response pairs that satisfy the semantic similarity threshold comprises the one or more of the plurality of query-response pairs; generating the second input sequence using the one or more of the plurality of query-response pairs; and prompting a second language model with the second input sequence to obtain a response to the user query.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Continuing with the example from claim 1, the employee could embed their FAQ sheet if they wanted to by converting it to some form of number representation. The employee would be able to identify how close their boss's question was to the questions they wrote on the FAQ sheet. They could use a threshold value such as "at least 5 matching words in the questions" to decide if the same answer can be used. In this case, the creation of the second input sequence to prompt for a response is the employee taking the bosses question, the similar question from the FAQ sheet, and context from the product in order to create a response to the boss’s question. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 3 recites wherein the first and second language models comprise large language models.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. In this instance the large language model is performing tasks that the human mind is capable of. Furthermore, the human mind is capable of making the design decision to use a large language model for these tasks. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 4 recites wherein the second language model comprises a [lightweight large language] model.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. In this instance the large language model is performing tasks that the human mind is capable of. Furthermore, the human mind is capable of making the design decision to use a large language model for these tasks. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. This claim lists the additional component of a lightweight large language model. The lightweight large language model is described in paragraph 37 of the specification as taking various general-purpose forms such as Mistal 7B. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 5 recites further comprising: storing the plurality of query-response pairs as stored query-response pairs; and at least one of updating and replacing stored query-response pairs based, at least in part, on feedback from the user that the response is incorrect.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Continuing with the example from claim 1, the employee can store the FAQ's by writing them down on paper that will be stored in a filing cabinet or similar storage. The employee could then edit the FAQ sheet with new/replacement questions if their boss requested. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 6 recites wherein at least one of updating and replacing the stored query-response pairs comprises: based on determining that a corrected query-response pair based on the user feedback is within a second semantic similarity threshold to one or more stored query-response pairs in the stored query-response pairs, replacing a most semantically similar of the one or more stored query-response pairs with the corrected query-response pair; and based on determining that the corrected query-response pair is outside the second semantic similarity threshold to the stored query-response pairs, updating the stored query-response pairs by adding the corrected query-response pair in storage.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Continuing with the example from claim 1, if the boss did not like an answer, they received from the FAQ the employee could take their feedback and replace the FAQ question that was most similar to the question the boss was trying to get an answer to. If the two questions aren't similar enough for a replacement, the employee could just add the new question and answer pair to the FAQ sheet. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 7 recites wherein storing the plurality of query-response pairs comprises storing the plurality of query-response pairs indexed by corresponding embeddings, wherein semantic similarity between the query of the user and the plurality of query-response pairs comprises semantic similarity between an embedding of the query of the user and embeddings of queries in the plurality of query-response pairs.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Continuing with the example from claim 1, the employee could embed the FAQs by numbering them/converting them to some form a of numerical representation. The employee could then organize them numerically based on that representation. The boss’s question could also be represented in a numerical form and a semantic similarity could be found in this form. For example, converting the boss’s question and FAQ question to ASCII representation, comparing them using a mathematical formula, and determining a similarity score from the result. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 8 recites wherein the first task instructions comprise example topics for responses to user queries.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Continuing with the example from claim 1, the first task instructions in this example would be the boss’s request to create an FAQ sheet and the materials the employee bases their questions and answers off of. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 9 recites wherein generating the first input sequence comprises, extracting the text from the unstructured data according to sections indicated by the unstructured data; and inserting the text into the first task instructions with indications of each section.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Continuing with the example from claim 1, the employee would be capable of looking at the guide for their product, identifying sections of it, and associating questions and answers with those sections. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 10 recites A [non-transitory machine-readable medium] having program code stored thereon, the program code comprising instructions to: based on receiving a query from a user for natural language content in unstructured data, retrieve one or more query-response pairs from a plurality of query-response pairs corresponding to queries for natural language content in the unstructured data, wherein the one or more query- response pairs comprise those of the plurality of query-response pairs with corresponding queries having highest semantic similarity to the query from the user; generate a first input sequence comprising first task instructions to generate a response to the user query based on context provided by the one or more query-response pairs; and prompt a [first language model] with the first input sequence to obtain a response to the user query as output.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. An example of how this could be performed by the human mind is a boss asking an employee to create a Frequently Asked Questions (FAQ) sheet for their new product. The boss could then ask the employee their own question about the product. The employee could look in the FAQ sheet they created to find the most similarly worded question to their boss’s question. The first task instruction and prompt in this instance would be the employee taking the boss’s question, comparing it to FAQ entries, and giving them an answer based on the FAQ sheet. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. This claim lists the additional components of a first language model and a non-transitory machine-readable medium. The language models are said to be LLMs which are described in paragraph 24 of the specification as taking various general-purpose forms such as GPT-4. The non-transitory machine-readable medium is described in paragraph 75 of the specification as being general purpose computer memory. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 11 recites wherein the program code further comprises instructions to: generate a second input sequence comprising second task instructions to generate query-response pairs based on context provided by text extracted from the unstructured data, wherein each query-response pair comprises a query for natural language content in the unstructured data and a corresponding response to the query; and prompting a [second language model] with the second input sequence to obtain the plurality of query-response pairs as output.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Continuing with the example from claim 10, the boss could ask the employee to make as many FAQs. The FAQs would be represented in natural language and based on the product/materials for the product. The prompt in this instance is the boss’s request to create the FAQ sheet and the employee would produce it on paper as “output”. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. This claim lists the additional component of a second language model. The language models are said to be LLMs which are described in paragraph 24 of the specification as taking various general-purpose forms such as GPT-4. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 12 recites wherein the first and second language models comprise large language models.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. In this instance the large language model is performing tasks that the human mind is capable of. Furthermore, the human mind is capable of making the design decision to use a large language model for these tasks. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 13 recites wherein the second language model comprises a [lightweight large language] model.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. In this instance the large language model is performing tasks that the human mind is capable of. Furthermore, the human mind is capable of making the design decision to use a large language model for these tasks. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. This claim lists the additional component of a lightweight large language model. The lightweight large language model is described in paragraph 37 of the specification as taking various general-purpose forms such as Mistal 7B. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 14 recites further comprising: storing the plurality of query-response pairs as stored query-response pairs; and at least one of updating and replacing stored query-response pairs based, at least in part, on feedback from the user that the response is incorrect.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Continuing with the example from claim 10, the employee can store the FAQ's by writing them down on paper that will be stored in a filing cabinet or similar storage. The employee could then edit the FAQ sheet with new/replacement questions if their boss requested. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 15 recites wherein at least one of updating and replacing the stored query-response pairs comprises: based on determining that a corrected query-response pair based on the user feedback is within a second semantic similarity threshold to one or more stored query-response pairs in the stored query-response pairs, replacing a most semantically similar of the one or more stored query-response pairs with the corrected query-response pair; and based on determining that the corrected query-response pair is outside the second semantic similarity threshold to the stored query-response pairs, updating the stored query-response pairs by adding the corrected query-response pair in storage.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Continuing with the example from claim 10, if the boss did not like an answer, they received from the FAQ the employee could take their feedback and replace the FAQ question that was most similar to the question the boss was trying to get an answer to. If the two questions aren't similar enough for a replacement, the employee could just add the new question and answer pair to the FAQ sheet. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 16 recites wherein storing the plurality of query-response pairs comprises storing the plurality of query-response pairs indexed by corresponding embeddings, wherein semantic similarity between the query of the user and the plurality of query-response pairs comprises semantic similarity between an embedding of the query of the user and embeddings of queries in the plurality of query-response pairs.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Continuing with the example from claim 10, the employee could embed the FAQs by numbering them/converting them to some form a of numerical representation. The employee could then organize them numerically based on that representation. The boss’s question could also be represented in a numerical form and a semantic similarity could be found in this form. For example, converting the boss’s question and FAQ question to ASCII representation, comparing them using a mathematical formula, and determining a similarity score from the result. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 17 recites An apparatus comprising: a [processor]; and a [machine-readable medium] having instructions stored thereon that are executable by the processor to cause the apparatus to, generate a first input sequence comprising first task instructions to generate query-response pairs based on context provided by text extracted from unstructured data, wherein each query-response pair comprises a query for natural language content in the unstructured data and a corresponding response to the query; prompt a [first language model] with the first input sequence to obtain a plurality of query-response pairs as output; and store the plurality of query-response pairs in memory indexed by corresponding embeddings, wherein the plurality of query-response pairs is stored in memory for augmentation of prompts comprising task instruction to generate responses to queries for natural language content in the unstructured data.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. An example of how this could be performed by the human mind is a boss asking an employee to create a Frequently Asked Questions (FAQ) sheet for their new product. The employee could create all the question-and-answer pairs they can think of for the product based on the product and materials related to the product. In this instance the prompt would be the boss’s request and the output would be the FAQ sheet. The employee could then store the FAQ sheet by writing it down and storing it in a folder or filing cabinet and access it later to help answer future questions. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. This claim lists the additional components of a first language model, a processor, and a machine-readable medium. The language models are said to be LLMs which are described in paragraph 24 of the specification as taking various general-purpose forms such as GPT-4. The processor is described in paragraph 79 of the specification with a generic description of the component. The non-transitory machine-readable medium is described in paragraph 75 of the specification as being general purpose computer memory. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 18 recites based on receiving a query from a user for natural language content of the unstructured data, retrieve one or more of the plurality of query-response pairs as those of the plurality of query-response pairs having queries that are within a first semantic similarity threshold to the user query; generate a second input sequence comprising second task instructions to generate a response to the user query based, at least in part, on context provided by the one or more of the plurality of query-response pairs; and prompt a [second language model] with the second input sequence to obtain a response to the user query as output.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. Continuing with the example from claim 17, the boss could ask the employee their own question about the product. The employee could then find question and answer pairs from the FAQ sheet that are similar to the question the boss asked. If the FAQ pair is similar enough (only minor difference in wording) the employee could provide the corresponding answer as the answer to the boss’s question. In this instance the boss’s question would represent the prompt and the answer found from the FAQ sheet by the employee would represent the output. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. This claim lists the additional component of a second language model. The language models are said to be LLMs which are described in paragraph 24 of the specification as taking various general-purpose forms such as GPT-4. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 19 recites wherein the first and second language models comprise large language models.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. In this instance the large language model is performing tasks that the human mind is capable of. Furthermore, the human mind is capable of making the design decision to use a large language model for these tasks. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. There are no additional components that were not present in the independent claim. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim 20 recites wherein the second language model comprises a [lightweight large language] model.
The limitations in these claims, as drafted, are a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. In this instance the large language model is performing tasks that the human mind is capable of. Furthermore, the human mind is capable of making the design decision to use a large language model for these tasks. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. This claim lists the additional component of a lightweight large language model. The lightweight large language model is described in paragraph 37 of the specification as taking various general-purpose forms such as Mistal 7B. Accordingly, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claims are not patent eligible.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-3, 7-12, and 16-19 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent Application Publication US 20240311407 A1 (Barron et al.).
Regarding Claim 1, Barron et al. teaches A method comprising:
(The system 100 can also comprise one or more non-transitory computer-readable media having stored therein computer-executable instructions that, when executed by the computing system, cause the computing system to perform any of the methods described herein.) (Paragraph 32).
generating a first input sequence comprising first task instructions to generate query-response pairs based on context provided by text extracted from unstructured data,
(In the example, at 402, data is extracted from a digital file. As described herein, the digital file can be a PDF file, or a digital file with another format.) (Paragraph 51).
(At 406, the extracted data is processed to generate question-answer pairs. This can include the chatbot system applying another LLM (e.g., another one of LLMs 170) to generate question-answer pairs from the text of the extracted data. The LLM used in this context can be a chat-tailored LLM, for example. The chatbot system can submit a prompt to the chat-tailored LLM which asks the LLM to summarize a single one of the semantically coherent text segments (e.g., the semantically coherent text segments produced at 404)) (Paragraph 56)
The chatbot system of Barron et al. submits a prompt to an LLM (first input sequence comprising first task instructions). The prompt instructs the LLM to generate question-answer pairs for text segments. The text segments are extracted from data that can be unstructured such as a PDF.
wherein each query-response pair comprises a query for natural language content in the unstructured data and a corresponding response to the query;
(The chatbot system can submit a prompt to the chat-tailored LLM which asks the LLM to summarize a single one of the semantically coherent text segments (e.g., the semantically coherent text segments produced at 404) by generating a set of questions and their answers based on that text segment. The questions and answers regarding the original text that are generated by the LLM will produce a text block which might more closely resemble (semantically) a future user's hypothetical query.) (Paragraph 56).
Questions and corresponding answers are generated for the extracted sections of the text.
prompting a first language model with the first input sequence to obtain a plurality of query-response pairs;
(The chatbot system can submit a prompt to the chat-tailored LLM which asks the LLM to summarize a single one of the semantically coherent text segments (e.g., the semantically coherent text segments produced at 404) by generating a set of questions and their answers based on that text segment. The questions and answers regarding the original text that are generated by the LLM will produce a text block which might more closely resemble (semantically) a future user's hypothetical query.) (Paragraph 56).
Prompt is submitted to a language model to generate question and answers pairs.
and based on receiving a query from a user for content in the unstructured data, augmenting a second input sequence instructing a second language model on how to respond to the query from the user,
(In the example, at 302, a text query is received, e.g., from a user via a user interface. The text query can be an unstructured query containing a question that references one or more elements.) (Paragraph 43).
(At 306, one or more elements referenced in the query are identified. For example, the chatbot system can input the query to an LLM (e.g., one of LLMs 170 of FIG. 1), and the LLM can identify any elements referenced in the query.) (Paragraph 45).
(At 310, the prompt is submitted to the LLM. The LLM receiving the prompt can be an AI or machine learning model that is designed to understand and generate human language.) (Paragraph 47).
User queries are used to generate another prompt for an LLM (second input sequence instructing second language model).
wherein the second input sequence comprises second task instructions to respond to the query based, at least in part, on context provided by one or more of the plurality of query-response pairs.
(At 310, the prompt is submitted to the LLM. The LLM receiving the prompt can be an AI or machine learning model that is designed to understand and generate human language.) (Paragraph 47).
(At 606, one or more entries that are semantically similar to the query are retrieved from the vector database. For example, the chatbot system can initiate a retrieval process which searches the vector database for entries which have similar semantic information to the query. The vector entries searched can include both the raw text and the summarized question text (e.g., question-answer pairs). When a text segment corresponding to a summarized question-answer pair is selected, the actual content returned from the search will be the original text from the digital file from which the question was generated.) (Paragraph 69).
User queries are used to generate another prompt for an LLM (second input sequence instructing second language model). The questions submitted by the user are based on the content that the question answer pairs were generated from. The system searches for a question answer pair similar to the query and that information is used to help provide an answer.
Regarding Claim 2, Barron et al. teaches the method of claim 1, further comprising: retrieving, from a database of query-response pairs indexed by embeddings of the queries in the query-response pairs,
(At 408, vector embeddings are generated for the text segments and question-answer pairs. For example, the chatbot system can generate a vector embedding for a semantically coherent text segment by encoding semantic information associated with the semantically coherent text segment into a fixed-length numeric vector.) (Paragraph 57)
(At 518, the vector embeddings are ingested into a vector database. The vector database can then be subsequently accessed by the chatbot system during formulation of an LLM prompt for a user query.) (Paragraph 66).
The question-answer pairs are converted to vector embeddings and stored in a database.
a set of one or more query-response pairs that satisfy a semantic similarity threshold with respect to the user query,
(At 606, one or more entries that are semantically similar to the query are retrieved from the vector database. For example, the chatbot system can initiate a retrieval process which searches the vector database for entries which have similar semantic information to the query. The vector entries searched can include both the raw text and the summarized question text (e.g., question-answer pairs). When a text segment corresponding to a summarized question-answer pair is selected, the actual content returned from the search will be the original text from the digital file from which the question was generated.) (Paragraph 69).
(At 610, the most relevant entries are selected from among the remaining entries. For example, after the chatbot system retrieves the initial results from the database, it can apply another specialized LLM (e.g., another one of LLMs 170) to find the entries most relevant to answering the user's question. This LLM can be a cross-encoder model which is fine-tuned to compare two text entries and provide a score that rates their similarity.) (Paragraph 71).
The question-answer pairs chosen are based on the threshold of being the most semantically similar to the query within the set.
wherein the set of one or more query-response pairs that satisfy the semantic similarity threshold comprises the one or more of the plurality of query-response pairs;
(At 610, the most relevant entries are selected from among the remaining entries. For example, after the chatbot system retrieves the initial results from the database, it can apply another specialized LLM (e.g., another one of LLMs 170) to find the entries most relevant to answering the user's question. This LLM can be a cross-encoder model which is fine-tuned to compare two text entries and provide a score that rates their similarity.) (Paragraph 71).
Multiple entries are selected from within the set of question-answer pairs.
generating the second input sequence using the one or more of the plurality of query-response pairs;
(At 612, a prompt for an LLM is formulated, the prompt including data from the selected entry. The data from the selected entry can include the text and tabular elements originally extracted from the digital file which were selected, via the procedure described above, to provide the most relevant content for answering the query. The prompt can include structured examples of questions and responses to guide the LLM to produce a response which best matches a desired output in format, tone, and content. An example structure of the prompt is set forth below.) (Paragraph 72).
An LLM is prompted to provide a reply to the query based on the selected question-answer pair.
and prompting a second language model with the second input sequence to obtain a response to the user query.
(At 612, a prompt for an LLM is formulated, the prompt including data from the selected entry. The data from the selected entry can include the text and tabular elements originally extracted from the digital file which were selected, via the procedure described above, to provide the most relevant content for answering the query. The prompt can include structured examples of questions and responses to guide the LLM to produce a response which best matches a desired output in format, tone, and content. An example structure of the prompt is set forth below.) (Paragraph 72).
The query is answered by providing a prompt to the LLM with the necessary information.
Regarding Claim 3, Barron et al. teaches the method of claim 2, wherein the first and second language models comprise large language models.
(Chatbot system 130 further includes one or more LLMs 170. While the LLMs 170 are depicted as being part of (e.g., internal to) the chatbot system 130, one or more of the LLMs can alternatively be hosted by an entity external to the chatbot system 130. As described herein, the LLMs 170 can include a first LLM configured to receive a prompt formulated by the chatbot system 130 and generate an answer based on the prompt, a second LLM configured to perform a database retrieval process, a third LLM configured to identify database entries relevant to a query, and a fourth LLM configured to generate question-answer pairs from text. In some examples, the first, second, third, and fourth LLMs are different LLMs (e.g., different types of LLMs). In other examples, the same LLM serves as two or more of the first, second, third, and fourth LLMs. While four LLMs are described herein, a smaller or larger number of LLMs 170 can be employed by the chatbot system 130.) (Paragraph 30).
LLMs are used for the tasks of creating question-answer pairs and responding to user queries.
Regarding Claim 7, Barron et al. teaches the method of claim 5, wherein storing the plurality of query-response pairs comprises storing the plurality of query-response pairs indexed by corresponding embeddings,
(At 408, vector embeddings are generated for the text segments and question-answer pairs. For example, the chatbot system can generate a vector embedding for a semantically coherent text segment by encoding semantic information associated with the semantically coherent text segment into a fixed-length numeric vector.) (Paragraph 57)
(At 410, the vector embeddings and extracted data are ingested into a database. For example, the vector embeddings as well as the extracted data (e.g., the unmodified extracted data) can be ingested into a vector database as entries. The database can be part of the knowledge corpus of the chatbot system, for example.) (Paragraph 58)
The question-answer pairs are vectorized and stored in a database.
wherein semantic similarity between the query of the user and the plurality of query-response pairs comprises semantic similarity between an embedding of the query of the user and embeddings of queries in the plurality of query-response pairs.
(Upon receipt of the user input 702 (e.g., a user query), the chatbot system 700 can utilize context model(s) 714 to incorporate contextual data into the query as shown at 716. Custom embeddings 718 (e.g., vector embeddings) relevant to the query can be obtained from the knowledge corpus 708.) (Paragraph 79).
(At 606, one or more entries that are semantically similar to the query are retrieved from the vector database. For example, the chatbot system can initiate a retrieval process which searches the vector database for entries which have similar semantic information to the query. The vector entries searched can include both the raw text and the summarized question text (e.g., question-answer pairs). When a text segment corresponding to a summarized question-answer pair is selected, the actual content returned from the search will be the original text from the digital file from which the question was generated.) (Paragraph 69).
The semantic similarity searching is done with a vectorized query on the stored, vectorized question-answer pairs in the knowledge corpus. Multiple entries can be chosen based on their similarity
Regarding Claim 8, Barron et al. teaches the method of claim 1, wherein the first task instructions comprise example topics for responses to user queries.
(The technology can leverage an LLM chatbot architecture and customize it for agriculture by integrating a unique corpus of agricultural data, a custom context extraction process, as well as domain-specific prompt-based learning inputs and fine-tunings.) (Paragraph 1).
(As an example, the digital file can include label data for an agricultural product (e.g., a product that contains one or more agricultural chemicals).) (Paragraph 51).
(At 404, the extracted data is processed to generate semantically coherent text segments. This can include the chatbot system applying a natural language processing algorithm to group text of the extracted data into text segments (e.g., “chunks” of text), where each text segment contains relevant context for a specific subsection of the digital file.) (Paragraph 52).
There is a designated topic and subsections for the data that forms the question-answer pairs and is used to answer the user query.
Regarding Claim 9, Barron et al. teaches the method of claim 1, wherein generating the first input sequence comprises, extracting the text from the unstructured data according to sections indicated by the unstructured data;
(At 404, the extracted data is processed to generate semantically coherent text segments. This can include the chatbot system applying a natural language processing algorithm to group text of the extracted data into text segments (e.g., “chunks” of text), where each text segment contains relevant context for a specific subsection of the digital file.) (Paragraph 52).
(Applying the natural language processing algorithm to group the text of the extracted data into text segments can include detecting formatting data and/or metadata tags in the extracted data. The formatting data and/or metadata tags can include section titles, table headings, etc. The text of the extracted data can then be grouped, using the natural language processing algorithm, into a set of preliminary text segments based on the detected formatting and/or metadata tags.) (Paragraph 53).
Sections from the unstructured data are used to form the question-answer pairs.
and inserting the text into the first task instructions with indications of each section.
(At 406, the extracted data is processed to generate question-answer pairs. This can include the chatbot system applying another LLM (e.g., another one of LLMs 170) to generate question-answer pairs from the text of the extracted data. … The questions and answers regarding the original text that are generated by the LLM will produce a text block which might more closely resemble (semantically) a future user's hypothetical query.) (Paragraph 56)
The extracted text that is based on the sections of the unstructured data is used as part of instructions to the LLM to form question answer pairs for the text.
Regarding Claim 10, Barron et al. teaches A non-transitory machine-readable medium having program code stored thereon, the program code comprising instructions to:
(The system 100 can also comprise one or more non-transitory computer-readable media having stored therein computer-executable instructions that, when executed by the computing system, cause the computing system to perform any of the methods described herein.) (Paragraph 32).
based on receiving a query from a user for natural language content in unstructured data,
(In the example, at 302, a text query is received, e.g., from a user via a user interface. The text query can be an unstructured query containing a question that references one or more elements.) (Paragraph 43).
(In the example, at 402, data is extracted from a digital file. As described herein, the digital file can be a PDF file, or a digital file with another format.) (Paragraph 51).
The user can submit a query to ask a question about the unstructured data.
retrieve one or more query-response pairs from a plurality of query-response pairs corresponding to queries for natural language content in the unstructured data,
(At 606, one or more entries that are semantically similar to the query are retrieved from the vector database. For example, the chatbot system can initiate a retrieval process which searches the vector database for entries which have similar semantic information to the query. The vector entries searched can include both the raw text and the summarized question text (e.g., question-answer pairs). When a text segment corresponding to a summarized question-answer pair is selected, the actual content returned from the search will be the original text from the digital file from which the question was generated.) (Paragraph 69).
The system search for question-answer pairs that match the user’s question.
wherein the one or more query- response pairs comprise those of the plurality of query-response pairs with corresponding queries having highest semantic similarity to the query from the user;
(At 606, one or more entries that are semantically similar to the query are retrieved from the vector database. For example, the chatbot system can initiate a retrieval process which searches the vector database for entries which have similar semantic information to the query. The vector entries searched can include both the raw text and the summarized question text (e.g., question-answer pairs). When a text segment corresponding to a summarized question-answer pair is selected, the actual content returned from the search will be the original text from the digital file from which the question was generated.) (Paragraph 69).
(At 610, the most relevant entries are selected from among the remaining entries. For example, after the chatbot system retrieves the initial results from the database, it can apply another specialized LLM (e.g., another one of LLMs 170) to find the entries most relevant to answering the user's question. This LLM can be a cross-encoder model which is fine-tuned to compare two text entries and provide a score that rates their similarity.) (Paragraph 71).
The question-answer pairs chosen are based on the threshold of being the most semantically similar to the query within the set.
generate a first input sequence comprising first task instructions to generate a response to the user query based on context provided by the one or more query-response pairs;
(At 612, a prompt for an LLM is formulated, the prompt including data from the selected entry. The data from the selected entry can include the text and tabular elements originally extracted from the digital file which were selected, via the procedure described above, to provide the most relevant content for answering the query. The prompt can include structured examples of questions and responses to guide the LLM to produce a response which best matches a desired output in format, tone, and content. An example structure of the prompt is set forth below.) (Paragraph 72).
An LLM is prompted to provide a reply to the query based on the selected question-answer pair.
and prompt a first language model with the first input sequence to obtain a response to the user query as output.
(At 612, a prompt for an LLM is formulated, the prompt including data from the selected entry. The data from the selected entry can include the text and tabular elements originally extracted from the digital file which were selected, via the procedure described above, to provide the most relevant content for answering the query. The prompt can include structured examples of questions and responses to guide the LLM to produce a response which best matches a desired output in format, tone, and content. An example structure of the prompt is set forth below.) (Paragraph 72).
The query is answered by providing a prompt to the LLM with the necessary information.
Regarding Claim 11, Barron et al. teaches the method of claim 10, wherein the program code further comprises instructions to: generate a second input sequence comprising second task instructions to generate query-response pairs based on context provided by text extracted from the unstructured data,
(In the example, at 402, data is extracted from a digital file. As described herein, the digital file can be a PDF file, or a digital file with another format.) (Paragraph 51).
(At 406, the extracted data is processed to generate question-answer pairs. This can include the chatbot system applying another LLM (e.g., another one of LLMs 170) to generate question-answer pairs from the text of the extracted data. The LLM used in this context can be a chat-tailored LLM, for example. The chatbot system can submit a prompt to the chat-tailored LLM which asks the LLM to summarize a single one of the semantically coherent text segments (e.g., the semantically coherent text segments produced at 404)) (Paragraph 56)
The chatbot system of Barron et al. submits a prompt to an LLM (first input sequence comprising first task instructions). The prompt instructs the LLM to generate question-answer pairs for text segments. The text segments are extracted from data that can be unstructured such as a PDF.
wherein each query-response pair comprises a query for natural language content in the unstructured data and a corresponding response to the query;
(The chatbot system can submit a prompt to the chat-tailored LLM which asks the LLM to summarize a single one of the semantically coherent text segments (e.g., the semantically coherent text segments produced at 404) by generating a set of questions and their answers based on that text segment. The questions and answers regarding the original text that are generated by the LLM will produce a text block which might more closely resemble (semantically) a future user's hypothetical query.) (Paragraph 56).
Questions and corresponding answers are generated for the extracted sections of the text.
and prompting a second language model with the second input sequence to obtain the plurality of query-response pairs as output.
(The chatbot system can submit a prompt to the chat-tailored LLM which asks the LLM to summarize a single one of the semantically coherent text segments (e.g., the semantically coherent text segments produced at 404) by generating a set of questions and their answers based on that text segment. The questions and answers regarding the original text that are generated by the LLM will produce a text block which might more closely resemble (semantically) a future user's hypothetical query.) (Paragraph 56).
Prompt is submitted to a language model to generate question and answers pairs.
Regarding Claim 12, Barron et al. teaches the method of claim 11, wherein the first and second language models comprise large language models.
(Chatbot system 130 further includes one or more LLMs 170. While the LLMs 170 are depicted as being part of (e.g., internal to) the chatbot system 130, one or more of the LLMs can alternatively be hosted by an entity external to the chatbot system 130. As described herein, the LLMs 170 can include a first LLM configured to receive a prompt formulated by the chatbot system 130 and generate an answer based on the prompt, a second LLM configured to perform a database retrieval process, a third LLM configured to identify database entries relevant to a query, and a fourth LLM configured to generate question-answer pairs from text. In some examples, the first, second, third, and fourth LLMs are different LLMs (e.g., different types of LLMs). In other examples, the same LLM serves as two or more of the first, second, third, and fourth LLMs. While four LLMs are described herein, a smaller or larger number of LLMs 170 can be employed by the chatbot system 130.) (Paragraph 30).
LLMs are used for the tasks of creating question-answer pairs and responding to user queries.
Regarding Claim 17, Barron et al. teaches: An apparatus comprising: a processor; and a machine-readable medium having instructions stored thereon that are executable by the processor to cause the apparatus to,
(Any of the systems herein, including the system 100, can comprise at least one hardware processor and at least one memory coupled to the at least one hardware processor.) (Paragraph 31).
generate a first input sequence comprising first task instructions to generate query-response pairs based on context provided by text extracted from unstructured data,
(In the example, at 402, data is extracted from a digital file. As described herein, the digital file can be a PDF file, or a digital file with another format.) (Paragraph 51).
(At 406, the extracted data is processed to generate question-answer pairs. This can include the chatbot system applying another LLM (e.g., another one of LLMs 170) to generate question-answer pairs from the text of the extracted data. The LLM used in this context can be a chat-tailored LLM, for example. The chatbot system can submit a prompt to the chat-tailored LLM which asks the LLM to summarize a single one of the semantically coherent text segments (e.g., the semantically coherent text segments produced at 404)) (Paragraph 56)
The chatbot system of Barron et al. submits a prompt to an LLM (first input sequence comprising first task instructions). The prompt instructs the LLM to generate question-answer pairs for text segments. The text segments are extracted from data that can be unstructured such as a PDF.
wherein each query-response pair comprises a query for natural language content in the unstructured data and a corresponding response to the query;
(The chatbot system can submit a prompt to the chat-tailored LLM which asks the LLM to summarize a single one of the semantically coherent text segments (e.g., the semantically coherent text segments produced at 404) by generating a set of questions and their answers based on that text segment. The questions and answers regarding the original text that are generated by the LLM will produce a text block which might more closely resemble (semantically) a future user's hypothetical query.) (Paragraph 56).
Questions and corresponding answers are generated for the extracted sections of the text.
prompt a first language model with the first input sequence to obtain a plurality of query-response pairs as output;
(The chatbot system can submit a prompt to the chat-tailored LLM which asks the LLM to summarize a single one of the semantically coherent text segments (e.g., the semantically coherent text segments produced at 404) by generating a set of questions and their answers based on that text segment. The questions and answers regarding the original text that are generated by the LLM will produce a text block which might more closely resemble (semantically) a future user's hypothetical query.) (Paragraph 56).
The prompt is submitted to a language model to generate question and answers pairs.
and store the plurality of query-response pairs in memory indexed by corresponding embeddings,
(At 408, vector embeddings are generated for the text segments and question-answer pairs. For example, the chatbot system can generate a vector embedding for a semantically coherent text segment by encoding semantic information associated with the semantically coherent text segment into a fixed-length numeric vector.) (Paragraph 57)
(At 518, the vector embeddings are ingested into a vector database. The vector database can then be subsequently accessed by the chatbot system during formulation of an LLM prompt for a user query.) (Paragraph 66).
The question-answer pairs are converted to vector embedding and stored in the database.
wherein the plurality of query-response pairs is stored in memory for augmentation of prompts comprising task instruction to generate responses to queries for natural language content in the unstructured data.
(In the example, at 302, a text query is received, e.g., from a user via a user interface. The text query can be an unstructured query containing a question that references one or more elements.) (Paragraph 43).
(At 306, one or more elements referenced in the query are identified. For example, the chatbot system can input the query to an LLM (e.g., one of LLMs 170 of FIG. 1), and the LLM can identify any elements referenced in the query.) (Paragraph 45).
(At 310, the prompt is submitted to the LLM. The LLM receiving the prompt can be an AI or machine learning model that is designed to understand and generate human language.) (Paragraph 47).
User queries are submitted based on the unstructured data and answers are provided using the question-answer pairs
Regarding Claim 18, Barron et al. teaches the method of claim 17, based on receiving a query from a user for natural language content of the unstructured data,
(In the example, at 302, a text query is received, e.g., from a user via a user interface. The text query can be an unstructured query containing a question that references one or more elements.) (Paragraph 43).
A text query is received from the user based on the unstructured data.
retrieve one or more of the plurality of query-response pairs as those of the plurality of query-response pairs having queries that are within a first semantic similarity threshold to the user query;
(At 606, one or more entries that are semantically similar to the query are retrieved from the vector database. For example, the chatbot system can initiate a retrieval process which searches the vector database for entries which have similar semantic information to the query. The vector entries searched can include both the raw text and the summarized question text (e.g., question-answer pairs). When a text segment corresponding to a summarized question-answer pair is selected, the actual content returned from the search will be the original text from the digital file from which the question was generated.) (Paragraph 69).
(At 610, the most relevant entries are selected from among the remaining entries. For example, after the chatbot system retrieves the initial results from the database, it can apply another specialized LLM (e.g., another one of LLMs 170) to find the entries most relevant to answering the user's question. This LLM can be a cross-encoder model which is fine-tuned to compare two text entries and provide a score that rates their similarity.) (Paragraph 71).
The question-answer pairs chosen are based on the threshold of being the most semantically similar to the query within the set.
generate a second input sequence comprising second task instructions to generate a response to the user query based, at least in part, on context provided by the one or more of the plurality of query-response pairs;
(At 612, a prompt for an LLM is formulated, the prompt including data from the selected entry. The data from the selected entry can include the text and tabular elements originally extracted from the digital file which were selected, via the procedure described above, to provide the most relevant content for answering the query. The prompt can include structured examples of questions and responses to guide the LLM to produce a response which best matches a desired output in format, tone, and content. An example structure of the prompt is set forth below.) (Paragraph 72).
An LLM is prompted to provide a reply to the query based on the selected question-answer pair.
and prompt a second language model with the second input sequence to obtain a response to the user query as output.
(At 612, a prompt for an LLM is formulated, the prompt including data from the selected entry. The data from the selected entry can include the text and tabular elements originally extracted from the digital file which were selected, via the procedure described above, to provide the most relevant content for answering the query. The prompt can include structured examples of questions and responses to guide the LLM to produce a response which best matches a desired output in format, tone, and content. An example structure of the prompt is set forth below.) (Paragraph 72).
The query is answered by providing a prompt to the LLM with the necessary information.
Regarding Claim 19, Barron et al. teaches the method of claim 18, wherein the first and second language models comprise large language models.
(Chatbot system 130 further includes one or more LLMs 170. While the LLMs 170 are depicted as being part of (e.g., internal to) the chatbot system 130, one or more of the LLMs can alternatively be hosted by an entity external to the chatbot system 130. As described herein, the LLMs 170 can include a first LLM configured to receive a prompt formulated by the chatbot system 130 and generate an answer based on the prompt, a second LLM configured to perform a database retrieval process, a third LLM configured to identify database entries relevant to a query, and a fourth LLM configured to generate question-answer pairs from text. In some examples, the first, second, third, and fourth LLMs are different LLMs (e.g., different types of LLMs). In other examples, the same LLM serves as two or more of the first, second, third, and fourth LLMs. While four LLMs are described herein, a smaller or larger number of LLMs 170 can be employed by the chatbot system 130.) (Paragraph 30).
LLMs are used for the tasks of creating question-answer pairs and responding to user queries.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 4, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication US 20240311407 A1 (Barron et al.) in view of US Patent Application Publication US 20250131121 A1 (Teng et al.).
Regarding Claims 4, 13, and 20, Barron et al. teaches the system of claims 3, 12, and 19.
Barron et al. does not explicitly teach: wherein the second language model comprises a lightweight large language model.
However, Teng et al. teaches wherein the second language model comprises a lightweight large language model.
(The term “local context database” is used herein to refer to a specialized data repository stored on the user's device or in a localized network that includes private, personal and/or specific user information, … the content of the local context database may be selectively used or shortened to align with Q/A operations based on the analysis results generated by a local lightweight language model. Such local queries may improve efficiency and help maintain user privacy by reducing or eliminating the need to send sensitive or personal data to external servers for processing. … Some embodiments may use a “relevance model” or “relevance determination model” to retain only the most pertinent parts of the knowledge base and ensure that the knowledge base aligns better with the edge device's processing capabilities, which may allow for faster and more efficient queries and responses.) (Paragraphs 43-44).
While Teng et al. is not involving query-response pairs it does use a lightweight LLM specifically for the response component of their method and separate large language models for other parts of the method. The lightweight LLM also has access to a local content database. In this context a lightweight LLM is being used in the same way where after larger amounts of work have been done and stored in a database (like the query response pairs) it is using information from that database to reply to the user
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify the question answering method involving generated question-answer pairs as taught by Barron et al. to use a lightweight LLM for the output operation as taught by Teng et al. This would have been an obvious improvement as lightweight LLMs are more efficient and provide additional privacy (Teng et al. Paragraphs 43-44). Furthermore, since they are performing a similar task to the output component of Barron et al. it would be a simple substitution of the model being used.
Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication US 20240311407 A1 (Barron et al.) in view of US Patent Application Publication US 20240070150 A1 (Nahamoo et al.).
Regarding Claims 5 and 14, Barron et al. teaches the system of claims 2 and 11.
further comprising: storing the plurality of query-response pairs as stored query-response pairs;
(At 408, vector embeddings are generated for the text segments and question-answer pairs.) (Paragraph 57).
(At 410, the vector embeddings and extracted data are ingested into a database. For example, the vector embeddings as well as the extracted data (e.g., the unmodified extracted data) can be ingested into a vector database as entries.) (Paragraph 58).
The questions-answer pairs are stored in a database
Barron et al. does not explicitly teach: and at least one of updating and replacing stored query-response pairs based, at least in part, on feedback from the user that the response is incorrect.
However, Nahamoo et al. teaches and at least one of updating and replacing stored query-response pairs based, at least in part, on feedback from the user that the response is incorrect.
(User feedback to the query results delivered to a user (responsive to a query submitted by that user) may be used to cache a Q-A pair that the user has indicated to be a good answer to the query he/she has submitted.) (Paragraph 125).
(In some examples, when a determination is made that an answer delivered in response to a query (e.g., whether the answer was obtained as a result of searching the cache 135 or the DOM repository 140) is incorrect or unsatisfactory (i.e., it is a “bad” answer), a correct answer can be provided (e.g., by a user of the customer network) and stored in the cache to override the incorrect answer. Thus, the procedure 500 may further include obtaining, at the local device, replacement answer data to replace an initial answer data determined at the local device in response to a particular question, with the initial answer determined by a user to be an unsatisfactory response to the particular question, and storing in the question-answer cache a data item representative of a pairing of the particular question and the replacement answer.) (Paragraph 126).
In Nahamoo et al. user feedback is collected regarding answers provided from the cached Q-A pairs. If it is determined that the answer is incorrect a replacement answer will be stored in the cache, thus updating the Q-A pair.
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify the question answering method involving generated question-answer pairs as taught by Barron et al. to implement the user-feedback for question-answer pair replacement as taught by Nahamoo et al. This would have been an obvious improvement as it would allow the system to update its self to provide more satisfactory responses (Nahamoo et al. Paragraph 126).
Claims 6 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication US 20240311407 A1 (Barron et al.) in view of US Patent Application Publication US 20240070150 A1 (Nahamoo et al.) and further in view of US Patent Publication US 12001801 B2 (Wang et al.).
Regarding Claim 6 and 15, Barron et al. in view of Nahamoo et al. teaches the method of claims 5 and 14,
Furthermore, Nahamoo et al. teaches wherein at least one of updating and replacing the stored query-response pairs comprises: based on determining that a corrected query-response pair based on the user feedback is within a second semantic similarity threshold to one or more stored query-response pairs in the stored query-response pairs,
(The search of the question-answer cache (also referred to as a “question store) may be based on meaning representation and using question similarity (paraphrase) processing. Thus, in some examples, determining whether the one or more pre-determined questions stored in a question-answer cache maintained at the local device matches the query data according to one or more matching criteria may include determining linguistic meaning of the query data, determining a level of meaning similarity between the query data and respective meaning of at least some of the pre-determined questions in the question-answer cache, and determining that at least one of the one or more pre-determined questions is a match to the query data when the level of meaning similarity between the query data and the respective meaning for the at least one of the one or more pre-determined questions exceeds a pre-determined similarity threshold.) (Paragraph 123).
(User feedback to the query results delivered to a user (responsive to a query submitted by that user) may be used to cache a Q-A pair that the user has indicated to be a good answer to the query he/she has submitted.) (Paragraph 125).
Nahamoo et al. determines if the user submitted entry is semantically similar enough to the question-answer pair in the cache. Also, user feedback is collected to determine if the corresponding answer is considered a good or bad one.
replacing a most semantically similar of the one or more stored query-response pairs with the corrected query-response pair;
(In some examples, when a determination is made that an answer delivered in response to a query (e.g., whether the answer was obtained as a result of searching the cache 135 or the DOM repository 140) is incorrect or unsatisfactory (i.e., it is a “bad” answer), a correct answer can be provided (e.g., by a user of the customer network) and stored in the cache to override the incorrect answer. Thus, the procedure 500 may further include obtaining, at the local device, replacement answer data to replace an initial answer data determined at the local device in response to a particular question, with the initial answer determined by a user to be an unsatisfactory response to the particular question, and storing in the question-answer cache a data item representative of a pairing of the particular question and the replacement answer.) (Paragraph 126).
Nahamoo et al. replaces the question answer pair if it is considered a bad answer.
Barron et al. in view of Nahamoo et al. does not explicitly teach: and based on determining that the corrected query-response pair is outside the second semantic similarity threshold to the stored query-response pairs, updating the stored query-response pairs by adding the corrected query-response pair in storage.
However, Wang et al teaches: and based on determining that the corrected query-response pair is outside the second semantic similarity threshold to the stored query-response pairs,
(Specifically, a user-submitted question is compared with questions within a question-answer repository such that a plurality of similarity scores are generated, where each of the similarity scores represents a similarity between the user-submitted question and a corresponding one of the stored questions.) (Col. 3, Lines 46-51).
Wang et al. finds a semantic similarity score between a user-submitted query and a stored question-answer pair.
updating the stored query-response pairs by adding the corrected query-response pair in storage.
(In some implementations, the question answer system makes a determination as to whether to update the question-answer repository to include the user-submitted question using a similarity function. Specifically, a similarity function may be applied to determine whether the user-submitted question is semantically distinct from questions in the question-answer repository. In the event that the user-submitted question is determined to be semantically distinct from questions in the question-answer repository, the question-answer repository may be updated to include the user-submitted question.) (Col. 4, Lines 5-15).
Wang et al. adds the new query to the question answer repository if it is below the similarity threshold (is semantically distinct).
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify the question answering method involving generated question-answer pairs as taught by Barron et al. in view of Nahamoo et al. to implement adding new question-answer pairs to storage when a user question does not match the current pairs as taught by Wang et al. This would have been an obvious improvement as by dynamically expanding the question-answer repository, answers that are retrieved are more likely to be accurate and responsive to user-submitted questions. (Wang et al. Col. 4, Lines 33-36).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS DANIEL LOWEN whose telephone number is (571)272-5828. The examiner can normally be reached Mon-Fri 8:00am - 4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras D Shah can be reached at (571) 270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/NICHOLAS D LOWEN/Examiner, Art Unit 2653
/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657