DETAILED ACTION
This office action is in response to Applicant’s submission filed on 12/29/2025. Claims 1, 4, 9 - 11, and 16 were amended. Claims 1-20 are pending in the application of which Claims 1, 10, and 16 are independent and have been examined.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claim 1, and therefore claims 2 - 9 which depend therefrom, rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. Withholding the initial prompt from the LLM as cited in the claim 1 does not have support or description in the as filed Applicant Specification.
Response to Arguments
Applicant’s arguments filed in the Amendment filed 12/29/2025 (herein “Amendment”) with respect to claim objection raised in the previous office action have been fully considered, and they are persuasive. Therefore, the claim objection of various claims is withdrawn.
Applicant’s arguments and amendments in the Amendment with respect to the 35 U.S.C. 112(b) has been fully considered and persuasive. Consequently, 35 U.S.C. 112(b) claim rejection is withdrawn.
Applicant’s arguments and amendments in the Amendment with respect to the 35 USC §103 rejection raised in the previous office action have been fully considered but are moot in view of the new grounds of rejection which was necessitated by applicant’s amendment. Therefore, the previous rejection has been withdrawn. However, upon further consideration, a new ground of rejection is introduced for independent claim 1 further adding Jones et al. (US 20250111220A1), and Rahmani et al. (US20230176829A1) to the Papayiannis, while adding to the independent claim 10 Jones et al. (US 20250111220A1) to the combination of Papayiannis and Mandlekar.
Please see prior art section below for more detail including updated citations and obviousness rationale.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 4, 7, and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis (US20250104693A1), and in further view of Jones et al. (US 20250111220 A1)(herein "Jones "), and Rahmani et al. (US20230176829A1)(herein "Rahmani").
Papayiannis was applied in the previous Office Action.
Regarding claim 1, Papayiannis teaches A system comprising: at least one computer processor; and computer storage media storing computer-useable instructions that, when used by the at least one computer processor, cause the system to perform operations comprising: (Papayiannis, Par. 0121:”… include a memory storage which may store various information associated with the processing performed (e.g., user input data 102, the prompt data 515, the context data 105, the personalized context data 467, the model output data 525, prompt data 535, the task data 437, the relevant API data 635, the prompt data 615, the action plan data 442, the action response data 458a-n, the potential response data 443a-n, etc.) during one or more previous iterations of processing by the LLM orchestrator component 430 for the user input data 102.”, and Par. 0206:” Computer instructions for operating each device … executed by the respective device's controller(s)/processor(s) ... A device's computer instructions may be stored in a non-transitory manner in … Alternatively, some or all of the executable instructions may be embedded in hardware or firmware on the respective device in addition to or instead of software.”, and Par. 0215:” Aspects of the disclosed system may be implemented as a computer method or as an article of manufacture such as a memory device or non-transitory computer readable storage medium. The computer readable storage medium may be readable by a computer and may comprise instructions for causing a computer or other device to perform processes described ... “).
receiving, from a user device, an input comprising an initial prompt indicative of at least one task; (Papayiannis, Par. 0017:” A system may receive a user input as speech. For example, a user may speak an input to a device. The device may send audio data, representing the spoken input, to the system.”, and Par. 0038:”… user input of “How fast can they move,” the task selection prompt generation component 530 may generate example prompt data 115b:”, and Par. 0082:” … the user input data 102 is received at the plan prompt generation component 510. The plan prompt generation component 510 processes the user input data 102 to generate prompt data 515 representing a prompt for input to the plan generation language model 520. In some embodiments, the plan prompt generation component 510 may further receive an indication of one or more remaining tasks to be completed with respect to the user input data 102.”)
based on the input, performing a semantic search to determine a plurality of candidate language model (LM) skills; (Papayiannis, Par. 0103:” … For example, the API shortlister component 620 may compare one or more APIs included in the index storage 630 to the user input or the current task to determine one or more APIs (top-k) that corresponds to the user input or the current task (e.g., APIs that are semantically similar to the user input or the current task, APIs that are capable of performing the current task, etc.). … representation of an API description for the API to determine whether the API is semantically similar to the user input or the current task. “, and Par. 0115:” For further example, the API provider component 650 may include a search component, which may be configured to query a storage (e.g., a database, repository, knowledge base, etc.) for information usable for generating a response to a user input. For example, if the action data 647a-n represents a request for information of “Who won the game between [Team 1 Name] and [Team 2 Name],” then the search component may query the storage (or other sources, such as the Internet), to retrieve the information “[Team 1 Name] won the game between [Team 1 Name] and [Team 2 Name].”, and Par.0117:” In some embodiments, the API provider component 650 may include a domain service [skill] component, which may be configured for interacting with one or more services [skills] defined by particular users, such as developers, specialists, or the like (e.g., to receive information, such as responses or annotations, to cause an action.”)
in response to performing the semantic search, receiving the plurality of candidate LM skills, each candidate LM skill comprising a corresponding Application Programming Interface (API) and a corresponding API description; (Papayiannis, Par. 0103:” … the one or more relevant APIs from the index storage 630, which may store various information associated with multiple APIs such as API descriptions, API arguments (e.g., parameter inputs/outputs), … For example, the API shortlister component 620 may compare one or more APIs included in the index storage 630 to the user input or the current task to determine one or more APIs (top-k) that corresponds to the user input or the current task (e.g., APIs that are semantically similar to the user input or the current task, APIs that are capable of performing the current task, etc.). … representation of an API description for the API to determine whether the API is semantically similar to the user input or the current task. An API description may correspond to a description of the one or more function that the API is configured to perform and/or other information associated with the API (e.g., an API call formatting structure (e.g., including input parameters), … the API description may further include one or more exemplars associated with use of the API (e.g., an example user input, corresponding API call, and example API output). If the value of semantic similarity meets or exceeds a threshold, the API (and, optionally, the API description) may be included in the relevant API data 635. …”, and Par. 0118:” The API provider component 650, the LLM agent component 652, the skill component 654, and/or the TTS component 380 may send action response data 458a-n representing one or more potential responses generated by the one or more APIs corresponding to the action data 647a-n (e.g., the potential response(s) and/or potential action(s)) to the action plan execution component 445. For example, in response to an API call to the skill component 654 associated with a user input for turning on a light, the action response data 458a may correspond to a potential action of “turn on the light,” “turn_on_device (“light”, [device ID])”, or the like. For further example, in response to an API call to the skill component 654 associated with a user input for ordering a pizza from a particular restaurant, the action response data 458b may correspond to a potential action of “order medium pizza from [restaurant name]”, “order_pizza (“medium”, “pizza”, “[restaurant name]”)”, or the like.”)
generating an API call associated with the at least one target LM skill and comprising an API parameter input into an API of the at least one target LM skill based on the input; (Papayiannis, Par. 0108:” The shortlister language model 640 may generate the one or more APIs calls (including the required input parameters) by applying in-context learning for cold-starting APIs (e.g., one-shot/few-shot learning). For example, in embodiments where the relevant API data 635 includes the API descriptions, the shortlister language model 640 may use the one or more exemplars included in the API descriptions (included in the prompt data 615) to determine the one or more input parameters for the API call. In some embodiments, the shortlister language model 640 may be finetuned on such exemplars (e.g., during offline or runtime processing), such that the shortlister language model 640 is capable of determining the one or more input parameters for the given API call.”, and Par. 0118:” The API provider component 650, the LLM agent component 652, the skill component 654, and/or the TTS component 380 may send action response data 458a-n representing one or more potential responses generated by the one or more APIs corresponding to the action data 647a-n (e.g., the potential response(s) and/or potential action(s)) to the action plan execution component 445. For example, in response to an API call to the skill component 654 associated with a user input for turning on a light, the action response data 458a may correspond to a potential action of “turn on the light,” “turn_on_device (“light”, [device ID])”, or the like. For further example, in response to an API call to the skill component 654 associated with a user input for ordering a pizza …”).
transmitting the API call, wherein transmitting the API call causes execution of the API call against the API of the at least one target LM skill. (Papayiannis, Par. 0019:” … The LLM(s) use the prompt to generate a natural language response to the user input and … output the natural language response to the user.”, and Par. 0078:” … The action plan execution component 445 may identify the request(s) in the action plan data 442, generate executable API calls corresponding to the request(s), and cause the corresponding components (e.g., the API provider component 650, the LLM agent component 652, the skill component 654, …) to generate action response data 458a-n representing the requested potential response(s), where individual action response data 458a may be provided by/correspond to a particular responding component—one of the API provider component 650, the LLM agent component 652, the skill component 4P4, ....”, and Par. 0109:” … Action data 647a may represent, for example, an instruction (e.g., an executable API call determined from/generated based on the action plan data 442) for a particular API to process with respect to the user input and/or the current task.”, and Par. 0118:” … For example, in response to an API call to the skill component 654 associated with a user input for turning on a light, the action response data 458a may correspond to a potential action of “turn on the light,” “turn_on_device (“light”, [device ID])”, or the like.”)
Papayiannis does not teach, however jones teaches executing an orchestration loop that includes: prompting a large language model (LLM), based on the at least one task associated with the input, to select among the plurality of candidate LM skills; and (Jones, Par. 0116:” … at block 704, selecting a large language model service. For instance, the service 110 can choose the lowest-cost LLM service from a set of available options, or it can opt for the lowest-cost LLM service from the available options with a predicted probability of generating a functionally/semantically equivalent code specification that surpasses a certain threshold or minimum probability.”, and Par. 0118:”… For every sample, the large language model analyzes the code generation prompt and endeavors to produce code that meets the requirements or intentions expressed in the code specification of the code generation prompt. In pursuit of this goal, the large language model utilizes its acquired understanding of programming languages, syntax, and prevalent coding patterns to generate coherent and pertinent code.”)
determining at least one target LM skill from the plurality of candidate LM skills based at least in part on an output of the LLM in response to the prompting; (Jones, Par. 0116:” … at block 704, selecting a large language model service. For instance, the service 110 can choose the lowest-cost LLM service from a set of available options, or it can opt for the lowest-cost LLM service from the available options with a predicted probability of generating a functionally/semantically equivalent code specification that surpasses a certain threshold or minimum probability.”, and Par. 0118:”… For every sample, the large language model analyzes the code generation prompt and endeavors to produce code that meets the requirements or intentions expressed in the code specification of the code generation prompt. In pursuit of this goal, the large language model utilizes its acquired understanding of programming languages, syntax, and prevalent coding patterns to generate coherent and pertinent code.”) Note: producing code that meets the requirements reads an output of the LLM in response to the prompting.
Jones is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, further in view of Jones to executing an orchestration loop that includes: prompting a large language model (LLM), based on the at least one task associated with the input, to select among the plurality of candidate LM skills; and determining at least one target LM skill from the plurality of candidate LM skills based at least in part on an output of the LLM in response to the prompting. Motivation to do so would provide accurate responses to a specific prompt (Jones, Par. 0015).
Papayiannis, as modified above, does not teach, however Rahmani teaches withholding the initial prompt from the LLM. (Rahmani, Par. 0326:” … prepare a prompt for later submission to a language model …”) Note: later submission reads on withholding.
Rahmani is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, as modified above, further in view of Rahmani to withhold the initial prompt from the LLM. Motivation to do so would allow to focus on the current task.
Regarding claim 4, Papayiannis, as modified above, teaches wherein performing the semantic search comprises: extracting, from the initial prompt, an intent; (Papayiannis, Par. 0090:” As an example of a user input [intent] that is associated with more than one task, the system 100 may receive a user input [intent] of “please order some pizza for dinner” and may determine a task list of “identify user pizza preference” and “find application that enables ordering of pizza.” Thereafter, the system 100 may process as described herein below to select and complete the task of “identify user pizza preference.” The plan prompt generation component 510 may process the user input, corresponding context data, the remaining task list, and the potential responses (e.g., the users pizza preference, determined, for example, by the personalized context component 465) to generate example prompt data 515a:”) Note: Prompt is formulated based on the user input (intent).
determining, from the intent, a task; (Papayiannis, Par. 0077:” The user input [intent] data 102 may be received at the LLM orchestrator component 430 of the system component(s) 420, which may be configured to generate a list (e.g., one or more) of tasks (e.g., steps/actions) that are to be completed in order to perform an action responsive to the user input and select a task of the list of the tasks that is to be completed …”).
transmitting an indication of the task to the LLM as a first command; and (Papayiannis, Par. 0077:” … The plan generation component 435 may generate and send task data 437 representing the selected task to be completed and various other information needed to perform further processing with respect to the task (e.g., the user input data 102, an indication of the selected task, potential responses associated with previous tasks, the remaining task(s), and context data associated with the user input data 102, as described in detail herein below with respect to FIG. 5) to the LLM shortlister component 440.”)
receiving a first LM response to the first command, wherein the semantic search is performed against an external database using the first LM response, (Papayiannis, Par. 0115:” … may include a search component, which may be configured to query a storage (e.g., a database, repository, knowledge base, etc.) for information usable for generating a response to a user input. For example, if the action data 647a-n represents a request for information of “Who won the game between [Team 1 Name] and [Team 2 Name],” then the search component may query the storage (or other sources, such as the Internet), to retrieve the information “[Team 1 Name] won the game between [Team 1 Name] and [Team 2 Name].”, and Par. 0035:” … generate information potentially responsive to the user input data 102 (e.g., search-query results), then the prompt generation component 110 may further receive the potential response data 107.”, and Par. 0103:” … the user input or the current task to determine one or more APIs (top-k) that corresponds to the user input or the current task (e.g., APIs that are semantically similar to the user input or the current task, …”).
wherein an updated prompt is transmitted as a second command to the LLM, (Papayiannis, Par. 0085:” In some embodiments, plan prompt generation component 510 (or another component of the system 100) may process the personalized context data 467, the user input data 102, and/or the potential responses associated with the user input data 102 to generate a natural language representation of the user input (represented by the user input data 102) that is updated to include the contextual information of the personalized context data 467 (e.g., a contextual rewrite of the user input). Thereafter, the plan prompt generation component 510 may process to generate the prompt data 515 using the updated user input data.”)
wherein the second command is communicated after the first command. (Papayiannis, Par. 0085:” … Thereafter [subsequently, after], the plan prompt generation component 510 may process to generate the prompt data 515 using the updated user input data.”)
Regarding claim 7, Papayiannis, as modified above, teaches wherein generating the API call comprises applying a portion of the input as the API parameter input that is applied into the API of the at least one target LM skill based on the input. (Papayiannis, Par. 0078:” The LLM shortlister component 440 may be configured to determine one or more components (e.g., APIs, skill component(s) 654, LLM agent component(s) 652, TTS component 380, etc.) configured to perform an action related to the user input or the current task. The LLM shortlister component 440 may further be configured to generate and cause the execution of a request(s) (e.g., an API call(s), … to provide a potential responses(s) to the user input or current task … generate executable API calls corresponding to the request(s), and cause the corresponding components (e.g., the API provider component 650, the LLM agent component 652, the skill component 654, ...”, and Par. 0102:” The relevant API data 635 may be generated by the API shortlister component 620, which may be configured to retrieve one or more (e.g., top-k) relevant APIs associated with the user input data 102 or the current task.”, and Par. 0104:” … determine one or more APIs that are to process with respect to the user input or the current task (e.g., determine one or more API calls to cause the APIs to process) given the information (e.g., the user input data 102, the personalized context data 467, the current task, and the relevant API data 635).”, and Par. 0103:”… multiple APIs such as API descriptions, API arguments (e.g., parameter inputs/outputs), identifiers for components (e.g., such as personalized context component 465, skill component(s) 654, LLM agent component(s) 652, … An API description may correspond to a description of the one or more function that the API is configured to perform and/or other information associated with the API (e.g., an API call formatting structure (e.g., including input parameters), historical accuracy/defect rate, historical latency value, etc.). In some embodiments, the API description may further include one or more exemplars associated with use of the API (e.g., an example user input, corresponding API call, and example API output). If the value of semantic similarity meets or exceeds a threshold, the API (and, optionally, the API description) may be included in the relevant API data 635. In some embodiments, the API shortlister component 620 may determine the relevant API data 635 further using contextual information, including the personalized context data 467, an accuracy/defect rate value associated with the APIs, and/or a historical latency value associated with the APIs (e.g., which may be included in the description of the API). “)
Regarding claim 8, Papayiannis, as modified above, teaches wherein the initial prompt is indicative of a user request to a large language model (LLM), and (Papayiannis, Par. 0082:” … the user input data 102 is received at the plan prompt generation component 510. The plan prompt generation component 510 processes the user input data 102 to generate prompt data 515 representing a prompt for input to the plan generation language model 520. … Such prompt data 515 may be generated based on combining the user input data 102 and …”) Note: prompt is generated based on user input/request.
wherein the input comprises contextual information associated with at least one of the initial prompt, the user request, the user device, or a user profile associated with a user. (Papayiannis, Par. 0082:” … The plan prompt generation component 510 may further receive the context data 105 representing the various contextual signals associated with the user input data 102, such as weather information, time of day, device information associated with the device that sent the user input data 102 (e.g., device ID, device states, historical device interaction data, etc.). Such prompt data 515 may be generated based on combining the user input data 102 and the context data 105 …. In some embodiments, the prompt data 515 may be generated further based on the personalized context data 467.”, and Par. 0126:” The personalized context data 467 may represent one or more contextual signals associated with the user 405, such as information associated with a user profile of the user 405 …”).
Claims 2, is rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis, Jones, and Rahmani, and in further view of Suwandy (US20210327413A1).
Suwandy was applied in the previous Office Action.
Regarding claim 2, Papayiannis teaches the system of claims 1.
Papayiannis, as modified above, does not teach, however, Suwandy teaches wherein the orchestration loop is run until at least one of: a threshold quantity of loops is reached or until the at least one target LM skill has a threshold level of relatedness to the input. (Suwandy, Par. 0054:” In this example, intent type A 218 is associated with a plurality of skills (skill A 230, skill B 232, skill C 234). A bot developer for the corresponding conversational bot may have provided exemplary natural language inputs for targeting each of those skills. Those exemplary natural language inputs may have been embedded and added to the embedding library. Thus, a similarity score [relatedness] may also be calculated for each string embedding from natural language input 202 and each of those skills. The scores are illustrated as skill A score 236, skill B score 238, and skill C score 240. If a similarity score for an embedding for any of the embedded strings exceeds a threshold value for any of skill A 230, skill B 232, and/or skill C 234, the corresponding skill may be executed and/or a response corresponding to the skill may be generated. “)
Suwandy is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, as modified above, further in view of Suwandy to wherein the orchestration loop is run until the at least one target LM skill has a threshold level of relatedness to the input. Motivation to do so would improve the service and/or for use in improving intent type and/or skill type identification.( Suwandy, Par. 0028).
Claim 3, is rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis, Jones, and Rahmani, and in further view of Suwandy, and Cha (US20250005901A1).
Cha was applied in the previous Office Action.
Regarding claim 3, Papayiannis teaches the system of claim 1.
Papayiannis, as modified above, does not teach, however Suwandy teaches communicating, for each LM skill of the plurality of candidate LM skills, at least one command in domain-specific language (DSL) to generate a respective output, (Suwandy, Par. 0037:” … A manifest may comprise an interface definition language (IDL) that includes instructions for sending, receiving, and processing commands associated with skills that a conversational bot may perform. A skill may comprise one or more activities that may be performed by a conversational bot.”, and Par. 0067:” … NLR 508 in fine-tuning service 508 receives domain-specific language from domain data 502 and encodes embeddings from that language.”)
Suwandy is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, as modified above, further in view of Suwandy to communicate at least one command in domain-specific language (DSL) to generate a respective output. Motivation to do so would improve the service and/or for use in improving intent type and/or skill type identification.( Suwandy, Par. 0028).
Papayiannis, as modified above, does not teach, however Cha teaches wherein the at least one target LM skill of the plurality of candidate LM skills is selected based on a level of relatedness determined based on a proximity in semantic vector space between the respective output and the input. (Cha, Par. 0065:” … retrieved by the search engine in response to the user prompt; converting, via an image captioning AI generator, the first image to a first text description; converting, via the image captioning AI generator, the second image to a second text description; determining the first text description is more similar to the user prompt than the second text description; and selecting the first image as the reference digital image in response to the first text description being more similar to the user prompt than the second text description.”)
Cha is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, as modified above, further in view of Cha to selectLM skill based on a level of relatedness determined based on a proximity in semantic vector space between the respective output and the input. Motivation to do so would improve the predicted accuracy of the output (Cha, Par. 0051).
Claim 5, is rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis, Jones, and Rahmani, and in further view of Corlatescu (US20240427631A1), and Sukhija (US20250036887A1).
Corlatescu and Sukhija were applied in the previous Office Action.
Regarding claim 5, Papayiannis teaches the system of claim 1.
Papayiannis, as modified above, does not teach, however Corlatescu teaches receiving an API response to the API call; and (Corlatescu , Par. 0020:” In some embodiments, the agent responses include Application Programming Interface (API) calls. The approach organizes the API calls into an execution stack, and executes the API calls to the services in an order based on the execution stack. In some embodiments, the second LLMs executing on the service agents execute the API calls to their corresponding services to produce an API responses. The approach receives the API responses from the service agents, and generates the query response based on the API responses.”)
Corlatescu is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, as modified above, further in view of Corlatescu to receive an API response to the API call. Motivation to do so would improve the operation of a computer system by using LLMs to increase the speed at w Papayiannis, as modified above, does not teach, however hich user queries are processed (Corlatescu, Par. 0024).
Papayiannis, as modified above, does not teach, however, transmitting the API response to the LLM without directly communicating the initial prompt to the LLM. (Sukhija, 0058:” … The API Chain enables using LLMs to interact with APIs to retrieve relevant information.”) Note: when LLM interacts with API, implies communication API response back to LLM as well.
Corlatescu is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, as modified above, further in view of Corlatescu to transmit the API response to the LLM without directly communicating the initial prompt to the LLM. Motivation to do so would provide accurate answers to the user queries. (Par. 0005).
Claim 6, is rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis, Jones, and Rahmani, and in further view of Jackson (US 20250103715 A1) .
Jackson was applied in the previous Office Action.
Regarding claim 6, Papayiannis teaches the system of claim 1.
Papayiannis, as modified above, does not teach, however Jackson teaches wherein the initial prompt from the user device is not communicated to the LLM. (Jackson, Par. 0034:” The various embodiments disclosed herein seek to prevent prompt injection attacks in LLM systems, such as chatbots or virtual assistants. The various embodiments disclosed herein provide systems and methods for preventing prompt injection attacks in LLM systems (e.g., chatbots/virtual assistant applications) by applying a pre-filtering evaluation to incoming prompts, and a post-filter evaluation to responses generated by the model. … In some embodiments, if the prompt is evaluated as safe by the pre-filter, the prompt may be processed by the LLM to generate an output.”) Note: prefiltered prompt is no longer initial prompt.
Jackson is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, as modified above, further in view of Jackson to wherein the initial prompt from the user device is not communicated to the LLM. Motivation to do so would improve quality and relevance of the response generated by the LLM (Jackson, Par. 0004).
Claim 9, is rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis, Jones, and Rahmani, and in further view of Takeoka (US 20210383255 A1) .
Takeoka was applied in the previous Office Action.
Regarding claim 9, Papayiannis teaches the system of claim 1.
Papayiannis, as modified above teaches semantic search (Papayiannis, Par. 0103:” … For example, the API shortlister component 620 may compare one or more APIs included in the index storage 630 to the user input or the current task to determine one or more APIs (top-k) that corresponds to the user input or the current task (e.g., APIs that are semantically similar to the user input or the current task, APIs that are capable of performing the current task, etc.). … representation of an API description for the API to determine whether the API is semantically similar to the user input or the current task. “, and Par. 0115:” For further example, the API provider component 650 may include a search component, which may be configured to query a storage (e.g., a database, repository, knowledge base, etc.) for information usable for generating a response to a user input.)
Papayiannis, as modified above, does not teach, however, Takeoka teaches determining, in semantic vector space, skills that are near the at least one task, wherein the plurality of candidate LM skills are semantically similar to and near in the semantic vector space to the at least one task. (Takeoka, Par. 0043:” The answer integration unit 40 may calculate, for example, an inner product of a feature vector representing the feature of the task and a skill vector representing the skill of the annotator and calculate a value (likelihood) indicating how well each annotator fits for each task to use the calculated likelihood as a weight. It can be said that this value is an index indicating how appropriately an annotator responds to the suitability of a label. In addition, the more the skill of the annotator and the feature of the task match, the larger the calculated inner product of the above-mentioned feature vector and skill vector will be.”) Note: larger inner product between two vectors indicates greater directional similarity. Consequently, the larger inner product, the closer/nearer (smaller angle between the two vectors) they are with each other which implies they are semantically similar.
Takeoka is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Papayiannis, as modified above, with Takeoka. As implied in Takeoka [Par. 0066], one of ordinary skill would have been motivated to combine the teachings because it would identify skills that are relevant to a task even when the task description and skill title do not share specific keywords.
Claims 10, 13, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis (US20250104693A1), and in further view of Mandlekar (US20250068966A1), and Jones (US 20250111220 A1).
Papayiannis, and Mandlekar were applied in the previous Office Action.
Regarding claim 10, Papayiannis teaches A computer-implemented method comprising: receiving, from a user device, an initial prompt indicative of at least one task; (Papayiannis, Par. 0017:” A system may receive a user input as speech. For example, a user may speak an input to a device. The device may send audio data, representing the spoken input, to the system.”, and Par. 0038:”… user input of “How fast can they move,” the task selection prompt generation component 530 may generate example prompt data 115b:”, and Par. 0082:” … the user input data 102 is received at the plan prompt generation component 510. The plan prompt generation component 510 processes the user input data 102 to generate prompt data 515 representing a prompt for input to the plan generation language model 520. In some embodiments, the plan prompt generation component 510 may further receive an indication of one or more remaining tasks to be completed with respect to the user input data 102.”)
based on the initial prompt, determining a first task and a second task associated with the initial prompt; (Papayiannis, Par. 0082:” … The plan prompt generation component 510 processes the user input data 102 to generate prompt data 515 representing a prompt for input to the plan generation language model 520.”, and Par. 0092:” The plan generation language model 520 processes the prompt data 515 to generate model output data 525 representing one or more predicted tasks to be completed in order to perform the action responsive to the user input. … In some embodiments, the threshold for determining the one or more tasks may be such that the plan generation language model 520 is encouraged to generate multiple predicted tasks for a given user input, where the system 100 may parse and filter the list of tasks during downstream processing (e.g., during the processing of the task selection language model 540). For example, based on processing the first example prompt data provided above, the plan generation language model 520 may output model output data 525d: {“turn on all of the lights except the garage light,” “turn on all lights,” “identify which garage light,” “turn on all lights then turn off garage light,” “turn on all lights where user is located,” “turn on kitchen lights, living room lights, dining room lights, hallways lights” “turn on all lights on first floor,” } or the like.”)
based on the first task and the second task, performing a search for a plurality of candidate LM skills; (Papayiannis, Par. 0103:” … For example, the API shortlister component 620 may compare one or more APIs included in the index storage 630 to the user input or the current task to determine one or more APIs (top-k) that corresponds to the user input or the current task (e.g., APIs that are semantically similar to the user input or the current task, APIs that are capable of performing the current task, etc.). … representation of an API description for the API to determine whether the API is semantically similar to the user input or the current task. “, and Par. 0115:” For further example, the API provider component 650 may include a search component, which may be configured to query a storage (e.g., a database, repository, knowledge base, etc.) for information usable for generating a response to a user input. For example, if the action data 647a-n represents a request for information of “Who won the game between [Team 1 Name] and [Team 2 Name],” then the search component may query the storage (or other sources, such as the Internet), to retrieve the information “[Team 1 Name] won the game between [Team 1 Name] and [Team 2 Name].”, and Par.0117:” In some embodiments, the API provider component 650 may include a domain service [skill] component, which may be configured for interacting with one or more services [skills] defined by particular users, such as developers, specialists, or the like (e.g., to receive information, such as responses or annotations, to cause an action.”)
in response to performing the search, receiving a plurality of candidate LM skills, each candidate LM skill comprising a corresponding API description and a corresponding API; ((Papayiannis, Par. 0103:” … the one or more relevant APIs from the index storage 630, which may store various information associated with multiple APIs such as API descriptions, API arguments (e.g., parameter inputs/outputs), … For example, the API shortlister component 620 may compare one or more APIs included in the index storage 630 to the user input or the current task to determine one or more APIs (top-k) that corresponds to the user input or the current task (e.g., APIs that are semantically similar to the user input or the current task, APIs that are capable of performing the current task, etc.). … representation of an API description for the API to determine whether the API is semantically similar to the user input or the current task. An API description may correspond to a description of the one or more function that the API is configured to perform and/or other information associated with the API (e.g., an API call formatting structure (e.g., including input parameters), … the API description may further include one or more exemplars associated with use of the API (e.g., an example user input, corresponding API call, and example API output). If the value of semantic similarity meets or exceeds a threshold, the API (and, optionally, the API description) may be included in the relevant API data 635. …”, and Par. 0118:” The API provider component 650, the LLM agent component 652, the skill component 654, and/or the TTS component 380 may send action response data 458a-n representing one or more potential responses generated by the one or more APIs corresponding to the action data 647a-n (e.g., the potential response(s) and/or potential action(s)) to the action plan execution component 445. For example, in response to an API call to the skill component 654 associated with a user input for turning on a light, the action response data 458a may correspond to a potential action of “turn on the light,” “turn_on_device (“light”, [device ID])”, or the like. For further example, in response to an API call to the skill component 654 associated with a user input for ordering a pizza from a particular restaurant, the action response data 458b may correspond to a potential action of “order medium pizza from [restaurant name]”, “order_pizza (“medium”, “pizza”, “[restaurant name]”)”, or the like.”)
[[[selecting at least one target LM skill of the plurality of candidate LM skills]] by executing an orchestration loop, the orchestration loop including, for at least the first task or the second task; (Papayiannis, Par. 0174:” … after completion of natural language understanding processing, after selection of a skill component to process with respect to the user input and prior to initiation of processing by the skill component, …”, and Par. 0076:” … various components, such as a large language model (LLM) orchestrator component 430, a personalized context component 465, and an action plan execution component 445. The LLM orchestrator component 430 may include a plan generation component 435, an LLM shortlister component 440, and a response arbitration component 460.”, and Par. 0077:”The user input data 102 may be received at the LLM orchestrator component 430 of the system component(s) 420, which may be configured to generate a list (e.g., one or more) of tasks (e.g., steps/actions) that are to be completed in order to perform an action responsive to the user input and select a task of the list of the tasks …”, and Par. 0137:” As such, the response language model 720 may select between the one or more potential responses from one or more different components (e.g., for the first example prompt data, the potential responses from the skill component A and the skill component B and, for the second example prompt data, the potential responses from Component A, API A, and API B) to determine that a subset of the potential responses are responsive to the user input. …”).
generating an API call associated with the at least one target LM skill and comprising an API parameter input into an API of the at least one target LM skill based on the first task and the second task; (Papayiannis, Par. 0108:” The shortlister language model 640 may generate the one or more APIs calls (including the required input parameters) by applying in-context learning for cold-starting APIs (e.g., one-shot/few-shot learning). For example, in embodiments where the relevant API data 635 includes the API descriptions, the shortlister language model 640 may use the one or more exemplars included in the API descriptions (included in the prompt data 615) to determine the one or more input parameters for the API call. In some embodiments, the shortlister language model 640 may be finetuned on such exemplars (e.g., during offline or runtime processing), such that the shortlister language model 640 is capable of determining the one or more input parameters for the given API call.”, and Par. 0118:” The API provider component 650, the LLM agent component 652, the skill component 654, and/or the TTS component 380 may send action response data 458a-n representing one or more potential responses generated by the one or more APIs corresponding to the action data 647a-n (e.g., the potential response(s) and/or potential action(s)) to the action plan execution component 445. For example, in response to an API call to the skill component 654 associated with a user input for turning on a light, the action response data 458a may correspond to a potential action of “turn on the light,” “turn_on_device (“light”, [device ID])”, or the like. For further example, in response to an API call to the skill component 654 associated with a user input for ordering a pizza …”).
transmitting the API call to cause execution of an API call against the API of the at least one target LM skill. (Papayiannis, Par. 0019:” … The LLM(s) use the prompt to generate a natural language response to the user input and … output the natural language response to the user.”, and Par. 0078:” … The action plan execution component 445 may identify the request(s) in the action plan data 442, generate executable API calls corresponding to the request(s), and cause the corresponding components (e.g., the API provider component 650, the LLM agent component 652, the skill component 654, …) to generate action response data 458a-n representing the requested potential response(s), where individual action response data 458a may be provided by/correspond to a particular responding component—one of the API provider component 650, the LLM agent component 652, the skill component 4P4, ....”, and Par. 0109:” … Action data 647a may represent, for example, an instruction (e.g., an executable API call determined from/generated based on the action plan data 442) for a particular API to process with respect to the user input and/or the current task.”, and Par. 0118:” … For example, in response to an API call to the skill component 654 associated with a user input for turning on a light, the action response data 458a may correspond to a potential action of “turn on the light,” “turn_on_device (“light”, [device ID])”, or the like.”)
Papayiannis does not teach, however Mandlekar teaches selecting at least one target LM skill of the plurality of candidate LM skills (Mandlekar, Par. 0019:” … systems implementing one or more language models-such as one or more large language models (LLMs), systems … “, and Par. 0033:” … system 220 may select the first skill 242 and the third skill 246 from a plurality of skills or sets of instructions. …”).
Mandlekar is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Papayiannis with Mandlekar. As implied in Mandlekar [Par. 0003], one of ordinary skill would have been motivated to combine the teachings because it would create more accurate, robust, and specialized systems.
Papayiannis, as modified above, does not teach, however jones teaches prompting a large language model (LLM), using a task description and contextual information derived from the initial prompt, to select among the plurality of candidate LM skills; and (Jones, Par. 0116:” … at block 704, selecting a large language model service. For instance, the service 110 can choose the lowest-cost LLM service from a set of available options, or it can opt for the lowest-cost LLM service from the available options with a predicted probability of generating a functionally/semantically equivalent code specification that surpasses a certain threshold or minimum probability.”, and Par. 0118:”… For every sample, the large language model analyzes the code generation prompt and endeavors to produce code that meets the requirements or intentions expressed in the code specification of the code generation prompt. In pursuit of this goal, the large language model utilizes its acquired understanding of programming languages, syntax, and prevalent coding patterns to generate coherent and pertinent code.”)
selecting the at least one target LM skill based at least in part on an output of the LLM response to the prompting; (Jones, Par. 0116:” … at block 704, selecting a large language model service. For instance, the service 110 can choose the lowest-cost LLM service from a set of available options, or it can opt for the lowest-cost LLM service from the available options with a predicted probability of generating a functionally/semantically equivalent code specification that surpasses a certain threshold or minimum probability.”, and Par. 0118:”… For every sample, the large language model analyzes the code generation prompt and endeavors to produce code that meets the requirements or intentions expressed in the code specification of the code generation prompt. In pursuit of this goal, the large language model utilizes its acquired understanding of programming languages, syntax, and prevalent coding patterns to generate coherent and pertinent code.”) Note: producing code that meets the requirements reads on output of the LLM response to the prompting.
Jones is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, further in view of Jones to prompt a large language model (LLM), using a task description and contextual information derived from the initial prompt, to select among the plurality of candidate LM skills; and selecting the at least one target LM skill based at least in part on an output of the LLM response to the prompting. Motivation to do so would provide accurate responses to a specific prompt (Jones, Par. 0015).
Regarding claim 13, Papayiannis, as modified above, teaches wherein determining the first task and the second task comprises: determining an intent based on contextual information associated with at least one of the initial prompt, a user request, the user device, or a user profile associated with a user; and (Papayiannis, Par. 0082:” … The plan prompt generation component 510 may further receive the context data 105 representing the various contextual signals associated with the user input [intent] data 102, such as weather information, time of day, device information associated with the device that sent the user input data 102 (e.g., device ID, device states, historical device interaction data, etc.). Such prompt data 515 may be generated based on combining the user input data 102 and the context data 105 …. In some embodiments, the prompt data 515 may be generated further based on the personalized context data 467.”, and Par. 0126:” The personalized context data 467 may represent one or more contextual signals associated with the user 405, such as information associated with a user profile of the user 405 …”).
generating a first command indicative of a first prompt executed against an LLM to determine at least one task based on the intent, (Papayiannis, Par. 0029:”… A prompt may be a natural language input, for example, an instruction, for the LLM to generate an output [task] according to the prompt.”, and Par. 0077:” … The plan generation component 435 may generate and send task data 437 representing the selected task to be completed and various other information needed to perform further processing with respect to the task (e.g., the user input [intent] data 102, an indication of the selected task, potential responses associated with previous tasks, the remaining task(s), and context data associated with the user input data 102, …“).
wherein the first task and the second task are determined based on the first command. (Papayiannis, Par. 0039:” … the prompt generation component 110 may also include in the prompt data an instruction to output a response that satisfies certain conditions.”, and Par. 0086:” … the prompt data 515 may be an instruction for the plan generation language model 520 to determine one or more tasks (e.g., steps/actions) that are to be completed in order to perform an action responsive to the user input given the other information”, and Par. 0145:” … may generate and send the corresponding instruction (or API call) to perform the one or more potential actions [tasks] responsive to the user input.”)
Regarding claim 15, Papayiannis, as modified above, teaches wherein the initial prompt comprises a user request to an LLM, wherein the initial prompt further comprises contextual information associated with at least one of the initial prompt, the user request, the user device, or a user profile associated with a user. (Papayiannis, Par. 0082:” … the user input [request] data 102 is received at the plan prompt generation component 510. The plan prompt generation component 510 processes the user input data 102 to generate prompt data 515 representing a prompt for input to the plan generation language model 520. … The plan prompt generation component 510 may further receive the context data 105 representing the various contextual signals associated with the user input data 102, such as weather information, time of day, device information associated with the device that sent the user input data 102 (e.g., device ID, device states, historical device interaction data, etc.). Such prompt data 515 may be generated based on combining the user input data 102 and the context data 105 … In some embodiments, the prompt data 515 may be generated further based on the personalized context data 467.”, and Par. 0126:” The personalized context data 467 may represent one or more contextual signals associated with the user 405, such as information associated with a user profile of the user 405 …”).
Claim 11, is rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis, Mandlekar, Jones, and Pang et al. (US 20230419027A1)(herein “Pang”).
Regarding claim 11, Papayiannis teaches the method of claim 10.
Papayiannis, as modified above, teaches wherein executing the orchestration loop comprises communicating at least one command in domain-specific language (DSL) to generate a respective output, (Papayiannis, Par. 0174:” … after selection of a skill component to process with respect to the user input and prior to initiation of processing by the skill component, …”, and Par. 0076:” … The LLM orchestrator component 430 may include a plan generation component 435, an LLM shortlister component 440, and a response arbitration component 460.”, and Par. 0077:”The user input data 102 may be received at the LLM orchestrator component 430 of the system component(s) 420, which may be configured to generate a list (e.g., one or more) of tasks (e.g., steps/actions) that are to be completed in order to perform an action responsive to the user input and select a task of the list of the tasks …”, and Par. 0135:” If the response language model 720 determines that one or more of the potential responses [implies proper skill selection] are responsive to the user input, the response language model 720 … For example, based on processing the first example prompt data above, the response language model 720 may select one of the potential responses (e.g., the potential responses from skill component A (e.g., a weather skill component)) determined to be responsive to the user input to generate the model output natural language data 125 and the model output prosody data 135a: {“It is currently 70 degrees, with a high of 75 and a low of 68,” } or the like. For further example, based on processing the first example prompt data provided above, the response language model 720 may select more than one of the potential responses (e.g., the potential responses from both the skill component A and skill component B) determined to be responsive to the user input and generate a summary of the selected responses to generate the model output natural language data 125b: {“It is expected to be mostly sunny today, with a high of 75 and a low of 68, but with a chance of rain in the late afternoon,” } or the like.”) Note: Prompt/command is the request to check the weather condition, and appropriate DSL (weather) is selected since the response is appropriately outputted. The command is to go get the weather report which generated a respective output.
Papayiannis, as modified above, does not teach, however, Pang teaches wherein the at least one target LM skill is selected based on a level of relatedness determined based on a proximity in semantic vector space between the respective output and the initial prompt. (Pang, Par. 0067:” For task relations, it is investigated if the latent space captures source and target task relations to allow knowledge transfer. Each instance queries the latent space and selects one latent skill. This selection is converted to a one-hot vector and treat it as an instance encoding. A task representation is the average of instance encodings in the task. The cosine similarity between two task representations is computed as their relation. The relations between source and target tasks are visualized in FIG. 9. It seems that more complicated source tasks such as QA and NLI tasks transfer more knowledge to target tasks via the skill latent space.”)
Pang is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, as modified above, further in view of Pang to wherein the at least one target LM skill is selected based on a level of relatedness determined based on a proximity in semantic vector space between the respective output and the initial prompt. Motivation to do so would provide a prompt-based transfer learning method that employs shared latent space prompt tuning (Pang, Par. 0015).
Claim 12, is rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis, Mandlekar, Jones, and Pang et al. (US 20230419027A1)(herein “Pang”).
Regarding claim 12, Papayiannis teaches the method of claim 10.
Papayiannis, as modified above, does not teach, however, Suwandy teaches wherein the orchestration loop is run until at least one of: a threshold quantity of loops is reached or until the at least one target LM skill has a threshold level of relatedness to the initial prompt. (Suwandy, Par. 0054:” In this example, intent type A 218 is associated with a plurality of skills (skill A 230, skill B 232, skill C 234). A bot developer for the corresponding conversational bot may have provided exemplary natural language inputs for targeting each of those skills. Those exemplary natural language inputs may have been embedded and added to the embedding library. Thus, a similarity score [relatedness] may also be calculated for each string embedding from natural language input 202 and each of those skills. The scores are illustrated as skill A score 236, skill B score 238, and skill C score 240. If a similarity score for an embedding for any of the embedded strings exceeds a threshold value for any of skill A 230, skill B 232, and/or skill C 234, the corresponding skill may be executed and/or a response corresponding to the skill may be generated. “)
Suwandy is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, as modified above, further in view of Suwandy to wherein the orchestration loop is run until the at least one target LM skill has a threshold level of relatedness to the input. Motivation to do so would improve the service and/or for use in improving intent type and/or skill type identification.( Suwandy, Par. 0028).
Claim 14, is rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis, Mandlekar, Jones, and Corlatescu (US20240427631A1).
Corlatescu was applied in the previous Office Action.
Regarding claim 14, Papayiannis teaches the method of claim 10.
Papayiannis, as modified above, does not teach, however Mandlekar further teaches wherein selecting the at least one target LM skill comprises [[generating a second command indicative of a second prompt executed against an LLM]] to determine the at least one target LM skill from the plurality of candidate LM skills. (Mandlekar, Par. 0019:” … systems implementing one or more language models-such as one or more large language models (LLMs), systems … “, and Par. 0033:” … system 220 may select the first skill 242 and the third skill 246 from a plurality of skills or sets of instructions. …”).
Papayiannis, as modified above, does not teach, however, Corlatescu teaches generating a second command indicative of a second prompt executed against an LLM (Corlatescu, Par. 0021:” … In some embodiments, in response to determining that the first agent response does not complete the first task, the approach constructs a second prompt based on a user query and the first agent response. The approach inputs the second prompt into the first LLM and the first LLM produces a new processing plan that includes new tasks. The approach then sends new messages to the service agents based on the new tasks.”)
Corlatescu is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, as modified above, further in view of Corlatescu to generate a second command indicative of a second prompt executed against an LLM. Motivation to do so would improve the operation of a computer system by using LLMs to increase the speed at which user queries are processed (Corlatescu, Par. 0024).
Claims 16 - 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis (US20250104693A1), and in further view of Mandlekar (US20250068966A1).
Papayiannis, and Mandlekar were applied in the previous Office Action.
Regarding claim 16, Papayiannis teaches One or more computer storage media having computer-executable instructions embodied thereon that, when executed, by one or more processors, cause the one or more processors to perform operations comprising: (Papayiannis, Par. 0121:”… include a memory storage which may store various information associated with the processing performed (e.g., user input data 102, the prompt data 515, the context data 105, the personalized context data 467, the model output data 525, prompt data 535, the task data 437, the relevant API data 635, the prompt data 615, the action plan data 442, the action response data 458a-n, the potential response data 443a-n, etc.) during one or more previous iterations of processing by the LLM orchestrator component 430 for the user input data 102.”, and Par. 0206:” Computer instructions for operating each device … executed by the respective device's controller(s)/processor(s) ... A device's computer instructions may be stored in a non-transitory manner in … Alternatively, some or all of the executable instructions may be embedded in hardware or firmware on the respective device in addition to or instead of software.”, and Par. 0215:” Aspects of the disclosed system may be implemented as a computer method or as an article of manufacture such as a memory device or non-transitory computer readable storage medium. The computer readable storage medium may be readable by a computer and may comprise instructions for causing a computer or other device to perform processes described ... “).
receiving, from a user device, an input comprising an initial prompt indicative of at least one task; (Papayiannis, Par. 0017:” A system may receive a user input as speech. For example, a user may speak an input to a device. The device may send audio data, representing the spoken input, to the system.”, and Par. 0038:”… user input of “How fast can they move,” the task selection prompt generation component 530 may generate example prompt data 115b:”, and Par. 0082:” … the user input data 102 is received at the plan prompt generation component 510. The plan prompt generation component 510 processes the user input data 102 to generate prompt data 515 representing a prompt for input to the plan generation language model 520. In some embodiments, the plan prompt generation component 510 may further receive an indication of one or more remaining tasks to be completed with respect to the user input data 102.”)
in lieu of communicating the initial prompt to a language model (LM): (a) determining a task from the input; (Papayiannis, Par. 0082:” … The plan prompt generation component 510 processes the user input data 102 to generate prompt data 515 representing a prompt for input to the plan generation language model 520.”, and Par. 0092:” The plan generation language model 520 processes the prompt data 515 to generate model output data 525 representing one or more predicted tasks to be completed in order to perform the action responsive to the user input. … In some embodiments, the threshold for determining the one or more tasks may be such that the plan generation language model 520 is encouraged to generate multiple predicted tasks for a given user input, where the system 100 may parse and filter the list of tasks during downstream processing (e.g., during the processing of the task selection language model 540). For example, based on processing the first example prompt data provided above, the plan generation language model 520 may output model output data 525d: {“turn on all of the lights except the garage light,” “turn on all lights,” “identify which garage light,” “turn on all lights then turn off garage light,” “turn on all lights where user is located,” “turn on kitchen lights, living room lights, dining room lights, hallways lights” “turn on all lights on first floor,” } or the like.”)
(b) based on the task, performing a semantic search for a plurality of candidate LM skills; (Papayiannis, Par. 0103:” … For example, the API shortlister component 620 may compare one or more APIs included in the index storage 630 to the user input or the current task to determine one or more APIs (top-k) that corresponds to the user input or the current task (e.g., APIs that are semantically similar to the user input or the current task, APIs that are capable of performing the current task, etc.). … representation of an API description for the API to determine whether the API is semantically similar to the user input or the current task. “, and Par. 0115:” For further example, the API provider component 650 may include a search component, which may be configured to query a storage (e.g., a database, repository, knowledge base, etc.) for information usable for generating a response to a user input. For example, if the action data 647a-n represents a request for information of “Who won the game between [Team 1 Name] and [Team 2 Name],” then the search component may query the storage (or other sources, such as the Internet), to retrieve the information “[Team 1 Name] won the game between [Team 1 Name] and [Team 2 Name].”, and Par.0117:” In some embodiments, the API provider component 650 may include a domain service [skill] component, which may be configured for interacting with one or more services [skills] defined by particular users, such as developers, specialists, or the like (e.g., to receive information, such as responses or annotations, to cause an action.”)
(c) in response to performing the semantic search, receiving the plurality of candidate LM skills, each candidate LM skill comprising a corresponding API description and a corresponding API; (Papayiannis, Par. 0103:” … the one or more relevant APIs from the index storage 630, which may store various information associated with multiple APIs such as API descriptions, API arguments (e.g., parameter inputs/outputs), … For example, the API shortlister component 620 may compare one or more APIs included in the index storage 630 to the user input or the current task to determine one or more APIs (top-k) that corresponds to the user input or the current task (e.g., APIs that are semantically similar to the user input or the current task, APIs that are capable of performing the current task, etc.). … representation of an API description for the API to determine whether the API is semantically similar to the user input or the current task. An API description may correspond to a description of the one or more function that the API is configured to perform and/or other information associated with the API (e.g., an API call formatting structure (e.g., including input parameters), … the API description may further include one or more exemplars associated with use of the API (e.g., an example user input, corresponding API call, and example API output). If the value of semantic similarity meets or exceeds a threshold, the API (and, optionally, the API description) may be included in the relevant API data 635. …”, and Par. 0118:” The API provider component 650, the LLM agent component 652, the skill component 654, and/or the TTS component 380 may send action response data 458a-n representing one or more potential responses generated by the one or more APIs corresponding to the action data 647a-n (e.g., the potential response(s) and/or potential action(s)) to the action plan execution component 445. For example, in response to an API call to the skill component 654 associated with a user input for turning on a light, the action response data 458a may correspond to a potential action of “turn on the light,” “turn_on_device (“light”, [device ID])”, or the like. For further example, in response to an API call to the skill component 654 associated with a user input for ordering a pizza from a particular restaurant, the action response data 458b may correspond to a potential action of “order medium pizza from [restaurant name]”, “order_pizza (“medium”, “pizza”, “[restaurant name]”)”, or the like.”)
(d) [[determining at least one target LM skill of the plurality of candidate LM skills]] based on an orchestration loop, the plurality of candidate LM skills, and the task; and (Papayiannis, Par. 0174:” … after completion of natural language understanding processing, after selection of a skill component to process with respect to the user input and prior to initiation of processing by the skill component, …”, and Par. 0076:” … various components, such as a large language model (LLM) orchestrator component 430, a personalized context component 465, and an action plan execution component 445. The LLM orchestrator component 430 may include a plan generation component 435, an LLM shortlister component 440, and a response arbitration component 460.”, and Par. 0077:”The user input data 102 may be received at the LLM orchestrator component 430 of the system component(s) 420, which may be configured to generate a list (e.g., one or more) of tasks (e.g., steps/actions) that are to be completed in order to perform an action responsive to the user input and select a task of the list of the tasks …”, and Par. 0137:” As such, the response language model 720 may select between the one or more potential responses from one or more different components (e.g., for the first example prompt data, the potential responses from the skill component A and the skill component B and, for the second example prompt data, the potential responses from Component A, API A, and API B) to determine that a subset of the potential responses are responsive to the user input. …”).
(e) generating an API call comprising the at least one target LM skill and an API parameter input into an API of the at least one target LM skill based on the task; and (Papayiannis, Par. 0108:” The shortlister language model 640 may generate the one or more APIs calls (including the required input parameters) by applying in-context learning for cold-starting APIs (e.g., one-shot/few-shot learning). For example, in embodiments where the relevant API data 635 includes the API descriptions, the shortlister language model 640 may use the one or more exemplars included in the API descriptions (included in the prompt data 615) to determine the one or more input parameters for the API call. In some embodiments, the shortlister language model 640 may be finetuned on such exemplars (e.g., during offline or runtime processing), such that the shortlister language model 640 is capable of determining the one or more input parameters for the given API call.”, and Par. 0118:” The API provider component 650, the LLM agent component 652, the skill component 654, and/or the TTS component 380 may send action response data 458a-n representing one or more potential responses generated by the one or more APIs corresponding to the action data 647a-n (e.g., the potential response(s) and/or potential action(s)) to the action plan execution component 445. For example, in response to an API call to the skill component 654 associated with a user input for turning on a light, the action response data 458a may correspond to a potential action of “turn on the light,” “turn_on_device (“light”, [device ID])”, or the like. For further example, in response to an API call to the skill component 654 associated with a user input for ordering a pizza …”).
executing the API call to generate at least a portion of a response to the initial prompt. (Papayiannis, Par. 0019:” … The LLM(s) use the prompt to generate a natural language response to the user input and … output the natural language response to the user.”, and Par. 0078:” … The action plan execution component 445 may identify the request(s) in the action plan data 442, generate executable API calls corresponding to the request(s), and cause the corresponding components (e.g., the API provider component 650, the LLM agent component 652, the skill component 654, …) to generate action response data 458a-n representing the requested potential response(s), where individual action response data 458a may be provided by/correspond to a particular responding component—one of the API provider component 650, the LLM agent component 652, the skill component 4P4, ....”, and Par. 0109:” … Action data 647a may represent, for example, an instruction (e.g., an executable API call determined from/generated based on the action plan data 442) for a particular API to process with respect to the user input and/or the current task.”, and Par. 0118:” … For example, in response to an API call to the skill component 654 associated with a user input for turning on a light, the action response data 458a may correspond to a potential action of “turn on the light,” “turn_on_device (“light”, [device ID])”, or the like.”)
Papayiannis does not teach, however Mandlekar teaches determining at least one target LM skill of the plurality of candidate LM skills (Mandlekar, Par. 0019:” … systems implementing one or more language models-such as one or more large language models (LLMs), systems … “, and Par. 0033:” … system 220 may select the first skill 242 and the third skill 246 from a plurality of skills or sets of instructions. …”).
Mandlekar is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Papayiannis with Mandlekar. As implied in Mandlekar [Par. 0003], one of ordinary skill would have been motivated to combine the teachings because it would create more accurate, robust, and specialized systems.
Regarding claim 17, Papayiannis, as modified above, teaches wherein the operations further comprise determining a second task from the input, (Papayiannis, Par. 0079:” … For example, the potential response data 443a-n may include a first potential response from a first component configured to perform a first task determined by the plan generation component 435, a second potential response from a second component configured to perform a second task determined by the plan generation component 435, etc.”, and Par. 0151:” … the system 100 may begin processing with respect to a second task associated with the user input.”)
wherein at least (b), (c), (d), and (e) are further performed based on the second task.
(b) based on the task, performing a semantic search for a plurality of candidate LM skills; (Papayiannis, Par. 0103:” … For example, the API shortlister component 620 may compare one or more APIs included in the index storage 630 to the user input or the current task to determine one or more APIs (top-k) that corresponds to the user input or the current task (e.g., APIs that are semantically similar to the user input or the current task, APIs that are capable of performing the current task, etc.). … representation of an API description for the API to determine whether the API is semantically similar to the user input or the current task. “, and Par. 0115:” For further example, the API provider component 650 may include a search component, which may be configured to query a storage (e.g., a database, repository, knowledge base, etc.) for information usable for generating a response to a user input. For example, if the action data 647a-n represents a request for information of “Who won the game between [Team 1 Name] and [Team 2 Name],” then the search component may query the storage (or other sources, such as the Internet), to retrieve the information “[Team 1 Name] won the game between [Team 1 Name] and [Team 2 Name].”, and Par.0117:” In some embodiments, the API provider component 650 may include a domain service [skill] component, which may be configured for interacting with one or more services [skills] defined by particular users, such as developers, specialists, or the like (e.g., to receive information, such as responses or annotations, to cause an action.”)
(c) in response to performing the semantic search, receiving the plurality of candidate LM skills, each candidate LM skill comprising a corresponding API description and a corresponding API; (Papayiannis, Par. 0103:” … the one or more relevant APIs from the index storage 630, which may store various information associated with multiple APIs such as API descriptions, API arguments (e.g., parameter inputs/outputs), … For example, the API shortlister component 620 may compare one or more APIs included in the index storage 630 to the user input or the current task to determine one or more APIs (top-k) that corresponds to the user input or the current task (e.g., APIs that are semantically similar to the user input or the current task, APIs that are capable of performing the current task, etc.). … representation of an API description for the API to determine whether the API is semantically similar to the user input or the current task. An API description may correspond to a description of the one or more function that the API is configured to perform and/or other information associated with the API (e.g., an API call formatting structure (e.g., including input parameters), … the API description may further include one or more exemplars associated with use of the API (e.g., an example user input, corresponding API call, and example API output). If the value of semantic similarity meets or exceeds a threshold, the API (and, optionally, the API description) may be included in the relevant API data 635. …”, and Par. 0118:” The API provider component 650, the LLM agent component 652, the skill component 654, and/or the TTS component 380 may send action response data 458a-n representing one or more potential responses generated by the one or more APIs corresponding to the action data 647a-n (e.g., the potential response(s) and/or potential action(s)) to the action plan execution component 445. For example, in response to an API call to the skill component 654 associated with a user input for turning on a light, the action response data 458a may correspond to a potential action of “turn on the light,” “turn_on_device (“light”, [device ID])”, or the like. For further example, in response to an API call to the skill component 654 associated with a user input for ordering a pizza from a particular restaurant, the action response data 458b may correspond to a potential action of “order medium pizza from [restaurant name]”, “order_pizza (“medium”, “pizza”, “[restaurant name]”)”, or the like.”)
(d) [[determining at least one target LM skill of the plurality of candidate LM skills]] based on an orchestration loop, the plurality of candidate LM skills, and the task; and (Papayiannis, Par. 0174:” … after completion of natural language understanding processing, after selection of a skill component to process with respect to the user input and prior to initiation of processing by the skill component, …”, and Par. 0076:” … various components, such as a large language model (LLM) orchestrator component 430, a personalized context component 465, and an action plan execution component 445. The LLM orchestrator component 430 may include a plan generation component 435, an LLM shortlister component 440, and a response arbitration component 460.”, and Par. 0077:”The user input data 102 may be received at the LLM orchestrator component 430 of the system component(s) 420, which may be configured to generate a list (e.g., one or more) of tasks (e.g., steps/actions) that are to be completed in order to perform an action responsive to the user input and select a task of the list of the tasks …”, and Par. 0137:” As such, the response language model 720 may select between the one or more potential responses from one or more different components (e.g., for the first example prompt data, the potential responses from the skill component A and the skill component B and, for the second example prompt data, the potential responses from Component A, API A, and API B) to determine that a subset of the potential responses are responsive to the user input. …”).
(e) generating an API call comprising the at least one target LM skill and an API parameter input into an API of the at least one target LM skill based on the task. (Papayiannis, Par. 0108:” The shortlister language model 640 may generate the one or more APIs calls (including the required input parameters) by applying in-context learning for cold-starting APIs (e.g., one-shot/few-shot learning). For example, in embodiments where the relevant API data 635 includes the API descriptions, the shortlister language model 640 may use the one or more exemplars included in the API descriptions (included in the prompt data 615) to determine the one or more input parameters for the API call. In some embodiments, the shortlister language model 640 may be finetuned on such exemplars (e.g., during offline or runtime processing), such that the shortlister language model 640 is capable of determining the one or more input parameters for the given API call.”, and Par. 0118:” The API provider component 650, the LLM agent component 652, the skill component 654, and/or the TTS component 380 may send action response data 458a-n representing one or more potential responses generated by the one or more APIs corresponding to the action data 647a-n (e.g., the potential response(s) and/or potential action(s)) to the action plan execution component 445. For example, in response to an API call to the skill component 654 associated with a user input for turning on a light, the action response data 458a may correspond to a potential action of “turn on the light,” “turn_on_device (“light”, [device ID])”, or the like. For further example, in response to an API call to the skill component 654 associated with a user input for ordering a pizza …”).
Papayiannis, as modified above, does not teach, however Mandlekar further teaches determining at least one target LM skill of the plurality of candidate LM skills (Mandlekar, Par. 0019:” … systems implementing one or more language models-such as one or more large language models (LLMs), systems … “, and Par. 0033:” … system 220 may select the first skill 242 and the third skill 246 from a plurality of skills or sets of instructions. …”).
Regarding claim 18, Papayiannis, as modified above, teaches wherein the orchestration loop comprises computations for prompting a large language model (LLM) to select the at least one target LM skill based on the at least one task associated with the input. (Papayiannis, Par. 0019:” … The LLM(s) receive a prompt including a user input, …”, and Par. 0174:” … after selection of a skill component to process with respect to the user input and prior to initiation of processing by the skill component, …”, and Par. 0076:” … The LLM orchestrator component 430 may include a plan generation component 435, an LLM shortlister component 440, and a response arbitration component 460.”, and Par. 0077:”The user input data 102 may be received at the LLM orchestrator component 430 of the system component(s) 420, which may be configured to generate a list (e.g., one or more) of tasks (e.g., steps/actions) that are to be completed in order to perform an action responsive to the user input and select a task of the list of the tasks …”, and Par. 0135:” If the response language model 720 determines that one or more of the potential responses [implies proper skill selection] are responsive to the user input, the response language model 720 … For example, based on processing the first example prompt data above, the response language model 720 may select one of the potential responses (e.g., the potential responses from skill component A (e.g., a weather skill component)) determined to be responsive to the user input to generate the model output natural language data 125 and the model output prosody data 135a: {“It is currently 70 degrees, with a high of 75 and a low of 68,” } or the like. For further example, based on processing the first example prompt data provided above, the response language model 720 may select more than one of the potential responses (e.g., the potential responses from both the skill component A and skill component B) determined to be responsive to the user input and generate a summary of the selected responses to generate the model output natural language data 125b: {“It is expected to be mostly sunny today, with a high of 75 and a low of 68, but with a chance of rain in the late afternoon,” } or the like.”) Note: Prompt is the request to check the weather condition, and appropriate skill (weather) is selected since the response is appropriately outputted. The task is to go get the weather report which is implied already.
Regarding claim 20, Papayiannis, as modified above, teaches wherein the initial prompt comprises a user request to an LLM, wherein the initial prompt further comprises contextual information associated with at least one of the initial prompt, the user request, the user device, or a user profile associated with a user. (Papayiannis, Par. 0082:” … the user input [request] data 102 is received at the plan prompt generation component 510. The plan prompt generation component 510 processes the user input data 102 to generate prompt data 515 representing a prompt for input to the plan generation language model 520. … The plan prompt generation component 510 may further receive the context data 105 representing the various contextual signals associated with the user input data 102, such as weather information, time of day, device information associated with the device that sent the user input data 102 (e.g., device ID, device states, historical device interaction data, etc.). Such prompt data 515 may be generated based on combining the user input data 102 and the context data 105 … In some embodiments, the prompt data 515 may be generated further based on the personalized context data 467.”, and Par. 0126:” The personalized context data 467 may represent one or more contextual signals associated with the user 405, such as information associated with a user profile of the user 405 …”).
Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Papayiannis, and Mandlekar, and in further view of Suwandy.
Regarding claim 19 Papayiannis teaches the one or more computer storage media of claim 16.
Papayiannis, as modified above, does not teach, however, Suwandy teaches wherein the orchestration loop is run until at least one of: a threshold quantity of loops is reached or until the at least one target LM skill has a threshold level of relatedness to the input. (Suwandy, Par. 0054:” In this example, intent type A 218 is associated with a plurality of skills (skill A 230, skill B 232, skill C 234). A bot developer for the corresponding conversational bot may have provided exemplary natural language inputs for targeting each of those skills. Those exemplary natural language inputs may have been embedded and added to the embedding library. Thus, a similarity score [relatedness] may also be calculated for each string embedding from natural language input 202 and each of those skills. The scores are illustrated as skill A score 236, skill B score 238, and skill C score 240. If a similarity score for an embedding for any of the embedded strings exceeds a threshold value for any of skill A 230, skill B 232, and/or skill C 234, the corresponding skill may be executed and/or a response corresponding to the skill may be generated. “)
Suwandy is considered to be analogous to the claimed invention because it is in the same field of endeavor. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Papayiannis, as modified above, further in view of Suwandy to wherein the orchestration loop is run until the at least one target LM skill has a threshold level of relatedness to the input. Motivation to do so would improve the service and/or for use in improving intent type and/or skill type identification.( Suwandy, Par. 0028).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. El Hattami et al. (US20230409956A1) teaches in Par. 0026:” … text-to-API converter 112 converts a text output of text-to-text model 110 to an API format. In various embodiments, text-to-API converter 112 uses a defined one-to-one mapping from step descriptions to API calls. In various embodiments, all available steps are enumerated and known by text-to-API converter 112. Stated alternatively, in various embodiments, a list of step IDs corresponding to a list of available steps that can be outputted by text-to-text model 110 is kept by text-to-API converter 112 and mapped one-to-one to a list of API calls. In the example illustrated, additional steps unit 100 does not create the final flows, but rather outputs API calls to flow builder application 114 to complete the conversion of the model output to actual steps of flows. ...”
Examiner's Note: Examiner has cited particular columns and line numbers and/or paragraph numbers in the references applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
In the case of amending the Claimed invention, Applicant is respectfully requested to indicate the portion(s) of the specification which dictate(s) the structure relied on for proper interpretation and also to verify and ascertain the metes and bounds of the claimed invention.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DARIOUSH AGAHI, P.E. whose telephone number is (408)918-7689. The examiner can normally be reached Monday - Thursday and alternate Fridays, 7:30-4:30 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached at 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
DARIOUSH AGAHI, P.E.
Primary Examiner
/DARIOUSH AGAHI/Primary Examiner, Art Unit 2656