DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The Amendment filed on 12/19/2025 has been entered. Claims 1-21 remain pending in the application. Claims 8 and 19 have been cancelled.
Response to Arguments
Applicant’s arguments filed 12/19/2025 have been fully considered but they are not persuasive.
With respect to the 35 U.S.C. 101 rejection, on pages 11-12, the Applicant asserts that the claims, as amended, are not directed towards a mental process or an abstract idea. The Applicant asserts that the claims, as amended, integrates any alleged judicial exceptions into a practical application. The Applicant provides an example of a practical application represented by amended claim 1 that includes a conversation conducted using a chatbot. Applicant also asserts that the claims, as amended, include additional elements that improve a technical field and that apply or use any alleged judicial exception in a meaningful way such that amended claim 1 as a whole is more than a drafting effort designed to monopolize the exception. They state that the amended claims improve the technical field of chatbots by receiving the utterance and using generative artificial intelligence.
The Examiner respectfully disagrees. It appears that the Applicant is merely restating what is in the claim language without specifically identifying what elements and how each limitation is significantly more. The amended claim, taken as a whole, is simply the formulation of a plan to solve a problem. This can easily be performed by a human with pen and paper, save for the recitation of generic computer components. Receiving an utterance and choose to use generative artificial intelligence does not improve the technical field of chatbots in a practical manner. The Applicant has not provided any reasoning or evidence as to why the noted individual limitations are not mental activities. The Examiner has considered all of the limitations as noted by the Applicant as part of the abstract idea as mental activities. The Examiner also notes in the rejection noted below that the claims only recited a few additional limitations of “a first generative artificial intelligence model” and “a second generative artificial intelligence model”. These elements, as stated below, are general purpose computing elements. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Hence, the Applicant’s arguments are not persuasive.
With respect to the 35 U.S.C. 103 rejection, pages 13-14, of claims 1-20 under Liang et al. ("TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs", 03/29/2023), hereinafter referred to as Liang, in view of Lu et al. ("Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models", 05/24/2023), hereinafter referred to as Lu, the Applicant asserts that the cited art fails to teach or suggest the amended claims. The Applicant asserts that Liang does not disclose “generating, by a first generative artificial intelligence model using the input prompt, an execution plan for executing one or more requests represented by the natural language utterance, wherein generating the execution plan comprises: determining, based on [[ the ]] one or more potential agents and associated actions, one or more agents and one or more actions associated with the one or more agents that can service the one or more requests, and generating a structured output for the execution plan by creating an ordered list that comprises the one or more actions for executing the one or more requests” with respect to claim 1.
In response to Liang, in view of Lu, not disclosing or suggesting “generating, by a first generative artificial intelligence model using the input prompt, an execution plan for executing one or more requests represented by the natural language utterance, wherein generating the execution plan comprises: determining, based on [[ the ]] one or more potential agents and associated actions, one or more agents and one or more actions associated with the one or more agents that can service the one or more requests, and generating a structured output for the execution plan by creating an ordered list that comprises the one or more actions for executing the one or more requests”, Liang Figure 1 pg. 3 shows a user instruction stemming from a conversational context being input into the Multimodal Conversational Foundation Model (MCFM). This model outputs a solution outline, i.e. an execution plan, to the API selector, which then communicates back with the MCFM to finalize the solution outline, which is then output as the action sequence. The API selector searches for specific agents that are able to perform the user instructions and aligns these agents with the MCFM and thereby the user. This is the action sequence, i.e. an order list, that is then executed to perform the original actions requested by the user.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1 – 7, 9-18, 20 – 21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Independent claims 1, 10, and 16 recite “receiving a natural language utterance”, “constructing an input prompt comprising the natural language utterance received from a user”, “generating, by a first generative artificial intelligence model using the input prompt, an execution plan”, “determining one or more agents and one or more actions associated with the one or more agents that can service the one or more requests”, “generating a structured output for the execution plan”, “executing the execution plan”, “triggering performance of the one or more actions”, “receiving one or more outputs”, “generating, by a second generative artificial intelligence model using the one or more outputs, a response”, and “providing the response”. These limitations, as drafted, are a process that, under a broadest reasonable interpretation, covers the abstract idea of “mental processes” because they cover concepts performed in the human mind, including observation, evaluation, judgement, and opinion. See MPEP 2106.04(a)(2). That is, other than reciting “a first generative artificial intelligence model” and “a second generative artificial intelligence model”, nothing in the claimed elements preclude the steps from being practically performed by a person taking in an input from another person, writing out an execution plan, figuring out what other parties can execute said plan, executing the written out plan, and receiving and then outputting the final response to the original input.
This judicial exception is not integrated into a practical application because the additional elements “a first generative artificial intelligence model” and “a second generative artificial intelligence model” are generic computer components and are recited at such a high level of generality. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Thus, the claims as a whole are directed to an abstract idea (Step 2A, prong two).
Claims 1, 10, and 16 do not include any additional elements that are sufficient to amount to significantly more than the judicial exception because, as discussed above with respect to integration of the abstract idea into a practical application, the additional elements of “a first generative artificial intelligence model” and “a second generative artificial intelligence model” amount to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept (Step 2B).
Dependent claims 2-7, 9, 11-15 and 17, 18, 20 are directed further to abstract details relating to the input prompt and natural language utterance, the agents, and the execution plan. These limitations are also related to the abstract idea of “mental processes”. That is, nothing in the claimed elements preclude the steps from practically being performed by a person taking in an input from another person, writing out an execution plan, figuring out what other parties can execute said plan, executing the written out plan, and receiving and then outputting the final response to the original input.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1 – 7, 9-18, 20 – 21 -20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liang et al. ("TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs", 03/29/2023), hereinafter referred to as Liang, in view of Lu et al. ("Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models", 05/24/2023), hereinafter referred to as Lu.
Regarding claim 1, Liang discloses a computer-implemented method comprising: receiving a natural language utterance (Liang Fig. 1 shows a user instruction from a conversational context being input into the multimodal conversational foundation model (MCFM));
constructing an input prompt comprising [[ a ]] the natural language utterance (Liang Figure 1 pg. 3, user instruction is received by the multimodal conversational foundation model (MCFM) and Liang Figure 2 pg. 6, this instruction is dialogue based, i.e. a natural language instruction);
generating, by a first generative artificial intelligence model using the input prompt, an execution plan for executing one or more requests represented by the natural language utterance (Liang Figure 1 pg. 3, the MCFM outputs a solution outline),
wherein generating the execution plan comprises: determining, based on [[ the ]] one or more potential agents and associated actions, one or more agents and one or more actions associated with the one or more agents that can service the one or more requests (Liang Figure 1 pg. 3, the API selector chooses the most relevant APIs according to the solution outline steps where each API is associated with an action in the solution outline),
and generating a structured output for the execution plan by creating an ordered list that comprises the one or more actions for executing the one or more requests (Liang Figure 1 pg. 3, the MCFM outputs an action sequence in the form of a list using the most relevant APIs);
executing the execution plan to perform the one or more actions using the one or more agents (Liang Figure 1 pg. 3, the most relevant APIs are executed by calling APIs), wherein executing the execution plan comprises: triggering performance of the one or more actions by the one or more agents (Liang Figure 1 pg. 3, the most relevant APIs are executed by calling APIs), and receiving one or more outputs from performance of the one or more actions by the one or more agents ("After the execution, the action executor will return the results to users," Liang 2.5 pg. 4);
and providing the response ("After the execution, the action executor will return the results to users," Liang 2.5 pg. 4).
However, Liang fails to disclose generating, by a second generative artificial intelligence model using the one or more outputs, a response to the natural language utterance. Lu teaches a plug-and-play compositional reasoning framework to use external tools to address a broad range of tasks.
Lu teaches generating, by a second generative artificial intelligence model using the one or more outputs, a response to the natural language utterance (Pan Lu Figure 1 pg. 1, shows it is known within the art to use a generative AI model to generate final answers in a plan-based task execution system).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Liang’s method of completing tasks by connecting the main AI model to a multitude of APIs by including Lu’s method of using multiple models to complete various steps in a plan-based execution system. Using multiple different models that are specialized on specific tasks, in this case using a specific model to be the final answer generator for the requests, would improve the efficiency and accuracy of the process at each step, and thereby maximize the efficiency and accuracy of the system as a whole. This inclusion would have been obvious to one of ordinary skill in the art.
Regarding claim 2, Liang, in view of Lu, discloses all of the limitations of claim 1. Liang further discloses wherein constructing the input prompt comprises: executing, using the natural language utterance, a semantic search on descriptions associated with [[ the ]] available agents and actions in a data store ("Since the API platform may have millions of APIs, the API selector needs the search capability to retrieve semantically relevant APIs," Liang 2.4 pg. 4 and "Each package corresponds to a specific domain," Liang 2.4 pg. 4);
identifying, based on a semantic search, the one or more candidate agents and associated actions ("Since the API platform may have millions of APIs, the API selector needs the search capability to retrieve semantically relevant APIs," Liang 2.4 pg. 4);
and constructing a natural language representation for the input prompt by appending the one or more candidate agents and associated actions to the natural language utterance (Liang Figure 1 pg. 3, the API selector chooses the most relevant APIs, which is fed back into the MCFM, which is then output in the form of an action sequence and "Next, MCFM generates action codes using the recommended APIs," Liang Figure 1 pg. 3).
Regarding claim 3, Liang, in view of Lu, discloses all of the limitations of claim 2. Liang further discloses wherein: the natural language utterance is a continuation or subsequent utterance within a conversation (Liang Figure 2 pg. 6, the user is having a dialogue (conversation) with the system),
the input prompt further comprises: (iii) conversation history and actions executed prior to the natural language utterance (Liang Figure 2 pg. 6, shows the user altering previous actions executed by the system and Liang 2.1 Formula (1) uses "the conversational context, denoted as C"), and
constructing the input prompt comprises accessing the conversation history and the actions executed prior to the natural language utterance, and constructing the natural language representation for the input prompt by appending the one or more candidate agents, the associated actions, and the conversation history and the actions executed prior to the natural language utterance to the natural language utterance (Liang Figure 2 pg. 6 shows the user having a dialogue and altering previous actions executed by the system and Liang Figure 1 pg. 3 shows the feedback being fed back into the MCFM, being then used to create the solution outline and thereby the action sequence. This implies that the conversation history and actions executed prior are used in creating future solution outlines and action sequences).
Regarding claim 4, Liang, in view of Lu, discloses all of the limitations of claim 3. Liang further discloses wherein: the one or more agents are a plurality of agents and the one or more actions are a plurality of actions (Liang Figure 1 pg. 3, the API selector chooses the most relevant APIs according to the solution outline steps where each API is associated with an action in the solution outline),
a first subset of the plurality of agents and the plurality of actions are in a first state and a second subset of the plurality of agents and the plurality of actions are in a second state (Liang Figure 1 pg. 3, the API selector chooses the most relevant APIs according to the solution outline steps where each API is associated with an action in the solution outline, this is dependent upon the specific APIs being called upon, and it is obvious and known within the art, as some APIs would require a login/registration before utilizing said API),
the first state is a ready-for-execution state, and the second state is a not-ready-for-execution state where additional information is required prior to execution of one or more actions within the second subset of the plurality of agents and the plurality of actions (Liang Figure 1 pg. 3, the API selector chooses the most relevant APIs according to the solution outline steps where each API is associated with an action in the solution outline, this is dependent upon the specific APIs being called upon, and it is obvious and known within the art, as some APIs would require a login/registration before utilizing said API).
Regarding claim 5, Liang, in view of Lu, discloses all of the limitations of claim 1. Liang further discloses wherein: executing the execution plan further comprises accessing contextual information that is needed by at least one of the one or more agents for performing at least one of the one or more actions (Liang Figure 1 pg. 3 shows conversational context being input to the MCFM, using that to generate the solution outline and thereby the action sequence);
triggering the performance of the one or more actions comprises forwarding one or more requests for performance of the one or more actions to the one the one or more agents (Liang Figure 1 pg. 3, the API selector chooses the most relevant APIs according to the solution outline steps where each API is associated with an action in the solution outline, implies calling upon the use of the selected APIs);
and a request of the one or more requests being forwarded for performance of the at least one of the one or more actions includes the contextual information (Liang Figure 1 pg. 3 shows conversational context being input to the MCFM, using that to generate the solution outline, which feeds into the API selector, and thereby generate the action sequence).
Regarding claim 6, Liang, in view of Lu, discloses all of the limitations of claim 1. Liang further discloses wherein: the one or more agents are a plurality of agents, the one or more actions are a plurality of actions, and the one or more requests are a plurality of requests (Liang Figure 1 pg. 3, the API selector chooses the most relevant APIs according to the solution outline steps where each API is associated with an action in the solution outline),
generating the execution plan further comprises determining whether one or more dependencies exist between the plurality of actions, and when the one or more dependencies exist, the ordered list is created to comprise the plurality of agents, the plurality of actions for executing the one or more requests, and an indication of the one or more dependencies ("Developers who offer a package of APIs could provide composition instructions. This can serve as guidance to the model on how to combine multiple APIs to accomplish complex user instructions," Liang 2.3 pg. 4),
when the execution plan comprises the indication of the one or more dependencies, the performance of the one or more actions by the one or more agents is triggered via serial processing ("Since this is a complex instruction, TaskMatrix.AI must break it down into roughly 25 API calls to complete the task," Liang 4.2 pg. 20 and Figure 11 pg. 18-19, these actions must inherently be ordered sequentially accordingly),
when the execution plan does not comprise the indication of the one or more dependencies, the performance of the one or more actions by the one or more agents is triggered via parallel processing ("TaskMatrix.AI uses an action executor to run various APIs, ranging from simple HTTP requests to complex algorithms or AI models that need multiple input parameters," Liang 2.5 pg. 4, it would be obvious to run APIs in parallel, as it is well-known in the art and would save time and computational power),
and the response is an aggregate response comprising a plurality of responses to the plurality of requests within the natural language utterance ("After the execution, the action executor will return the results to users," Liang 2.5 pg. 4, would be obvious to return the results to the plurality of requests).
Regarding claim 7, Liang, in view of Lu, discloses all of the limitations of claim 1. Liang further discloses wherein: the natural language utterance is received in a conversation with a chatbot (Liang Figure 2 pg. 6, the user is having a dialogue (conversation) with the system and Liang 2.1 Formula (1) uses "the conversational context, denoted as C" and Liang Fig. 1 shows a user instruction from a conversational context being input into the multimodal conversational foundation model (MCFM));
the natural language utterance is a continuation or subsequent utterance within [[ a ]] the conversation [[ and ]] (Liang Figure 2 pg. 6, the user is having a dialogue (conversation) with the system and Liang 2.1 Formula (1) uses "the conversational context, denoted as C");
the response to the natural language utterance is generated by the generative artificial intelligence model using the one or more outputs, the natural language utterance, and a conversation history for the conversation (Liang Figure 2 pg. 6, the user is having a dialogue (conversation) with the system and Liang 2.1 Formula (1) uses "the conversational context, denoted as C");
and the response is transmitted as a response from the chatbot ("After the execution, the action executor will return the results to users," Liang 2.5 pg. 4).
However, Liang does not disclose the second generative artificial intelligence model.
Lu teaches the second generative artificial intelligence model (Pan Lu Figure 1 pg. 1, shows it is known within the art to use a generative AI model to generate final answers in a plan-based task execution system).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Liang’s method of completing tasks by connecting the main AI model to a multitude of APIs by including Lu’s method of using multiple models to complete various steps in a plan-based execution system. Using multiple different models that are specialized on specific tasks, in this case using a specific model to be the final answer generator for the requests, would improve the efficiency and accuracy of the process at each step, and thereby maximize the efficiency and accuracy of the system as a whole. This inclusion would have been obvious to one of ordinary skill in the art.
Regarding claim 9, Liang, in view of Lu, discloses all of the limitations of claim 1. Liang further discloses wherein the first generative artificial intelligence model is the same model as the second generative artificial intelligence model (Liang Figure 1 pg. 3 shows the same model outputting a response).
As to claim 10, system claim 10 and method claim 1 are related as method and system of using same, with each claimed element’s function corresponding to the method step. Accordingly, claim 10 is similarly rejected under the same rationale as applied above with respect to the method claim.
As to claim 11, system claim 11 and method claim 2 are related as method and system of using same, with each claimed element’s function corresponding to the method step. Accordingly, claim 11 is similarly rejected under the same rationale as applied above with respect to the method claim.
As to claim 12, system claim 12 and method claim 3 are related as method and system of using same, with each claimed element’s function corresponding to the method step. Accordingly, claim 12 is similarly rejected under the same rationale as applied above with respect to the method claim.
As to claim 13, system claim 13 and method claim 4 are related as method and system of using same, with each claimed element’s function corresponding to the method step. Accordingly, claim 13 is similarly rejected under the same rationale as applied above with respect to the method claim.
As to claim 14, system claim 14 and method claim 5 are related as method and system of using same, with each claimed element’s function corresponding to the method step. Accordingly, claim 14 is similarly rejected under the same rationale as applied above with respect to the method claim.
As to claim 15, system claim 15 and method claim 6 are related as method and system of using same, with each claimed element’s function corresponding to the method step. Accordingly, claim 15 is similarly rejected under the same rationale as applied above with respect to the method claim.
As to claim 16, computer-readable medium (CRM) claim 16 and method claim 1 are related as method and CRM of using same, with each claimed element’s function corresponding to the method step. Accordingly, claim 16 is similarly rejected under the same rationale as applied above with respect to the method claim.
As to claim 17, CRM claim 17 and method claim 2 are related as method and CRM of using same, with each claimed element’s function corresponding to the method step. Accordingly, claim 17 is similarly rejected under the same rationale as applied above with respect to the method claim.
As to claim 18, CRM claim 18 and method claim 7 are related as method and CRM of using same, with each claimed element’s function corresponding to the method step. Accordingly, claim 18 is similarly rejected under the same rationale as applied above with respect to the method claim.
As to claim 20, CRM claim 20 and method claim 10 are related as method and CRM of using same, with each claimed element’s function corresponding to the method step. Accordingly, claim 20 is similarly rejected under the same rationale as applied above with respect to the method claim.
Regarding claim 21, Liang, in view of Lu, discloses all of the limitations of claim 1. Liang further discloses wherein the execution plan comprises computer-executable instructions (“It should be able to take multimodal inputs and contexts (such as text, image, video, audio, and code) and generate executable codes based on APIs that can complete specific tasks,” Liang pg. 3),
wherein executing the execution plan comprises executing the computer-executable instructions to: transmit data to the one or more agents to facilitate performance of the one or more actions to generate data to provide to the generative artificial intelligence model (Liang Fig. 8 shows a conversational chatbot requesting extra data to perform the user’s requests);
and receive the one or more outputs from the performance of the one or more actions by the one or more agents ("After the execution, the action executor will return the results to users," Liang 2.5 pg. 4).
However, Liang does not disclose the second generative artificial intelligence model.
Lu teaches the second generative artificial intelligence model (Pan Lu Figure 1 pg. 1, shows it is known within the art to use a generative AI model to generate final answers in a plan-based task execution system).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Liang’s method of completing tasks by connecting the main AI model to a multitude of APIs by including Lu’s method of using multiple models to complete various steps in a plan-based execution system. Using multiple different models that are specialized on specific tasks, in this case using a specific model to be the final answer generator for the requests, would improve the efficiency and accuracy of the process at each step, and thereby maximize the efficiency and accuracy of the system as a whole. This inclusion would have been obvious to one of ordinary skill in the art.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ADAM MICHAEL WEAVER whose telephone number is (571)272-7062. The examiner can normally be reached Monday-Friday, 8AM-5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached at (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ADAM MICHAEL WEAVER/Examiner, Art Unit 2658
/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658