Last updated: April 19, 2026
Application No. 19/070,120
INTERMEDIARY ROUTING AND MODERATION PLATFORM FOR GENERATIVE ARTIFICIAL INTELLIGENCE SYSTEM INTERFACING

Final Rejection §101§103
Filed
Mar 04, 2025
Examiner
MAHMOOD, REZWANUL
Art Unit
2159
Tech Center
2100 — Computer Architecture & Software
Assignee
Target Brands Inc.
OA Round
2 (Final)
Interview Optional

— +34.7% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 402 resolved cases, 2023–2026
Examiner Intelligence

MAHMOOD, REZWANUL View full profile →
Grants 46% of resolved cases
Career Allow Rate
186 granted / 402 resolved
-8.7% vs TC avg
Strong +35% interview lift
Without
With
+34.7%
Interview Lift
resolved cases with interview
Typical timeline
4y 5m
Avg Prosecution
31 currently pending
Career history
433
Total Applications
across all art units
Statute-Specific Performance

§101
18.9%
-21.1% vs TC avg
§103
54.8%
+14.8% vs TC avg
§102
9.0%
-31.0% vs TC avg
§112
12.1%
-27.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 402 resolved cases
Office Action

§101 §103
DETAILED ACTION
	This office action is in response to the communication filed on March 16, 2026. Claims 1-20 are currently pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
	Applicant's arguments filed on March 16, 2026 have been fully considered but they are not persuasive for the following reasons:

	Applicant in Pages 8-11 of the Remarks argues that claims 1-20 are not directed to and abstract idea under 101 because the claim limitations are computer-centric improvements to AI system operation, not mental processes or abstract data manipulation, even assuming that the claim involves an abstract idea, the claim as a whole is integrated into a practical application that improves the functioning of generative AI computing systems, and even if the claim is directed to an abstract idea, the claim nonetheless recites an inventive concept rooted in technical improvements to AI system operation, not the abstract idea itself.

	Examiner respectfully disagrees.

	It is important to note that the judicial exception alone cannot provide the improvement. The improvement can be provided by one or more additional elements (MPEP 2106.05(a)).

	Independent claims 1, 10, and 16 covers several steps, such as the identify and generate steps in claim 1, the identifying, generating, and determining steps in claim 10, and the determining and generating steps in claim 16, that recite an abstract idea within the “Mental Processes” grouping of abstract ideas, because a person can mentally or using a pen and paper perform the limitations recited in said steps, which are discussed in detail in the 101 rejection below.

	The claims do not provide any limitations that are directed to a specific improvement in computer technology, because the limitations argued by the applicant as being directed to a specific improvement are all recited in the claims as limitations that have been identified as abstract ideas.

	The remaining steps in the claims that are identified as reciting additional elements, are only adding insignificant extra-solution activity to the judicial exception, and are recognized as a well understood, routine, and conventional activity within the field of computer functions, which is not sufficient to amount to significantly more than the judicial exception and are not directed to any specific improvement in computer technology.

	Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Applicant in Pages 11-13 of the Remarks argues that the cited prior art Hintz and Paul do not teach or even suggest the features "identify, based at least in part on historical response quality score generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems to invoke to respond to the input query", as recited in amended independent claim 1 and similarly recited in amended independent claims 10 and 16.

Applicant’s arguments with respect to amended independent claim(s) 1, 10, and 16 have been considered but are moot in view of new grounds of rejection.
  
For the above reasons, Examiner states that rejection of the current Office action is proper.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

At step 1:

Independent claims 1, 10, and 16 respectively recite a routing and moderation
platform, a routing and moderation platform, and a method, which are directed to a
statutory category such as a process, machine, or an article of manufacture.

At step 2A, prong one:

Independent claim 1 recites the limitations:

“identify, based at least in part on historical response quality score generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems to invoke to respond to the input query”;

A person can mentally or using a pen and paper identify, based at least in part on historical response quality score generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems to invoke to respond to the input query.

“generate instructions for constructing a response to the input query”;

A person can mentally or using a pen and paper generate instructions for constructing a response to an input query.

“generate a prompt based on the input query, the contextual information and the instructions”;

A person can mentally or using a pen and paper generate a prompt based on an input query, contextual information, and instructions.

“generate a tuned prompt by compressing the number of tokens included within the prompt”;

A person can mentally or using a pen and paper generate a tuned prompt by compressing a number of tokens included within a prompt.

The limitations, as recited above, are processes that, under their broadest reasonable interpretation, cover steps that can be performed in the human mind or by a human using a pen and paper, but for recitation of generic computer components.

If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea.

Independent claim 10 recites the limitations:

“identifying, based at least in part on historical response quality score generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems to invoke to respond to the input query”;

A person can mentally or using a pen and paper identify, based at least in part on historical response quality score generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems to invoke to respond to the input query.

“generating instructions for constructing a response to the input query”;

A person can mentally or using a pen and paper generate instructions for constructing a response to an input query.

“generating a prompt based on the input query, the contextual information and the instructions”;

A person can mentally or using a pen and paper generate a prompt based on an input query, contextual information, and instructions.

“generating a tuned prompt by compressing the number of tokens included within the prompt”;

A person can mentally or using a pen and paper generate a tuned prompt by compressing a number of tokens included within a prompt.

“generating an average quality score for the response”;

A person can mentally or using a pen and paper generate an average quality score for a response.

“upon determining that the average quality score meets a threshold quality value…”.

A person can mentally or using a pen and paper determine that an average quality score meets a threshold quality value.

The limitations, as recited above, are processes that, under their broadest reasonable interpretation, cover steps that can be performed in the human mind or by a human using a pen and paper, but for recitation of generic computer components.

If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea.

Independent claim 16 recites the limitations:

“determining, based at least in part on historical response quality score generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems to invoke to respond to the input query”;

A person can mentally or using a pen and paper determine, based at least in part on historical response quality score generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems to invoke to respond to the input query.

“generating instructions for constructing a response to the input query”;

A person can mentally or using a pen and paper generate instructions for constructing a response to an input query.

“generating a prompt based on the input query, the contextual information and the instructions”;

A person can mentally or using a pen and paper generate a prompt based on an input query, contextual information, and instructions.

“generating a tuned prompt by compressing the number of tokens included within the prompt”;

A person can mentally or using a pen and paper generate a tuned prompt by compressing a number of tokens included within a prompt.

The limitations, as recited above, are processes that, under their broadest reasonable interpretation, cover steps that can be performed in the human mind or by a human using a pen and paper, but for recitation of generic computer components.

If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea.

At step 2A, prong two:

This judicial exception is not integrated into a practical application.

Independent claim 1 recites the limitations:

“receive an input query submitted from a tenant device of the plurality of tenant devices”, which is a step of receiving data. The step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (MPEP 2106.05(g)).

“obtain contextual information that is relevant to the input query from one or more enterprise systems”, which is a step of obtaining or retrieving data. The step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (MPEP 2106.05(g)).

“wherein the routing application programming interface is further configured to submit the tuned prompt to the LLM-based generative Al system and receive the response to the input query”, which is a step of submitting or transmitting data and receiving data. The step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (MPEP 2106.05(g)).

The additional elements “a routing and moderation platform comprising:”, “a routing application programming interface communicatively interfaced to a plurality of tenant devices to:”, “from a tenant device of the plurality of tenant devices”, “an LLM-based generative Al system from a plurality of LLM-based generative AI systems”, “a prompt templating service executable to:”, “from one or more enterprise systems”, “wherein the routing application programming interface is further configured to”, and “to the LLM-based generative Al system” in the steps in claim 1 are recited at a high-level of generality, such that it amounts to no more than mere instructions to apply the exception using generic computer components.

Accordingly, the additional elements, individually or in combination, do not
integrate the abstract idea into a practical application, even viewing the claims a whole,
because it does not impose any meaningful limits on practicing the abstract idea.

Independent claim 10 recites the limitations:

“receiving an input query submitted from a tenant device of a plurality of tenant devices”, which is a step of receiving data. The step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (MPEP 2106.05(g)).

“obtaining contextual information that is relevant to the input query from one or more enterprise systems”, which is a step of obtaining or retrieving data. The step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (MPEP 2106.05(g)).

“submitting the tuned prompt to the LLM-based generative AI system”, which is a step of submitting or transmitting data. The step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (MPEP 2106.05(g)).

“receiving the response to the input query”, which is a step of receiving data. The step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (MPEP 2106.05(g)).

“upon determining that the average quality score meets a threshold quality value, sending the response to the tenant device”, which is a step of sending or transmitting data. The step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (MPEP 2106.05(g)).

The additional elements “a routing and moderation platform comprising: a computing system comprising a processor and a memory, the computing system including instructions which, when executed, cause the routing and moderation platform to perform:”, “from a tenant device of the plurality of tenant devices”, “an LLM-based generative Al system from a plurality of LLM-based generative AI systems”, “from one or more enterprise systems”, “to the LLM-based generative Al system”, and “to the tenant device” in the steps in claim 10 are recited at a high-level of generality, such that it amounts to no more than mere instructions to apply the exception using generic computer components.

Accordingly, the additional elements, individually or in combination, do not
integrate the abstract idea into a practical application, even viewing the claims a whole,
because it does not impose any meaningful limits on practicing the abstract idea.

Independent claim 16 recites the limitations:

“receiving an input query submitted from a tenant device of the plurality of tenant devices”, which is a step of receiving data. The step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (MPEP 2106.05(g)).

“obtaining contextual information that is relevant to the input query from one or more enterprise systems”, which is a step of obtaining or retrieving data. The step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (MPEP 2106.05(g)).

“submitting the tuned prompt to the LLM-based generative AI system”, which is a step of submitting or transmitting data. The step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity (MPEP 2106.05(g)).

The additional elements “a method for routing and moderation of questions received from tenants, the method comprising:”, “from a tenant device of the plurality of tenant devices”, “an LLM-based generative Al system from a plurality of LLM-based generative AI systems”, “from one or more enterprise systems”, and “to the LLM-based generative Al system” in the steps in claim 16 are recited at a high-level of generality, such that it amounts to no more than mere instructions to apply the exception using generic computer components.

Accordingly, the additional elements, individually or in combination, do not
integrate the abstract idea into a practical application, even viewing the claims a whole,
because it does not impose any meaningful limits on practicing the abstract idea.

At step 2B:

Independent claims 1, 10, and 16 recites the same additional elements as
identified in step 2A prong two above. These additional elements are not sufficient to
amount to significantly more than the judicial exception.

Independent claim 1 recites the limitations:

“receive an input query submitted from a tenant device of the plurality of tenant devices”, which is a step of receiving data, and is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).

“obtain contextual information that is relevant to the input query from one or more enterprise systems”, which is a step of obtaining or retrieving data, and is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of storing and retrieving information in memory (MPEP 2106.05(d)(II)(iv)).

“wherein the routing application programming interface is further configured to submit the tuned prompt to the LLM-based generative Al system and receive the response to the input query”, which is a step of submitting or transmitting data and receiving data, and is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).

Accordingly, the additional limitations are not sufficient to amount to significantly more than the judicial exception. Therefore, the claims are directed to an abstract idea and are not patent eligible.

Independent claim 10 recites the limitations:

“receiving an input query submitted from a tenant device of a plurality of tenant devices”, which is a step of receiving data, and is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).

“obtaining contextual information that is relevant to the input query from one or more enterprise systems”, which is a step of obtaining or retrieving data, and is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of storing and retrieving information in memory (MPEP 2106.05(d)(II)(iv)).

“submitting the tuned prompt to the LLM-based generative AI system”, which is a step of submitting or transmitting data, and is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).

“receiving the response to the input query”, which is a step of receiving data, and is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).

“upon determining that the average quality score meets a threshold quality value, sending the response to the tenant device”, which is a step of sending or transmitting data, and is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).

Accordingly, the additional limitations are not sufficient to amount to significantly more than the judicial exception. Therefore, the claims are directed to an abstract idea and are not patent eligible.

Independent claim 16 recites the limitations:

“receiving an input query submitted from a tenant device of the plurality of tenant devices”, which is a step of receiving data, and is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).

“obtaining contextual information that is relevant to the input query from one or more enterprise systems”, which is a step of obtaining or retrieving data, and is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of storing and retrieving information in memory (MPEP 2106.05(d)(II)(iv)).

“submitting the tuned prompt to the LLM-based generative AI system”, which is a step of submitting or transmitting data, and is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).

Accordingly, the additional limitations are not sufficient to amount to significantly more than the judicial exception. Therefore, the claims are directed to an abstract idea and are not patent eligible.

Dependent claim 2 recites additional limitations, such as:
“determine a quality level of the response for a plurality of evaluation criteria”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 1, because a person can mentally or using a pen and paper determine a quality level of a response for a plurality of evaluation criteria, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“generate quality score for each of the plurality of evaluation criteria based on the quality level for each of the plurality of evaluation criteria”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 1, because a person can mentally or using a pen and paper generate quality score for each of a plurality of evaluation criteria based on a quality level for each of the plurality of evaluation criteria, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“calculate an average quality score based on an average of the quality score for each of the plurality of evaluation criteria”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 1, because a person can mentally or using a pen and paper calculate an average quality score based on an average of a quality score for each of the plurality of evaluation criteria, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“determine whether average quality score is above a threshold quality value”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 1, because a person can mentally or using a pen and paper determine whether average quality score is above a threshold quality value, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“upon determining that the average quality score meets a threshold quality value, send the response to the tenant device”, which is a step of sending or transmitting data.
At step 2A prong two, the step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity.
At step 2B, the step is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).
“upon determining that the average quality score does not meet the threshold quality value, generate a modified prompt with at least one of: modified contextual information and modified instructions…”.
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 1, because a person can mentally or using a pen and paper determine that an average quality score does not meet a threshold quality value and the person can generate a modified prompt with at least one of: modified contextual information and modified instructions based on the determination, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“…and submit the modified prompt to one of: the LLM-based generative Al system or a different LLM-based generative Al system of the plurality of LLM-based generative Al systems”, which is a step of submitting or transmitting data.
At step 2A prong two, the step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity.
At step 2B, the step is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).
The additional elements “a moderation service executable to:” and “the LLM-based generative Al system or a different LLM-based generative Al system of the plurality of LLM-based generative Al systems” in the steps are recited at a high-level of generality, such that it amounts to no more than mere instructions to apply the exception using generic computer components.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 3 recites additional limitations, such as:
“wherein the instructions for constructing the response to the input query includes instructions for interpreting the input query and instructions for constructing tone and content of the response”.
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 1, because a person can mentally or using a pen and paper generate instructions for constructing a response to an input query which includes instructions for interpreting the input query and instructions for constructing a tone and content of the response, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 4 recites additional limitations, such as:
“wherein the input query comprises textual questions”, which is a step of receiving data.
At step 2A prong two, the step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity.
At step 2B, the step is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 5 recites additional limitations, such as:
“wherein the tenant devices are associated with a plurality of different types of tenants having different access rights to enterprise data”, which are additional elements recited at a high-level of generality, such that it amounts to no more than mere instructions to apply the exception using generic computer components.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 6 recites additional limitations, such as:
“wherein the tenant devices include customer tenant devices and employee tenant devices”, which are additional elements recited at a high-level of generality, such that it amounts to no more than mere instructions to apply the exception using generic computer components.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 7 recites additional limitations, such as:
“wherein the plurality of different LLM-based generative AT systems include at least one enterprise-hosted LLM model and at least one external LLM model”, which are additional elements recited at a high-level of generality, such that it amounts to no more than mere instructions to apply the exception using generic computer components.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 8 recites additional limitations, such as:
“wherein the quality scores include one or more of: a relevancy score, a toxicity score, a consistency score, a fluency score, a bias score, a diversity score, a hallucination score, a coherence score, a context awareness score and an understanding ambiguity score”.
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 1 and dependent claim 2, because a person can mentally or using a pen and paper generate quality score including one or more of: a relevancy score, a toxicity score, a consistency score, a fluency score, a bias score, a diversity score, a hallucination score, a coherence score, a context awareness score and an understanding ambiguity score, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 9 recites additional limitations, such as:
“wherein the routing application programming interface is configured to submit the prompt to a plurality of the different LLM- based generative AI systems”, which is a step of submitting or transmitting data.
At step 2A prong two, the step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity.
At step 2B, the step is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 11 recites additional limitations, such as:
“wherein generating the average quality score for the response includes:
determine a quality level of the response for a plurality of evaluation criteria”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 10, because a person can mentally or using a pen and paper generate an average quality score for a response by mentally or using a pen and paper determining a quality level of the response for a plurality of evaluation criteria, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“generate quality scores for each of the plurality of evaluation criteria based on the quality level for each of the plurality of evaluation criteria”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 10, because a person can mentally or using a pen and paper generate quality scores for each of a plurality of evaluation criteria based on a quality level for each of the plurality of evaluation criteria, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“calculate the average quality score based on an average of the quality score for each of the plurality of evaluation criteria”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 10, because a person can mentally or using a pen and paper calculate an average quality score based on an average of a quality score for each of the plurality of evaluation criteria, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 12 recites additional limitations, such as:
“upon determining that the average quality score does not meet the threshold quality value, generate a modified prompt with at least one of: modified contextual information and modified instructions…”.
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 10, because a person can mentally or using a pen and paper determine that an average quality score does not meet a threshold quality value and the person can generate a modified prompt with at least one of: modified contextual information and modified instructions based on the determination, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“…and submit the modified prompt to one of: the LLM-based generative Al system or a different LLM-based generative Al system of the plurality of LLM-based generative Al systems”, which is a step of submitting or transmitting data.
At step 2A prong two, the step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity.
At step 2B, the step is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).
The additional elements “the LLM-based generative Al system or a different LLM-based generative Al system of the plurality of LLM-based generative Al systems” in the steps are recited at a high-level of generality, such that it amounts to no more than mere instructions to apply the exception using generic computer components.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 13 recites additional limitations, such as:
“wherein the contextual information comprises enterprise confidential information”, which is a step of obtaining or retrieving data.
At step 2A prong two, the step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity.
At step 2B, the step is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of storing and retrieving information in memory (MPEP 2106.05(d)(II)(iv)).
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 14 recites additional limitations, such as:
“wherein the LLM-based generative AI system is identified based at least in part on historical response quality scores of each of the plurality of LLM-based generative Al systems”.
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 10, because a person can mentally or using a pen and paper identify a LLM-based generative AI system based at least in part on historical response quality scores of each of a plurality of LLM-based generative Al systems, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 15 recites additional limitations, such as:
“wherein the LLM-based generative Al system is identified based at least on part on a cost of submitting the tuned prompt to each of the plurality of LLM-based generative Al systems”.
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 10, because a person can mentally or using a pen and paper identify a LLM-based generative AI system based at least on part on a cost of submitting a tuned prompt to each of a plurality of LLM-based generative Al systems, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 17 recites additional limitations, such as:
“receiving the response to the input query”, which is a step of receiving data.
At step 2A prong two, the step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity.
At step 2B, the step is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).
“determine a quality level of the response for a plurality of evaluation criteria”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 16, because a person can mentally or using a pen and paper determine a quality level of a response for a plurality of evaluation criteria, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“generate quality scores for each of the plurality of evaluation criteria based on the quality level for each of the plurality of evaluation criteria”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 16, because a person can mentally or using a pen and paper generate quality score for each of a plurality of evaluation criteria based on a quality level for each of the plurality of evaluation criteria, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“calculate an average quality score based on an average of the quality score for each of the plurality of evaluation criteria”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 16, because a person can mentally or using a pen and paper calculate an average quality score based on an average of a quality score for each of the plurality of evaluation criteria, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.
Dependent claim 18 recites additional limitations, such as:
“determine whether average quality score is above a threshold quality value”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 16, because a person can mentally or using a pen and paper determine whether average quality score is above a threshold quality value, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“upon determining that the average quality score meets a threshold quality value, send the response to the tenant device”, which is a step of sending or transmitting data.
At step 2A prong two, the step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity.
At step 2B, the step is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 19 recites additional limitations, such as:
“determine whether average quality score is above a threshold quality value”;
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 16, because a person can mentally or using a pen and paper determine whether average quality score is above a threshold quality value, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“upon determining that the average quality score does not meet the threshold quality value, generate a modified prompt with at least one of: modified contextual information and modified instructions…”.
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 16, because a person can mentally or using a pen and paper determine that an average quality score does not meet a threshold quality value and the person can generate a modified prompt with at least one of: modified contextual information and modified instructions based on the determination, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
“…and submit the modified prompt to one of: the LLM-based generative Al system or a different LLM-based generative Al system of the plurality of LLM-based generative Al systems”, which is a step of submitting or transmitting data.
At step 2A prong two, the step is recited at a high level of generality, and amounts to mere data gathering, which is a form of insignificant extra-solution activity.
At step 2B, the step is recognized as a well understood, routine, and conventional activity within the field of computer functions as an element of receiving or transmitting data over a network (MPEP 2106.05(d)(II)(i)).
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Dependent claim 20 recites additional limitations, such as:
“wherein the quality scores include one or more of: a relevancy score, a toxicity score, a consistency score, a fluency score, a bias score, a diversity score, a hallucination score, a coherence score, a context awareness score and an understanding ambiguity score”.
These limitations are directed to the same abstract idea under the mental processes grouping as independent claim 10 and dependent claim 11, because a person can mentally or using a pen and paper generate quality scores including one or more of: a relevancy score, a toxicity score, a consistency score, a fluency score, a bias score, a diversity score, a hallucination score, a coherence score, a context awareness score and an understanding ambiguity score, and because the limitations do not recite any additional elements that are sufficient to amount to significantly more.
Accordingly, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application, even viewing the claims a whole, because it does not impose any meaningful limits on practicing the abstract idea.

Accordingly, dependent claims 2-9, 11-15, 17-19. and 20 are also directed to abstract idea without significantly more and are not patent eligible.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hintz (US Pub 2025/0200100) in view of Paul (US Pub 2025/0086439) and in further view of  Cook (US Pub 2025/0272313).

With respect to claim 1, Hintz discloses a routing and moderation platform comprising:
a routing application programming interface communicatively interfaced to a plurality of tenant devices (Hintz in [0024] and [0087] discloses applications, systems, or modules operate on a computing device or across a plurality of computing devices, input entered on a user device or client device and information processed on or accessed from other devices in a network, such as one or more remote cloud devices or web server devices, user interface presented on display of a computing device to provide an input query that is transmitted to a remote server or a cloud service to process and retrieve information) to:
receive an input query submitted from a tenant device of the plurality of tenant devices (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents, the returned relevant documents each having a relevancy score or indication that indicates the relevancy of the documents to the input query, reviewing the input query to determine a depth of documents that should be used to respond to the input query, data from relevant documents and the input query are incorporated into another AI prompt as grounding data and used in responding to the initial input query, the relevancy scores of the document and the depth score of the input query are used to identify a minimal subset of documents and content within this minimal subset of documents needed to determine a response to a query; Hintz in [0024] and [0087] discloses user interface presented on display of a computing device to provide an input query that is transmitted to a remote server or a cloud service to process and retrieve information); and
identify an LLM-based generative Al system from a plurality of…generative AI systems to invoke to respond to the input query (Hintz in [0021] and [0025] discloses synthesizing information response, based on content provided as input, using a generative AI model, such as a large language model (LLM), multimodal model, or other type of generative AI model; here Hintz does not explicitly disclose identify an LLM-based generative Al system from a plurality of LLM-based generative AI systems, but the Paul reference discloses the feature, as discussed below);
a prompt templating service executable (Hintz in [0086] and in Figure 7 discloses a computing system comprising a processor and memory storing instructions executed by the processor to perform operations) to:
obtain contextual information that is relevant to the input query from one or more enterprise systems (Hintz in [0004] and [0039] discloses retrieving information from an enterprise database using a language model, data in database is analyzed for use in an AI prompt, identify content of relevant documents in the database and include the content in an AI prompt that a language model processes to generate an output, LLM based dialog system integrating enterprise knowledge by allowing generative AI model access to search data, providing relevant document content to fill a prompt with context for generative AI to respond to input query; Hintz in [0029], [0055] and [0056] discloses allowing generative AI model to focus on specific parts of input text and generate context-aware outputs, receive relevant documents for an input query, forming a prompt that includes the input query along with the content of the search results used as grounding data, grounding data providing context for generative AI model to respond to input query);
generate instructions for constructing a response to the input query (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents);
generate a prompt based on the input query, the contextual information and the instructions (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents; Hintz in [0029], [0055] and [0056] discloses allowing generative AI model to focus on specific parts of input text and generate context-aware outputs, receive relevant documents for an input query, forming a prompt that includes the input query along with the content of the search results used as grounding data, grounding data providing context for generative AI model to respond to input query);
generate a tuned prompt by compressing the number of tokens included within the prompt (Hintz in [0006] discloses ensuring that only the most relevant and useful documents are used to produce an ultimate response to a user while minimizing the number of tokens required for processing and responding to the input query; Hintz in [0028] discloses generative AI model trained to understand and generate sequences of tokens in the form of natural language or human-like text, understand complex intent, cause and effect, perform language translation, semantic search classification, complex classification, text sentiment, summarization, and/or other natural language capabilities; Hintz in [0031]-[0033] discloses initial processing of a prompt includes tokenizing the prompt into tokens that are mapped to a unique integer or mathematical representation, receiving token embeddings, weight the importance of each token in relation to every other token in the input, compute a score for each token pair signifying how much attention should be given to other tokens when encoding a particular token, tokens with highest probability is selected);
wherein the routing application programming interface is further configured to submit the tuned prompt to the LLM-based generative Al system and receive the response to the input query (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents; Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query).
Hintz discloses identifying an LLM-based generative Al system from a plurality of generative AI systems, however, Hintz does not explicitly disclose:
identify an LLM-based generative Al system from a plurality of LLM-based generative AI systems;
The Paul reference discloses identifying an LLM-based generative Al system from a plurality of LLM-based generative AI systems (Paul in [0051] discloses a GAI integration platform enables interactions between one or more projects and one or more GAI services, such as LLM services, each project represents at least one application executed by an enterprise, each application includes a unique application identifier that uniquely identifies the application with the GAI integration platform, providing a multi-tenant environment to enable multiple enterprises to concurrently user the platform, separating retrieval, prompt generation, GAI system querying, and other functionalities between enterprises based on application identifiers; Paul in [0060] discloses transmitting a request to the platform, indexing the request by application identifier, process the request to generate a prompt for submission to a GAI system, request includes data that can be processed to generate the prompt, request processed to determine a GAI system that the prompt is to be sent to, request can indicate a particular LLM system that is to be accessed).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having the teachings of Hintz and Paul, to have combined Hintz and Paul. The motivation to combine Hintz and Paul would be to generate a prompt that is specific to a LLM that is to be queried by provisioning the prompt based on enterprise data such that the LLM response is specific to a context of the enterprise data (Paul: [0033] and [0040]).
Hintz discloses identifying an LLM-based generative Al system from a plurality of generative AI systems to invoke to respond to an input query, however, Hintz and Paul do not explicitly disclose:
identify, based at least in part on historical response quality scores generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems…;
The Cook reference discloses identify, based at least in part on historical response quality scores generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems (Cook in [0003] and [0014] discloses using a complexity score of an input query to select a particular AI model from a set of two or more available AI models for generating a response to the query, selecting an AI model, such as a large language model (LLM), multimodal mode, or other type of generative AI model, based on the query complexity, high-complexity AI models typically callable of producing higher-quality responses over a wider range of queries relative to lower complexity AI models; Cook in [0036] and [0038] discloses analyzing a received input query along with additional relevant context, such as prior input queries and prior responses generated in response to prior input queries, to determine a complexity score associated with the query and select based on the score an AI model for generating a response to the input query, providing training queries, which may be prior queries, to multiple AI models of differing complexity and providing the queries and resulting responses to an evaluation AI model, training queries extracted from logs of prior queries, evaluating the absolute or relative quality of responses based on various quality metrics to compare response quality across AI models, quality metrics include relevance of response, coherence of response, groundedness of response, such as lack of hallucinations, perceived intelligence of response and/or other response quality metrics, determine a quality score of each response based on the quality metrics, quality score used to assign a response complexity score, input query assigned to AI model with relatively high response complexity score).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having the teachings of Hintz, Paul, and Cook, to have combined Hintz, Paul, and Cook. The motivation to combine Hintz, Paul, and Cook would be to select an AI model from a set of two or more AI models available for generating a response to a query based on query complexity (Cook: [0003]).

With respect to claim 2, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 1 further comprising:
a moderation service executable (Hintz in [0086] and in Figure 7 discloses a computing system comprising a processor and memory storing instructions executed by the processor to perform operations) to:
determine a quality level of the response for a plurality of evaluation criteria (Hintz in [0004], [0007], and [0044] discloses selection of most relevant documents improving quality of processed information and producing accurate high quality results, ensuring that the response provided is accurate, relevant and based on the most relevant subset of documents available; Hintz in [0005] and [0023] discloses identify a depth score indicating how many documents are likely needed to accurately respond to the input query, relevant documents to the input query are identified and a relevancy score is generated for each of the identified relevant documents, based on the depth score of the input query and the relevancy scores of the documents, a minimal set of documents are identified; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score, inhibit sending based on monitored profanity, bias, slurs, and the like, code injection by a malicious user, concealed malicious questions and/or surpassed protection boundaries, potential leakage, analyzing quality for readability);
generate quality score for each of the plurality of evaluation criteria based on the quality level for each of the plurality of evaluation criteria (Hintz in [0005] and [0023] discloses generative AI model produces a search query that is executed against the database of documents to produce a list of relevant documents, the returned relevant documents may each have a relevancy score or indication that indicates the relevancy of the documents to the input query, identify a depth score indicating how many documents are likely needed to accurately respond to the input query, based on the depth score of the input query and the relevancy scores of the documents, a minimal set of documents are identified; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score, inhibit sending based on monitored profanity, bias, slurs, and the like, code injection by a malicious user, concealed malicious questions and/or surpassed protection boundaries, potential leakage, analyzing quality for readability);
calculate an average quality score based on an average of the quality score for each of the plurality of evaluation criteria (Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score, inhibit sending based on monitored profanity, bias, slurs, and the like, code injection by a malicious user, concealed malicious questions and/or surpassed protection boundaries, potential leakage, analyzing quality for readability);
determine whether average quality score is above a threshold quality value (Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query; Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score);
upon determining that the average quality score meets a threshold quality value, send the response to the tenant device (Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query; Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score); and
upon determining that the average quality score does not meet the threshold quality value, generate a modified prompt with at least one of: modified contextual information and modified instructions and submit the modified prompt to one of: the LLM-based generative Al system or a different LLM-based generative Al system of the plurality of LLM-based generative Al systems (Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query; Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score).

With respect to claim 3, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 1, wherein the instructions for constructing the response to the input query includes instructions for interpreting the input query and instructions for constructing the tone and content of the response (Hintz in [0047] and [0099] discloses language model detecting user intent present in the input query, language model detecting primary topics of the input query, styles of the input query, and/or mood or tone of the input query, generative AI model detecting primary topics, style, and tone of the text in the input query and in the content of relevant documents; Paul in [0040] discloses providing prompts that represent appropriate queries in an appropriate sequence to a LLM model, providing context and other details to a LLM to enable the LLM to correctly interpret and answer the prompt).

With respect to claim 4, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 1, wherein the input query comprises textual questions (Hintz in [0050] discloses receiving an input query and processing the query as part of an AI prompt; Paul in [0021] and [0036] discloses LLMs can generate text and perform various natural language processing tasks such as question answering, a customer-facing chatbot answering questions; Paul in [0030] and [0058] discloses provisioning prompts used to query a LLM, generating prompts based on questions).

With respect to claim 5, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 1, wherein the tenant devices are associated with a plurality of different types of tenants having different access rights to enterprise data (Paul in [0026], [0028], and [0051] discloses GAI integration platform defining and documenting access controls, providing secure access to data across multiple, disparate datastores, providing a multi-tenant environment to enable multiple enterprises to concurrently use the platform, separating data storage/retrieval, prompt generation, querying, and other functionalities between enterprises based on unique application identifiers uniquely identifying enterprise applications).

With respect to claim 6, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 5, wherein the tenant devices include customer tenant devices and employee tenant devices (Paul in [0036] and [0051] discloses an application can include a customer-facing chatbot that answers questions, an application can include a marketing editor enabling a content creator to generate product descriptions, providing a multi-tenant environment to enable multiple enterprises to concurrently use the GAI integration platform).

With respect to claim 7, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 1, wherein the plurality of different LLM-based generative AI systems include at least one enterprise-hosted LLM model and at least one external LLM model (Hintz in [0024] and [0026] discloses information processed on or accessed from other devices in a network, such as one or more remote cloud devices or web servers, information retriever may be a local application, some operations performed locally and other operations performed at a server; Hintz in [0034] discloses computing device communicating with generative AI model using one or a combination of networks, such as private area network, a local area network, and a wide area network, generative AI model operates on a device located remotely from the computing device, generative AI model implemented in a cloud-based environment or a server-based environment using one or more cloud resources, server devices, or personal computers; Hintz in [0039] and [0043] discloses a LLM based dialog system that integrates enterprise knowledge by allowing generative AI model to retrieve relevant document using enterprise database; Paul in [0029] and [0030] discloses enable an application of an enterprise to interact with one or more LLMs; Paul in [0032] and [0043] discloses interactions with third-party LLM systems, providing enterprise-specific responses from LLMs, access LLMs that are pre-trained and offered as managed services by multiple third-parties or vendors; Paul in [0045] discloses leveraging a LLM and that is powered by knowledge and context of an enterprise, enable access to a knowledge based on the enterprise, enterprise data residing on a central data platform; Paul in [0060] and [0065] accessing a particular LLM system based on processing a request and generating a prompt for submission to the particular LLM, transmitting the prompt to one of a OpenAI service and a third-party LLM system).

With respect to claim 8, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 1, wherein the quality scores include one or more of: a relevancy score, a toxicity score, a consistency score, a fluency score, a bias score, a diversity score, a hallucination score, a coherence score, a context awareness score and an understanding ambiguity score (Hintz in [0004], [0007], and [0044] discloses selection of most relevant documents improving quality of processed information and producing accurate high quality results, ensuring that the response provided is accurate, relevant and based on the most relevant subset of documents available; Hintz in [0005] and [0023] discloses generative AI model produces a search query that is executed against the database of documents to produce a list of relevant documents, the returned relevant documents may each have a relevancy score or indication that indicates the relevancy of the documents to the input query, identify a depth score indicating how many documents are likely needed to accurately respond to the input query, based on the depth score of the input query and the relevancy scores of the documents, a minimal set of documents are identified; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score, inhibit sending based on monitored profanity, bias, slurs, and the like, code injection by a malicious user, concealed malicious questions and/or surpassed protection boundaries, potential leakage, analyzing quality for readability).

With respect to claim 9, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 1, wherein the routing application programming interface is configured to submit the prompt to a plurality of the different LLM-based generative AI systems (Hintz in [0021] and [0025] discloses synthesizing information response, based on content provided as input, using a generative AI model, such as a large language model (LLM), multimodal model, or other type of generative AI model; Paul in [0051] and [0060] discloses a GAI integration platform enables interactions with one or more GAI services, such as LLM services, transmitting a request to the platform, request can indicate a particular LLM system that is to be accessed).

With respect to claim 10, Hintz discloses a routing and moderation platform comprising:
a computing system comprising a processor and a memory, the computing system including instructions which, when executed, cause the routing and moderation platform to perform (Hintz in [0086] and in Figure 7 discloses a computing system comprising a processor and memory storing instructions executed by the processor to perform operations):
receiving an input query submitted from a tenant device of the plurality of tenant devices (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents, the returned relevant documents each having a relevancy score or indication that indicates the relevancy of the documents to the input query, reviewing the input query to determine a depth of documents that should be used to respond to the input query, data from relevant documents and the input query are incorporated into another AI prompt as grounding data and used in responding to the initial input query, the relevancy scores of the document and the depth score of the input query are used to identify a minimal subset of documents and content within this minimal subset of documents needed to determine a response to a query; Hintz in [0024] and [0087] discloses user interface presented on display of a computing device to provide an input query that is transmitted to a remote server or a cloud service to process and retrieve information); and
identifying an LLM-based generative AI system from a plurality of… generative AI systems to invoke to respond to the input query (Hintz in [0021] and [0025] discloses synthesizing information response, based on content provided as input, using a generative AI model, such as a large language model (LLM), multimodal model, or other type of generative AI model; here Hintz does not explicitly disclose identify an LLM-based generative Al system from a plurality of LLM-based generative AI systems, but the Paul reference discloses the feature, as discussed below);
obtaining contextual information that is relevant to the input query from one or more enterprise systems (Hintz in [0004] and [0039] discloses retrieving information from an enterprise database using a language model, data in database is analyzed for use in an AI prompt, identify content of relevant documents in the database and include the content in an AI prompt that a language model processes to generate an output, LLM based dialog system integrating enterprise knowledge by allowing generative AI model access to search data, providing relevant document content to fill a prompt with context for generative AI to respond to input query; Hintz in [0029], [0055] and [0056] discloses allowing generative AI model to focus on specific parts of input text and generate context-aware outputs, receive relevant documents for an input query, forming a prompt that includes the input query along with the content of the search results used as grounding data, grounding data providing context for generative AI model to respond to input query);
generating instructions for constructing a response to the input query (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents);
generating a prompt based on the input query, the contextual information and the instructions (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents; Hintz in [0029], [0055] and [0056] discloses allowing generative AI model to focus on specific parts of input text and generate context-aware outputs, receive relevant documents for an input query, forming a prompt that includes the input query along with the content of the search results used as grounding data, grounding data providing context for generative AI model to respond to input query);
generating a tuned prompt by compressing the number of tokens included within the prompt (Hintz in [0006] discloses ensuring that only the most relevant and useful documents are used to produce an ultimate response to a user while minimizing the number of tokens required for processing and responding to the input query; Hintz in [0028] discloses generative AI model trained to understand and generate sequences of tokens in the form of natural language or human-like text, understand complex intent, cause and effect, perform language translation, semantic search classification, complex classification, text sentiment, summarization, and/or other natural language capabilities; Hintz in [0031]-[0033] discloses initial processing of a prompt includes tokenizing the prompt into tokens that are mapped to a unique integer or mathematical representation, receiving token embeddings, weight the importance of each token in relation to every other token in the input, compute a score for each token pair signifying how much attention should be given to other tokens when encoding a particular token, tokens with highest probability is selected); and
submitting the tuned prompt to the LLM-based generative AI system (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents; Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query);
receiving the response to the input query (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents; Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query);
generating an average quality score for the response (Hintz in [0004], [0007], and [0044] discloses selection of most relevant documents improving quality of processed information and producing accurate high quality results, ensuring that the response provided is accurate, relevant and based on the most relevant subset of documents available; Hintz in [0005] and [0023] discloses generative AI model produces a search query that is executed against the database of documents to produce a list of relevant documents, the returned relevant documents may each have a relevancy score or indication that indicates the relevancy of the documents to the input query, identify a depth score indicating how many documents are likely needed to accurately respond to the input query, based on the depth score of the input query and the relevancy scores of the documents, a minimal set of documents are identified; Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score); and
upon determining that the average quality score meets a threshold quality value, sending the response to the tenant device (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents; Hintz in [0024] and [0087] discloses user interface presented on display of a computing device to provide an input query that is transmitted to a remote server or a cloud service to process and retrieve information; Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, send request for updated answer synthesis prompt, regenerate and updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query is received; Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score).
Hintz discloses identifying an LLM-based generative Al system from a plurality of generative AI systems, however, Hintz does not explicitly disclose:
identifying an LLM-based generative Al system from a plurality of LLM-based generative AI systems;
The Paul reference discloses identifying an LLM-based generative Al system from a plurality of LLM-based generative AI systems (Paul in [0051] discloses a GAI integration platform enables interactions between one or more projects and one or more GAI services, such as LLM services, each project represents at least one application executed by an enterprise, each application includes a unique application identifier that uniquely identifies the application with the GAI integration platform, providing a multi-tenant environment to enable multiple enterprises to concurrently user the platform, separating retrieval, prompt generation, GAI system querying, and other functionalities between enterprises based on application identifiers; Paul in [0060] discloses transmitting a request to the platform, indexing the request by application identifier, process the request to generate a prompt for submission to a GAI system, request includes data that can be processed to generate the prompt, request processed to determine a GAI system that the prompt is to be sent to, request can indicate a particular LLM system that is to be accessed).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having the teachings of Hintz and Paul, to have combined Hintz and Paul. The motivation to combine Hintz and Paul would be to generate a prompt that is specific to a LLM that is to be queried by provisioning the prompt based on enterprise data such that the LLM response is specific to a context of the enterprise data (Paul: [0033] and [0040]).

Hintz discloses identifying an LLM-based generative Al system from a plurality of generative AI systems to invoke to respond to an input query, however, Hintz and Paul do not explicitly disclose:
identify, based at least in part on historical response quality scores generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems…;
The Cook reference discloses identify, based at least in part on historical response quality scores generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems (Cook in [0003] and [0014] discloses using a complexity score of an input query to select a particular AI model from a set of two or more available AI models for generating a response to the query, selecting an AI model, such as a large language model (LLM), multimodal mode, or other type of generative AI model, based on the query complexity, high-complexity AI models typically callable of producing higher-quality responses over a wider range of queries relative to lower complexity AI models; Cook in [0036] and [0038] discloses analyzing a received input query along with additional relevant context, such as prior input queries and prior responses generated in response to prior input queries, to determine a complexity score associated with the query and select based on the score an AI model for generating a response to the input query, providing training queries, which may be prior queries, to multiple AI models of differing complexity and providing the queries and resulting responses to an evaluation AI model, training queries extracted from logs of prior queries, evaluating the absolute or relative quality of responses based on various quality metrics to compare response quality across AI models, quality metrics include relevance of response, coherence of response, groundedness of response, such as lack of hallucinations, perceived intelligence of response and/or other response quality metrics, determine a quality score of each response based on the quality metrics, quality score used to assign a response complexity score, input query assigned to AI model with relatively high response complexity score).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having the teachings of Hintz, Paul, and Cook, to have combined Hintz, Paul, and Cook. The motivation to combine Hintz, Paul, and Cook would be to select an AI model from a set of two or more AI models available for generating a response to a query based on query complexity (Cook: [0003]).

With respect to claim 11, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 10, wherein generating the average quality score for the response includes:
determining a quality level of the response for a plurality of evaluation criteria (Hintz in [0004], [0007], and [0044] discloses selection of most relevant documents improving quality of processed information and producing accurate high quality results, ensuring that the response provided is accurate, relevant and based on the most relevant subset of documents available; Hintz in [0005] and [0023] discloses identify a depth score indicating how many documents are likely needed to accurately respond to the input query, relevant documents to the input query are identified and a relevancy score is generated for each of the identified relevant documents, based on the depth score of the input query and the relevancy scores of the documents, a minimal set of documents are identified; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score, inhibit sending based on monitored profanity, bias, slurs, and the like, code injection by a malicious user, concealed malicious questions and/or surpassed protection boundaries, potential leakage, analyzing quality for readability);
generating quality scores for each of the plurality of evaluation criteria based on the quality level for each of the plurality of evaluation criteria (Hintz in [0005] and [0023] discloses generative AI model produces a search query that is executed against the database of documents to produce a list of relevant documents, the returned relevant documents may each have a relevancy score or indication that indicates the relevancy of the documents to the input query, identify a depth score indicating how many documents are likely needed to accurately respond to the input query, based on the depth score of the input query and the relevancy scores of the documents, a minimal set of documents are identified; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score, inhibit sending based on monitored profanity, bias, slurs, and the like, code injection by a malicious user, concealed malicious questions and/or surpassed protection boundaries, potential leakage, analyzing quality for readability); and
calculating the average quality score based on an average of the quality score for each of the plurality of evaluation criteria (Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score, inhibit sending based on monitored profanity, bias, slurs, and the like, code injection by a malicious user, concealed malicious questions and/or surpassed protection boundaries, potential leakage, analyzing quality for readability).

With respect to claim 12, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 10, wherein the instructions which, when executed, cause the routing and moderation platform to further perform:
upon determining that the average quality score does not meet the threshold quality value, generating a modified prompt with at least one of modified contextual information and modified instructions and submit the modified prompt to one of: the LLM-based generative AI system or a different LLM-based generative Al system of the plurality of LLM-based generative Al systems (Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query; Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score).

With respect to claim 13, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 10, wherein the contextual information comprises enterprise confidential information (Paul in [0006], [0030], and [0032] discloses generating a prompt responsive to a request includes determining context data representative of one or more enterprise and an enterprise operation, provisioning context for prompts to query a LLM, context used to provide enterprise specific responses from the LLMs, prompt processed by GAI integration platform to mitigate presence of one or more of PII and profanity before transmitting the prompt to the GAI system; Paul in [0028] discloses API layer includes API access and controls to enable interactions with the GAI systems deployed in a secure, managed, and governed manner, data access layer provides storage of and secure access to data across multiple, disparate datastores; Paul in [0047] discloses security and monitoring component includes enterprise security, data and model privacy, threat management, and monitoring, addresses threats and security concerns regarding the applications and their use of LLMs, and how the LLMs themselves are storing and using the data; Paul in [0055] and [0064] discloses masking or replacing personally identifiable information within a prompt before transmitting the prompt to the GAI system).

With respect to claim 14, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 10, wherein the LLM-based generative AI system is identified based at least in part on historical response quality scores of each of the plurality of LLM-based generative Al systems (Paul in [0033] discloses prompt engineering tier includes a prompt generation module and a cognitive interaction module, includes prompt templates, prompt assessment, prompt registration, and prompt reusability, prompt generation module enables a prompt to be generated using a prompt template that is specific to a GAI model, such as LLM, that is to be queried, prompt can be assessed for quality and accuracy, before being used to query the GAI model, and can be registered and stored for reuse).

With respect to claim 15, Hintz in view of Paul and in further view of Cook discloses the routing and moderation platform of claim 10, wherein the LLM-based generative Al system is identified based at least on part on a cost of submitting the tuned prompt to each of the plurality of LLM-based generative Al systems (Hintz in [0075] discloses summary generator saves the summarization results of the pre-summarization stage, indicating if a document requires summarization in order to be included in the final prompt, if it does not need summarization then it may be included in the final prompt without incurring extra cost; Paul in [0026] discloses documenting reusable archetypes and patterns for application implementations, keeping continuous track of third-party GAI providers, their GAI model offerings, and assessment of the GAI model in terms of deployment patterns, cost, efficiency, speed, quality, ability to fine-tune, ability to customize, case of ownership, and the like).

With respect to claim 16, Hintz discloses a method for routing and moderation of questions received from tenants, the method comprising:
receiving an input query submitted from a tenant device of the plurality of tenant devices (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents, the returned relevant documents each having a relevancy score or indication that indicates the relevancy of the documents to the input query, reviewing the input query to determine a depth of documents that should be used to respond to the input query, data from relevant documents and the input query are incorporated into another AI prompt as grounding data and used in responding to the initial input query, the relevancy scores of the document and the depth score of the input query are used to identify a minimal subset of documents and content within this minimal subset of documents needed to determine a response to a query; Hintz in [0024] and [0087] discloses user interface presented on display of a computing device to provide an input query that is transmitted to a remote server or a cloud service to process and retrieve information); and
determining an LLM-based generative AI system from a plurality of…generative AI systems to invoke to respond to the input query (Hintz in [0021] and [0025] discloses synthesizing information response, based on content provided as input, using a generative AI model, such as a large language model (LLM), multimodal model, or other type of generative AI model; here Hintz does not explicitly disclose determining an LLM-based generative Al system from a plurality of LLM-based generative AI systems, but the Paul reference discloses the feature, as discussed below);
obtaining contextual information that is relevant to the input query from one or more enterprise systems (Hintz in [0004] and [0039] discloses retrieving information from an enterprise database using a language model, data in database is analyzed for use in an AI prompt, identify content of relevant documents in the database and include the content in an AI prompt that a language model processes to generate an output, LLM based dialog system integrating enterprise knowledge by allowing generative AI model access to search data, providing relevant document content to fill a prompt with context for generative AI to respond to input query; Hintz in [0029], [0055] and [0056] discloses allowing generative AI model to focus on specific parts of input text and generate context-aware outputs, receive relevant documents for an input query, forming a prompt that includes the input query along with the content of the search results used as grounding data, grounding data providing context for generative AI model to respond to input query);
generating instructions for constructing a response to the input query (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents);
generating a prompt based on the input query, the contextual information and the instructions (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents; Hintz in [0029], [0055] and [0056] discloses allowing generative AI model to focus on specific parts of input text and generate context-aware outputs, receive relevant documents for an input query, forming a prompt that includes the input query along with the content of the search results used as grounding data, grounding data providing context for generative AI model to respond to input query);
generating a tuned prompt by compressing the number of tokens included within the prompt (Hintz in [0006] discloses ensuring that only the most relevant and useful documents are used to produce an ultimate response to a user while minimizing the number of tokens required for processing and responding to the input query; Hintz in [0028] discloses generative AI model trained to understand and generate sequences of tokens in the form of natural language or human-like text, understand complex intent, cause and effect, perform language translation, semantic search classification, complex classification, text sentiment, summarization, and/or other natural language capabilities; Hintz in [0031]-[0033] discloses initial processing of a prompt includes tokenizing the prompt into tokens that are mapped to a unique integer or mathematical representation, receiving token embeddings, weight the importance of each token in relation to every other token in the input, compute a score for each token pair signifying how much attention should be given to other tokens when encoding a particular token, tokens with highest probability is selected); and
submitting the tuned prompt to the LLM-based generative Al system (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents; Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query).
Hintz discloses identifying an LLM-based generative Al system from a plurality of generative AI systems, however, Hintz does not explicitly disclose:
determining an LLM-based generative Al system from a plurality of LLM-based generative AI systems;
The Paul reference discloses determining an LLM-based generative Al system from a plurality of LLM-based generative AI systems (Paul in [0051] discloses a GAI integration platform enables interactions between one or more projects and one or more GAI services, such as LLM services, each project represents at least one application executed by an enterprise, each application includes a unique application identifier that uniquely identifies the application with the GAI integration platform, providing a multi-tenant environment to enable multiple enterprises to concurrently user the platform, separating retrieval, prompt generation, GAI system querying, and other functionalities between enterprises based on application identifiers; Paul in [0060] discloses transmitting a request to the platform, indexing the request by application identifier, process the request to generate a prompt for submission to a GAI system, request includes data that can be processed to generate the prompt, request processed to determine a GAI system that the prompt is to be sent to, request can indicate a particular LLM system that is to be accessed).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having the teachings of Hintz and Paul, to have combined Hintz and Paul. The motivation to combine Hintz and Paul would be to generate a prompt that is specific to a LLM that is to be queried by provisioning the prompt based on enterprise data such that the LLM response is specific to a context of the enterprise data (Paul: [0033] and [0040]).
Hintz discloses identifying an LLM-based generative Al system from a plurality of generative AI systems to invoke to respond to an input query, however, Hintz and Paul do not explicitly disclose:
identify, based at least in part on historical response quality scores generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems…;
The Cook reference discloses identify, based at least in part on historical response quality scores generated for prior responses to prior input queries, an LLM-based generative Al system from a plurality of LLM-based generative AI systems (Cook in [0003] and [0014] discloses using a complexity score of an input query to select a particular AI model from a set of two or more available AI models for generating a response to the query, selecting an AI model, such as a large language model (LLM), multimodal mode, or other type of generative AI model, based on the query complexity, high-complexity AI models typically callable of producing higher-quality responses over a wider range of queries relative to lower complexity AI models; Cook in [0036] and [0038] discloses analyzing a received input query along with additional relevant context, such as prior input queries and prior responses generated in response to prior input queries, to determine a complexity score associated with the query and select based on the score an AI model for generating a response to the input query, providing training queries, which may be prior queries, to multiple AI models of differing complexity and providing the queries and resulting responses to an evaluation AI model, training queries extracted from logs of prior queries, evaluating the absolute or relative quality of responses based on various quality metrics to compare response quality across AI models, quality metrics include relevance of response, coherence of response, groundedness of response, such as lack of hallucinations, perceived intelligence of response and/or other response quality metrics, determine a quality score of each response based on the quality metrics, quality score used to assign a response complexity score, input query assigned to AI model with relatively high response complexity score).
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having the teachings of Hintz, Paul, and Cook, to have combined Hintz, Paul, and Cook. The motivation to combine Hintz, Paul, and Cook would be to select an AI model from a set of two or more AI models available for generating a response to a query based on query complexity (Cook: [0003]).

With respect to claim 17, Hintz in view of Paul and in further view of Cook discloses the method of claim 16, further comprising:
receiving the response to the input query (Hintz in [0005] and [0006] discloses receiving an input query, a generative AI model processing the query as part of an AI prompt and producing a search query that is executed against a database of documents to produce a list of relevant documents; Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query);
determining a quality level of the response for a plurality of evaluation criteria (Hintz in [0004], [0007], and [0044] discloses selection of most relevant documents improving quality of processed information and producing accurate high quality results, ensuring that the response provided is accurate, relevant and based on the most relevant subset of documents available; Hintz in [0005] and [0023] discloses identify a depth score indicating how many documents are likely needed to accurately respond to the input query, relevant documents to the input query are identified and a relevancy score is generated for each of the identified relevant documents, based on the depth score of the input query and the relevancy scores of the documents, a minimal set of documents are identified; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score, inhibit sending based on monitored profanity, bias, slurs, and the like, code injection by a malicious user, concealed malicious questions and/or surpassed protection boundaries, potential leakage, analyzing quality for readability);
generating quality scores for each of the plurality of evaluation criteria based on the quality level for each of the plurality of evaluation criteria (Hintz in [0005] and [0023] discloses generative AI model produces a search query that is executed against the database of documents to produce a list of relevant documents, the returned relevant documents may each have a relevancy score or indication that indicates the relevancy of the documents to the input query, identify a depth score indicating how many documents are likely needed to accurately respond to the input query, based on the depth score of the input query and the relevancy scores of the documents, a minimal set of documents are identified; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score, inhibit sending based on monitored profanity, bias, slurs, and the like, code injection by a malicious user, concealed malicious questions and/or surpassed protection boundaries, potential leakage, analyzing quality for readability);
calculating the average quality score based on an average of the quality score for each of the plurality of evaluation criteria (Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score, inhibit sending based on monitored profanity, bias, slurs, and the like, code injection by a malicious user, concealed malicious questions and/or surpassed protection boundaries, potential leakage, analyzing quality for readability).

With respect to claim 18, Hintz in view of Paul and in further view of Cook discloses the method of claim 17, further comprising:
determine whether average quality score is above a threshold quality value (Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query; Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score); and
upon determining that the average quality score meets a threshold quality value, send the response to the tenant device (Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query; Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score).

With respect to claim 19, Hintz in view of Paul and in further view of Cook discloses the method of claim 17, further comprising:
determine whether average quality score is above a threshold quality value (Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query; Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score); and
upon determining that the average quality score does not meet the threshold quality value, generate a modified prompt with at least one of: modified contextual information and modified instructions and submit the modified prompt to one of: the LLM-based generative AI system or a different LLM-based generative Al system of the plurality of LLM-based generative Al systems (Hintz in [0061] discloses review answer synthesis prompt to determine its sufficiency to generate response, look for minimal sufficiency of information to effectively respond to input query, upon finding answer synthesis prompt to be insufficient sending a request to regenerate an updated answer synthesis prompt until answer synthesis prompt deemed sufficient, regenerate an updated search query and updated answer synthesis prompt based on search results produced by updated search query, iteratively generate new search queries and answer synthesis prompt until a sufficiency confirmation to prepare response for input query; Hintz in [0128] discloses using a weighted average of relevancy score of a document of the identified relevant documents, pre-summarizing each of the identified relevant documents by determining relevancy score of the summary of the document and pre-summarizing until the document summary’s relevancy score is at least the document’s relevancy score; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score).

With respect to claim 20, Hintz in view of Paul and in further view of Cook discloses the method of claim 7, wherein the quality scores include one or more of: a relevancy score, a toxicity score, a consistency score, a fluency score, a bias score, a diversity score, a hallucination score, a coherence score, a context awareness score and an understanding ambiguity score (Hintz in [0004], [0007], and [0044] discloses selection of most relevant documents improving quality of processed information and producing accurate high quality results, ensuring that the response provided is accurate, relevant and based on the most relevant subset of documents available; Hintz in [0005] and [0023] discloses generative AI model produces a search query that is executed against the database of documents to produce a list of relevant documents, the returned relevant documents may each have a relevancy score or indication that indicates the relevancy of the documents to the input query, identify a depth score indicating how many documents are likely needed to accurately respond to the input query, based on the depth score of the input query and the relevancy scores of the documents, a minimal set of documents are identified; Paul in [0033], [0055], and [0058] discloses assessing for quality and accuracy, quality scoring module provides a readability score and can selectively inhibit sending in response to a readability score being below a threshold score, inhibit sending based on monitored profanity, bias, slurs, and the like, code injection by a malicious user, concealed malicious questions and/or surpassed protection boundaries, potential leakage, analyzing quality for readability).



Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.










Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to REZWANUL MAHMOOD whose telephone number is (571)272-5625. The examiner can normally be reached M-F 9-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann J. Lo can be reached at 571-272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/R.M/Examiner, Art Unit 2159                                                                                                                                                                                                        /MARC S SOMERS/Primary Examiner, Art Unit 2159
Read full office action
Prosecution Timeline

Mar 04, 2025
Application Filed
Dec 04, 2025
Non-Final Rejection — §101, §103
Jan 23, 2026
Interview Requested
Jan 29, 2026
Applicant Interview (Telephonic)
Jan 29, 2026
Examiner Interview Summary
Mar 16, 2026
Response Filed
Apr 04, 2026
Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/343,938
Patent 12579192
PROMISE KEYS FOR RESULT CACHES OF DATABASE SYSTEMS
2y 5m to grant Granted Mar 17, 2026
17/569,030
Patent 12548309
LABEL INHERITANCE FOR SOFT LABEL GENERATION IN INFORMATION PROCESSING SYSTEM
2y 5m to grant Granted Feb 10, 2026
17/343,379
Patent 12541537
DEVICE DISCOVERY SYSTEM
2y 5m to grant Granted Feb 03, 2026
18/495,269
Patent 12524465
SYSTEMS AND METHODS FOR BROWSER EXTENSIONS AND LARGE LANGUAGE MODELS FOR INTERACTING WITH VIDEO STREAMS
2y 5m to grant Granted Jan 13, 2026
18/297,527
Patent 12450226
EFFICIENTLY ANALYZING TRACE DATA
2y 5m to grant Granted Oct 21, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
46%
Grant Probability
81%
With Interview (+34.7%)
4y 5m
Median Time to Grant
Moderate
PTA Risk
Based on 402 resolved cases by this examiner. Grant probability derived from career allow rate.