Last updated: April 19, 2026
Application No. 18/736,763
SYSTEMS FOR GENERATION OF PROMPTS FOR EVALUATION OF LANGUAGE MODELS

Non-Final OA §101§103
Filed
Jun 07, 2024
Examiner
WOZNIAK, JAMES S
Art Unit
2655
Tech Center
2600 — Communications
Assignee
Amazon Technologies, Inc.
OA Round
1 (Non-Final)
Interview Optional

— +40.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 385 resolved cases, 2023–2026
Examiner Intelligence

WOZNIAK, JAMES S View full profile →
Grants 59% of resolved cases
Career Allow Rate
227 granted / 385 resolved
-3.0% vs TC avg
Strong +40% interview lift
Without
With
+40.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 7m
Avg Prosecution
42 currently pending
Career history
427
Total Applications
across all art units
Statute-Specific Performance

§101
18.1%
-21.9% vs TC avg
§103
40.1%
+0.1% vs TC avg
§102
18.4%
-21.6% vs TC avg
§112
16.1%
-23.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 385 resolved cases
Office Action

§101 §103
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Examiner Note on Patent Subject Matter Eligibility under 35 U.S.C. 101

Independent Claims 1 and 4 regard a combination trained ML models directed towards prompt modifications, prompt generation and the generation of responses based upon the modified prompt.  Since this combination of models is directed towards prompts that relate to large language model (LLM) inputs and feature machine learning models trained for such LLM processes in a combination, claims 1 and 4 are directed towards a technical improvement in evaluating LLMs outputs based upon prompt modification with machine learning models, and thus, is found to be eligible under step 2A prong 2.  Claim 12 does not feature such improvements as this claim has been drafted to be a broader/generic version of the invention that is more clearly set forth in claims 1 and 4.  Accordingly, claim 12 and its dependents have been rejected under 35 U.S.C. 101 for being directed towards a judicial exception.

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 12 – 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea under the broadest reasonable interpretation (BRI) without significantly more. 
Independent Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  The claims regard a process that, as drafted under its BRI, covers performance of the limitations as a mental process, but for the recitation of generic computer components and high level machine learning models.
In regards to the process of claim 1, the claimed functionality could be practiced as a mental process in the following manner:
(a human can mentally evaluate an input text and decide upon an appropriate output text mentally or using pen and paper), 
wherein: (a human can mentally decide upon a value for an output- for example- I want to know the temperature for tomorrow can involve insertion of a date value, or a human can decide upon a modification with respect to a negative or positive tone value), wherein the first output includes second text indicative of a modification to the first text (a human can write text using pen and paper); and use a second machine learning model to determine a second input based on the first input and the first output, wherein the second input comprises third text (a human can mentally rely on the output of an ML model (i.e., use in a general sense) to determine a third text based upon their rewritten text via mentally understanding/reading a printout).
This judicial exception is not integrated into a practical application.  Outside of the identified abstract idea, the claimed invention only recites a generic machine learning model trained to generate some modification of a generic input and generic computer components (i.e., a processor and memory) which amount to no more than mere instructions to implement an otherwise abstract idea using generic computer components and machine automation of an otherwise mental process that does not involve a particular machine learning model aimed at a specific task- i.e., this claim does not include the specifics of Applicant’s practical application set forth in claims 1 and 4.  
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The above identified additional generic computer components are no more than mere instructions to apply the exception using generic computer components that are well-known, routine, and conventional as is evidenced by Bancorp Services v. Sun Life (Fed. Cir. 2012) and Alice Corp. v. CLS Bank (2014).  As for evidence that the claimed machine learning model is well-known, routine, and conventional activity that does not direct patent ineligible subject matter to significantly more than the abstract idea, see the following prior art:  Diesendruck, et al. (U.S. PG Publication 2025/0371282 A1- Paragraph 0005- adjustments to a prompt is "known" as prompt engineering) and Rawte, et al. (“"Sorry, Come Again?” Prompting – Enhancing Comprehension and Diminishing Hallucination with [PAUSE]-injected Optimal Paraphrasing," March 2024- see that input modification can be performed by known and publicly available ML models in Section 7.1, Page 5).  Moreover see Recentive Analytics, Inc. v. Fox Corp. (Fed. Cir. April 18, 2025)- “Machine learning is a burgeoning and increasingly important field and may lead to patent-eligible improvements in technology. Today, we hold only that patents that do no more than claim the application of generic machine learning to new data environments, without disclosing improvements to the machine learning models to be applied, are patent ineligible under § 101.”
Accordingly, claim 12 is not directed towards patent eligible subject matter under 35 U.S.C. 101.
Note that claim 13 captures the improvements/practical application by providing further information on how the first machine learning is trained that goes beyond the generic input engineering expressed in independent claim 12 under the BRI and is not included in this rejection nor are its further dependent claims that incorporate the subject matter of claim 13 by virtue of their dependency (i.e., claims 14-18).  Claims 19-20 respectively add further information as to how the first ML model is trained while claim 20 adds a combination of an additional machine learning model that explains how the first input is selected based upon a response.  Accordingly, these claims have also not been included in this rejection.

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1 is rejected under 35 U.S.C. 103 as being unpatentable over Rawte, et al. (“"Sorry, Come Again?” Prompting – Enhancing Comprehension and Diminishing Hallucination with [PAUSE]-injected Optimal Paraphrasing," March 2024) in view of Qu, et al. (U.S. PG Publication:  2025/0356204 A1).
With respect to Claim 1, Rawte discloses:
access a first prompt comprising first text (access of "original prompt," Fig. 1 (showing prompt text) and Fig. 9, Element 1; Section 10, Page 9); 
use a first machine learning model to determine a modification to the first prompt based on the first prompt, wherein the first machine learning model is trained to determine modifications to prompts that are associated with generation, by a second machine learning model, of responses having characteristics associated with an invalid response (modification in the form of paraphrasing of an “original prompt” that is associated with hallucinations/invalid responses at a second ML model in the form of an LLM, Section 1, Page 2; Sections 7-7.1, Page 5; Section 9, Page 8; Fig. 5; Fig. 9 (see LLM plus "Hallucinated response"));
determine a second prompt based on the first prompt and the modification determined using first output from the first machine learning model, wherein the second prompt comprises second text (determination of a second prompt based upon the output of the first ML model (e.g., GPT-3) that is a paraphrase/rewrite of the original prompt that comprises second text, Sections 7-8.1, Pages 5-6; see Fig. 1 showing the rewritten prompt and Fig. 9 showing the optimal paraphrases and adjusted prompt to be input into an LLM to produce AI-generated text);
use the second machine learning model to determine a first response based on the second prompt, wherein the second machine learning model is trained to determine responses based on text and semantic information associated with prompts , wherein the responses are associated with one or more constraints (LLM that determines a response to the optimized/adjusted prompt in the form of “AI-generated text” shown in Fig. 9; see also Section 1, Page 2 discussing LLMs trained for tasks related to text generation based upon constraints in a request; Section 4 Page 4 discussing accuracy considerations of LLM responses; Section 6, Page 4), and wherein the first response comprises third text that deviates from the one or more constraints (even with an adjusted prompt, hallucinations may still be present, Section 9, Page 8; see refute answers in Fig. 9); and 
determine an output based on the first response (AI-generated text, Fig. 9; see newly generated response based upon the rewritten prompt in Fig. 1).
Although Rawte teaches the claimed process set forth in claim 1 under the broadest reasonable interpretation (BRI), Rawte represents a scholarly paper/non-patent literature and so does not explicitly teach the hardware-based components of the claim in the form of one or more non-transitory memories storing computer-executable instructions; and one or more hardware processors to execute the computer-executable instructions.  Qu, however, teaches a system including prompt revision implemented using a processor and a processor-executable program stored on a non-transitory memory (Paragraphs 0076 and 0104).
Rawte and Qu are analogous art because they are from a similar field of endeavor in LM response generation utilizing revised prompting.  Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify the teachings of Rawte to include the computer hardware-based embodiment disclosed by Qu to provide a predictable result of enabling process implementation on a general-purpose computing device.

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Rawte, et al. in view of Qu, et al. and further in view of Sato (U.S. PG Publication 2025/0209102 A1).
With respect to Claim 3, Rawte in view of Qu teaches the system for prompting rewriting to reduce hallucinations in LLM responses as applied to Claim 1.  Although Rawte considers various rationales in prompt assessment (see readability, formality, and concreteness discussed in Section 3, Pages 2-4), Rawte in view of Qu does not include scoring machine learning models (the claimed third and fourth models for making such assessments). Sato, however, discloses:
use a third machine learning model to determine a score based on one or more characteristics of the second prompt and the first response, wherein the third machine learning model is trained to determine scores based on characteristics of text, and wherein the score is indicative of one or more semantic characteristics of one or more of the second text or the third text (trained determination model that determines whether a replacement prompt contains correct information (note:  such appropriateness of answer text given prompt text is indicative of a semantic appropriateness) in generative-text AI in a response based upon total values of scores, Paragraphs 0017, 0068-0069, 0076-0077, and 0155);
use a fourth machine learning model to determine a rationale associated with the score based on the first response and the score, wherein the fourth machine learning model is trained to determine rationales associated with determination of scores relative to first responses; and store the score and the rationale as data accessible to control generation of prompts for input to the second machine learning model (comparison determinations of the model for individual score items related to rationales- degree of brevity, concreteness, etc. of the prompt, Paragraph 0069 and 0153; information stored in prompt database including the related information items in the form of scoring, Paragraphs 0022, 0029, and 0108).
Rawte, Qu, and Sato are analogous art because they are from a similar field of endeavor in LM response generation utilizing revised prompting.  Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify the teachings of Rawte in view of Qu with the revised prompt scoring taught by Sato to provide a predictable result of better ensuring that an appropriate prompt is provided to an LLM (Sato, Paragraph 0004).

Claim 4-6, 8-9, 12-13, and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over Rawte, et al. in view of Sato.
With respect to Claim 4, Rawte discloses:
use a first machine learning model to determine a first output based on a first prompt comprising first text, wherein the first machine learning model is trained to determine modifications to prompts associated with a first characteristic of responses to the prompts, and wherein the first output includes second text indicative of a modification to the first prompt (modification in the form of paraphrasing of an “original prompt” that is associated with hallucinations/invalid responses at a ML model in the form of an LLM, Section 1, Page 2; Sections 7-7.1, Page 5; Section 9, Page 8; Fig. 5; Fig. 9 (see LLM plus "Hallucinated response"));
(determination of a second prompt based upon the output of the first ML model (e.g., GPT-3) that is a paraphrase/rewrite of the original prompt that comprises second text, Sections 7-8.1, Pages 5-6; see Fig. 1 showing the rewritten prompt and Fig. 9 showing the optimal paraphrases and adjusted prompt to be input into an LLM to produce AI-generated text); and
and use a third machine learning model to determine a second output comprising fourth text based on the second prompt, wherein the third machine learning model is trained to determine responses based on one or more of text or semantic information associated with prompts, wherein the responses are associated with one or more constraints (LLM that determines a response to the optimized/adjusted prompt in the form of “AI-generated text” shown in Fig. 9; see also Section 1, Page 2 discussing LLMs trained for tasks related to text generation based upon constraints in a semantic, natural language requests; Section 4 Page 4 discussing accuracy considerations of LLM responses; Section 6, Page 4; see responses pertaining to an input prompt in Fig. 1).
While Rawte discloses determining a second prompt based upon the original prompt and a rewritten/paraphrased output via assessment.  The assessment in Rawte is not based upon a trained machine-learning model.  Sato, however, teaches the use of a machine learning model to compare and/or determine a prompt for an LLM that is trained to assess original prompt text and modified prompts (Paragraphs 0017, 0068-0069, 0076-0077, and 0155).  Also, Rawte represents a scholarly paper/non-patent literature and so does not explicitly teach the hardware-based components of the claim in the form of one or more non-transitory memories storing computer-executable instructions; and one or more hardware processors to execute the computer-executable instructions.  Sato, however, provides such a hardware-based implementation having a computer processor and non-transitory memory (Paragraphs 0013 and 0123).
Rawte and Sato are analogous art because they are from a similar field of endeavor in LM response generation utilizing revised prompting.  Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify the teachings of Rawte with the ML model for prompt assessment and computer-implementation taught by Sato to provide a predicable result of better ensuring that an appropriate prompt is provided to an LLM in a manner that is learnable over time and for process implementation on a general-purpose computing device.
With respect to Claim 5, Rawte further discloses:
The system of claim 4, further comprising computer-executable instructions to: determine a relationship between the fourth text and the one or more constraints; and determine output data based on the relationship between the fourth text and the one or more constraints, wherein the output data is indicative of the fourth text deviating from the one or more constraint (generation of labeled output based upon comparing the AI generated text to various response constraints (e.g., entailment) wherein the data is labeled as being refuted/improper, Fig. 9; Section 10, Page 9; Section G, Page 19).
With respect to Claim 6, Rawte further discloses:
The system of claim 4, wherein the first machine learning model is trained to determine modifications that are associated with causing the third machine learning model to determine responses that deviate from the one or more constraints (the first ML model is trained to generate prompt modifications, Section 7-7.1, Page 5, that may lead to refuted responses/hallucinations, Fig. 9 and Section 4, Page 4 detailing various types of hallucinations).
With respect to Claim 8, Sato further discloses:
The system of claim 4, further comprising computer-executable instructions to: use a fourth machine learning model to determine a score based on the third text, wherein the fourth machine learning model is trained to determine scores based on one or more second characteristics associated with text (trained determination model that determines whether a replacement prompt/third text contains correct information (note:  such appropriateness of answer text given prompt text is indicative of a semantic appropriateness) in generative-text AI in a response based upon total values of scores, Paragraphs 0017, 0068-0069, 0076-0077, and 0155); and 
store the score as data accessible to control generation of prompts for input to the third machine learning model (information stored in prompt database including the related information items in the form of scoring for use in prompt generation for an LLM/third ML model, Paragraphs 0022, 0029, and 0108).
Claim 9 contains subject matter similar to Claim 3, and thus, is rejected under similar rationale (minus the teachings of Qu) where clarity and concreteness are examples of rationales.
With respect to Claim 12, Rawte discloses:
use a first machine learning model to determine a first output based on a first input comprising first text, wherein: the first machine learning model is trained to determine modifications to inputs associated with a first characteristic of responses to the inputs (modification in the form of word-based paraphrasing of an “original prompt” that is associated with hallucinations/invalid responses at a ML model in the form of an LLM, Section 1, Page 2; Sections 7-7.1, Page 5; Section 9, Page 8; Fig. 5; Fig. 9 (see LLM plus "Hallucinated response")), 
the first machine learning model includes a function for determining a value based at least in part on a first state comprising the first input and an action comprising the first output, and the first output is determined based on the value, wherein the first output includes second text indicative of a modification to the first text (computing a scoring function based upon the outputs of the first ML model (e.g., GPT-3) that are a paraphrase/rewrite of the original prompt that comprises altered wordings (i.e., an edit action) using a function, Sections 7-8.1, Pages 5-6; see Fig. 1 showing the rewritten prompt and Fig. 9 showing the generated paraphrase candidates); and 
(determination of an optimal second prompt based upon the output of the first ML model (e.g., GPT-3) that is a paraphrase/rewrite of the original prompt that comprises second text, Sections 7-8.1, Pages 5-6; see Fig. 1 showing the rewritten prompt and Fig. 9 showing the optimal paraphrase to be input into an LLM to produce AI-generated text).
While Rawte discloses determining a second prompt based upon the original prompt and a rewritten/paraphrased output via assessment.  The assessment in Rawte is not based upon a trained machine-learning model.  Sato, however, teaches the use of a machine learning model to compare and/or determine a prompt for an LLM that is trained to assess original prompt text and modified prompts (Paragraphs 0017, 0068-0069, 0076-0077, and 0155).  Also, Rawte represents a scholarly paper/non-patent literature and so does not explicitly teach the hardware-based components of the claim in the form of one or more non-transitory memories storing computer-executable instructions; and one or more hardware processors to execute the computer-executable instructions.  Sato, however, provides such a hardware-based implementation having a computer processor and non-transitory memory (Paragraphs 0013 and 0123).
Rawte and Sato are analogous art because they are from a similar field of endeavor in LM response generation utilizing revised prompting.  Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify the teachings of Rawte with the ML model for prompt assessment and computer-implementation taught by Sato to provide a predicable result of better ensuring that an appropriate prompt is provided to an LLM in a manner that is learnable over time and for process implementation on a general-purpose computing device.
With respect to Claim 13, Rawte further discloses:
The system of claim 12, wherein the first machine learning model is trained to determine modifications to inputs associated with responses, by a third machine learning model, that deviate from one or more constraints (LLM that determines a response to the optimized/adjusted prompt in the form of “AI-generated text” shown in Fig. 9; see also Section 1, Page 2 discussing LLMs trained for tasks related to text generation based upon constraints in a request; Section 4 Page 4 discussing accuracy considerations of LLM responses; Section 6, Page 4; even with an adjusted prompt, hallucinations may still be present, Section 9, Page 8; see refute answers in Fig. 9).
Claim 15 recites subject matter similar to the use of the second machine learning model in claim 1 as addressed by Rawte, and thus, is rejected under similar rationale.  See also the determination of refuted results/hallucinations as depicted in Fig. 9.
Claim 16 recites subject matter similar to the use of the second machine learning model in claim 4 as addressed by Rawte, and thus, is rejected under similar rationale.  See also the determination of supported results/hallucinations as depicted in Fig. 9.
Claim 17 contains subject matter similar to Claim 8, and thus, is rejected under similar rationale.  Note that claim 17 uses broader claim terms where the various outputs in the claim correspond to the text of claim 8.
Claim 18 contains subject matter similar to Claim 9, and thus, is rejected under similar rationale.  Note that while the numerical labels associated with the ML models differ, the functionality is equivalent with broader terms to express text in the form of outputs in claim 18.

Claims 7, 10, 14, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Rawte, et al. in view of Sato and further in view of Jung, et al. ("Discrete Prompt Compression with Reinforcement Learning," May 2024).
With respect to Claim 7, Rawte in view of Sato teach the system for prompting rewriting to reduce hallucinations in LLM responses as applied to Claim 1.  Rawte in view of Sato do not teach the reward function set forth in claim 7.   Jung, however, discloses:
the first machine learning model includes a reward function for determining a reward value based on one or more of:   a first state comprising the first text, an action comprising the first output, or a second state comprising the second text; and the reward value is associated with the second output deviating from the one or more constraints (Section III.B., Page 72580- "reward is calculated from the output sequences of the LMs and the reduced prompt length" thus the rewards is based on the second text using a compressed prompt; Fig. 1, Rewards; Section III.B., Page 72580- action is selected that has a reward greater (i.e., deviating more) that an current policy (i.e., a constraint)).
Rawte, Sato, and Jung are analogous art because they are from a similar field of endeavor in LM response generation utilizing revised prompting.  Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify the teachings of Rawte in view of Sato with the reward policy for prompt restructuring taught by Jung to provide a predictable result of an efficient policy that can edit prompts for application to various types of LMs (Jung, Abstract).  
With respect to Claim 10, Rawte in view of Sato lack, however, Jung discloses for the previously mentioned predictable result:
The system of claim 4, further comprising computer-executable instructions to: train the first machine learning model to determine modifications to prompts (training a model for modifications in the form of prompt compression, Section II.B., Page 72579 and Sections III.A-B., Page 72580), wherein the modifications are associated with a first characteristic of responses to the prompts (reward consideration after Generation LM inference to update the compression policy, Fig. 1, Section III.B., Pages 72580-72581), using training data comprising a plurality of prompts, each prompt of the plurality of prompts associated with an indication of one of: the first characteristic or an absence of the first characteristic (prompt training pool is considered and are associated with having reward characteristics (e.g., based upon similarity), Section III.B., Pages 72580-72581).
With respect to Claim 14, Rawte in view of Sato does not teach, however, Jung further discloses for the previously mentioned predictable result:
The system of claim 13, wherein:
the first machine learning model further determines the value based on:
a second state comprising the second input, and one or more intervals of time; and
the value is associated with a second output associated with the second input deviating from the one or more constraints (prompt comparison scoring based upon an original prompt and a modified prompt (i.e., time expressed in intervals of prompt processing) in relation to a highest probability “action” and a deviation in terms of a reward to select a particular prompt, Section III.B., Pages 72580-72581).
Claim 19 contains subject matter similar to Claim 10, and thus, is rejected under similar rationale.

Claims 11 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Rawte, et al. in view of Sato and further in view of Sateli, et al. (U.S. PG Publication:  2025/0238347 A1).
With respect to Claim 11, Rawte in view of Sato teach the system for prompting rewriting to reduce hallucinations in LLM responses as applied to Claim 1.  Rawte in view of Sato do not teach the fourth learning model that determines sets of prompts as set forth in claim 11.  Sateli, however, discloses:
determine the first prompt by: providing a plurality of prompts to a fourth machine learning model, wherein the fourth machine learning model is trained to determine sets of prompts based on text and semantic information associated with prompts; determining a first set of prompts associated with second characteristics and a second set of prompts associated with third characteristics, based on output from the fourth machine learning model (generative AI model that receives static input test data (i.e., a plurality of prompts) to generate sets of test/evaluation data "where two or more phrases, sentences, or expressions share a similar meaning or convey similar ideas, but they are composed of different words or have distinct lexical forms" (note words/lexical forms correspond to differing characteristics, Paragraphs 0132-0134); and using the first prompt as an input to the first machine learning model based on the first prompt being included in the first set of prompts (prompts are fed into the LLM for evaluation, Paragraphs 0135-0137; it is noted that in the system of Rawte, the initial prompting undergoes paraphrase modification via the first machine learning model before submission to the LLM for response generation).
Rawte, Sato, and Sateli are analogous art because they are from a similar field of endeavor in LM response generation utilizing revised prompting.  Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date, to modify the teachings of Rawte in view of Sato with the semantic prompt variation taught by Sateli to provide a predictable result of capturing a broader perspective of LLM responses and enhancing test coverage (Sateli, Paragraph 0132).
Claim 20 contains subject matter similar to claim 11, and thus, is rejected under similar rationale.  Note that claim 20 is broader is some aspects where the inputs correspond to prompts in claim 11 and the clustering algorithms corresponds to the shared semantic information relied upon in the claim 11 rejection.

Allowable Subject Matter

Claim 2 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

The following is a statement of reasons for the indication of allowable subject matter: 
With respect to Claim 2, the prior art of record fails to explicitly teach or fairly suggest either taken individually or in combination, the system of claim 2 having the particularly claimed and processed reward function.
Most Pertinent Prior Art:
Rawte discloses the process for prompt revision using a machine-learning model in the form of an LLM and generating a response via an LLM using the revised/paraphrased prompt (see Sections 7-8.1, Pages 5-6; see Fig. 1 showing the rewritten prompt and Fig. 9 showing the optimal paraphrases and adjusted prompt to be input into an LLM to produce AI-generated text).  Rawte is silent on any type of reinforcement learning RL, particularly the reward function particularly recited and processed as recited in claim 2.  While secondary reference Qu, et al. (U.S. PG Publication:  2025/0356204 A1) does teach reinforcement learning (RL) agent comprising a "reward function" including a set of instructions (Paragraph 0047) along with a first state comprising the first text, an action comprising the modification, a second state comprising the second text, and one or more intervals of time (prompt from a first iteration, generated response/action/evaluation, revised prompt, and intervals of time as particular iteration, Paragraphs 0059, 0065, and 0075), Qu does not teach that the reward value is associated with a probability of the first response associated with the second text deviating from the one or more constraints  and the first machine learning model is trained to maximize that particularly defined reward value based upon constraint deviation.  Thus, the prior art of record fails to explicitly teach or fairly suggest the invention set forth in dependent claim 2.  

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Dong, et al. (U.S. PG Publication:  2025/0018298 A1)- teaches a dataset having bad examples of input prompts in reinforcement learning of LLMs (Paragraph 0012).
Gardner (U.S. PG Publication:  2024/0296219 A1)- orchestrator can be AI has ML detection of problematic inputs and then generates a response based upon a modified prompt including adverse input-output pairs (Paragraphs 0067 and 0097).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAMES S WOZNIAK whose telephone number is (571)272-7632. The examiner can normally be reached 7-3, off alternate Fridays.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant may use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached at (571)272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

JAMES S. WOZNIAK
Primary Examiner
Art Unit 2655



/JAMES S WOZNIAK/Primary Examiner, Art Unit 2655
Read full office action
Prosecution Timeline

Jun 07, 2024
Application Filed
Feb 09, 2026
Non-Final Rejection — §101, §103
Mar 30, 2026
Interview Requested
Apr 08, 2026
Examiner Interview Summary
Apr 08, 2026
Applicant Interview (Telephonic)
Precedent Cases

Applications granted by this same examiner with similar technology

18/399,876
Patent 12597422
SPEAKING PRACTICE SYSTEM WITH RELIABLE PRONUNCIATION EVALUATION
2y 5m to grant Granted Apr 07, 2026
18/488,578
Patent 12586569
Knowledge Distillation with Domain Mismatch For Speech Recognition
2y 5m to grant Granted Mar 24, 2026
18/359,113
Patent 12511476
CONCEPT-CONDITIONED AND PRETRAINED LANGUAGE MODELS BASED ON TIME SERIES TO FREE-FORM TEXT DESCRIPTION GENERATION
2y 5m to grant Granted Dec 30, 2025
18/390,934
Patent 12512100
AUTOMATED SEGMENTATION AND TRANSCRIPTION OF UNLABELED AUDIO SPEECH CORPUS
2y 5m to grant Granted Dec 30, 2025
18/448,628
Patent 12475882
METHOD AND SYSTEM FOR AUTOMATIC SPEECH RECOGNITION (ASR) USING MULTI-TASK LEARNED (MTL) EMBEDDINGS
2y 5m to grant Granted Nov 18, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
59%
Grant Probability
99%
With Interview (+40.1%)
3y 7m
Median Time to Grant
Low
PTA Risk
Based on 385 resolved cases by this examiner. Grant probability derived from career allow rate.