Prosecution Insights
Last updated: April 19, 2026
Application No. 18/755,440

SYSTEM, METHOD, AND COMPUTER PROGRAM FOR EVOLVING MULTI-TURN CHATBOT DIALOGS

Non-Final OA §102
Filed
Jun 26, 2024
Examiner
SINGH, SATWANT K
Art Unit
2653
Tech Center
2600 — Communications
Assignee
Amdocs Development Limited
OA Round
1 (Non-Final)
90%
Grant Probability
Favorable
1-2
OA Rounds
2y 6m
To Grant
99%
With Interview

Examiner Intelligence

Grants 90% — above average
90%
Career Allow Rate
707 granted / 788 resolved
+27.7% vs TC avg
Moderate +10% lift
Without
With
+9.7%
Interview Lift
resolved cases with interview
Typical timeline
2y 6m
Avg Prosecution
13 currently pending
Career history
801
Total Applications
across all art units

Statute-Specific Performance

§101
20.2%
-19.8% vs TC avg
§103
26.4%
-13.6% vs TC avg
§102
34.8%
-5.2% vs TC avg
§112
3.0%
-37.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 788 resolved cases

Office Action

§102
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Information Disclosure Statement The information disclosure statements (IDSs) submitted on 07/09/2024, 01/15/2025, and 10/23/2025 were filed in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements are being considered by the examiner. Claim Rejections - 35 USC § 102 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ruochen Zhao et al: (Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-Battles and Committee Discussions”, ARXIV.ORG, Cornell University Library, 201 Olin Library Cornell University Ithaca, NY 14853, 30 May 2024 (2024-05-30), XP091772881, IDS supplied). Regarding Claim 1, Ruochen Zhao et al discloses a non-transitory computer-readable media storing computer instructions which when executed by one or more processors of a device cause the device to evolve a large language model (LLM)-based chatbot (Peer Battle) (page 2, Figure 2, Section 3.2) over at least one iteration (Overall, the peer battle consists of 3 rounds, where the candidates take turns to speak) (page 2, Figure 2, Section 3.2) that includes: presenting, by a large language model (LLM)-based evaluator, a question to a LLM-based chatbot during a dialog with the LLM-based chatbot (For debate questions, as using a static dataset could incur data contamination concerns and result in unfair evaluations, we ask an LLM examiner agent to dynamically generate questions. The examiner agent could be any capable LLM) (page 4, Section 3.1) comprised of a sequence of question and answer pairs (The process is illustrated in Figure 2. In the first round, A gives an initial response to the examiner's question; B criticizes the weaknesses in A's response and raises a targeted follow-up question; and A responds to B's question) (page 2, Figure 2, Section 3.2); receiving, by the LLM-based evaluator, an answer to the question from the LLM-based chatbot (In the first round, A gives an initial response to the examiner's question) (page 2, Figure 2, Section 3.2: Peer Debate); evaluating, by the LLM-based evaluator, the answer according to one or more evaluation metrics and a ground truth (Given the questions, the LLM-produced answers are compared to ground-truth answers using metrics such as accuracy) (page 1, Section 1); determining, by the LLM-based evaluator, that a result of the evaluation is unsatisfactory (Candidate A (powered by Yi-34B-Chat) gives a wrong answer as it miscounts occurrences for repeated letters and miscalculates factorials) (page 9, Section 5.1); and presenting, by the LLM-based evaluator, a follow-up question to the LLM-based chatbot designed to encourage a new answer of the LLM-based chatbot (The opponent B (powered by Claude-3-Haiku) quickly and precisely points out these two issues and skillfully raised a follow-up that targets A's weaknesses: "how about the word 'BANANA?" ) (page 9, Section 5.1) to be satisfactory with respect to the ground truth and to cause an optimization of the LLM-based chatbot (Given the questions, the LLM-produced answers are compared to ground-truth answers using metrics such as accuracy) (page 1, Section 1). Regarding Claim 2, Ruochen Zhao et al discloses the non-transitory computer-readable media, wherein the LLM-based chatbot is evolved over a plurality of iterations each corresponding to different question and answer pair in the sequence of question and answer pairs (Overall, the peer battle consists of 3 rounds, where the candidates take turns to speak. The entire dialogue history is visible to both candidates. The process is illustrated in Figure 2. In the first round, A gives an initial response to the examiner's question; B criticizes the weaknesses in A's response and raises a targeted follow-up question; and A responds to B's question. In the second round, A and B are reversed: B gives an initial response to the examiner's question (without seeing A's response); A criticizes and raises questions; and B responds to A's question. In the third round, A and B cross-examine each other. A starts by criticizing B's previous loopholes and raises follow-up questions. After responding, B also criticizes A's loopholes and raises questions. A concludes the battle by responding again. In this process, both A and B get an equal number of each action to ensure fairness. To further reduce position bias, A and B's order is randomly shuffled at the beginning of each debate (pages 4 and 5, Section 3.2). Regarding Claim 3, Ruochen Zhao et al discloses the non-transitory computer-readable media, wherein when the LLM-based evaluator determines that a result of the evaluation for a given question and answer pair is satisfactory with respect to the ground truth (Given the questions, the LLM-produced answers are compared to ground-truth answers using metrics such as accuracy) (page 1, Section 1), then the LLM-based evaluator begins a next iteration of the plurality of iterations (Overall, the peer battle consists of 3 rounds, where the candidates take turns to speak. The entire dialogue history is visible to both candidates. The process is illustrated in Figure 2. In the first round, A gives an initial response to the examiner's question; B criticizes the weaknesses in A's response and raises a targeted follow-up question; and A responds to B's question. In the second round, A and B are reversed: B gives an initial response to the examiner's question (without seeing A's response); A criticizes and raises questions; and B responds to A's question. In the third round, A and B cross-examine each other. A starts by criticizing B's previous loopholes and raises follow-up questions. After responding, B also criticizes A's loopholes and raises questions. A concludes the battle by responding again. In this process, both A and B get an equal number of each action to ensure fairness. To further reduce position bias, A and B's order is randomly shuffled at the beginning of each debate (pages 4 and 5, Section 3.2). Regarding Claim 4, Ruochen Zhao et al discloses the non-transitory computer-readable media, wherein the evaluating of the answer is further performed according to prior question and answer pairs occurring in the dialog (In the third round, A and B cross-examine each other. A starts by criticizing B's previous loopholes and raises follow-up questions. After responding, B also criticizes A's loopholes and raises questions. A concludes the battle by responding again. In this process, both A and B get an equal number of each action to ensure fairness. To further reduce position bias, A and B's order is randomly shuffled at the beginning of each debate (pages 4 and 5, Section 3.2). Regarding Claim 5, Ruochen Zhao et al discloses the non-transitory computer-readable media, wherein the one or more evaluation metrics include one or more automatically calculable natural language processing (NLP) measures (This is a competitive chatbot arena. You are competing against another chatbot assistant in a debate and being judged by a committee on factors such as helpfulness, relevance, accuracy, depth, and creativity) (Page 15, Section A.1.2, Prompts). Regarding Claim 6, Ruochen Zhao et al discloses the non-transitory computer-readable media, wherein evaluating, by the LLM-based evaluator, the answer according to the one or more evaluation metrics and the ground truth includes: calculating a score for the answer based on the one or more evaluation metrics and the ground truth (For logical-reasoning questions that have ground-truth answers (reasoning, code, math), LLM-as- a-judge is known to show weak performances in judging the quality of responses. We adopt prior approaches to establish the reference-based judge [32]. Specifically, we utilize the strongest model (according to the current ranking) to generate a reference answer and provide it to the judge when evaluating the peer battle) (Page 5, Section 3.3) Regarding Claim 7, Ruochen Zhao et al discloses the non-transitory computer-readable media, wherein the result of the evaluation is unsatisfactory when the score is below a predefined threshold (In the first round, the committee is initialized with MMLU [15] scores to approximate LLM performances. They will first be asked to read through the battle history, elaborate judgment reasons, and give a verdict on whether A is better, or B is better, or if there is a tie) (Page 5, Section 3.3) Regarding Claim 8, Ruochen Zhao et al discloses the non-transitory computer-readable media, wherein the LLM-based evaluator presents up to a threshold number of follow-up questions until the new answer of the LLM-based chatbot is evaluated to be satisfactory with respect to the ground truth (Each pair of candidates engage in 40 peer battles, with 5 questions from each of the 8 categories. The questions are generated by GPT-4. As each battle consists of 3 rounds (each candidate speaks for 4 times), we expect the competition scale to be approximately the same as MT-Bench (80 questions, each candidate speaks twice)) (Page 6, Section 4.1). Regarding Claim 9, Ruochen Zhao et al discloses the non-transitory computer-readable media, wherein when the LLM-based evaluator presents the threshold number of follow-up questions without the new answer of the LLM-based chatbot being evaluated as satisfactory with respect to the ground truth, then an error analysis is caused to be performed on the LLM-based chatbot (Each pair of candidates engage in 40 peer battles, with 5 questions from each of the 8 categories. The questions are generated by GPT-4. As each battle consists of 3 rounds (each candidate speaks for 4 times), we expect the competition scale to be approximately the same as MT-Bench (80 questions, each candidate speaks twice)) (Page 6, Section 4.1). Regarding Claim 10, Ruochen Zhao et al discloses the non-transitory computer-readable media, wherein the LLM-based chatbot is initially trained on a dataset comprised of individual question and answer pairs (One line of research conducts automatic evaluation with static datasets. Among these, static datasets with predefined metrics, such as GSM8k [9] and MMLU [15], are constructed with aspect-specific input-output pairs, such as questions and their corresponding answers) (page 1, Section 1). Regarding Claim 11, Ruochen Zhao et al discloses the non-transitory computer-readable media, wherein the LLM-based chatbot evolved to include a multi-turn question and answer dataset (Secondly, two candidate LLMs interact with each other and engage in a multi-round peer battle by answering the seed question individually, criticizing the opponent's weaknesses, and raising targeted follow-up queries to challenge the opponent further) (page 2, Section 1). Regarding Claim 12, Ruochen Zhao et al discloses the non-transitory computer-readable media, wherein the device is further caused to: output the evolved LLM-based chatbot for use (A noticeable example is Chatbot Arena [32], which is a crowdsourced voting platform that gathers anonymous votes on LLM performances and calculates ELO scores to rank these models) (page 2, Section 1). Claims 13 and 20 are rejected for the same reason as claim 1. Claims 14 is rejected for the same reason as claim 2. Claims 15 is rejected for the same reason as claim 3. Claims 16 is rejected for the same reason as claim 4. Claims 17 is rejected for the same reason as claim 5. Claims 18 is rejected for the same reason as claim 6. Claims 19 is rejected for the same reason as claim 7. Cited Art The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Barron et al. (US 2024/0311407) discloses an artificial intelligence agricultural advisor chatbot system powered by large language models (LLMs) and customized for the agricultural domain using a blend of agricultural datasets can include tools providing custom context relevant to user queries. Gado et al. (US 2025/0384280) discloses training data generation for large language model (LLM) training and/or benchmarking. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to SATWANT K SINGH whose telephone number is (571)272-7468. The examiner can normally be reached Monday thru Friday 9:00 AM to 6:00 PM EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras D Shah can be reached at (571}270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /SATWANT K SINGH/Primary Examiner, Art Unit 2653
Read full office action

Prosecution Timeline

Jun 26, 2024
Application Filed
Feb 07, 2026
Non-Final Rejection — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12602550
NATURAL LANGUAGE DRIVEN PLANNING WITH MACHINE LEARNING MODELS
2y 5m to grant Granted Apr 14, 2026
Patent 12602411
Method for Collaborative Knowledge Base Development
2y 5m to grant Granted Apr 14, 2026
Patent 12585881
NATURAL LANGUAGE PROCESSING SYSTEM, NATURAL LANGUAGE PROCESSING METHOD, AND NATURAL LANGUAGE PROCESSING PROGRAM
2y 5m to grant Granted Mar 24, 2026
Patent 12587274
SATELLITE OPTIMIZATION MANAGEMENT SYSTEM BASED ON NATURAL LANGUAGE INPUT AND ARTIFICIAL INTELLIGENCE
2y 5m to grant Granted Mar 24, 2026
Patent 12579368
System, device, and method to provide generalized knowledge routing utilizing machine learning to a user within the system
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
90%
Grant Probability
99%
With Interview (+9.7%)
2y 6m
Median Time to Grant
Low
PTA Risk
Based on 788 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month