Last updated: May 29, 2026

Application No. 18/215,312

DIALOGUE STATE TRACKING WITH IN-CONTEXT TUNING

Non-Final OA §103

Filed

Jun 28, 2023

Examiner

HWA, SHYUE JIUNN

Art Unit

2156

Tech Center

2100 — Computer Architecture & Software

Assignee

International Business Machines Corporation

OA Round

1 (Non-Final)

Interview Optional

— +38.4% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 83% grant rate with +38.4% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 858 resolved cases, 2023–2026

Examiner Intelligence

HWA, SHYUE JIUNN View full profile →

Grants 83% — above average

Career Allowance Rate

709 granted / 858 resolved

+27.6% vs TC avg

Strong +38% interview lift

Without

With

+38.4%

Interview Lift

resolved cases with interview

Typical timeline

3y 0m

Avg Prosecution

19 currently pending

Career history

880

Total Applications

across all art units

Statute-Specific Performance

§101

0.7%

-39.3% vs TC avg

§103

81.1%

+41.1% vs TC avg

§102

15.8%

-24.2% vs TC avg

§112

0.7%

-39.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 858 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
1. 	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

2. 	Claims 1-20 are pending in this office action. This action is responsive to Applicant’s application filed 06/28/2023.

Information Disclosure Statement
3.	The references listed in the IDS filed 06/28/2023 has been considered. A copy of the signed or initialed IDS is hereby attached.

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

This application currently names joint inventors.  In considering patentability of the claims under 35 U.S.C. 103(a), the examiner presumes that the subject matter of the various claims was commonly owned at the time any inventions covered therein were made absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and invention dates of each claim that was not commonly owned at the time a later invention was made in order for the examiner to consider the applicability of 35 U.S.C. 103(c) and potential 35 U.S.C. 102(e), (f) or (g) prior art under 35 U.S.C. 103(a).
4.	Claims 1-3, and 8-20 are rejected under 35 U.S.C. 103(a) as being unpatentable over Dinu et al. (US Patent Publication No. 2024/0354319 A1, hereinafter “Dinu”) in view of Mostafazadeh et al. (US Patent Publication No. 2022/0343903 A1, hereinafter “Mostafazadeh”).
As to Claim 1, Dinu teaches the claimed limitations:
“A system comprising:” as a variety of different systems such as automotive systems, systems implemented using a robot, aerial systems, medial systems, boating systems, smart area monitoring systems, systems for performing deep learning operations (paragraphs 0023-0024). 
 	“a memory configured to store program instructions; and a processor operatively coupled to the memory to execute the program instructions to:” as the computing system includes processors and memory coupled to a parallel processing subsystem via a memory bridge and a communication path. Memory bridge is further coupled to an I/O (input/output) bridge via a communication path and I/O bridge is, in turn, coupled to a switch (paragraphs 0025-0026).
 	“obtain input dialogue data; identify, from at least one historical dialogue dataset, one or more historical dialogue examples having at least a given semantic similarity to at least a portion of the input dialogue data” as in at least one embodiment, unsupervised training can also be used to perform anomaly detection, which allows identification of data points in new dataset that deviate from normal patterns of new dataset (paragraph 0108; see also element 108 of figure 1, input devices). The canonical form generator converts the user input into a canonical form input by determining one or more most similar example user inputs in the canonical form input definitions and associated canonical form inputs; and prompting the language model to generate the canonical form input using a few-shot prompt that includes the most similar example user inputs, the corresponding canonical form inputs, and the current conversation (e.g., the dialog history) with the user. In such cases, the canonical form generator can determine the number of most similar example user inputs by generating an embedding (e.g., a vector embedding) of the user input in a semantic or latent space, such as by inputting the user input into a sentence transformer or other trained machine learning model that outputs the embedding as a vector, and then comparing the embedding of the user input to embeddings of the example user inputs in the canonical form input definitions 212 to determine one or more dial user inputs whose embeddings are closest, according to a distance metric, to the embedding of the user in the semantic or latent space (paragraphs 0042, 0044; see also figures 6-7).
 	“generate, based at least in part on the one or more historical dialogue examples, one or more prompts related to at least a portion of the input dialogue data” as given the matching dialog flow in the dialog flow definitions or a dialog flow that is generated by the dialog flow generator as input, the dialog flow execution module executes the matching or generated dialog flow to generate an output. Executing the matching or generated dialog flow includes executing the one or more next steps included in the matching or generated dialog flow. In such cases, executing the one or more next steps can include determining a context, which can include, e.g., the last user input, a full history of the current conversation, information about an application such as application state variables, and environmental context such as in a multi-modal application; Optionally causing external tools to execute based on the context and/or other parameters to generate an intermediate output; Matching (or matching within a threshold similarity) a canonical form output associated with the matching or generated dialog flow to a predefined canonical form output or, if no such match exists, determining one or more most similar predefined canonical form outputs to the canonical form output associated with the matching or generated dialog flow; and Outputting an example output associated with a matching predefined canonical form output or, if the canonical form output associated with the matching or generated dialog flow does not match any predefined canonical form output, prompting the language model to generate an output using a few-shot prompt that includes the most similar canonical form outputs, corresponding example outputs, and/or the current conversation with the user (paragraphs 0042, 0044, 0048).
Dinu does not explicitly teach the claimed limitation “generate tuning data, associated with at least one dialogue state tracking task related to the input dialogue data, for one or more artificial intelligence techniques by augmenting at least a portion of the one or more prompts in connection with at least one given dialogue state value derived from at least a portion of the at least one historical dialogue dataset; perform one or more automated actions based at least in part on the generated tuning data“.
Mostafazadeh teaches an example implementation for the natural language AI engine is shown and described in more detail. The natural language AI engine advantageously parses queries and converts them to an executable program which gets executed on various data sources by the distributed runtime engine, the natural language AI engine comprises: a speech recognition module, a query rewriter module, neural question answering (QA) system, a neural semantic parser, a deep information retrieval module, a dialogue manager, a natural language generation module, and a speech synthesis module. The natural language AI engine orchestrates handing the input query down the three various systems, with varying degrees of precision and recall. These systems are the neural QA system, the neural semantic parser, and the deep information retrieval module. Each of these modules output a confidence score tied to their predictions to the dialog manager. The dialog manager is in charge of tracking the state of the ongoing dialog and making a decision as to the best next response to the user, given the historical context of the dialog. The dialog manager has a specialized thresholding algorithm based on the confidence scores for deciding on what to output from each module. These thresholds, learned jointly, are one of the hyperparameters that get tuned throughout the system with the objective of increasing the system's F-score, which is the harmonic mean of the precision and recall of each module. The speech recognition module and the query rewriter are coupled to receive speech and text queries from the AI-enabled visualization & reporting engine. Any speech queries are routed to the neural speech recognition module which processes the speech and converts it to text which is output by the speech recognition module to the query rewriter. The query rewriter combines the text from the AI-enabled visualization & reporting engine and the text from the speech recognition module and uses them to generate a new more accurate query that is void of any vocabulary mismatch in the underlying domain (paragraph 0050-0052). Based on the needs of the domain general AI platform or system, the human-in-the-loop ML module automatically instantiates specific teaching actions and/or instances and provides them to the natural language AI engine or the AI-enabled distributed deep semantic data compositor (paragraphs 0050-0052, 0054).
		Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of Dinu and Mostafazadeh before him/her, to modify Dinu generate tuning data, associated with at least one dialogue state tracking task because that would provide a natural language understanding and dialogue engine, answering questions about any kind of underlying data, does not require pre-defined templates or predefined patterns, and generates interactive reports as answers to queries as taught by Mostafazadeh (paragraph 0024). 

As to Claim 2, Dinu teaches the claimed limitations:
“wherein performing one or more automated actions comprises automatically tuning the one or more artificial intelligence techniques using at least a portion of the generated tuning data” as (paragraphs 0004, 0022-0023).

As to Claim 3, Dinu does not explicitly teach the claimed limitation “wherein performing one or more automated actions comprises predicting at least one dialogue state related to the input dialogue data by processing, using the one or more tuned artificial intelligence techniques, at least a portion of the input dialogue data”.
Mostafazadeh teaches (paragraphs 0046, 0051). 
		Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of Dinu and Mostafazadeh before him/her, to modify Dinu tuned artificial intelligence techniques, at least a portion of the input dialogue data because that would provide a natural language understanding and dialogue engine, answering questions about any kind of underlying data, does not require pre-defined templates or predefined patterns, and generates interactive reports as answers to queries as taught by Mostafazadeh (paragraph 0024). 

As to Claim 8, Dinu does not explicitly teach the claimed limitation 	“wherein performing one or more automated actions comprises automatically training at least one multi-domain automated conversation system using the at least one predicted dialogue state”. 
Mostafazadeh teaches (abstract, paragraphs 0006, 0024, 0038-0039, 0045-0048). 
		Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of Dinu and Mostafazadeh before him/her, to modify Dinu at least one multi-domain automated conversation system using the at least one predicted dialogue state because that would provide a natural language understanding and dialogue engine, answering questions about any kind of underlying data, does not require pre-defined templates or predefined patterns, and generates interactive reports as answers to queries as taught by Mostafazadeh (paragraph 0024). 

As to Claim 9, Dinu teaches the claimed limitations:
 	“wherein obtaining input dialogue data comprises obtaining dialogue content involving at least one user and at least one automated conversation system, and query slot information related to the dialogue content” as (paragraphs (paragraphs 0039-0040, 0044, 0048, 00547, 0081, 0091).

As to Claim 10, Dinu teaches the claimed limitations:
 	“wherein identifying one or more historical dialogue examples comprises comparing semantic similarity of one or more embeddings within the input dialogue data and one or more embeddings within the at least one historical dialogue dataset” as (paragraphs 0040, 0042, 0044-0046, 0048). 
		Mostafazadeh teaches (paragraphs 0009, 0050-0053, 0062-0055). 

As to Claim 11, Dinu teaches the claimed limitations:
 	“wherein generating one or more prompts comprises generating, based at least in part on the one or more historical dialogue examples, one or more prompts related to at least a portion of the input dialogue data in at least one of a zero- shot context and a few-shot context”  as (paragraphs 0040, 0048, 0051-0055).
		Mostafazadeh teaches (paragraphs 0035-0036, 0040, 0042-0046, 0048, 0052-0055). 

As to Claim 12, Dinu teaches the claimed limitations:
 	“wherein the processor is further operatively coupled to the memory to execute the program instructions to: train at least a portion of the one or more artificial intelligence techniques using at least a portion of the one or more prompts” as (paragraphs 0004, 0022, 0040, 0048, 0052, 0067, 0093, 0147).
		Mostafazadeh teaches (abstract, paragraphs 0006-0007, 0032-0033, 0036-0038, 0043, 0046, 0050-0053, 0055-0057, 0065). 

As to Claim 13, Dinu does not explicitly teach the claimed limitation “wherein training at least a portion of the one or more artificial intelligence techniques using at least a portion of the one or more prompts comprises encoding the at least a portion of the one or more prompts and using, via the one or more artificial intelligence techniques, the encoded prompts to predict one or more slot values in connection with the input dialogue data”.  
		Mostafazadeh teaches (paragraphs 0008, 0024, 0051, 0057, 0065).
		Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of Dinu and Mostafazadeh before him/her, to modify Dinu using, via the one or more artificial intelligence techniques, the encoded prompts to predict one or more slot values in connection with the input dialogue data because that would provide a natural language understanding and dialogue engine, answering questions about any kind of underlying data, does not require pre-defined templates or predefined patterns, and generates interactive reports as answers to queries as taught by Mostafazadeh (paragraph 0024). 

As to claims 14-16 are rejected under 35 U.S.C 103(a), the limitations therein have substantially the same scope as claims 1-3. In addition, Dinu teaches one embodiment, code is stored on a computer-readable storage medium in form of a computer program comprising a plurality of instructions executable by one or more processors (paragraph 0120). Therefore, these claims are rejected for at least the same reasons as claims 1-3.

As to claims 17-19 are rejected under 35 U.S.C 103(a), the limitations therein have substantially the same scope as claims 1-3. In addition, Dinu teaches The systems and methods described herein may be used for a variety of purposes, by way of example and without limitation, for use in systems associated with machine control, machine locomotion, machine driving, synthetic data generation, model training (paragraph 0022. Therefore, these claims are rejected for at least the same reasons as claims 1-3.

As to Claim 20, Dinu teaches the claimed limitations:
“wherein software implementing the method is provided as a service in a cloud environment” as (paragraphs 0004, 002200024, 0026). 

4.	Claim 4 is rejected under 35 U.S.C. 103(a) as being unpatentable over Dinu et al. (US Patent Publication No. 2024/0354319 A1) as applied to claims 1 above, and further in view of Mostafazadeh et al. (US Patent Publication No. 2022/0343903 A1) and Thomson et al. (US Patent Publication No. 2018/0330721 A1, hereinafter “Thomson”).
As to Claim 4, Dinu does not explicitly teach the claimed limitation 	“wherein predicting at least one dialogue related to the input dialogue data using the one or more tuned artificial intelligence techniques comprises processing at least a portion of the input dialogue data in conjunction with at least a portion of the one or more prompts using at least one language model”.  
Thomson teaches (paragraphs 0048, 0264, 0267, 0289). 
		Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of Dinu, Mostafazadeh and Thomson before him/her, to modify Dinu tuned artificial intelligence techniques, at least a portion of the input dialogue data because that would provide a natural language understanding and dialogue engine, answering questions about any kind of underlying data, does not require pre-defined templates or predefined patterns, and generates interactive reports as answers to queries as taught by Mostafazadeh (paragraph 0024). Or predicting at least one dialogue related to the input dialogue data using the one or more tuned artificial intelligence techniques provide a hierarchical belief state that enables accurate and robust decision making for responding to complex user requests while maintaining tractability and computational efficiency as taught by Thomson (paragraph 0027).

5.	Claim 5 is rejected under 35 U.S.C. 103(a) as being unpatentable over Dinu et al. (US Patent Publication No. 2024/0354319 A1) as applied to claims 1 above, and further in view of Mostafazadeh et al. (US Patent Publication No. 2022/0343903 A1), Thomson et al. (US Patent Publication No. 2018/0330721 A1), and Wang et al. (US Patent Publication No. 2021/0232773 A1, hereinafter “Wang”).
As to Claim 5, Dinu does not explicitly teach the claimed limitation 	“wherein the at least one language model comprises one or more of at least one encoder-decoder model and at least one decoder-only model”.  
Wang teaches (abstract, paragraphs 0017-0018, 0034-0035, 0041-0043, 0052-0054).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of Dinu, Mostafazadeh and Thomson before him/her, to modify Dinu tuned artificial intelligence techniques, at least a portion of the input dialogue data because that would provide a natural language understanding and dialogue engine, answering questions about any kind of underlying data, does not require pre-defined templates or predefined patterns, and generates interactive reports as answers to queries as taught by Mostafazadeh (paragraph 0024). Or predicting at least one dialogue related to the input dialogue data using the one or more tuned artificial intelligence techniques provide a hierarchical belief state that enables accurate and robust decision making for responding to complex user requests while maintaining tractability and computational efficiency as taught by Thomson (paragraph 0027). Or language model comprises one or more of at least one encoder-decoder model provides systems and methods that implement a unified visual-dialogue transformer-based approach or model that leverages a Bidirectional Encoder-Decoder Representations from Transformers (BERT) pre-trained language models for visual dialogue tasks as taught by Wang (paragraph 0017).

6.	Claim 6 is rejected under 35 U.S.C. 103(a) as being unpatentable over Dinu et al. (US Patent Publication No. 2024/0354319 A1) as applied to claims 1 above, and further in view of Mostafazadeh et al. (US Patent Publication No. 2022/0343903 A1), and Hanson et al. (US Patent Publication No. 2022/0199079 A1, hereinafter “Hanson”).
As to Claim 6, Dinu does not explicitly teach the claimed limitation 	“wherein performing one or more automated actions comprises automatically generating at least one response, based at least in part on the at least one predicted dialogue state, in connection with an automated conversation system involved in the input dialogue data”.
 Hanson teaches (paragraphs 0070, 0085, 0210, 0271). 
		Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of Dinu, Mostafazadeh and Hanson before him/her, to modify Dinu tuned artificial intelligence techniques, at least a portion of the input dialogue data because that would provide a natural language understanding and dialogue engine, answering questions about any kind of underlying data, does not require pre-defined templates or predefined patterns, and generates interactive reports as answers to queries as taught by Mostafazadeh (paragraph 0024). Or predicted dialogue state, in connection with an automated conversation system because that would provide services (to facilitate social interaction between or among users as taught by Hanson (paragraph 0004).

7.	Claim 7 is rejected under 35 U.S.C. 103(a) as being unpatentable over Dinu et al. (US Patent Publication No. 2024/0354319 A1) as applied to claims 1 above, and further in view of Mostafazadeh et al. (US Patent Publication No. 2022/0343903 A1), and Gupta et al. (US Patent Publication No. 2024/0220732 A1, hereinafter “Gupta”).
As to Claim 7, Dinu does not explicitly teach the claimed limitation 	“wherein performing one or more automated actions comprises automatically training at least a portion of the one or more tuned artificial intelligence techniques using feedback related to the at least one predicted dialogue state”.  
Gupta teaches (abstract, paragraphs 0030, 0063).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of Dinu, Mostafazadeh and Gupta before him/her, to modify Dinu tuned artificial intelligence techniques, at least a portion of the input dialogue data because that would provide a natural language understanding and dialogue engine, answering questions about any kind of underlying data, does not require pre-defined templates or predefined patterns, and generates interactive reports as answers to queries as taught by Mostafazadeh (paragraph 0024). Or tuned artificial intelligence techniques using feedback related to the at least one predicted dialogue state because that would provide benefits by making services more accurate and efficient. These techniques are flexible, and so can apply a wide variety of tasks and domains as taught by Gupta (paragraph 0024).

Examiner’s Note
Examiner has cited particular columns/paragraph and line numbers in the references applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
In the case of amending the Claimed invention, Applicant is respectfully requested to indicate the portion(s) of the specification which dictate(s) the structure relied on for proper interpretation and also to verify and ascertain the metes and bounds of the claimed invention. This will assist in expediting compact prosecution.  MPEP 714.02 recites: “Applicant should also specifically point out the support for any amendments made to the disclosure. See MPEP § 2163.06. An amendment which does not comply with the provisions of 37 CFR 1.121(b), (c), (d), and (h) may be held not fully responsive. See MPEP § 714.”  Amendments not pointing to specific support in the disclosure may be deemed as not complying with provisions of 37 C.F.R.  1.131(b), (c), (d), and (h) and therefore held not fully responsive.  Generic statements such as “Applicants believe no new matter has been introduced” may be deemed insufficient.
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to James Hwa whose telephone number is 571-270-1285, email address is james.hwa@uspto.gov. The examiner can normally be reached on 9:00 am – 5:30 pm EST. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ajay Bhatia can be reached on 571-272-3906. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only, for more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the PAIR system contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
02/09/2026											
										
/SHYUE JIUNN HWA/
Primary Examiner, Art Unit 2156

Read full office action

Prosecution Timeline

Jun 28, 2023

Application Filed

Feb 13, 2026

Non-Final Rejection mailed — §103

Apr 24, 2026

Interview Requested

May 12, 2026

Applicant Interview (Telephonic)

May 12, 2026

Examiner Interview Summary

May 13, 2026

Response Filed

Precedent Cases

Applications granted by this same examiner with similar technology

18/924,937

Patent 12639342

SYSTEMS AND METHODS FOR DETERMINING DATA-ORIGIN LINKS BETWEEN NAMED ENTITIES

1y 7m to grant Granted May 26, 2026

17/984,717

Patent 12632786

NAMED ENTITY BIAS DETECTION AND MITIGATION TECHNIQUES FOR SENTENCE SENTIMENT ANALYSIS

3y 6m to grant Granted May 19, 2026

18/049,138

Patent 12619873

Providing unlabelled training data for training a computational model

3y 6m to grant Granted May 05, 2026

17/819,047

Patent 12602571

NETWORK PARTITIONING FOR SENSOR-BASED SYSTEMS

3y 8m to grant Granted Apr 14, 2026

18/819,789

Patent 12596683

LOG-STRUCTURED FILE SYSTEM FOR A ZONED BLOCK MEMORY DEVICE

1y 7m to grant Granted Apr 07, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

83%

Grant Probability

99%

With Interview (+38.4%)

3y 0m (~1m remaining)

Median Time to Grant

Low

PTA Risk

Based on 858 resolved cases by this examiner. Grant probability derived from career allowance rate.