Last updated: April 19, 2026

Application No. 18/341,224

DIALOG MANAGEMENT WITH MULTIPLE MODALITIES

Final Rejection §103§DP

Filed

Jun 26, 2023

Examiner

HE, JIALONG

Art Unit

2659

Tech Center

2600 — Communications

Assignee

Amazon Technologies, Inc.

OA Round

4 (Final)

Interview Optional

— +33.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 911 resolved cases, 2023–2026

Examiner Intelligence

HE, JIALONG View full profile →

Grants 81% — above average

Career Allow Rate

742 granted / 911 resolved

+19.4% vs TC avg

Strong +33% interview lift

Without

With

+33.1%

Interview Lift

resolved cases with interview

Typical timeline

3y 1m

Avg Prosecution

23 currently pending

Career history

934

Total Applications

across all art units

Statute-Specific Performance

§101

13.7%

-26.3% vs TC avg

§103

39.7%

-0.3% vs TC avg

§102

15.6%

-24.4% vs TC avg

§112

19.6%

-20.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 911 resolved cases

Office Action

§103 §DP

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Response to Amendments and Arguments
Regarding an obviousness type double patenting, applicant amended independent claims and requested to hold the double patenting rejection in abeyance (Remarks, page 8). 

By comparing the amended independent claims with claims of its parent patent, the examiner believed that the independent claims are broader than a claim of the parent patent. The obviousness type double patenting rejection has been maintained.

Regarding a rejection under 35 U.S.C. §103, applicant amended independent claims 1 and 11 by adding a few new limitations. These limitations were presented during an interview (see an interview summary mailed on 12/29/2025). The added limitations are related to sharing context information between applications. The added feature is known in prior art references. For example, a reference submitted by the applicant in an IDS (Busayapongchai, US PG Pub. 2005/0288936, paragraph [30]) disclosed the added claim limitations.  

In the Remarks (Remarks, page 9), applicant generally alleged the previously cited references fail to teach the newly added limitations to independent claims 1 and 11. In the following rejection, the examiner combines Busayapongchai (submitted by the applicant in an IDS filed on 09/12/2023) with the previously cited prior art references to reject the amended claims. Applicant’s arguments are considered but are moot because the arguments do not apply to the combined teaching being used in the following rejection. 

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.

Claims 2 and 11 are rejected on the ground of non-statutory double patenting as being unpatentable over claim 6 or 14 of U.S. Patent No. 11,688,402.  Although the conflicting claims are not identical, they are not patentably distinct from each other because the instant independent claims are broader than a dependent claim of its parent ‘402 patent. In other words, a claim in its parent ‘402 patent anticipates the instant independent claims. Anticipation is “the ultimate or epitome of obviousness” (In re Kalm, 154 USPQ 10 (CCPA 1967), also In re Dailey, 178 USPQ 293 (CCPA 1973) and In re Pearson, 181 USPQ 641 (CCPA 1974)).

	Claim Rejections - 35 USC § 103
Claims 2, 4-11, and 14-21 are rejected under 35 U.S.C. 103 as being unpatentable over Kalns et al. (US PG Pub. 2014/0310001, submitted in an IDS, referred to as Kalns) in view of in view of Busayapongchai et al. (US PG Pub. 2005/0288936, referred to as Busayapongchai), and further in view of Nash (US PG Pub. 2013/0159377, referred to as Nash). 

Kalns was cited before for rejecting original claims of its parent patents. Kalns discloses a user could conversate with a virtual personal assistant (VPA). The VPA maintains a dialog context so that the user could continue / resume a previous dialog session in a later time ([0108], Fig. 9). Kalns further discloses the VPA provides response in text or speech format (claimed “a first modality”) and also output images (claimed “a second modality”). Kalns further discloses VPA could deduce higher level or implicit / unstated information about user ([0019], [0060]).
	
Busayapongchai discloses in a conversational environment a dialog context is stored and shared between different applications (Busayapongchai, [0030], the context may be transferred across applications. For example, if a user who asked for "Chinese in Atlanta", then switched to weather, the system may ask the user if the user in interested in Atlanta weather. Additionally, the chance of recognizing "Atlanta" compared to other cities may be increased; also see [0034-0035])	

	Nash discloses an automated personal assistant (APA) proactively provides suggestions to a user by considering context information and user’s schedule ([0035], An agent 104 may generate suggestions for delivery to a user 106. An agent 104 may provide information for deliver to a user 106 or to another agent 104 of the user 106. An agent 104 may provide information that may be used by the platform 102 to calculate context variables. An agent 104 may automate tasks for a user 106 without requiring the user 106 to initiate the task). Nash discloses APA proactively provides information such as coupons, ads or offers (Abstract, [0080-0084])

Regarding claims 2 and 11, Kalns discloses a computer-implemented system and method ([0092], Fig. 1, a computer implemented virtual personal assistant, VPA, for conducting spoken dialogs with a user), comprising:
under control of a computing system comprising one or more processors configured with specific computer-executable instructions (Fig. 1 and Fig. 11), 
receiving audio data representing at least one utterance ([0018], Fig. 5, a user has a spoken dialog with a VPA); 
generating, using natural language understanding ("NLU") processing based at least partly on the audio data, command data that represents the at least one utterance ([0017-0018], conversation in natural language between VPA and a user; Fig. 9); 
sending the command data to a first application resulting in first output content being presented in an audio modality during a first period of time in response to receiving the command data ([0058-059], a user performs online shopping by speaks voice requests to a VPA, the VPA presents search results in both text format, voice format and image format, see Fig. 9; Note the outputted text or voice format corresponds to a claimed “a first modality”); 
receiving second output content in a second modality different from the first modality, wherein the second output content is associated with the audio output content and a second application, ([0108], the VPA also an image associated with the text / voice search results as shown in Fig. 9, #926; Note, the displayed image is claimed “a second modality”, the claimed “a second application” is interpreted as software programs for displaying images), wherein the second output content comprises content not included in the first output content, and wherein the second output content is received without a request for output content in the second modality (Kalns, [0020-0021], [0108], Fig. 9, VPA presents promotion product and pictures without user’s request); and 
causing presentation of the second output content in the second modality during a second period of time beginning subsequent to a beginning of the first period of time (Fig. 9, the image #926 is displayed after displaying text / outputting voice #924, which is claimed “a second period of time beginning subsequent to a beginning of the first period of time”).

Kalns discloses in response to a user’s voice command, outputting text and image responses (Fig. 9). Kalns further discloses the system may also generate / output synthesized speech ([0057]). In addition, Kalns further discloses a dialog context builder extract contexts from user’s input. The context are used when processing user’s requests (Kalns, [0062-0070], processing newly received user input based on the stored context; if a user asks “search movie”, the context intent extractor may search the dialog context for other instances of the “search movie” intents). Although Kalns implicitly discloses sharing / reuse context information, the examiner further cites Busayapongchai to explicitly discloses the newly added: storage of context data associated with the first output content; determine, during a second period of time subsequent to the first period oftime, that a second application is to use the context data; send the context data to the second application.. (Busayapongchai, [0030], [0034-0035]). 

Kalns discloses when VPA interacts with a user, the VPA prompts a product with 20% discount. The output (product picture) was not requested by a user. Although Kalns meets the newly added limitations, the examiner further cites Nash, which discloses an automated personal assistant (APA) proactively provides suggestions based on context information and user’s schedule without user’s request (Nash, [0035], [0080-0084]). Nash meets the added limitations “wherein the second output …” in claims 2 and 11. 

Kalns, Busayapongchai and Nash are related to a user interacting with a computer.  It would have been obvious to a person having ordinary skill in the art at the time the invention was filed to modify Kalns teaching with Busayapongchai[‘s to share context information between applications. It would also have been obvious to modify Kalns to implement proactive functions so that a virtual assistant provides information without explicitly requesting. One having ordinary skill in the art would have been motivated to make such a modification because the remote computer has a high-resolution display and is easy to read. In addition, all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods, and in the combination each element merely would have performed the same function as it did separately. “A combination of familiar elements according to known methods is likely to be obvious when it does no more than yield predictable results.” KSR, 550 U.S. ___, 82 USPQ2d at 1395 (2007). One of ordinary skill in the art would have recognized that the results of the combination were predictable.

Regarding claims 4 and 14, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses first output content comprises an audio presentation, and wherein the second output content comprises a visual presentation (Kalns, [0108], VPA outputs speech and images showing products as shown in Fig. 9, #924, #926).

Regarding claim 5, Kalns further discloses the one or more processors are programmed by further executable instructions to manage a multi-turn dialog comprising the at least one utterance, at least a second utterance, and at least one system-generated response (Kalns, Fig. 5, Fig. 9 showing multi-turn dialog between a user and VPA).

Regarding claim 6, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses the one or more processors are programmed by further executable instructions to determine to send the command data to the first application based at least partly on a subject of the at least one utterance (Kalns, [0059], Fig. 5, Fig. 9, a user performs online shopping by having a dialog with VPA).

Regarding claim 7, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses the one or more processors are programmed by further executable instructions to:
generate, using automatic speech recognition ("ASR") processing and the audio data, utterance data representing the at least one utterance (Kalns, [0041], recognizing user’s speech using ASR), 
wherein the command data being generated using NLU processing based at least partly on the audio data comprises the command data being generated using NLU processing and the utterance data (Kalns, [0035], [0039-0041], Fig. 5, the VPA understands user’s voice request in natural language).

Regarding claim 8, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses the one or more processors are programmed by further executable instructions to:
store context data, wherein the context data is generated by the first application ([0018], [0020], [0024], Fig. 2, #212);
generate, using second NLU processing based at least partly on second audio data, second command data ([0098], continuation of a dialog session based on context, Fig. 5 and Fig. 9);
determine that a second application is to use the context data ([0036], fig. 2, Fig. 7); and 
send the second command data and the context data to the second application ([0018], [0036], [0057], Fig. 9, displaying image of a suggested product).

Regarding claim 9, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses the one or more processors are programmed by further executable instructions to:
generate context data during the NLU processing, wherein the context data is based on at least one of: named entity data associated with the at least one utterance, or utterance data representing a plurality of previously processed utterances ([0036], [0084-0085], [0087], Fig. 7, #710, # Fig.9, continue shopping for gift based on context); 
store the context data (Fig. 7, #710); and 
generate second command data using second NLU processing based at least partly on second audio data and the context data ([0107-0109], Fig. 9, continuing previous dialog for shopping gift based on dialog context and user’s intent).

Regarding claim 10, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses further comprising:
an audio input device configured to generate the audio data based on the at least one utterance ([0107-0108], using a microphone to receive user’s speech); and 
an audio output device configured to present at least one of the first output content or the second output content ([0057], [0115], Fig. 9, generating synthesized speech and output using a speaker).

Regarding claim 15, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses sending the command data to the first application is performed by a dialog manager application, and wherein causing presentation of the second output content in the second modality comprises sending, by the dialog manager application, the second output content to the second application configured to present content in the second modality ([0018-0020], [0050], Fig. 9 and fig. 10).

Regarding claim 16, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses causing presentation of the second output content in the second modality comprises causing presentation of the second output content in the second modality using a different output device than an output device used by the first application to cause presentation of the first output content in the first modality ([0057], output synthesized speech using a speaker, output images on a display screen, which is a different output device that speaker used for outputting TTS speech).

Regarding claim 17, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses storing context data associated with the at least one utterance, wherein the context data comprises the second output content ([0036], Fig. 2, #212, #216; Fig. 8, #826).

Regarding claim 18, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses:
receiving second audio data representing a second utterance (Fig. 7, Fig. 9 and Fig. 10; continuing dialog based on previous contexts); 
determining that the second utterance is part of a conversation including the at least one utterance ([0018], Fig. 7, Fig. 9); and 
determining, based at least partly on the context data, to cause presentation of the second output content in the second modality in response to receiving the second audio data ([0107-1008], Fig. 9, continuation previous dialog based on contexts and presenting text and image outputs). 

Regarding claim 19, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses:
determining to send the command data to the first application based at least partly on a subject of the at least one utterance ([0047-0048], based on topic or subject to output search results).

Regarding claim 20, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses:
generating context data during the NLU processing (Fig. 3, #210); 
storing the context data (Fig. 3, #212); and 
generating second command data using second NLU processing based at least partly on second audio data and the context data ([0036], [0061-0066], Fig. 7, #712, #716).

Regarding claim 21, the combined teaching of Kalns in view of Busayapongchai and Nash further discloses:
the receiving the audio data comprises receiving the audio data over a network connection to a computing device ([0112], Fig. 11, #1130).

Claim 22 is rejected under 35 U.S.C. 103 as being unpatentable over Kalns in view of Busayapongchai and Nash, and further in view of Jacobsen et al. (US PG Pub. 2009/0099836, referred to as Jacobsen).

Regarding claim 22, Kalns discloses a user could interact with a VPA using voice or using text input (Kalns, [0028], Fig. 9, display screen; [0057], output machine generated speech using a speaker). Although Kalns implicitly discloses outputting to a second device. The examiner further cites Jacobsen to show “the second output content to a second output device different from a first output device through which the first output content was presented, wherein the second output content is presented using the second output device (Jacobsen, [0042-0043], if content could not be displayed on a device #100, transmitting the content to a remote computer #with high resolution display #125; See fig. 1, a wearable device #100, a wireless connected computer #125).

	It would have been obvious to a person having ordinary skill in the art at the time the invention was filed to modify Kalns teaching with Jacobsen’s teaching to transmit certain contents to a remote computer for displaying after outputting audio. One having ordinary skill in the art would have been motivated to make such a modification because the remote computer has a high-resolution display and is easy to read. In addition, all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods, and in the combination each element merely would have performed the same function as it did separately. “A combination of familiar elements according to known methods is likely to be obvious when it does no more than yield predictable results.” KSR, 550 U.S. ___, 82 USPQ2d at 1395 (2007). One of ordinary skill in the art would have recognized that the results of the combination were predictable.
 
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jialong He, whose telephone number is (571) 270-5359.  The examiner can normally be reached on Monday – Friday, 8:00AM – 4:30PM, EST.

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Pierre Desir can be reached on (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JIALONG HE/Primary Examiner, Art Unit 2659

Read full office action

Prosecution Timeline

Jun 26, 2023

Application Filed

Sep 12, 2023

Response after Non-Final Action

Nov 09, 2024

Non-Final Rejection — §103, §DP

Jan 24, 2025

Interview Requested

Jan 30, 2025

Applicant Interview (Telephonic)

Jan 30, 2025

Examiner Interview Summary

Feb 13, 2025

Response Filed

Feb 27, 2025

Final Rejection — §103, §DP

May 10, 2025

Interview Requested

May 20, 2025

Applicant Interview (Telephonic)

May 20, 2025

Examiner Interview Summary

Jul 01, 2025

Request for Continued Examination

Jul 02, 2025

Response after Non-Final Action

Aug 04, 2025

Examiner Interview (Telephonic)

Sep 24, 2025

Examiner Interview Summary

Oct 01, 2025

Non-Final Rejection — §103, §DP

Dec 22, 2025

Applicant Interview (Telephonic)

Dec 22, 2025

Examiner Interview Summary

Jan 30, 2026

Response Filed

Feb 22, 2026

Final Rejection — §103, §DP (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/488,333

Patent 12597426

METHOD AND SYSTEM FOR GENERATING SYMPATHETIC BACK-CHANNEL SIGNAL

2y 5m to grant Granted Apr 07, 2026

18/081,076

Patent 12579721

Generating video content from user input data

2y 5m to grant Granted Mar 17, 2026

18/656,239

Patent 12581165

SYSTEM AND METHOD FOR AUDIO VISUAL CONTENT CREATION AND PUBLISHING WITHIN A CONTROLLED ENVIRONMENT

2y 5m to grant Granted Mar 17, 2026

18/149,656

Patent 12573360

AUDIOVISUAL CONTENT RENDERING WITH DISPLAY ANIMATION SUGGESTIVE OF GEOLOCATION AT WHICH CONTENT WAS PREVIOUSLY RENDERED

2y 5m to grant Granted Mar 10, 2026

18/221,190

Patent 12561535

ELECTRONIC APPARATUS FOR REAL-TIME CONVERSATION INTERPRETATION AND METHOD FOR CONTROLLING THE SAME

2y 5m to grant Granted Feb 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

5-6

Expected OA Rounds

81%

Grant Probability

99%

With Interview (+33.1%)

3y 1m

Median Time to Grant

High

PTA Risk

Based on 911 resolved cases by this examiner. Grant probability derived from career allow rate.

DIALOG MANAGEMENT WITH MULTIPLE MODALITIES

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email