Last updated: April 19, 2026

Application No. 18/435,024

CROSS-ASSISTANT COMMAND PROCESSING

Non-Final OA §103

Filed

Feb 07, 2024

Examiner

PULLIAS, JESSE SCOTT

Art Unit

2655

Tech Center

2600 — Communications

Assignee

Amazon Technologies, Inc.

OA Round

3 (Non-Final)

Interview Optional

— +13.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 1052 resolved cases, 2023–2026

Examiner Intelligence

PULLIAS, JESSE SCOTT View full profile →

Grants 83% — above average

Career Allow Rate

873 granted / 1052 resolved

+21.0% vs TC avg

Moderate +13% lift

Without

With

+13.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 8m

Avg Prosecution

47 currently pending

Career history

1099

Total Applications

across all art units

Statute-Specific Performance

§101

15.0%

-25.0% vs TC avg

§103

50.4%

+10.4% vs TC avg

§102

19.7%

-20.3% vs TC avg

§112

4.9%

-35.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1052 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 11/25/25 has been entered.
This office action is in response to correspondence 11/25/25 regarding application 18/435,024, in which claims 1, 6, 11, and 16 were amended. Claims 1-8, 10-18, and 20 are pending and have been considered.
 
Response to Arguments
Applicant’s arguments with respect to section II. And the rejection of the claim(s) under 35 U.S.C. 102 have been considered but are moot in view of the new grounds for rejections under 35 U.S.C. 103, based in part on the newly discovered reference to Lee et al. (hereinafter Lee, U.S. Patent Application Publication 2020/0349952), which describes a voice command resolving system in which a voice assistant server queries an IoT server for device information about registered devices, including device id information, i.e. an identifier of a second computing system, and transmits this to hub device, [0146], and the voice assistant server also transmits the function determination model to the hub device, [0207], which is a directive to send second data to the second computing system, since the hub device checks whether it is present for the given device and if so, transmits text to the device, [0140].


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.




Claim(s) 1 – 3, 6, 8, 11 – 13, 16, and 18 is/are rejected under 35 U.S.C. 103 being unpatentable over Wood et al. (hereinafter Wood, U.S. Patent Application Publication 2017/0269975) in view of Lee et al. (hereinafter Lee, U.S. Patent Application Publication 2020/0349952).

Regarding Claim 1, Wood discloses:
A computer-implemented method, comprising:
receiving, by an electronic device, first data representing at least a portion of a first natural language input (e.g. speech input to virtual assistant arbitor; Fig. 2 elements 200, 206, and 218; also see corresponding process in Fig. 1 step 1 and [0024]);
sending, from the electronic device to a first computing system (e.g. networked elements shown in Fig. 1 and corresponding elements in Fig. 2), the first data to cause the first computing system to perform first natural language processing using the first data (e.g. utterance sent to speech recognition engine; Fig. 2 elements 220 and 208 also see corresponding process in Fig. 1 step 2 and [0026]);
receiving, by the electronic device and from the first computing system (e.g. speech recognition engine 208 returns a message 222 containing the text and intent of the user's speech, to the virtual assistant arbiter 206; also see corresponding process in Fig. 1 step 3 and [0027]), a directive to send second data to a second computing system (e.g. virtual assistant arbiter next determines which of several different possible independent specialized virtual assistants to use to process the user's intent; that is, the intent serves as a directive as to which of virtual assistant the command should be sent to, Fig. 2 elements 206, 210, 214 and [0036] in conjunction with steps 2 and 3 above; note particularly that high level virtual assistant module 106 may be in communication with high level virtual assistant network 107; in other words, the intent determination is performed by the networked elements), the second data representing a first natural language command determined by the first computing system (e.g. request/user intent is returned; [0027]; note the example command of “order pizza” in [0028]), the first natural language command corresponding to the portion of the first natural language input (e.g. message returned containing the intent of the user’s speech; [0036]; further note at least one user request intent, determined based on the user request; [0040]), wherein the first natural language command is different from the first natural language input (e.g. intent arbitration selects one of the specialized virtual assistants based on the user request intent, which is performed based on created rules; [0041] in other words, the selection and execution of the command is performed based on the determined intent which is not an exact duplicate of the command, or is “different,” by virtue of the fact of the inclusion of additional information for example location; [0041]) ; and
sending, from the electronic device to the second system (e.g. the virtual assistant arbiter 206 sends an intent notification 230 to the selected specialized virtual assistant 214; Fig. 2 elements 206, 214, 230; also see corresponding process in Fig. 1 step 6 and [0030]; high level virtual assistant module 106 notifies the next independent specialized virtual assistant interested in the “order pizza” intent), the second data to cause the second computing system to perform second natural language processing using the second data to determine a response to the first natural language command determined by the first computing system (e.g. acknowledges the intent notification and instantiates a specialized virtual assistant. The specialized virtual assistant then initiates a dialog with the end user, who issues further speech; Fig. 2 elements 210, 234, 236, 238, and 240; also see corresponding process in Fig. 1 step 6 and [0031]; particularly that Network 116 consumes the request and responds to initiate an active Pizza Restaurant 1 Virtual Assistant session; further note that PR2VA 110 delivers a virtual assistant initiation request with its own independent specialized virtual assistant network, which here is Pizza Restaurant 2 (PR2) Network 112 [0028]).
Wood does not specifically mention an identifier of a second computing system and a directive to send second data to the second computing system.
In the same field of voice command devices (Abstract), Lee teaches an identifier of a second computing system and a directive to send second data to the second computing system (voice assistant server queries IoT server for device information about registered devices, including device id information, i.e. an identifier of a second computing system, and transmits this to hub device, [0146], voice assistant server also transmits the function determination model to the hub device, [0207], which is a directive to send second data to the second computing system, since the hub device checks whether it is present for the given device and if so, transmits text to the device, [0140]).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to combine the device id information and function determination model as taught by Lee with the disclosure of Wood in order to confer the benefits of reducing user inconvenience and increasing response speed, as is discussed in paras. [0004-0005] of Lee.


Regarding Claim 2, in addition to the elements stated above regarding claim 1, Wood further discloses:
wherein sending the first data to the first computing system comprises using a first component of the electronic device to send the first data to the first computing system, the first component corresponding to the first computing system (e.g. note communication network elements of the various client and other devices, particularly the internet, Bluetooth, TCP/IP communication etc. [0046] [0047] and Fig. 7 element 86; which serve to effect the communication performed in the various figures).

Regarding Claim 3, in addition to the elements stated above regarding claim 2, Wood further discloses:
wherein sending the second data to the second computing system comprises using a second component of the electronic device to send the second data to the second computing system, the second component corresponding to the second system (e.g. note communication network elements of the various client and other devices, particularly the internet, Bluetooth, TCP/IP communication etc. [0046] [0047] and Fig. 7 element 86; which serve to effect the communication performed in the various figures; further consider that communication over TCP/IP or the internet require particular configurations and setting, for example, assigning IP addresses and the like, which, when considering the broadest reasonable interpretation, program these elements to perform different and unique transmissions based on protocol, sender and recipient; as such these differing configurations can be said to be another, or “second component” as when transmitting to one device element 86 will be set in a different manner than transmitting to any of the other devices).

Regarding Claim 6, in addition to the elements stated above regarding claim 1, Wood further discloses:
sending, to the second computing system, an indication that the first natural language input is part of a dialog conducted in a first language (e.g. after activation, signaling to the high level virtual assistant module 106 that it is now taking control of the dialog with the user 100; [0032]); and 
receiving, from the second computing system in response to the second data and the indication, output data in the first language (e.g. and vocalizes to the user 100 a greeting and next step [0032]).

Regarding Claim 8, in addition to the elements stated above regarding claim 1, Wood further discloses:
wherein receiving the first data comprises receiving data representing text of the portion of the first natural language input (e.g. text of the utterance 104, along with the user's intent, are returned; [0027]).

Regarding Claim 11, claim 11 is directed to the system corresponding to the method presented in claim 1 and is rejected under the same reasons as stated above.  Additionally, Wood further discloses:
at least one processor (e.g. processor [0048]; and 
at least one memory comprising instructions that, when executed by the at least one processor (e.g. computer program product provides software instructions for the system for execution; [0047][0048], cause the system to perform the method of claim 1 (see rejection of claim 1 above)

Regarding Claim 12, claim 12 is directed to the system corresponding to the method presented in claim 2 and is rejected under the same reasons as stated above.  

Regarding Claim 13, claim 13 is directed to the system corresponding to the method presented in claim 3 and is rejected under the same reasons as stated above.  

Regarding Claim 16, claim 16 is directed to the system corresponding to the method presented in claim 6 and is rejected under the same reasons as stated above.  

Regarding Claim 18, claim 18 is directed to the system corresponding to the method presented in claim 8 and is rejected under the same reasons as stated above.  


Claim(s) 4, 5, 7, 14, 15, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wood et al. (hereinafter Wood, U.S. Patent Application Publication 2017/0269975) in view of Lee et al. (hereinafter Lee, U.S. Patent Application Publication 2020/0349952), in further view of Wang et al. (hereinafter Wang, U.S. Patent Application Publication 2019/0332680).

Regarding Claim 4, in addition to the elements stated above regarding claim 1, Wood and Lee fail to explicitly disclose:
determining the first natural language input corresponds to a first language,
wherein the first natural language command corresponds to a second language different from the first language.
In the same field of voice assistants ([0061]), Wang discloses further comprising:
determining the first natural language input corresponds to a first language ([0364]; the virtual personal assistant 2500 may optionally include a language identification subsystem 2504. The language identification subsystem 2504 can analyze a spoken word or phrase and determine the language in which the word or phrase was spoken),
wherein the first natural language command corresponds to a second language different from the first language (e.g. Wood, now adapted such that the natural language command in the transcription text corresponds to a second language different from the first language as taught by Wang in [0360];  FIG. 25 illustrates an example of the audio input and output system of a virtual personal assistant 2500, where the audio input 2542 can be provided in one language, and natural language understanding and reasoning can be conducting in a different language. In the example of FIG. 25, the virtual personal assistant 2500 includes machine translation (MT) components. In various implementations, the machine translation components can process the output from an automatic speech recognition engine, and convert the automatic speech recognition output from an input language into the internal processing language of the virtual personal assistant 2500).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Wood and Lee as taught by Wang in order to confer the benefit of a quickly developable and deployable virtual assistant that can handle multi-lingual input and output which allows the assistant to be used by multi-lingual persons or groups of people more easily, as is discussed in paras. [0076-78] of Wang.

Regarding Claim 5, in addition to the elements stated above regarding claim 1, Wood further discloses:
receiving, from the second computing system, output data corresponding to the response (e.g. specialized virtual assistant 210 then initiates a dialog 240; [0036]; note further after initiating initiate an active Pizza Restaurant 1 Virtual Assistant session, vocalizing to the user a greeting and next steps; [0031] [0032]);
Wood and Lee do not, however, explicitly disclose:
determining that the output data is in the first language; and
in response to determining that the output data is in the first language, causing presentation of an output based on the output data.
In the same field of voice assistants ([0061]), Wang discloses further comprising:
determining that the output data is in the first language ([0375]; the output generated by the virtual personal assistant engine 2512 is translated to the speaker's input language); and
in response to determining that the output data is in the first language, causing presentation of an output based on the output data ([0374]; The NLG/TTS 2518 engines can formulate the task output as a natural language phrase, and then convert the phrase to machine synthesized speech. The speech can then be output using, for example, a speaker included in the virtual personal assistant 2500, or in the device into which the virtual personal assistant is incorporated).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to combine the translation as taught by Wang with the disclosures of Wood and Lee in order to confer the benefit of a quickly developable and deployable virtual assistant that can handle multi-lingual input and output which allows the assistant to be used by multi-lingual persons or groups of people more easily, as is discussed in paras. [0076-78] of Wang.

Regarding Claim 7, in addition to the elements stated above regarding claim 1, Wood further discloses:
receiving, from the second computing system, first output data (e.g. specialized virtual assistant 210 then initiates a dialog 240; [0036]; note further after initiating initiate an active Pizza Restaurant 1 Virtual Assistant session, vocalizing to the user a greeting and next steps; [0031] [0032]); 
sending the first output data to the first computing system (e.g. process the user's intent 248, and responds to the user's utterance 250; [0036]; see also communication denoted by line 580f in Fig. 5 as well);
causing presentation of an output based on output data (e.g. speech presented to a user via input/output devices of Figs. 6/7; see for example “Thanks for ordering with <PR1>... [0032]).
Wood and Lee do not explicitly disclose:
determining that the first output data is in a second language different from a first language corresponding to a dialog of the first natural language input;
in response to determining that the first output data is not in a same language as the dialog, sending the first output data for translation
receiving, from the first computing system, second output data representing a translation of the first output data into the first language; and 
causing presentation of an output based on the second output data.
In the same field of voice assistants ([0061]), Wang discloses further comprising:
determining that the first output data is in a second language different from a first language corresponding to a dialog of the first natural language input ([0375]; the output generated by the virtual personal assistant engine 2512 is translated to the speaker's input language, thus the system must determine that the first output data is in a second language different from a first language corresponding to a dialog of the first natural language input);
in response to determining that the first output data is not in a same language as the dialog, sending the first output data for translation ([0375]; the output generated by the virtual personal assistant engine 2512 is translated to the speaker's input language (e.g., Japanese) or into another desired language. In these implementations, the system-generated output may be provided to the machine translation subsystem 2510 to perform a reverse translation (e.g., from the VPA's internal processing language to the speaker's input language));
receiving, from the first computing system, second output data representing a translation of the first output data into the first language ([0375]; the reverse translation of the output); and 
causing presentation of an output based on the second output data ([0375]; use of natural language generation (NLG)/text-to-speech (TTS) to present translated output, i.e. presentation of an output based on the second output data).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to combine the translation as taught by Wang with the disclosures of Wood and Lee in order to confer the benefit of a quickly developable and deployable virtual assistant that can handle multi-lingual input and output which allows the assistant to be used by multi-lingual persons or groups of people more easily, as is discussed in paras. [0076-78] of Wang.

Regarding Claim 14, claim 14 is directed to the system corresponding to the method presented in claim 4 and is rejected under the same reasons as stated above.  

Regarding Claim 15, claim 15 is directed to the system corresponding to the method presented in claim 5 and is rejected under the same reasons as stated above.  

Regarding Claim 17, claim 17 is directed to the system corresponding to the method presented in claim 7 and is rejected under the same reasons as stated above.  

Claim(s) 10 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wood et al. (hereinafter Wood, U.S. Patent Application Publication 2017/0269975) in view of Lee et al. (hereinafter Lee, U.S. Patent Application Publication 2020/0349952), in further view of Hicks  (U.S. Patent Application Publication 2014/0362024).

Regarding Claim 10, in addition to the elements stated above regarding claim 1, Wood and Lee fail to explicitly disclose:
the first natural language input represents a second natural language command; and 
the second natural language command represents a request for processing by the second computing system.
In the same field of voice command devices (Abstract), Hicks teaches wherein:
the first natural language input represents a second natural language command ([0017]; first command is get a pizza; second command is to send a message to Tom); and
the second natural language command represents a request for processing by the second computing system ([0017]; second command is to send a message to Tom, which can be considered as a request for processing, i.e. the generation and sending of the message).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to combine the second natural language command of Hicks with the systems of Wood and Lee. The disclosures are directed to voice controlled speech processing devices. Wood similarly teaches of a voice controlled system that can also order a pizza, and Lee similarly teaches of a voice controlled system that can control an air condition, TV, air purifier, etc. The incorporation of the features as taught by Hicks into the disclosures of Wood and Lee provides the benefit of flexibility of the system for handling complex inputs. Therefore, it would have been obvious to one skilled in the art to further modify Wood and Lee with the features of Hicks to process a multi-command input.

Regarding Claim 20, claim 20 is directed to the system corresponding to the method presented in claim 10 and is rejected under the same reasons as stated above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
US 20210103452 Pratt discloses receiving a query at a client device, parsing it into sub-tasks at a server, identifying a set of assistants that can handle the sub-tasks, and presenting a proposal to a user at a client device to select a list of tasks and assistants for handling them.
US 20210072953 Amarilla disclose conditionally assigning various automated assistant functions to interaction with a peripheral assistant control device.
US 20210316682 Broy discloses determining a digital assistant from among multiple digital assistants in a vehicle for carrying out a task.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jesse Pullias whose telephone number is 570/270-5135. The examiner can normally be reached M-F 8:00 AM-4:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on 571/272-7516.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/JESSE S PULLIAS/Primary Examiner, Art Unit 2655                                                                           02/13/26

Read full office action

Prosecution Timeline

Feb 07, 2024

Application Filed

Mar 19, 2025

Non-Final Rejection — §103

May 05, 2025

Interview Requested

Jun 09, 2025

Response Filed

Sep 27, 2025

Final Rejection — §103

Oct 30, 2025

Interview Requested

Nov 17, 2025

Applicant Interview (Telephonic)

Nov 17, 2025

Examiner Interview Summary

Nov 25, 2025

Response after Non-Final Action

Dec 12, 2025

Request for Continued Examination

Jan 13, 2026

Response after Non-Final Action

Feb 13, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/385,358

Patent 12596885

Automatically Labeling Items using a Machine-Trained Language Model

2y 5m to grant Granted Apr 07, 2026

17/747,704

Patent 12573378

SPEECH TENDENCY CLASSIFICATION

2y 5m to grant Granted Mar 10, 2026

18/168,450

Patent 12572740

MULTI-LANGUAGE DOCUMENT FIELD EXTRACTION

2y 5m to grant Granted Mar 10, 2026

18/410,097

Patent 12566929

COMBINING DATA SELECTION AND REWARD FUNCTIONS FOR TUNING LARGE LANGUAGE MODELS USING REINFORCEMENT LEARNING

2y 5m to grant Granted Mar 03, 2026

17/838,199

Patent 12536389

TRANSLATION SYSTEM

2y 5m to grant Granted Jan 27, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

83%

Grant Probability

96%

With Interview (+13.0%)

2y 8m

Median Time to Grant

High

PTA Risk

Based on 1052 resolved cases by this examiner. Grant probability derived from career allow rate.