Prosecution Insights
Last updated: April 19, 2026
Application No. 18/435,024

CROSS-ASSISTANT COMMAND PROCESSING

Non-Final OA §103
Filed
Feb 07, 2024
Examiner
PULLIAS, JESSE SCOTT
Art Unit
2655
Tech Center
2600 — Communications
Assignee
Amazon Technologies, Inc.
OA Round
3 (Non-Final)
83%
Grant Probability
Favorable
3-4
OA Rounds
2y 8m
To Grant
96%
With Interview

Examiner Intelligence

Grants 83% — above average
83%
Career Allow Rate
873 granted / 1052 resolved
+21.0% vs TC avg
Moderate +13% lift
Without
With
+13.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 8m
Avg Prosecution
47 currently pending
Career history
1099
Total Applications
across all art units

Statute-Specific Performance

§101
15.0%
-25.0% vs TC avg
§103
50.4%
+10.4% vs TC avg
§102
19.7%
-20.3% vs TC avg
§112
4.9%
-35.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 1052 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 11/25/25 has been entered. This office action is in response to correspondence 11/25/25 regarding application 18/435,024, in which claims 1, 6, 11, and 16 were amended. Claims 1-8, 10-18, and 20 are pending and have been considered. Response to Arguments Applicant’s arguments with respect to section II. And the rejection of the claim(s) under 35 U.S.C. 102 have been considered but are moot in view of the new grounds for rejections under 35 U.S.C. 103, based in part on the newly discovered reference to Lee et al. (hereinafter Lee, U.S. Patent Application Publication 2020/0349952), which describes a voice command resolving system in which a voice assistant server queries an IoT server for device information about registered devices, including device id information, i.e. an identifier of a second computing system, and transmits this to hub device, [0146], and the voice assistant server also transmits the function determination model to the hub device, [0207], which is a directive to send second data to the second computing system, since the hub device checks whether it is present for the given device and if so, transmits text to the device, [0140]. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claim(s) 1 – 3, 6, 8, 11 – 13, 16, and 18 is/are rejected under 35 U.S.C. 103 being unpatentable over Wood et al. (hereinafter Wood, U.S. Patent Application Publication 2017/0269975) in view of Lee et al. (hereinafter Lee, U.S. Patent Application Publication 2020/0349952). Regarding Claim 1, Wood discloses: A computer-implemented method, comprising: receiving, by an electronic device, first data representing at least a portion of a first natural language input (e.g. speech input to virtual assistant arbitor; Fig. 2 elements 200, 206, and 218; also see corresponding process in Fig. 1 step 1 and [0024]); sending, from the electronic device to a first computing system (e.g. networked elements shown in Fig. 1 and corresponding elements in Fig. 2), the first data to cause the first computing system to perform first natural language processing using the first data (e.g. utterance sent to speech recognition engine; Fig. 2 elements 220 and 208 also see corresponding process in Fig. 1 step 2 and [0026]); receiving, by the electronic device and from the first computing system (e.g. speech recognition engine 208 returns a message 222 containing the text and intent of the user's speech, to the virtual assistant arbiter 206; also see corresponding process in Fig. 1 step 3 and [0027]), a directive to send second data to a second computing system (e.g. virtual assistant arbiter next determines which of several different possible independent specialized virtual assistants to use to process the user's intent; that is, the intent serves as a directive as to which of virtual assistant the command should be sent to, Fig. 2 elements 206, 210, 214 and [0036] in conjunction with steps 2 and 3 above; note particularly that high level virtual assistant module 106 may be in communication with high level virtual assistant network 107; in other words, the intent determination is performed by the networked elements), the second data representing a first natural language command determined by the first computing system (e.g. request/user intent is returned; [0027]; note the example command of “order pizza” in [0028]), the first natural language command corresponding to the portion of the first natural language input (e.g. message returned containing the intent of the user’s speech; [0036]; further note at least one user request intent, determined based on the user request; [0040]), wherein the first natural language command is different from the first natural language input (e.g. intent arbitration selects one of the specialized virtual assistants based on the user request intent, which is performed based on created rules; [0041] in other words, the selection and execution of the command is performed based on the determined intent which is not an exact duplicate of the command, or is “different,” by virtue of the fact of the inclusion of additional information for example location; [0041]) ; and sending, from the electronic device to the second system (e.g. the virtual assistant arbiter 206 sends an intent notification 230 to the selected specialized virtual assistant 214; Fig. 2 elements 206, 214, 230; also see corresponding process in Fig. 1 step 6 and [0030]; high level virtual assistant module 106 notifies the next independent specialized virtual assistant interested in the “order pizza” intent), the second data to cause the second computing system to perform second natural language processing using the second data to determine a response to the first natural language command determined by the first computing system (e.g. acknowledges the intent notification and instantiates a specialized virtual assistant. The specialized virtual assistant then initiates a dialog with the end user, who issues further speech; Fig. 2 elements 210, 234, 236, 238, and 240; also see corresponding process in Fig. 1 step 6 and [0031]; particularly that Network 116 consumes the request and responds to initiate an active Pizza Restaurant 1 Virtual Assistant session; further note that PR2VA 110 delivers a virtual assistant initiation request with its own independent specialized virtual assistant network, which here is Pizza Restaurant 2 (PR2) Network 112 [0028]). Wood does not specifically mention an identifier of a second computing system and a directive to send second data to the second computing system. In the same field of voice command devices (Abstract), Lee teaches an identifier of a second computing system and a directive to send second data to the second computing system (voice assistant server queries IoT server for device information about registered devices, including device id information, i.e. an identifier of a second computing system, and transmits this to hub device, [0146], voice assistant server also transmits the function determination model to the hub device, [0207], which is a directive to send second data to the second computing system, since the hub device checks whether it is present for the given device and if so, transmits text to the device, [0140]). It would have been obvious to one of ordinary skill in the art at the time of effective filing to combine the device id information and function determination model as taught by Lee with the disclosure of Wood in order to confer the benefits of reducing user inconvenience and increasing response speed, as is discussed in paras. [0004-0005] of Lee. Regarding Claim 2, in addition to the elements stated above regarding claim 1, Wood further discloses: wherein sending the first data to the first computing system comprises using a first component of the electronic device to send the first data to the first computing system, the first component corresponding to the first computing system (e.g. note communication network elements of the various client and other devices, particularly the internet, Bluetooth, TCP/IP communication etc. [0046] [0047] and Fig. 7 element 86; which serve to effect the communication performed in the various figures). Regarding Claim 3, in addition to the elements stated above regarding claim 2, Wood further discloses: wherein sending the second data to the second computing system comprises using a second component of the electronic device to send the second data to the second computing system, the second component corresponding to the second system (e.g. note communication network elements of the various client and other devices, particularly the internet, Bluetooth, TCP/IP communication etc. [0046] [0047] and Fig. 7 element 86; which serve to effect the communication performed in the various figures; further consider that communication over TCP/IP or the internet require particular configurations and setting, for example, assigning IP addresses and the like, which, when considering the broadest reasonable interpretation, program these elements to perform different and unique transmissions based on protocol, sender and recipient; as such these differing configurations can be said to be another, or “second component” as when transmitting to one device element 86 will be set in a different manner than transmitting to any of the other devices). Regarding Claim 6, in addition to the elements stated above regarding claim 1, Wood further discloses: sending, to the second computing system, an indication that the first natural language input is part of a dialog conducted in a first language (e.g. after activation, signaling to the high level virtual assistant module 106 that it is now taking control of the dialog with the user 100; [0032]); and receiving, from the second computing system in response to the second data and the indication, output data in the first language (e.g. and vocalizes to the user 100 a greeting and next step [0032]). Regarding Claim 8, in addition to the elements stated above regarding claim 1, Wood further discloses: wherein receiving the first data comprises receiving data representing text of the portion of the first natural language input (e.g. text of the utterance 104, along with the user's intent, are returned; [0027]). Regarding Claim 11, claim 11 is directed to the system corresponding to the method presented in claim 1 and is rejected under the same reasons as stated above. Additionally, Wood further discloses: at least one processor (e.g. processor [0048]; and at least one memory comprising instructions that, when executed by the at least one processor (e.g. computer program product provides software instructions for the system for execution; [0047][0048], cause the system to perform the method of claim 1 (see rejection of claim 1 above) Regarding Claim 12, claim 12 is directed to the system corresponding to the method presented in claim 2 and is rejected under the same reasons as stated above. Regarding Claim 13, claim 13 is directed to the system corresponding to the method presented in claim 3 and is rejected under the same reasons as stated above. Regarding Claim 16, claim 16 is directed to the system corresponding to the method presented in claim 6 and is rejected under the same reasons as stated above. Regarding Claim 18, claim 18 is directed to the system corresponding to the method presented in claim 8 and is rejected under the same reasons as stated above. Claim(s) 4, 5, 7, 14, 15, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wood et al. (hereinafter Wood, U.S. Patent Application Publication 2017/0269975) in view of Lee et al. (hereinafter Lee, U.S. Patent Application Publication 2020/0349952), in further view of Wang et al. (hereinafter Wang, U.S. Patent Application Publication 2019/0332680). Regarding Claim 4, in addition to the elements stated above regarding claim 1, Wood and Lee fail to explicitly disclose: determining the first natural language input corresponds to a first language, wherein the first natural language command corresponds to a second language different from the first language. In the same field of voice assistants ([0061]), Wang discloses further comprising: determining the first natural language input corresponds to a first language ([0364]; the virtual personal assistant 2500 may optionally include a language identification subsystem 2504. The language identification subsystem 2504 can analyze a spoken word or phrase and determine the language in which the word or phrase was spoken), wherein the first natural language command corresponds to a second language different from the first language (e.g. Wood, now adapted such that the natural language command in the transcription text corresponds to a second language different from the first language as taught by Wang in [0360]; FIG. 25 illustrates an example of the audio input and output system of a virtual personal assistant 2500, where the audio input 2542 can be provided in one language, and natural language understanding and reasoning can be conducting in a different language. In the example of FIG. 25, the virtual personal assistant 2500 includes machine translation (MT) components. In various implementations, the machine translation components can process the output from an automatic speech recognition engine, and convert the automatic speech recognition output from an input language into the internal processing language of the virtual personal assistant 2500). It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Wood and Lee as taught by Wang in order to confer the benefit of a quickly developable and deployable virtual assistant that can handle multi-lingual input and output which allows the assistant to be used by multi-lingual persons or groups of people more easily, as is discussed in paras. [0076-78] of Wang. Regarding Claim 5, in addition to the elements stated above regarding claim 1, Wood further discloses: receiving, from the second computing system, output data corresponding to the response (e.g. specialized virtual assistant 210 then initiates a dialog 240; [0036]; note further after initiating initiate an active Pizza Restaurant 1 Virtual Assistant session, vocalizing to the user a greeting and next steps; [0031] [0032]); Wood and Lee do not, however, explicitly disclose: determining that the output data is in the first language; and in response to determining that the output data is in the first language, causing presentation of an output based on the output data. In the same field of voice assistants ([0061]), Wang discloses further comprising: determining that the output data is in the first language ([0375]; the output generated by the virtual personal assistant engine 2512 is translated to the speaker's input language); and in response to determining that the output data is in the first language, causing presentation of an output based on the output data ([0374]; The NLG/TTS 2518 engines can formulate the task output as a natural language phrase, and then convert the phrase to machine synthesized speech. The speech can then be output using, for example, a speaker included in the virtual personal assistant 2500, or in the device into which the virtual personal assistant is incorporated). It would have been obvious to one of ordinary skill in the art at the time of effective filing to combine the translation as taught by Wang with the disclosures of Wood and Lee in order to confer the benefit of a quickly developable and deployable virtual assistant that can handle multi-lingual input and output which allows the assistant to be used by multi-lingual persons or groups of people more easily, as is discussed in paras. [0076-78] of Wang. Regarding Claim 7, in addition to the elements stated above regarding claim 1, Wood further discloses: receiving, from the second computing system, first output data (e.g. specialized virtual assistant 210 then initiates a dialog 240; [0036]; note further after initiating initiate an active Pizza Restaurant 1 Virtual Assistant session, vocalizing to the user a greeting and next steps; [0031] [0032]); sending the first output data to the first computing system (e.g. process the user's intent 248, and responds to the user's utterance 250; [0036]; see also communication denoted by line 580f in Fig. 5 as well); causing presentation of an output based on output data (e.g. speech presented to a user via input/output devices of Figs. 6/7; see for example “Thanks for ordering with <PR1>... [0032]). Wood and Lee do not explicitly disclose: determining that the first output data is in a second language different from a first language corresponding to a dialog of the first natural language input; in response to determining that the first output data is not in a same language as the dialog, sending the first output data for translation receiving, from the first computing system, second output data representing a translation of the first output data into the first language; and causing presentation of an output based on the second output data. In the same field of voice assistants ([0061]), Wang discloses further comprising: determining that the first output data is in a second language different from a first language corresponding to a dialog of the first natural language input ([0375]; the output generated by the virtual personal assistant engine 2512 is translated to the speaker's input language, thus the system must determine that the first output data is in a second language different from a first language corresponding to a dialog of the first natural language input); in response to determining that the first output data is not in a same language as the dialog, sending the first output data for translation ([0375]; the output generated by the virtual personal assistant engine 2512 is translated to the speaker's input language (e.g., Japanese) or into another desired language. In these implementations, the system-generated output may be provided to the machine translation subsystem 2510 to perform a reverse translation (e.g., from the VPA's internal processing language to the speaker's input language)); receiving, from the first computing system, second output data representing a translation of the first output data into the first language ([0375]; the reverse translation of the output); and causing presentation of an output based on the second output data ([0375]; use of natural language generation (NLG)/text-to-speech (TTS) to present translated output, i.e. presentation of an output based on the second output data). It would have been obvious to one of ordinary skill in the art at the time of effective filing to combine the translation as taught by Wang with the disclosures of Wood and Lee in order to confer the benefit of a quickly developable and deployable virtual assistant that can handle multi-lingual input and output which allows the assistant to be used by multi-lingual persons or groups of people more easily, as is discussed in paras. [0076-78] of Wang. Regarding Claim 14, claim 14 is directed to the system corresponding to the method presented in claim 4 and is rejected under the same reasons as stated above. Regarding Claim 15, claim 15 is directed to the system corresponding to the method presented in claim 5 and is rejected under the same reasons as stated above. Regarding Claim 17, claim 17 is directed to the system corresponding to the method presented in claim 7 and is rejected under the same reasons as stated above. Claim(s) 10 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wood et al. (hereinafter Wood, U.S. Patent Application Publication 2017/0269975) in view of Lee et al. (hereinafter Lee, U.S. Patent Application Publication 2020/0349952), in further view of Hicks (U.S. Patent Application Publication 2014/0362024). Regarding Claim 10, in addition to the elements stated above regarding claim 1, Wood and Lee fail to explicitly disclose: the first natural language input represents a second natural language command; and the second natural language command represents a request for processing by the second computing system. In the same field of voice command devices (Abstract), Hicks teaches wherein: the first natural language input represents a second natural language command ([0017]; first command is get a pizza; second command is to send a message to Tom); and the second natural language command represents a request for processing by the second computing system ([0017]; second command is to send a message to Tom, which can be considered as a request for processing, i.e. the generation and sending of the message). It would have been obvious to one of ordinary skill in the art at the time of effective filing to combine the second natural language command of Hicks with the systems of Wood and Lee. The disclosures are directed to voice controlled speech processing devices. Wood similarly teaches of a voice controlled system that can also order a pizza, and Lee similarly teaches of a voice controlled system that can control an air condition, TV, air purifier, etc. The incorporation of the features as taught by Hicks into the disclosures of Wood and Lee provides the benefit of flexibility of the system for handling complex inputs. Therefore, it would have been obvious to one skilled in the art to further modify Wood and Lee with the features of Hicks to process a multi-command input. Regarding Claim 20, claim 20 is directed to the system corresponding to the method presented in claim 10 and is rejected under the same reasons as stated above. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: US 20210103452 Pratt discloses receiving a query at a client device, parsing it into sub-tasks at a server, identifying a set of assistants that can handle the sub-tasks, and presenting a proposal to a user at a client device to select a list of tasks and assistants for handling them. US 20210072953 Amarilla disclose conditionally assigning various automated assistant functions to interaction with a peripheral assistant control device. US 20210316682 Broy discloses determining a digital assistant from among multiple digital assistants in a vehicle for carrying out a task. Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jesse Pullias whose telephone number is 570/270-5135. The examiner can normally be reached M-F 8:00 AM-4:30 PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on 571/272-7516. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /JESSE S PULLIAS/Primary Examiner, Art Unit 2655 02/13/26
Read full office action

Prosecution Timeline

Feb 07, 2024
Application Filed
Mar 19, 2025
Non-Final Rejection — §103
May 05, 2025
Interview Requested
Jun 09, 2025
Response Filed
Sep 27, 2025
Final Rejection — §103
Oct 30, 2025
Interview Requested
Nov 17, 2025
Applicant Interview (Telephonic)
Nov 17, 2025
Examiner Interview Summary
Nov 25, 2025
Response after Non-Final Action
Dec 12, 2025
Request for Continued Examination
Jan 13, 2026
Response after Non-Final Action
Feb 13, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12596885
Automatically Labeling Items using a Machine-Trained Language Model
2y 5m to grant Granted Apr 07, 2026
Patent 12573378
SPEECH TENDENCY CLASSIFICATION
2y 5m to grant Granted Mar 10, 2026
Patent 12572740
MULTI-LANGUAGE DOCUMENT FIELD EXTRACTION
2y 5m to grant Granted Mar 10, 2026
Patent 12566929
COMBINING DATA SELECTION AND REWARD FUNCTIONS FOR TUNING LARGE LANGUAGE MODELS USING REINFORCEMENT LEARNING
2y 5m to grant Granted Mar 03, 2026
Patent 12536389
TRANSLATION SYSTEM
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
83%
Grant Probability
96%
With Interview (+13.0%)
2y 8m
Median Time to Grant
High
PTA Risk
Based on 1052 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month