Last updated: April 19, 2026

Application No. 18/207,386

AUDIO OUTPUT CONTROL

Final Rejection §103

Filed

Jun 08, 2023

Examiner

MCCORD, PAUL C

Art Unit

2692

Tech Center

2600 — Communications

Assignee

Amazon Technologies, Inc.

OA Round

4 (Final)

Interview Optional

— +26.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 569 resolved cases, 2023–2026

Examiner Intelligence

MCCORD, PAUL C View full profile →

Grants 69% — above average

Career Allow Rate

393 granted / 569 resolved

+7.1% vs TC avg

Strong +27% interview lift

Without

With

+26.6%

Interview Lift

resolved cases with interview

Typical timeline

3y 5m

Avg Prosecution

41 currently pending

Career history

610

Total Applications

across all art units

Statute-Specific Performance

§101

10.5%

-29.5% vs TC avg

§103

54.0%

+14.0% vs TC avg

§102

6.8%

-33.2% vs TC avg

§112

20.9%

-19.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 569 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 2, 4-8, 10-12, 14-18, 20-23 rejected under 35 U.S.C. 103 as being unpatentable over Lang: 20170242653 further in view of Millington: 20070022207 hereinafter Mill and further in view of Gold: 8571612.

Regarding claim 2
Lang teaches:
A device comprising: one or more processors; and non-transitory computer-readable media storing instructions that, when executed by the one or more processors, causes the one or more processors to perform operations (Lang: ¶ 49-51, 66-70; Fig 2, 3: control device and/or playback device operable to receive user instructions and direct playback in, among, etc. a constellation of synchronized playback devices) comprising: 
 receiving, at the device, first input data requesting that first audio be output by a target device associated with the associated with the device (Lang: ¶ 71-76; Fig 4: such as by operation of the user interface to designate a particular output device or room; a particular first audio track and a particular music source; said operation conveyed to the device; said device comprising a device proximal to a user), 
wherein the device is configured to wirelessly receive audio data corresponding to the first audio from one or more audio streaming services (Lang: ¶ 49-51, 66-70, 294: such as over the network interface of the controller or playback device, operable to wirelessly request and retrieve first audio content form one or more music content sources); 
determining, from user account data associated with the device, multiple audio- output devices that have been configured to communicate with the device (Lang: ¶ 189, 296, 325-328, etc.: user preferences associated with a device, streaming account, etc. comprise playback preferences with respect to the configuration of a device, groupings of devices and the output of media retrieved from a stream service thereupon), 
 selecting the target device from the multiple audio-output devices based at least in part on the first input data (Lang: ¶ 46-57, 71-76; Fig 4: such as by operation of the user interface to designate a particular output device or room in response to a spoken selection; such as for designating a particular track, particular changes thereto as well as a particular source to retrieve the track for playback; said operation conveyed to the device) and the user account data indicating that the target device has been configured to transition device states in response to commands sent to the target device from the device (Lang: ¶ 296, 325-327: user preferences dictate a particular time at which a first, second, etc. device should connect to a first, second, etc. streaming service; that is, in Lang the playback devices operate to exchange metadata and perform automated and user directed actions based on user identification based on voice commands and user profile information for a registered user (see additionally Lang: ¶ 33, 45); based on Examiner review of relevant paragraphs ¶ 195, 200, 202 of the specification as filed the user account data indicating that a target has been configured is considered user registry, account, preference, profile, metadata, etc. which designates device states and determines a current device state such as for determining a device to which a media playback queue or active queue may be transitioned);
receiving, via the device, second input data representing a user utterance (Lang: Abstract; ¶ 33, : system operable of microphone(s) for voice control of media playback in concert with determination of a user location by the microphone(s)) directing playback of audio at a target device requesting to alter output of the first audio at the target device (Lang: ¶ 60-62, 84, 207, 303-309, etc.: such as by receiving a voice command to perform a user interface operation upon the audio such as by iteratively directing playback to a particular zone player, playback device, etc.; or by receipt of further commands to alter playback parameters of the active queue such as volume, transport control commands such as pause, equalization thereof)
causing, at the device, a queue associated with the audio data to be altered based at least in part on receiving the user utterance (Lang: ¶ 50-62, 84, 98, 105, 111, 172-176, 458, 538, 547, etc.: such as in response to an instruction to move music to a separate or additional room or changes to playback or other user interface parameters based on receipt of a voice command, that is, user voice control alters playback settings including changes to the playback of a first audio list or queue; changes to output parameters of first audio; and changes to the list or queue such as direction thereof to a particular zone or additional zone for output; such as a receptor of a command to “turn up the balcony,” or other command to change operational parameters, output parameters, or indeed any a voice control of any user interface operations relevant to music playback control for output on “the balcony,”);
and causing output of an audible response at a device of the system indicating that the command has been received, enacted, etc. (Lang: ¶ 403-409).
 

and causing output of second audio at the device indicating that output of the first audio at the target device has been altered..

Lang discusses transitioning a second device to operate in concert with a first device (Lang: ¶ 50-52) and discusses causing a queue, playlist, etc. to be moved among zones and an audible response thereto (Lang: ¶ 206, 211) thereby causing, at the device, a queue associated with the audio data to be altered based at least in part on receiving the user utterance Lang: ¶ 206, 211, 393). Further Lang strongly suggests the audible output in response to a voice command being directable to a particular output device but does not teach the explicit use case of causing output of second audio at the device indicating that output of the first audio at the target device has been altered. 

In a related field of endeavor Mill teaches a similar system of networked playback devices wherein certain devices upon the network access streaming audio sources and provide the data therefrom to other devices upon the network (Mill: Abstract); the system operative for 
sending a first command from the device to the target device (Mill: ¶ 101-104; Figs 5, 5A, 5B: current information channel device determines a target device to stream information from a streaming source to the local network and provides notification message thereto), at the time of the first command the target device lacks a connection to the audio streaming source (Mill: ¶ 101-104; Figs 5, 5A, 5B: target device instantiates a connection to the streaming audio) the first command configured to cause the target device to transition to a state where the audio data from the one or more audio streaming services is utilized by the target device instead of the device to output the audio (Mill: ¶ 101-104; Figs 5, 5A, 5B upon receipt of message target device transitions to the information channel device for the local network, establishes a connection with the network, streaming, etc. source and begins streaming therefrom, outputting based thereon, etc.). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to transition a device in the manner claimed and taught by Mill within the Lang system and method for synchronized audio playback by a network of local devices; one of ordinary skill would have been motivated to do so for at least the purpose of changing the particular device operating the channel and would have expected.
Lang in view of Mill teaches operation of voice control to alter output of a target device (Mill: ¶ 207, 400: system operable under voice command to change playback parameters of particular devices representative of named zones, rooms, etc.) and further operative to receive user instructions to alter a queue (Mill: 347). Lang in view of Mill strongly suggests the audible output in response to a voice command being directable to a particular output device but does not teach the explicit use case of causing output of second audio at the device indicating that output of the first audio at the target device has been altered.

In a related field of endeavor Gold teaches a system for voice management of a plurality of target devices by a user operative of a device in communication with a controller device (Gold: Col 4:14-4:30, 9:22-9:35,16:26-16:56) wherein the  controller operates to provide confirmation data to the device acknowledging execution of commands, such as indicating success, failure and/or errors thereof (Gold: Col 16:26-16:56) and the device generates an audible response based on the confirmation data such as corresponding to changes to operations of target devices in response to first and subsequent commands (Gold: Col 4:14-4:30, 9:22-9:35, 10:15-10:29, 16:26-16:56). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to produce similar voice feedback upon a device receiving configuration instructions such as taught or suggested by Gold upon a playback device receiving commands to operate a target device in the Lang in view Mill system and method for at least the purpose of improving a user experience by explicitly confirming a successful or unsuccessful voice command of operations on a target device upon the device receiving the command and without resort to visual confirmation; one of ordinary skill in the art would have expected only predictable results therefrom.


Regarding claim 4
Lang in view of Mill in view of Gold teaches or suggests:
 The device of claim 2, wherein: the first input data represents speech input received at a speech interface device; and the first input data indicates that the speech interface device has received a request to cause the target device to output the audio (Lang: Abstract: ¶ 35, 400, 408, 453; Fig 4: voice command over a playback system such as to operate aspects of the figure 4 user interface to designate media to be played upon particular output devices); (Mill: 101-104; Figs 5: such as when the device and target device are handed off in keeping with figures 5 in response to a user command or stored preferences). The claim is considered obvious over Lang as modified by Mill and Gold as addressed in the base claim as it would have been obvious to apply the further teaching of Lang, Mill, and/or Gold to the modified device of Lang, Mill, and Gold; one of ordinary skill in the art would have expected only predictable results therefrom.

Regarding claim 5
Lang in view of Mill in view of Gold teaches or suggests:
 The device of claim 2, the operations further comprising: associating the device with the target device based at least in part on the target device being physically connected to the device; and wherein selecting the target device comprises selecting the target device based at least in part on the device being physically connected to the target device (Lang: ¶ 35, 49, 56, 296, 325-327, 400, 408, etc.: such as when the required devices selected by user voice input and/or stored preferences are connected over a wired network or a physical layer of a wireless network); (Mill: 101-104; Figs 5: such as when the device and target device are handed off in keeping with figures 5 in response to a user command or stored preferences). The claim is considered obvious over Lang as modified by Mill and Gold as addressed in the base claim as it would have been obvious to apply the further teaching of Lang, Mill, and/or Gold to the modified device of Lang, Mill, and Gold; one of ordinary skill in the art would have expected only predictable results therefrom.

Regarding claim 6
Lang in view of Mill in view of Gold teaches or suggests:
 The device of claim 2, wherein: the first input data is received from a personal device running an application associated with the device; the first input data indicates a selection of the device as the target device; and selecting the target device comprises selecting the target device based at least in part on the device being configured to cause the target device to output the audio (Lang: ¶ 35, 49, 56, 101-104, 296, 325-327, 400, 408, etc.; Fig 4: such as when the required devices are upon a user premises or owned by the user and selected by user voice input and/or stored preferences); (Mill: 101-104; Figs 5: such as when the device and target device are handed off in keeping with figure 4 in response to a user command or stored preferences). The claim is considered obvious over Lang as modified by Mill and Gold as addressed in the base claim as it would have been obvious to apply the further teaching of Lang, Mill, and/or Gold to the modified device of Lang, Mill, and Gold; one of ordinary skill in the art would have expected only predictable results therefrom.

Regarding claim 7
Lang in view of Mill in view of Gold teaches or suggests:
 The device of claim 2, the operations further comprising: receiving second input data requesting to cause at least one of the multiple audio-output devices to output the audio in time synchronization with output of the audio by the target device; and sending, to the at least one of the multiple audio-output devices and based at least in part on the second input data: the audio data; and a second command to output the audio in time synchronization with output of the audio by the target device (Lang: ¶ 400, 408, 534, etc.: such as when the system is in receipt of a voice command to play media upon two devices); (Mill: 101-104; Figs 5: such as when the device and target device are handed off in keeping with figures 5 in response to a user command or stored preferences and the command requires invocation of the figures 5 method). The claim is considered obvious over Lang as modified by Mill and Gold as addressed in the base claim as it would have been obvious to apply the further teaching of Lang, Mill, and/or Gold to the modified device of Lang, Mill, and Gold; one of ordinary skill in the art would have expected only predictable results therefrom.

Regarding claim 8
Lang in view of Mill in view of Gold teaches or suggests:
 The device of claim 2, wherein: the device excludes a speaker; and selecting the target device comprises selecting the target device based at least in part on the target device being configured to output audio sent to the device (Lang: Abstract; ¶ 49, 53, Fig 2: such as one or more of the playback devices on the synchronized playback network); (Mill: ¶ 3; Fig 1: zone players of the system such as those of figure 1 comprise speakers and/or are connected to speakers for audio output). The claim is considered obvious over Lang as modified by Mill and Gold as addressed in the base claim as it would have been obvious to apply the further teaching of Lang, Mill, and/or Gold to the modified device of Lang, Mill, and Gold; one of ordinary skill in the art would have expected only predictable results therefrom.

Regarding claim 10
Lang in view of Mill in view of Gold teaches or suggests:
 The device of claim 2, the operations further comprising: receiving second input data requesting to transfer output of the audio from the target device to an additional device of the multiple audio-output devices; and based at least in part on receiving the second input data: sending a second command to the additional device, the second command configured to cause the additional device to output the audio; and sending a third command to the target device, the third command configured to cause the target device to cease output of the audio (Lang: Abstract: ¶ 35, 400, 408, 453; Fig 4: voice command over a playback system such as to operate aspects of the figure 4 user interface to designate media to be played upon particular output device and an additional output device(s)); (Mill: ¶ 101-104; Figs 5: such as a user or automatic command suitable to invoke the figs 5 method and to thereby instantiation of playback upon a second device added to a group of third, etc. devices). The claim is considered obvious over Lang as modified by Mill and Gold as addressed in the base claim as it would have been obvious to apply the further teaching of Lang, Mill, and/or Gold to the modified device of Lang, Mill, and Gold; one of ordinary skill in the art would have expected only predictable results therefrom.

Regarding claim 11
Lang in view of Mill in view of Gold teaches or suggests:
 The device of claim 2, the operations further comprising associating a state of the device with the target device such that, when a state change occurs for the device, the state change is caused to occur for the target device (Mill: ¶ 101-104; Figs 5: such as a handoff state sufficient to invoke the figs 5 method). The claim is considered obvious over Lang as modified by Mill and Gold as addressed in the base claim as it would have been obvious to apply the further teaching of Lang, Mill, and/or Gold to the modified device of Lang, Mill, and Gold; one of ordinary skill in the art would have expected only predictable results therefrom.

Regarding claim  12 – the claim is considered to recite substantially similar subject matter to the of claim 2 and is similarly rejected.

Regarding claim 14 – the claim is considered to recite substantially similar subject matter to the of claim 4 and is similarly rejected.

Regarding claim 15 – the claim is considered to recite substantially similar subject matter to the of claim 5 and is similarly rejected.

Regarding claim 16 – the claim is considered to recite substantially similar subject matter to the of claim 6 and is similarly rejected.

Regarding claim 17 – the claim is considered to recite substantially similar subject matter to the of claim 7 and is similarly rejected.

Regarding claim 18 – the claim is considered to recite substantially similar subject matter to the of claim 8 and is similarly rejected.

Regarding claim 20 – the claim is considered to recite substantially similar subject matter to the of claim 10 and is similarly rejected.

Regarding claim 21 – the claim is considered to recite substantially similar subject matter to the of claim 11 and is similarly rejected.

Regarding claim 22, 23
Lang in view of Mill in view of Gold teaches or suggests:
The method of claim 12, wherein the user account data indicates a stored preference for the first device to utilize the second device for audio output (Lang: ¶ 1, 45, 310-314, etc.; Fig 9, etc.: system configures audio output for one or more playback devices based on system metadata and user profile information such as to create default playback devices based on a plurality of situations); (Mill: ¶ 22, 30-36, 101-104, etc.: such as by playback of media by a handed-off playback device based on established user preferences, system metadata, etc.). The claim is considered obvious over Lang as modified by Mill and Gold as addressed in the base claim as it would have been obvious to apply the further teaching of Lang, Mill, and/or Gold to the modified device of Lang, Mill, and Gold; one of ordinary skill in the art would have expected only predictable results therefrom.

Regarding claim 23 – the claim is considered to recite substantially similar subject matter to the of claim 22 and is similarly rejected.


Claims 24, 25 rejected under 35 U.S.C. 103 as being unpatentable over Lang: 20170242653 further in view of Millington: 20070022207 hereinafter Mill and further in view of Gold: 8571612as applied to claims 2, 4-8, 10-12, 14-18, 20-23 supra and further in view of Solomon: 20180233141 hereinafter Sol.

Regarding claim 24
Lang in view of Mill in view of Gold teaches or suggests:
The device of claim 2, but does not explicitly teach operations further comprising: determining that the first input data includes an anaphora; determining that the anaphora is associated with the target device; and wherein selecting the target device comprises selecting the target device based at least in part on the anaphora being associated with the target device. 
In a related field of endeavor Sol teaches a system and method for an intelligent voice user interface (Sol: Abstract; ¶ 2, 46, etc.; Fig 3, etc.) where information resolving a user location is further operable to resolve anaphoric references such as for the purpose of resolving ambiguous statements based on user context (Sol: ¶ 279, 280, etc.; Fig 23, etc.). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to improve the Lang in view of Mill in view of Gold system and method with the Sol taught means for disambiguating a user input for at least the purpose of creating a voice user interface able to interpret less explicit user commands, pronouns therein based on user context such as location; one of ordinary skill in the art would have expected only predictable results therefrom.

Regarding claim 25 – the claim is considered to recite substantially similar subject matter to the of claim 24 and is similarly rejected.



Response to Arguments
Applicant’s arguments and amended claims, see Claims and Remarks, filed 10/07/2025, with respect to the rejection(s) of claim(s) 2, 4-12, 14-23 under 35 USC 103 over Lang in view of Millington in view of Wilberding and claim(s) 24, 25 under 35 USC 103 over Lang in view of Millington in view of Wilberding in view of Solomon have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Lang in view of Millington in view of Gold and Lang in view of Millington in view of Gold in view of Solomon.

Conclusion
 Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL C MCCORD whose telephone number is (571)270-3701. The examiner can normally be reached 730-630 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CAROLYN EDWARDS can be reached at (571) 270-7136. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PAUL C MCCORD/Primary Examiner, Art Unit 2692

Read full office action

Prosecution Timeline

Jun 08, 2023

Application Filed

Sep 04, 2024

Non-Final Rejection — §103

Dec 05, 2024

Response Filed

Feb 28, 2025

Final Rejection — §103

Apr 29, 2025

Examiner Interview Summary

Apr 29, 2025

Applicant Interview (Telephonic)

May 05, 2025

Response after Non-Final Action

May 20, 2025

Request for Continued Examination

May 21, 2025

Response after Non-Final Action

Jul 10, 2025

Non-Final Rejection — §103

Oct 06, 2025

Applicant Interview (Telephonic)

Oct 06, 2025

Examiner Interview Summary

Oct 07, 2025

Response Filed

Oct 24, 2025

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/674,333

Patent 12603094

ADAPTIVE PROCESSING WITH MULTIPLE MEDIA PROCESSING NODES

2y 5m to grant Granted Apr 14, 2026

18/653,631

Patent 12592238

INFORMATION PROCESSING METHOD, INFORMATION PROCESSING DEVICE, AND NON-TRANSITORY COMPUTER READABLE RECORDING MEDIUM STORING INFORMATION PROCESSING PROGRAM

2y 5m to grant Granted Mar 31, 2026

19/029,744

Patent 12593192

MEDIA PLAYBACK BASED ON SENSOR DATA

2y 5m to grant Granted Mar 31, 2026

18/280,697

Patent 12572323

DYNAMIC AUDIO CONTENT GENERATION

2y 5m to grant Granted Mar 10, 2026

16/822,293

Patent 12567003

TECHNOLOGIES FOR DECENTRALIZED FLEET ANALYTICS

2y 5m to grant Granted Mar 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

5-6

Expected OA Rounds

69%

Grant Probability

96%

With Interview (+26.6%)

3y 5m

Median Time to Grant

High

PTA Risk

Based on 569 resolved cases by this examiner. Grant probability derived from career allow rate.