Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 9/4/25 have been fully considered but they are not persuasive.
Regarding Applicant’s first argument: “Applicant submits that claims recite, inter alia, "any number of voice input devices for receiving voice inputs vocalized by a user of the automation network." Wilberding does not disclose a plurality of voice input devices operating within an automation network. Instead, Wilberding describes a network microphone device (NMD) or playback devices that may provide microphone functionality. These devices are not disclosed as independent voice input devices forming part of an automation network, and thus fail to teach the claimed plurality of voice input devices communicatively connected as part of such a system.”
The examiner respectfully disagrees. Wilberding includes various example scenarios that indicate that the network microphone devices (NMDs) are part of an automation network.
See col. 1 ln. 62 – col. 2 ln. 9: “Networked microphone devices (NMDs) may be used to control a household using voice control. A NMD may be, for example, a SONOS® playback device, server, or system capable of receiving voice input via a microphone. Additionally, a NMD may be a device other than a SONOS® playback device, server, or system (e.g., AMAZON® ECHO®, APPLE® IPHONE®) capable of receiving voice inputs via a microphone. U.S. application Ser. No. 15/098,867 entitled, “Default Playback Device Designation,” which is hereby incorporated by reference, provides examples of voice-enabled household architectures. Voice control can be beneficial for various devices with a “smart” home, such as playback devices, wireless illumination devices, thermostats, door locks, home automation, as well as other examples.” (emphasis added)
Also see col. 12 ln 66 – col. 13 ln. 5: “In one example, as with NMDs 512, 514, and 516, CR522 and PBDs 532, 534, 536, and 538 may also be components of one or more “Smart Home” systems. In one case, PBDs 532, 534, 536, and 538 may be distributed throughout the same household as the NMDs 512, 514, and 516. Further, as suggested above, one or more of PBDs 532, 534, 536, and 538 may be one or more of NMDs 512, 514, and 516.” (emphasis added)
Further, see col. 14 lns. 4-39: “In an illustrative example, NMDs 512, 514, and 516 may be configured to receive voice inputs to control PBDs 532, 534, 536, and 538. The available control commands may include any media playback system controls previously discussed, such as playback volume control, playback transport controls, music source selection, and grouping, among other possibilities. In one instance, NMD 512 may receive a voice input to control one or more of the PBDs 532, 534, 536, and 538.” (emphasis added)
Still further, see col. 17 lns. 21-31: “Example voice commands may include commands to modify any of the media playback system controls or playback settings. Playback settings may include, for example, playback volume, playback transport controls, music source selection, and grouping, among other possibilities. Other voice commands may include operations to adjust television control or play settings, mobile phone device settings, or illumination devices, among other device operations. As more household devices become “smart” (e.g., by incorporating a network interface), voice commands may be used to control various household devices.” (emphasis added)
Finally, see col. 20 ln. 54 – col. 21 ln. 3: “In some cases, the NMD may determine that the voice input includes a voice command that is directed to a particular type of device. In such cases, the NMD may identify a particular voice service that is configured to process voice inputs directed to that type of device to process the voice input. For example, the NMD may determine that a given voice input is directed to one or more wireless illumination devices (e.g., that “Turn on the lights in here” is directed to the “smart” lightbulbs in the same room as the NMD) and identify, as the voice service to process the voice input, a particular voice service that is configured to process voice inputs directed to wireless illumination devices. As another example, the NMD may determine that a given voice input is directed to a playback device and identify, as the voice service to process the voice input, a particular voice service that is configured to process voice inputs directed to playback devices.” (emphasis added)
Regarding applicant’s second argument: “Further, the claims recite, inter alia, an "automation controller for executing (1) a voice coordinator for coordinating WHV system operations, (2) any number of voice input proxies for proxying for the voice input devices; and (3) a voice daemon executing any number of voice daemon instances each respectively associated with one of any number of voice targets." Wilberding does not disclose an automation controller that executes these distinct functions. The NMD of Wilberding may detect a voice input and forward it to a selected voice service, at most, but the reference does not disclose an automation controller separate from the input devices, nor does it disclose a "voice coordinator" configured to coordinate WHV system operations. Likewise, Wilberding lacks any teaching of "voice input proxies" that proxy for multiple voice input devices.
Additionally, the claims recite, inter alia, a "voice daemon executing any number of voice daemon instances each respectively associated with one of any number of voice targets." Wilberding is silent as to any daemon-based architecture, and certainly does not describe executing multiple daemon instances, each mapped to an individual voice target. Instead, Wilberding merely forwards captured audio data to one or more voice services. This is fundamentally distinct from the claimed architecture in which the automation controller runs multiple daemon instances, each tied to a separate voice target.”
The examiner respectfully disagrees. Wilberding describes that all the NMDs include the functionality argued above, where the NMDs may take the form of standalone NMDs, playback devices, or controller devices, and where at least the controller devices map to an automation controller.
See col. 16 lns. 46-55: “At block 702, implementation 700 involves receiving voice data indicating a voice input. For instance, a NMD, such as NMD 600, may receive, via a microphone, voice data indicating a voice input. As further examples, any of playback devices 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, and 124 or control devices 126 and 128 of FIG. 1 may be a NMD and may receive voice data indicating a voice input. Yet further examples NMDs include NMDs 512, 514, and 516, PBDs 532, 534, 536, and 538, and CR 522 of FIG. 5.”
Wilberding further describes (1) that the controller NMDs operate as a voice coordinator for coordinating WHV system operations.
See col. 7 lns. 2-13: “Referring back to the media playback system 100 of FIG. 1, the environment may have one or more playback zones, each with one or more playback devices. The media playback system 100 may be established with one or more playback zones, after which one or more zones may be added, or removed to arrive at the example configuration shown in FIG. 1. Each zone may be given a name according to a different room or space such as an office, bathroom, master bedroom, bedroom, kitchen, dining room, living room, and/or balcony. In one case, a single playback zone may include multiple rooms or spaces. In another case, a single room or space may include multiple playback zones.” (emphasis added)
Also see col. 9 lns. 32-39: “The playback zone region 420 may include representations of playback zones within the media playback system 100. In some embodiments, the graphical representations of playback zones may be selectable to bring up additional selectable icons to manage or configure the playback zones in the media playback system, such as a creation of bonded zones, creation of zone groups, separation of zone groups, and renaming of zone groups, among other possibilities.” (emphasis added)
Further, see col. 19 ln. 57 – col. 20 ln. 3: “As noted above, some example systems may include multiple NMDs, possibly configured into multiple zones (e.g., media playback system 100 of FIG. 1 with Living Room, Kitchen, Dining Room, and Bedroom zones, each with respective playback devices). In such systems, a default voice service may be configured on a per NMD or per zone basis. Then, voice inputs detected by a given NMD or zone may be processed by the default voice service for that NMD or zone. In some cases, the NMD may assume that a voice input that was detected by a given NMD or zone is intended to be processed by the voice service associated with the zone. However, in other case, a wake-word or phrase may direct the voice input to a particular NMD or zone (e.g., “Hey, Kitchen” to direct a voice input to a Kitchen zone).” (emphasis added)
Thus, Wilberding’s described controllers read on a voice coordinator for coordinating WHV system operations.
Wilberding also describes (2) that the NMDs act as voice input proxies for proxying for the voice input devices.
See col. 24 ln. 59 – col. 25 ln. 4: “a computing device may register one or more voice services to process a voice command. Implementation 900 is an example technique to cause a NMD to register at least one voice service.
a. Receive Input Data Indicating a Command to Register Voice Service(s)
At block 902, implementation 900 involves receiving input data indicating a command to register one or more voice services on one or more second devices. For instance, a first device (e.g., a NMD) may receive, via a user interface (e.g., a touchscreen), input data indicating a command to register one or more voice services with a media playback system that includes one or more playback devices.” (emphasis added)
Also see col 25 lns. 11-18: “At block 904, implementation 900 involves detecting one or more voice services that are registered to the first device (e.g., the NMD). Such voice services may include voice services that are installed on the NMD or that are native to the NMD (e.g., part of the operating system of the NMD).
For instance, a NMD that is a smartphone or tablet may have installed one or more applications (“apps”) that interface with voice services. The NMD may detect these applications using any suitable technique.” (emphasis added)
Further, see col. 25 lns. 34-50: “At block 906, implementation 900 involves causing registration of at least one of the detected voice services to be registered on the one or more second devices. For instance, the NMD may cause at least one of the detected voice services to be registered with a media playback system that includes one or more playback devices (e.g., media playback system 100 of FIG. 1). Causing the a voice service to be registered may involve transmitting, via a network interface, a message indicating credentials for that voice service to the media playback system (i.e., at least one device thereof). The message may also include a command, request, or other query to cause the media playback system to register with the voice service using the credentials from the NMD. In such manner, a user's media playback system may have registered one or more of the same voice services as registered on the user's NMD (e.g., smartphone) utilizing the same credentials as the user's NMD, which may hasten registration.”
Thus, Wilberding’s NMDs, which include controllers, include the functionality of registering with one or more voice services, such as voice services associated with a user’s smartphone or tablet voice input device, which maps to the creation of voice input proxies for proxying for the voice input devices.
Further, Wilberding describes (3) that the NMDs include a voice daemon executing any number of voice daemon instances each respectively associated with one of any number of voice targets.
See col. 18 lns. 43-51: “In other cases, multiple voice services may be available to the NMD for processing of the voice input. In such cases, the NMD may identify a particular voice service of the multiple voice services to process the voice input. For instance, the NMD may identify a particular voice service from among multiple voice services registered to a media playback system. As indicated above, the NMD may be part of the media playback system (e.g., as a playback device or controller device) or otherwise associated with the system.” (emphasis added)
Also see col. 19 lns. 5-19: “Determining that the particular wake-word corresponds to a specific voice service may involve querying one or more voice services with the voice data (e.g., the portion of the voice data corresponding to the wake-word or phrase). For instance, a voice service may provide an application programming interface that the NMD can invoke to determine that whether the voice data includes the wake-word or phrase corresponding to that voice service. The NMD may invoke the API by transmitting a particular query of the voice service to the voice service along with data representing the wake-word portion of the received voice data. Alternatively, the NMD may invoke the API on the NMD itself. Registration of a voice service with the NMD or with the media playback system may integrate the API or other architecture of the voice service with the NMD.” (emphasis added)
Thus, WIlberding’s NMDs, which include controllers, include the functionality of registering with one or more voice services, where the registration process includes the voice service providing an application programming interface that the NMD can invoke in order to send voice data to the voice service, which maps to a voice daemon executing any number of voice daemon instances each respectively associated with one of any number of voice targets.
Regarding applicant’s third argument: “Moreover, the claims recite, inter alia, that "the automation controller is communicatively connected to the voice targets." Wilberding does not disclose an automation controller at all, and therefore cannot disclose such a controller being communicatively connected to voice targets in the manner claimed.”
The examiner respectfully disagrees. As described above, at least Wilberding’s controller devices read on an automation controller and, as also described above, the NMDs, which may be controller devices, can communicate with multiple registered voice services.
Regarding applicant’s fourth argument: “Accordingly, Wilberding fails to disclose or suggest the claimed arrangement of an automation controller executing a voice coordinator, voice input proxies, and a voice daemon with multiple daemon instances associated with respective voice targets. The cited reference therefore does not anticipate or render obvious the claimed subject matter.”
The examiner respectfully disagrees. As discussed above, at least Wilberding’s controllers teach the claimed arrangement of an automation controller executing a voice coordinator, voice input proxies, and a voice daemon with multiple daemon instances associated with respective voice targets.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1-11 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Wilberding (US 10,115,400, hereinafter “Wilberding”).
RE claim 1, Wilberding describes a whole home voice (WHV) system of an automation network connecting an automation server (FIG. 5 and col. 12 lns. 11-23: “the computing devices 504, 506, and 508 may be part of a cloud network 502. The cloud network 502 may include additional computing devices. In one example, the computing devices 504, 506, and 508 may be different servers.”), an automation controller (FIG. 5, col. 11 ln. 65-col. 12 ln. 2 “controller device (CR) 522”), and any number of automation devices (FIG. 5 and col. 11 ln. 65-col. 12 ln. 2 “network microphone devices (NMDs) 512, 514, and 516; playback devices (PBDs) 532, 534, 536, and 538”), the WHV system comprising:
any number of voice input devices for receiving voice inputs vocalized by a user of the automation network (col. 12 lns. 24-48 describes network microphone devices (NMDs). Further, col. 16 lns. 45-55 describes that NMDs receive voice data indicating a voice input); and
an automation controller for executing: (1) a voice coordinator for coordinating WHV system operations (col. 8 lns. 8-22 “In one example, the control device 300 may be a dedicated controller for the media playback system 100. In another example, the control device 300 may be a network device on which media playback system controller application software may be installed, such as for example, an iPhone™, iPad™ or any other smart phone, tablet or network device (e.g., a networked computer such as a PC or Mac™)”), (2) any number of voice input proxies for proxying for the voice input devices (col 18 lns. 3-51: “Available voice services may include voice services registered with the NMD. Registration of a given voice service with the NMD may involve providing user credentials (e.g., user name and password) of the voice service to the NMD and/or providing an identifier of the NMD to the voice service. Such registration may configure the NMD to receive voice inputs on behalf of the voice service and perhaps configure the voice service to accept voice inputs from the NMD for processing.”); and (3) a voice daemon executing any number of voice daemon instances each respectively associated with one of any number of voice targets (col. 19 lns. 5-34: “a voice service may provide an application programming interface that the NMD can invoke to determine that whether the voice data includes the wake-word or phrase corresponding to that voice service. The NMD may invoke the API by transmitting a particular query of the voice service to the voice service along with data representing the wake-word portion of the received voice data. Alternatively, the NMD may invoke the API on the NMD itself. Registration of a voice service with the NMD or with the media playback system may integrate the API or other architecture of the voice service with the NMD.”),
wherein the automation controller is communicatively connected to the voice targets (col. 22 lns. 31-41: “At block 706, implementation 700 involves causing the identified voice service(s) to process the voice input. For instance, the NMD may transmit, via a network interface to one or more servers of the identified voice service(s), data representing the voice input and a command or query to process the data presenting the voice input. The command or query may cause the identified voice service(s) to process the voice command. The command or query may vary according to the identified voice service so as to conform the command or query to the identified voice service (e.g., to an API of the voice service).”).
RE claim 2, Wilberding describes the WHV system of claim 1, wherein the voice targets are associated with respective voice assistant services (col 18 lns. 52-65: “For instance, the particular wake-word may be “Hey, Siri” to invoke APPLE®'s voice service, “Ok, Google” to invoke GOOGLE®'s voice service, “Alexa” to invoke AMAZON®'s voice service, or “Hey, Cortana” to invoke Microsoft's voice service.”).
RE claim 3, Wilberding describes the WHV system of claim 1, wherein the voice input devices are any of: a touchscreen, a home speaker, a stand-alone microphone, a remote control, a television, a home appliance, an intercom device, a cell phone, a computer, and a personal electronic device (col 1 ln. 61-col. 2 ln 2, NMD may be a Sonos playback device, Amazon Echo, or Apple iPhone.).
RE claim 4, Wilberding describes the WHV system of claim 1, wherein the voice coordinator determines mapping information for mapping any of the voice input devices to any of the voice targets (col. 25 lns. 10-24 “At block 904, implementation 900 involves detecting one or more voice services that are registered to the first device (e.g., the NMD). Such voice services may include voice services that are installed on the NMD or that are native to the NMD (e.g., part of the operating system of the NMD).”).
RE claim 5, Wilberding describes the WHV system of claim 4, wherein the mapping information is determined according to any of device registration information and device polling information (col. 25 lns. 10-24 “At block 904, implementation 900 involves detecting one or more voice services that are registered to the first device (e.g., the NMD). Such voice services may include voice services that are installed on the NMD or that are native to the NMD (e.g., part of the operating system of the NMD).”).
RE claim 6, Wilberding describes the WHV system of claim 1, wherein the received voice inputs are recordings of sounds or speech vocalized by a user of the automation network (col. 17 lns. 32-37 “In some cases, the NMD may receive voice data indicating the voice input via a network interface, perhaps from another NMD within a household. The NMD may receive this recording in addition to receiving voice data indicating the voice input via a microphone (e.g., if the two NMDs are both within detection range of the voice input).”).
RE claim 7, Wilberding describes an automation controller having a processor, a memory, and a communication interface (FIG. 3 – control device 300), the automation controller configured to:
determine mapping information mapping any number of voice input devices to any number of voice targets (col. 25 lns. 10-31 “At block 904, implementation 900 involves detecting one or more voice services that are registered to the first device (e.g., the NMD). Such voice services may include voice services that are installed on the NMD or that are native to the NMD (e.g., part of the operating system of the NMD).”);
broadcast the mapping information to any number of devices communicatively connected to the automation controller via an automation network (col. 25 lns. 34-51 “the NMD may cause at least one of the detected voice services to be registered with a media playback system that includes one or more playback devices (e.g., media playback system 100 of FIG. 1). Causing the a voice service to be registered may involve transmitting, via a network interface, a message indicating credentials for that voice service to the media playback system (i.e., at least one device thereof).”);
transmit a message commanding a voice daemon to instantiate a voice daemon instance for each of the voice targets included in the mapping information (col. 25 lns. 34-51 “Causing the a voice service to be registered may involve transmitting, via a network interface, a message indicating credentials for that voice service to the media playback system (i.e., at least one device thereof). The message may also include a command, request, or other query to cause the media playback system to register with the voice service using the credentials from the NMD.” Also see col. 19 lns. 17-19: “Registration of a voice service with the NMD or with the media playback system may integrate the API or other architecture of the voice service with the NMD.”);
instantiate a respective voice input proxy for each of the voice input devices (col 18 lns. 3-51: “Available voice services may include voice services registered with the NMD. Registration of a given voice service with the NMD may involve providing user credentials (e.g., user name and password) of the voice service to the NMD and/or providing an identifier of the NMD to the voice service. Such registration may configure the NMD to receive voice inputs on behalf of the voice service and perhaps configure the voice service to accept voice inputs from the NMD for processing.”); and
instruct the voice daemon to transmit user voice inputs according to the mapping information (col. 18 lns. 52-65 “Identification of a particular voice service to process the voice input may be based on a wake-word or phrase in the voice input. For instance, after receiving voice data indicating a voice input, the NMD may determine that a portion of the voice data represents a particular wake-word. Further, the NMD may determine that the particular wake-word corresponds to a specific voice service. In other words, the NMD may determine that the particular wake-word or phrase is used to invoke a specific voice service.”).
RE claim 8, Wilberding describes the automation controller of claim 7, wherein the voice targets are associated with respective voice assistant services (col 18 lns. 52-65: “For instance, the particular wake-word may be “Hey, Siri” to invoke APPLE®'s voice service, “Ok, Google” to invoke GOOGLE®'s voice service, “Alexa” to invoke AMAZON®'s voice service, or “Hey, Cortana” to invoke Microsoft's voice service.”).
RE claim 9, Wilberding describes the automation controller of claim 7, wherein the voice input devices are any of: a touchscreen, a home speaker, a stand-alone microphone, a remote control, a television, a home appliance, an intercom device, a cell phone, a computer, and a personal electronic device (col 1 ln. 61-col. 2 ln 2, NMD may be a Sonos playback device, Amazon Echo, or Apple iPhone.).
RE claim 10, Wilberding describes the automation controller of claim 7, wherein the mapping information is determined by mapping any one or more of the voice input devices to any one or more of the voice targets (col. 25 lns. 10-24 “At block 904, implementation 900 involves detecting one or more voice services that are registered to the first device (e.g., the NMD). Such voice services may include voice services that are installed on the NMD or that are native to the NMD (e.g., part of the operating system of the NMD).”).
RE claim 11, Wilberding describes the automation controller of claim 10, wherein the mapping information is determined according to any of device registration information and device polling information (col. 25 lns. 10-24 “At block 904, implementation 900 involves detecting one or more voice services that are registered to the first device (e.g., the NMD). Such voice services may include voice services that are installed on the NMD or that are native to the NMD (e.g., part of the operating system of the NMD).”).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure – see references cited on PTO-892.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Daniel C Washburn whose telephone number is (571)272-5551. The examiner can normally be reached Monday-Friday 9:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657