DETAILED ACTION
1. This communication is in response to the Amendments and Arguments filed on 3/5/2026. Claims 1-20 are pending and have been examined.
Response to Amendments and Arguments
2. Regarding applicant’s amended claims, note that these claims necessitate the new ground(s) of rejection presented in this Office action, because these claims introduce new issue and/or change the scope of the claims.
Applicant's arguments with respect to claim rejections under 35 USC 101 (mental process) have been fully considered, but they are not persuasive. In particular, the applicant argues that the claims recite (1) detect the first hotword based on audio features; (2) enabling a first hotword detector and a second hotword detector in parallel.
Note that for (1), the claims do not recite any specific “audio features” such as MFCC as the applicant argues. As such “audio features” can be broadly interpreted such as audio loudness, tone, pitch and so on which a person can mentally recognize. This also applies to hot phrases or key phrases when more than one single word is to be detected or spotted.
For (2), note that a human is able to perform multi-processing of parallel tasks, unless the applicant can recite further details from the Specification how this can not be performed by a human mental process.
Applicant's arguments with respect to claim rejections under 35 USC 103 have been fully considered, but they are not persuasive. In particular, the applicant argues that (1) the cited references do not teach the limitation “detect the first hotword .. based on audio features that are indicative of the first term ..” In response, the examiner respectfully disagrees.
WEINTRAUB teaches: [Abstract] “Hidden Markov model (I-IMM) systems have been used for speaker-independent continuous-speech recognition (CSR) as well as for keyword-spotting tasks <read on ‘hotword’>” and [sec 2.1, para 1] “Six spectral features are used to model the speech signal <read on ‘audio features’>”: the cepstral vector (Cl-CN) and its first and second derivatives, and cepstral energy (CO) and its first and second derivatives. These features are computed from an FFT filterbank and subsequent high-pass RASTA filtering of the filterbank log energies, and are modeled either with VQ and scalar codebooks or with tied-mixture Gaussian models.”
(2) the cited references do not teach the limitation “a first hotword detector and a second hotword detector in parallel ..” In response, the examiner respectfully disagrees.
NING teaches: [Abstract] “our approach can improve the efficiency of executing parallel keyword search” and [sec. II. C] “For keyword searching over ordinary XML documents with high efficiency, Kurita[6] and Zhang[12] provided an XML search method in distributed environment which can search the results parallel, and then sent all the results from distributed nodes.”
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
3. Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The independent claims 1, 11 recite a method and a system, thus relating to a statutory category.
Claims 1, 11 recite “receiving a user input indication that designates a first term as a first hotword; receiving another user input indication that designates a second term followed by a third term as a second hotword .. enabling a first hotword detector and a second hotword detector in parallel ..” The limitations as drafted cover mental processes, where a human with a user’s voice input designating hotwords, can perform hotwords detection on subsequent utterances containing the designated hotwords. Furthermore, see the examiner’s response in the Response to Amendments and Arguments section.
This judicial exception is not integrated into a practical application. In particular, independent claims 1, 11 recite additional elements of “data processing hardware” and “memory hardware” (SPEC [0075] - The computing device includes a processor, a memory, a storage device .. a display coupled to the high-speed interface) which amount to general purpose computing devices. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using a processor is noted as a general computer. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Further, the additional limitations in the claims noted above are directed towards insignificant solution activity. The claims are not patent eligible.
With respect to claims 2 and 12, the claims recite “prompting a user to designate the first term as the first hotword ..” where a human can receive a prompt with any term and designate any hotword. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claims 3 and 13, the claims recite “prompting a user to designate the second term followed by the third term as the second hotword ..” where a human can receive a prompt with multiple terms and designate any hotword. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claims 4 and 14, the claims recite “wherein the first hotword detector .. to wake up the client device ..” where a human can detect the first hotword for a wakeup action (subject to BRI). The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claims 5 and 15, the claims recite “wherein the second hotword detector .. to wake up the client device ..” where a human can detect the second hotword for a wakeup action (subject to BRI). The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claims 6 and 16, the claims recite “the second term is different than the first term and the third term ..” where a human can differentiate the second term from the first and the third terms. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claims 7 and 17, the claims recite “the first term is different than the third term ..” where a human can differentiate the first term from the third term. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claims 8 and 18, the claims recite “the first hotword detector comprises a neural network classifier trained using machine learning ..” where a neural network is considered a generic computer. It is suggested that a particular type of neural network with specific training such as an RNN with supervised training is recited. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claims 9 and 19, the claims recite “the second hotword detector comprises a neural network classifier trained using machine learning ..” where a neural network is considered a generic computer. It is suggested that a particular type of neural network with specific training such as an RNN with supervised training is recited. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
With respect to claims 10 and 20, the claims recite “after enabling the first hotword detector and the second hotword detector in parallel, undesignating a fourth term that is designated as a third hotword ..” where a human can designate/enable or undesignate/delete any term(s) as hotwords in any manner such as in parallel. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim Rejections - 35 USC § 103
4. Claims 1-3, 6-7, 10-13, 16-17, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Gao, et al. (IEEE 2011; hereinafter GAO) in view of Ning, et al. (2012 International Conference on System Science and Engineering; hereinafter NING), further in view of Sakagami, et al. (Computer Networks and ISDN Systems, 1997; hereinafter SAKAGAMI) and further in view of Weintraub (1993 IEEE; hereinafter WEINTRAUB).
As per claim 1, GAO (Title: The Hot Keyphrase Extraction based on TF*PDF) discloses “A computer-implemented method executed on data processing hardware that causes the data processing hardware to perform operations comprising: [ receiving a user input indication that designates a first term as a first hotword ]; receiving another user input indication that designates a second term followed by a third term as a second hotword (GAO, [Introduction, para 3], Keyphrases <read on ‘a second term followed by a third term’> have clearer meanings than hot terms <read on ‘a first term’> and thus are better representations for topics, for example, “nature language processing” vs “nature”,” language”ˈ“processing”); and based on designating the first term as the first hotword and the second term followed by the third term as the second hotword, [ enabling, on a client device, a first hotword detector and a second hotword detector ] [ in parallel ], wherein: the first hotword detector is configured to [ detect the first hotword in utterances based on audio features ] that are indicative of the first term that include the first term; and the second hotword detector configured to detect, in parallel with the first hotword detector detecting the first hotword, the second hotword in utterances [ based on audio features ] that are indicative of the second term followed by the second (?) term <The applicant is requested to correct this) that include the second term followed by the third term (GAO, [A. Hot Terms Extraction, para 4] <read on ‘hotword detector’ on any device>, the top-ranked k term can be chosen to construct the candidate hot terms list; [B. Hot Keyphrase Extraction, para 2] <read on ‘hotword detector’ on any device>, If there are some hot terms consecutively occurring in the same sentence, then these adjacent hot terms is combined into a candidate keyphrase).”
GAO does not explicitly disclose “(a first hotword detector and a second hotword detector) in parallel ..” However, the limitation is taught by NING (Title: Parallel Processing the Keyword Search in Uncertain Environment).
In the related field of endeavor, NING teaches: [Abstract] “our approach can improve the efficiency of executing parallel keyword search” and [sec. II. C] “For keyword searching over ordinary XML documents with high efficiency, Kurita[6] and Zhang[12] provided an XML search method in distributed environment which can search the results parallel, and then sent all the results from distributed nodes.”
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of NING in the system (as taught by GAO) for parallel processing of multiple hotword/keyword detection for improved efficiency.
GAO in view of NING does not explicitly disclose “receiving a user input indication that designates a first term as a first hotword … enabling .. a first hotword detector and a second hotword detector ..” However, the limitation is taught by SAKAGAMI (Title: Learning personal preferences on online newspaper articles from user behaviors).
In the related field of endeavor, SAKAGAMI teaches: [p. 1448, para 4] “The system asks users to register their interests in the form of keywords, topics, and so on” and [Introduction, para 1] “the user inputs keywords and then gets search results <read on the hotword detector having been enabled/activated/initialized to operate after the hotword having been registered/selected>.”
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of SAKAGAMI in the system (as taught by GAO and NING) to prompt users for registering hotwords/keywords in the system for howord/keyword detection applications.
GAO in view of NING and SAKAGAMI does not explicitly disclose “detect the first hotword in utterances based on audio features ..” However, the limitation is taught by WEINTRAUB (Title: KEYWORD-SPOTTING USING SRI’S DECIPHER LARGE-VOCABUARLY SPEECH-RECOGNITION SYSTEM).
In the related field of endeavor, WEINTRAUB teaches: [Abstract] “Hidden Markov model (I-IMM) systems have been used for speaker-independent continuous-speech recognition (CSR) as well as for keyword-spotting tasks” and [sec 2.1, para 1] “Six spectral features are used to model the speech signal: the cepstral vector (Cl-CN) and its first and second derivatives, and cepstral energy (CO) and its first and second derivatives. These features are computed from an FFT filterbank and subsequent high-pass RASTA filtering of the filterbank log energies, and are modeled either with VQ and scalar codebooks or with tied-mixture Gaussian models.”
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of WEINTRAUB in the system (as taught by GAO and NING and SAKAGAMI) to employ audio features for howord or keyword spotting/detection applications.
As per claim 2 (dependent on claim 1), GAO in view of NING and SAKAGAMI and WEINTRAUB further discloses “generating an interface for output on a display, the interface prompting a user to designate the first term as the first hotword, wherein the user input indication that designates the first term as the first hotword is received after generating the interface for output on the display (SAKAGAMI, [p. 1448, para 4], The system asks users to register their interests in the form of keywords, topics, and so on <read on hotword, interface for any I/P and O/P devices, prompting, user input, designation and the associated timing for system components to work in sequence>).”
As per claim 3 (dependent on claim 1), GAO in view of NING and SAKAGAMI and WEINTRAUB further discloses “generating an interface for output on a display, the interface prompting a user to designate the second term followed by the third term as the second hotword, wherein the another user input indication that designates the second term followed by the third term as the second hotword is received after generating the interface for output on the display (SAKAGAMI, [p. 1448, para 4], The system asks users to register their interests in the form of keywords, topics, and so on <read on hotword, interface for any I/P and O/P devices, prompting, user input, designation and the associated timing for system components to work in sequence>).”
As per claim 6 (dependent on claim 1), GAO in view of NING and SAKAGAMI and WEINTRAUB further discloses “wherein the second term is different than the first term and the third term (GAO, [Introduction, para 3], Keyphrases have clearer meanings than hot terms and thus are better representations for topics, for example, “nature language processing” <read on general keyphrases with multiple terms> vs “nature”,” language”ˈ“processing”).”
As per claim 7 (dependent on claim 6), GAO in view of NING and SAKAGAMI and WEINTRAUB further discloses “wherein the first term is different than the third term (GAO, [Introduction, para 3], Keyphrases have clearer meanings than hot terms and thus are better representations for topics, for example, “nature language processing” <read on general keyphrases with multiple terms> vs “nature”,” language”ˈ“processing”).”
As per claim 10 (dependent on claim 1), GAO in view of NING and SAKAGAMI and WEINTRAUB further discloses “after enabling the first hotword detector and the second hotword detector in parallel, undesignating a fourth term that is designated as a third hotword (SAKAGAMI, [p. 1448, para 4], The system asks users to register their interests in the form of keywords, topics, and so on <read on a ready mechanism for registration/designation and the corresponding un-registration/undesignation of any hotword/keyword at any desired time, such as by simple deletion>).”
Claims 11-13, 16-17, 20 (similar in scope to claims 1-3, 6-7, 10, respectively) are rejected under the same rationale as detailed above for claims 1-3, 6-7, 10, respectively. Note that NING also teaches: [sec. V, para 1], “The algorithm run on a PC equipped with Intel Pentium 4 (R) 1.99 GHz CPU frequency processor, 1 G memory).
5. Claims 4-5, 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over GAO in view of NING, SAKAGAMI and WEINTRAUB, and further in view of Parada San Martin, et al. (US 20150127594; also 61899449 filed on 11/4/2013; hereinafter Parada San Martin).
As per claim 4 (dependent on claim 1), GAO in view of NING, SAKAGAMI and WEINTRAUB further discloses “wherein the first hotword detector enabled on the client device is configured to [ wake up the client device from a sleep state or hibernation state ] when [[the]] first hotword detector detects an utterance of the first term.”
GAO in view of NING, SAKAGAMI and WEINTRAUB does not explicitly disclose “wake up the client device from a sleep state or hibernation state ..” However, the limitation is taught by Parada San Martin (Title: Transfer learning for deep neural network based hotword detection).
In the related field of endeavor, Parada San Martin teaches: [0002] “Automatic speech recognition .. to use voice commands to wake up and have basic spoken interactions with the device. For example, it may be desirable to recognize a “hotword” that signals that the mobile device should activate when the mobile device is in a sleep state.”
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Parada San Martin in the system (as taught by GAO, NING, SAKAGAMI and WEINTRAUB) for waking up a client device when a hotword is detected.
As per claim 5 (dependent on claim 1), GAO in view of NING, SAKAGAMI and WEINTRAUB further discloses “wherein the second hotword detector enabled on the client device is configured to [wake up the client device from a sleep state or hibernation state ] when the second hotword detector detects an utterance of the second term followed by the third term.”
GAO in view of NING, SAKAGAMI and WEINTRAUB does not explicitly disclose “wake up the client device from a sleep state or hibernation state ..” However, the limitation is taught by Parada San Martin (Title: Transfer learning for deep neural network based hotword detection).
In the related field of endeavor, Parada San Martin teaches: [0002] “Automatic speech recognition .. to use voice commands to wake up and have basic spoken interactions with the device. For example, it may be desirable to recognize a “hotword” that signals that the mobile device should activate when the mobile device is in a sleep state.”
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Parada San Martin in the system (as taught by GAO, NING, SAKAGAMI and WEINTRAUB) for waking up a client device when a hotword (with multiple terms) is detected.
Claims 14-15 (similar in scope to claims 4-5, respectively) are rejected under the same rationale as detailed above for claims 4-5, respectively.
6. Claims 8-9, 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over GAO in view of NING, SAKAGAMI and WEINTRAUB, and further in view of Fernandez, et al. (ICANN 2007; hereinafter Fernandez).
As per claim 8 (dependent on claim 1), GAO in view of NING, SAKAGAMI and WEINTRAUB further discloses “wherein the first hotword detector comprises [ a neural network classifier trained using machine learning to detect the first hotword based on audio features indicative of the first hotword ].”
GAO in view of NING, SAKAGAMI and WEINTRAUB does not explicitly disclose “a neural network classifier trained using machine learning to detect the first hotword based on audio features indicative of the first hotword.” However, the limitation is taught by Fernandez (Title: An Application of Recurrent Neural Networks to Discriminative Keyword Spotting).
In the related field of endeavor, Fernandez teaches: Title and [sec. 1, para 1] “The goal of keyword spotting <read on ‘hotword detection’> is to detect the presence of specific spoken words in (typically) unconstrained speech. Applications of keyword spotting include audio indexing, detection of command words in interactive environments and spoken password verification.” Note that RNN (in Title) also reads on training with audio features.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Fernandez in the system (as taught by GAO, NING, SAKAGAMI and WEINTRAUB) for the implementation of hotword/keyword detection based on neural network.
As per claim 9 (dependent on claim 1), GAO in view of NING, SAKAGAMI and WEINTRAUB further discloses “wherein the second hotword detector comprises [ a neural network classifier trained using machine learning to detect the second hotword based on audio features indicative of the second hotword ].”
GAO in view of NING, SAKAGAMI and WEINTRAUB does not explicitly disclose “a neural network classifier trained using machine learning to detect the second hotword based on audio features indicative of the second hotword.” However, the limitation is taught by Fernandez (Title: An Application of Recurrent Neural Networks to Discriminative Keyword Spotting).
In the related field of endeavor, Fernandez teaches: Title and [sec. 1, para 1] “The goal of keyword spotting <read on ‘hotword detection’> is to detect the presence of specific spoken words in (typically) unconstrained speech. Applications of keyword spotting include audio indexing, detection of command words in interactive environments and spoken password verification.” Note that RNN (in Title) also reads on training with audio features.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Fernandez in the system (as taught by GAO, NING, SAKAGAMI and WEINTRAUB) for the implementation of hotword/keyword detection based on neural network.
Claims 18-19 (similar in scope to claims 8-9, respectively) are rejected under the same rationale as detailed above for claims 8-9, respectively.
Conclusion
7 Applicant's amendment necessitates the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FENG-TZER TZENG whose telephone number is 571-272-4609. The examiner can normally be reached on M-F (8:30-5:00). The fax phone number where this application or proceeding is assigned is 571-273-4609.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras Shah (SPE) can be reached on 571-270-1650.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/FENG-TZER TZENG/ 3/25/2026Primary Examiner, Art Unit 2653