Last updated: April 19, 2026
Application No. 18/761,117
USER MEDIATION FOR HOTWORD/KEYWORD DETECTION

Non-Final OA §101§102§103§DP
Filed
Jul 01, 2024
Examiner
KAZEMINEZHAD, FARZAD
Art Unit
2653
Tech Center
2600 — Communications
Assignee
Google LLC
OA Round
1 (Non-Final)
Interview Optional

— +67.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 534 resolved cases, 2023–2026
Examiner Intelligence

KAZEMINEZHAD, FARZAD View full profile →
Grants 71% — above average
Career Allow Rate
379 granted / 534 resolved
+9.0% vs TC avg
Strong +67% interview lift
Without
With
+67.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 6m
Avg Prosecution
24 currently pending
Career history
558
Total Applications
across all art units
Statute-Specific Performance

§101
13.6%
-26.4% vs TC avg
§103
36.9%
-3.1% vs TC avg
§102
18.3%
-21.7% vs TC avg
§112
18.5%
-21.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 534 resolved cases
Office Action

§101 §102 §103 §DP
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 7/1/2024, 9/26/2024, 12/23/2024, 2/12/2025, 3/27/2025, 8/25/2025, 9/22/2025, and 10/14/2025 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-3, 8-10, 15-17 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Shah et al. (US 2021/0390948).
Regarding claim 1, Shah et al. do teach a method implemented by one or more processors (¶ 0043 S2: “Each user device 102” (client device) “may correspond to a computing device, such as a mobile phone, computer, smart speaker, smart appliance, smart headphones, wearable, etc., and is equipped with data processing hardware 103” (implemented by one or more processors) “and memory hardware 105”),       
The method comprising:
receiving, via one or more microphones of a client device, audio data that captures a spoken utterance of a user (¶ 0022 2 S. before last: “The operations include receiving” (receiving) “streaming audio” (audio data) “captured by one or more microphones” (via one or more microphones of a client device) “in communication with the data processing hardware”);
processing the audio data using a machine learning model to generate a predicted output (¶ 0043 S. before last: “the user device 102” (the client device) “receives a trained neural network 130” (uses machine learning) “(e.g., a memorized neural network) from the remote system 110 via the network 104 and executes the trained neural network 130 to detect hotwords” (to predict an output (predicted output)) “in streaming audio 118” (by processing the audio data)); 
determining that the predicted output satisfies a threshold (¶ 0022 last 2S: “The operations also include generating, using a first stage hotword detector, a probability score indicating a presence of a hotword in audio features of the streaming audio. The operations include determining whether the probability score satisfies a hotword detection” (determining whether the “hotword detection” (predicted output) “threshold” (satisfies a threshold));
in response to determining that the predicted output satisfies the threshold, initiating an automated assistant function (¶0012 S3: “when the probability score satisfies the hotword detection threshold” (in response to the predicted output satisfying the threshold) “the method includes initiating, by the data processing hardware, a wake-up process” (initiating an automated assistant function) “on the user device for processing the hotword and/or one or more other terms following the hotword in the streaming audio”));
determining, based on user interface input received subsequent to initiating the
automated assistant function, that the automated assistant function should not have been initiated (¶0012 last S: “When the hotword is not detected” (determining that the automated assistant function should not have been initiated even though “wake-up process” was “initiat[ed]” (subsequent to initiating the automated assistant function)) “by the second stage hotword detector in the audio data, the method may include suppressing, by the data processing hardware, the wake-up process on the user device”); and
in response to determining that the predicted output satisfies the threshold and determining that the automated assistant function should not have been initiated, adjusting the threshold (¶ 0017 last S: “When the false acceptance rate associated with the first stage hotword detector satisfies the false acceptance rate threshold” (when it is determined that the automated assistant function should not have been initiated even while in the first stage the threshold was satisfied ) “the operations include adjusting” (adjusting) “the hotword detection threshold” (the threshold) “of the first stage hotword detector”).

Regarding claim 2, Shah et al. do teach the method according to claim 1, wherein the machine learning model is a hotword detection model, and further comprising training the hotword detection model based on determining that the automated assistant function should not have been initiated ((¶ 0043 S. before last: “the user device 102”  “receives a trained neural network 130” (the machine learning model) “(e.g., a memorized neural network) from the remote system 110 via the network 104 and executes the trained neural network 130 to detect hotwords” (is tailored to detection of hotwords) “in streaming audio 118”; ¶ 0046 S3: “the second stage” (when automated assistant function is not supposed to be initiated in the “first stage”) “hotword detector 140 includes a different neural network” (training via the machine learning still is done) “that is potentially more computationally-intensive than the neural network 130 of the first stage hotword detector 120”).

Regarding claim 3, Shah et al. do teach the method according to claim 1, wherein the adjusting the threshold comprises raising the threshold (¶ 0008 last S: “Adjusting” (adjusting the threshold) “the hotword detection threshold of the first stage hotword detector, in some examples, includes increasing a value of the hotword detection threshold” (comprises raising the threshold)).

Regarding claim 8, Shah et al. do teach a computer program product comprising one or more non-transitory computer readable storage media having program instructions collectively stored on the one or more non-transitory computer-readable storage media ([0071] “The storage device 830 is capable of providing mass storage for the computing device 800. In some implementations, the storage device 830 is a computer-readable medium. In various different implementations, the storage device 830 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state memory device, or an array of devices, including devices in a storage area network or other configurations. In additional implementations, a computer program product is tangibly embodied in an information carrier. The computer program product contains instructions that, when executed, perform one or more methods, such as those described above. The information carrier is a computer- or machine-readable medium, such as the memory 820, the storage device 830, or memory on processor 810”),
The program instructions executable to:
receive, via one or more microphones of a client device, audio data that captures a spoken utterance of a user (¶ 0022 2 S. before last: “The operations include receiving” (receiving) “streaming audio” (audio data) “captured by one or more microphones” (via one or more microphones of a client device) “in communication with the data processing hardware”);
process the audio data using a machine learning model to generate a predicted output (¶ 0043 S. before last: “the user device 102” (the client device) “receives a trained neural network 130” (uses machine learning) “(e.g., a memorized neural network) from the remote system 110 via the network 104 and executes the trained neural network 130 to detect hotwords” (to predict an output (predicted output)) “in streaming audio 118” (by processing the audio data)); 
determine that the predicted output satisfies a threshold (¶ 0022 last 2S: “The operations also include generating, using a first stage hotword detector, a probability score indicating a presence of a hotword in audio features of the streaming audio. The operations include determining whether the probability score satisfies a hotword detection” (determining whether the “hotword detection” (predicted output) “threshold” (satisfies a threshold));
in response to determining that the predicted output satisfies the threshold, initiating an automated assistant function (¶0012 S3: “when the probability score satisfies the hotword detection threshold” (in response to the predicted output satisfying the threshold) “the method includes initiating, by the data processing hardware, a wake-up process” (initiating an automated assistant function) “on the user device for processing the hotword and/or one or more other terms following the hotword in the streaming audio”));
determine, based on user interface input received subsequent to initiating the
automated assistant function, that the automated assistant function should not have been initiated (¶0012 last S: “When the hotword is not detected” (determining that the automated assistant function should not have been initiated even though “wake-up process” was “initiat[ed]” (subsequent to initiating the automated assistant function)) “by the second stage hotword detector in the audio data, the method may include suppressing, by the data processing hardware, the wake-up process on the user device”); and
in response to determining that the predicted output satisfies the threshold and determining that the automated assistant function should not have been initiated, adjusting the threshold (¶ 0017 last S: “When the false acceptance rate associated with the first stage hotword detector satisfies the false acceptance rate threshold” (when it is determined that the automated assistant function should not have been initiated even while in the first stage the threshold was satisfied ) “the operations include adjusting” (adjusting) “the hotword detection threshold” (the threshold) “of the first stage hotword detector”).

Regarding claim 9, Shah et al. do teach the computer program product according to claim 8, wherein the machine learning model is a hotword detection model, and the program instructions are further executable to train the hotword detection model based on determining that the automated assistant function should not have been initiated ((¶ 0043 S. before last: “the user device 102”  “receives a trained neural network 130” (the machine learning model) “(e.g., a memorized neural network) from the remote system 110 via the network 104 and executes the trained neural network 130 to detect hotwords” (is tailored to detection of hotwords) “in streaming audio 118”; ¶ 0046 S3: “the second stage” (when automated assistant function is not supposed to be initiated in the “first stage”) “hotword detector 140 includes a different neural network” (training via the machine learning still is done) “that is potentially more computationally-intensive than the neural network 130 of the first stage hotword detector 120”).

Regarding claim 10, Shah et al. do teach the computer program product according to claim 8, wherein the adjusting the threshold comprises raising the threshold (¶ 0008 last S: “Adjusting” (adjusting the threshold) “the hotword detection threshold of the first stage hotword detector, in some examples, includes increasing a value of the hotword detection threshold” (comprises raising the threshold)).

Regarding claim 15, Shah et al. do teach a client device comprising:
One or more microphone (¶ 0022: “one or more microphone”)
One or more processors ((¶ 0043 S2: “Each user device 102” (client device) “may correspond to a computing device, such as a mobile phone, computer, smart speaker, smart appliance, smart headphones, wearable, etc., and is equipped with data processing hardware 103” (implemented by one or more processors) “and memory hardware 105”)
one or more non-transitory computer readable storage media and program instructions collectively stored on the one or more non-transitory computer-readable storage media ([0071] “The storage device 830 is capable of providing mass storage for the computing device 800. In some implementations, the storage device 830 is a computer-readable medium. In various different implementations, the storage device 830 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state memory device, or an array of devices, including devices in a storage area network or other configurations. In additional implementations, a computer program product is tangibly embodied in an information carrier. The computer program product contains instructions that, when executed, perform one or more methods, such as those described above. The information carrier is a computer- or machine-readable medium, such as the memory 820, the storage device 830, or memory on processor 810”),
The program instructions executable to:
receive, via one or more microphones, audio data that captures a spoken utterance of a user (¶ 0022 2 S. before last: “The operations include receiving” (receiving) “streaming audio” (audio data) “captured by one or more microphones” (via one or more microphones of a client device) “in communication with the data processing hardware”);
process the audio data using a machine learning model to generate a predicted output (¶ 0043 S. before last: “the user device 102” (the client device) “receives a trained neural network 130” (uses machine learning) “(e.g., a memorized neural network) from the remote system 110 via the network 104 and executes the trained neural network 130 to detect hotwords” (to predict an output (predicted output)) “in streaming audio 118” (by processing the audio data)); 
determine that the predicted output satisfies a threshold (¶ 0022 last 2S: “The operations also include generating, using a first stage hotword detector, a probability score indicating a presence of a hotword in audio features of the streaming audio. The operations include determining whether the probability score satisfies a hotword detection” (determining whether the “hotword detection” (predicted output) “threshold” (satisfies a threshold));
in response to determining that the predicted output satisfies the threshold, initiating an automated assistant function (¶0012 S3: “when the probability score satisfies the hotword detection threshold” (in response to the predicted output satisfying the threshold) “the method includes initiating, by the data processing hardware, a wake-up process” (initiating an automated assistant function) “on the user device for processing the hotword and/or one or more other terms following the hotword in the streaming audio”));
determine, based on user interface input received subsequent to initiating the
automated assistant function, that the automated assistant function should not have been initiated (¶0012 last S: “When the hotword is not detected” (determining that the automated assistant function should not have been initiated even though “wake-up process” was “initiat[ed]” (subsequent to initiating the automated assistant function)) “by the second stage hotword detector in the audio data, the method may include suppressing, by the data processing hardware, the wake-up process on the user device”); and
in response to determining that the predicted output satisfies the threshold and determining that the automated assistant function should not have been initiated, adjusting the threshold (¶ 0017 last S: “When the false acceptance rate associated with the first stage hotword detector satisfies the false acceptance rate threshold” (when it is determined that the automated assistant function should not have been initiated even while in the first stage the threshold was satisfied ) “the operations include adjusting” (adjusting) “the hotword detection threshold” (the threshold) “of the first stage hotword detector”).

Regarding claim 16, Shah et al. do teach the client device according to claim 15, wherein the machine learning model is a hotword detection model, and the program instructions are executable to train the hotword detection model  based on determining that the automated assistant function should not have been initiated ((¶ 0043 S. before last: “the user device 102”  “receives a trained neural network 130” (the machine learning model) “(e.g., a memorized neural network) from the remote system 110 via the network 104 and executes the trained neural network 130 to detect hotwords” (is tailored to detection of hotwords) “in streaming audio 118”; ¶ 0046 S3: “the second stage” (when automated assistant function is not supposed to be initiated in the “first stage”) “hotword detector 140 includes a different neural network” (training via the machine learning still is done) “that is potentially more computationally-intensive than the neural network 130 of the first stage hotword detector 120”).

Regarding claim 17, Shah et al. do teach the client device according to claim 15, wherein the  adjusting the threshold comprises raising the threshold (¶ 0008 last S: “Adjusting” (adjusting the threshold) “the hotword detection threshold of the first stage hotword detector, in some examples, includes increasing a value of the hotword detection threshold” (comprises raising the threshold)).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 4, 11, 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shah et al., and further in view of Clark et al. (US 2017/0162192).
Regarding claim 4, Shah et al. do not specifically disclose the method according to claim 1, further comprising prompting the user to indicate whether or not the spoken utterance includes the hotword, wherein the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
Clark et al. do teach the method according to claim 1, further comprising prompting the user to indicate whether or not the spoken utterance includes the hotword, wherein the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting (page 8 column 1 last ¶: “prompting a user to speak a candidate hotword, and receiving audio data corresponding to the user speaking the candidate hotword; and in response to determining that a length of the spoken candidate hotword satisfies a threshold, prompting the user to speak the candidate hotword again” (prompting a user to utter a hotword “again” that was not well recognized properly the first time which amounts to asking if it was part of his initial “receiving audio data”), and these “hotwords” also called “trigger” words as shown in Fig. 1 ¶ 0028 line 3 serve “a speech recognition-enabled electronic device” (e.g., to initiate an automated assistant function)).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “prompting” procedure for “hotword” detection of Clark et al. into the “hotword” techniques of Shah et al. would enable the combined systems and their associated methods to perform in combination as they do separately and thus avoid “false rejection by the hotword detector” as disclosed required by Shah et al. ¶ 0040 column 1 last 5 lines.

Regarding claim 11, Shah et al. do not specifically disclose the computer program product according to claim 8, wherein the program instructions are further executable to prompt the user to indicate whether or not the spoken utterance includes the hotword; and the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
Clark et al. do teach the computer program product according to claim 8, wherein the program instructions are further executable to prompt the user to indicate whether or not the spoken utterance includes the hotword; and the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting (page 8 column 1 last ¶: “prompting a user to speak a candidate hotword, and receiving audio data corresponding to the user speaking the candidate hotword; and in response to determining that a length of the spoken candidate hotword satisfies a threshold, prompting the user to speak the candidate hotword again” (prompting a user to utter a hotword “again” that was not well recognized properly the first time which amounts to asking if it was part of his initial “receiving audio data”), and these “hotwords” also called “trigger” words as shown in Fig. 1 ¶ 0028 line 3 serve “a speech recognition-enabled electronic device” (e.g., to initiate an automated assistant function)).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “prompting” procedure for “hotword” detection of Clark et al. into the “hotword” techniques of Shah et al. would enable the combined systems and their associated methods to perform in combination as they do separately and thus avoid “false rejection by the hotword detector” as disclosed required by Shah et al. ¶ 0040 column 1 last 5 lines.

Regarding claim 18, Shah et al. do not specifically disclose the client device according to claim 15, wherein: the program instructions are further executable to prompt the user to indicate whether or not the spoken utterance includes the hotword, and the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
Clark et al. do teach the client device according to claim 15, wherein: the program instructions are further executable to prompt the user to indicate whether or not the spoken utterance includes the hotword, and the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting (page 8 column 1 last ¶: “prompting a user to speak a candidate hotword, and receiving audio data corresponding to the user speaking the candidate hotword; and in response to determining that a length of the spoken candidate hotword satisfies a threshold, prompting the user to speak the candidate hotword again” (prompting a user to utter a hotword “again” that was not well recognized properly the first time which amounts to asking if it was part of his initial “receiving audio data”), and these “hotwords” also called “trigger” words as shown in Fig. 1 ¶ 0028 line 3 serve “a speech recognition-enabled electronic device” (e.g., to initiate an automated assistant function)).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the “prompting” procedure for “hotword” detection of Clark et al. into the “hotword” techniques of Shah et al. would enable the combined systems and their associated methods to perform in combination as they do separately and thus avoid “false rejection by the hotword detector” as disclosed required by Shah et al. ¶ 0040 column 1 last 5 lines.

Claim(s) 7, 14, 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shah et al. in view of Clark et al.,  and further in view of Cho et al. (US 2018/0131802).
Regarding claim 7, Shah et al. in view of Clark et al. do not specifically disclose the method according to claim 4, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time.
Cho et al. do teach the method according to claim 4, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time ([0016]: “In the embodiment, if the set period” (a predetermined period of time) “of time elapses without” (that the user has not accessed the client device) “at least one touch input on the virtual keys being sensed, the speech recognition mode may be executed, and if there is no voice command input from the user in the speech recognition mode, a phrase prompting” (a prompt is made to the user to speak e.g. a “command” (hotword)) “the user to speak a voice command may be output”).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the methods of Cho et al. with respect to time management for uttering commands into general hotword utterance techniques of Shah et al. in view of Clark et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable Shah et al. in view of Clark et al. to more efficiently manage reception of verbal utterances by reducing times a device is idle.

Regarding claim 14, Shah et al. in view of Clark et al. do not specifically disclose the computer program product according to claim 11, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time.
Cho et al. do teach the computer program product according to claim 11, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time ([0016]: “In the embodiment, if the set period” (a predetermined period of time) “of time elapses without” (that the user has not accessed the client device) “at least one touch input on the virtual keys being sensed, the speech recognition mode may be executed, and if there is no voice command input from the user in the speech recognition mode, a phrase prompting” (a prompt is made to the user to speak e.g. a “command” (hotword)) “the user to speak a voice command may be output”).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the methods of Cho et al. with respect to time management for uttering commands into general hotword utterance techniques of Shah et al. in view of Clark et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable Shah et al. in view of Clark et al. to more efficiently manage reception of verbal utterances by reducing times a device is idle.

Regarding claim 20, Shah et al. in view of Clark et al. do not specifically disclose the client device according to claim 18, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time.
Cho et al. do teach the client device according to claim 18, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time ([0016]: “In the embodiment, if the set period” (a predetermined period of time) “of time elapses without” (that the user has not accessed the client device) “at least one touch input on the virtual keys being sensed, the speech recognition mode may be executed, and if there is no voice command input from the user in the speech recognition mode, a phrase prompting” (a prompt is made to the user to speak e.g. a “command” (hotword)) “the user to speak a voice command may be output”).
It would have therefore been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the methods of Cho et al. with respect to time management for uttering commands into general hotword utterance techniques of Shah et al. in view of Clark et al. would enable the combined systems and their associated methods to perform in combination as they do separately and to further enable Shah et al. in view of Clark et al. to more efficiently manage reception of verbal utterances by reducing times a device is idle.

Allowable Subject Matter
Claims 5-6, 12-13, 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Double Patenting
A rejection based on double patenting of the “same invention” type finds its support in the language of 35 U.S.C. 101 which states that “whoever invents or discovers any new and useful process... may obtain a patent therefor...” (Emphasis added). Thus, the term “same invention,” in this context, means an invention drawn to identical subject matter. See Miller v. Eagle Mfg. Co., 151 U.S. 186 (1894); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Ockert, 245 F.2d 467, 114 USPQ 330 (CCPA 1957).
A statutory type (35 U.S.C. 101) double patenting rejection can be overcome by canceling or amending the claims that are directed to the same invention so they are no longer coextensive in scope. The filing of a terminal disclaimer cannot overcome a double patenting rejection based upon 35 U.S.C. 101.
Claims 1-19 is/are rejected under 35 U.S.C. 101 as claiming the same invention as that of claims 1-19 of prior U.S. Patent No. 12,027,160. This is a statutory double patenting rejection.
18,761,117
 
1. A method implemented by one or more processors, the method comprising: receiving, via one or more microphones of a client device, audio data that captures a spoken utterance of a user; processing the audio data using a machine learning model to generate a predicted output; determining that the predicted output satisfies a threshold; in response to determining that the predicted output satisfies the threshold, initiating an automated assistant function; determining, based on user interface input received subsequent to initiating the automated assistant function, that the automated assistant function should not have been initiated; and in response to determining that the predicted output satisfies the threshold and determining that the automated assistant function should not have been initiated, adjusting the threshold.

2. The method according to claim 1, wherein the machine learning model is a hotword detection model, and further comprising training the hotword detection model based on determining that the automated assistant function should not have been initiated.
3. The method according to claim 1, wherein the adjusting the threshold comprises raising the threshold.
4. The method according to claim 1, further comprising prompting the user to indicate whether or not the spoken utterance includes the hotword, wherein the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
5. The method according to claim 4, wherein the prompting is in response to determining that a number of times that the user has been previously prompted does not exceed a rate limit.
6. The method according to claim 4, wherein the prompting is in response to determining that a do not disturb state is disabled.
7. The method according to claim 4, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time.
9. The computer program product according to claim 8, wherein the machine learning model is a hotword detection model, and the program instructions are further executable to train the hotword detection model based on determining that the automated assistant function should not have been initiated.
10. The computer program product according to claim 8, wherein the adjusting the threshold comprises raising the threshold.
11. The computer program product according to claim 8, wherein: the program instructions are further executable to prompt the user to indicate whether or not the spoken utterance includes the hotword; and the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
12. The computer program product according to claim 11, wherein the prompting is in response to determining that a number of times that the user has been previously prompted does not exceed a rate limit.
13. The computer program product according to claim 11, wherein the prompting is in response to determining that a do not disturb state is disabled.
14. The computer program product according to claim 11, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time.
16. The client device according to claim 15, wherein the machine learning model is a hotword detection model, and the program instructions are further executable to train the hotword detection model based on determining that the automated assistant function should not have been initiated.
17. The client device according to claim 15, wherein the adjusting the threshold comprises raising the threshold.
18. The client device according to claim 15, wherein: the program instructions are further executable to prompt the user to indicate whether or not the spoken utterance includes the hotword; and the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
19. The client device according to claim 18, wherein the prompting is in response to determining that a number of times that the user has been previously prompted does not exceed a rate limit.

12,027,160

1. A method implemented by one or more processors, the method comprising: receiving, via one or more microphones of a client device, audio data that captures a spoken utterance of a user; processing the audio data using a machine learning model to generate a predicted output; determining that the predicted output satisfies a threshold; in response to determining that the predicted output satisfies the threshold, initiating an automated assistant function; determining, based on user interface input received subsequent to initiating the automated assistant function, that the automated assistant function should not have been initiated; and in response to determining that the predicted output satisfies the threshold and determining that the automated assistant function should not have been initiated, adjusting the threshold.

2. The method according to claim 1, wherein the machine learning model is a hotword detection model, and further comprising training the hotword detection model based on determining that the automated assistant function should not have been initiated.
3. The method according to claim 1, wherein the adjusting the threshold comprises raising the threshold.
4. The method according to claim 1, further comprising prompting the user to indicate whether or not the spoken utterance includes the hotword, wherein the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
5. The method according to claim 4, wherein the prompting is in response to determining that a number of times that the user has been previously prompted does not exceed a rate limit.
6. The method according to claim 4, wherein the prompting is in response to determining that a do not disturb state is disabled.

7. The method according to claim 4, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time.
9. The computer program product according to claim 8, wherein: the machine learning model is a hotword detection model; and the program instructions are further executable to train the hotword detection model based on determining that the automated assistant function should not have been initiated.

10. The computer program product according to claim 8, wherein the adjusting the threshold comprises raising the threshold.
11. The computer program product according to claim 8, wherein: the program instructions are further executable to prompt the user to indicate whether or not the spoken utterance includes the hotword; and the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
12. The computer program product according to claim 11, wherein the prompting is in response to determining that a number of times that the user has been previously prompted does not exceed a rate limit.

13. The computer program product according to claim 11, wherein the prompting is in response to determining that a do not disturb state is disabled.
14. The computer program product according to claim 11, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time.
16. The system according to claim 15, wherein: the machine learning model is a hotword detection model; and the program instructions are further executable to train the hotword detection model based on determining that the automated assistant function should not have been initiated.
17. The system according to claim 15, wherein the adjusting the threshold comprises raising the threshold.
18. The system according to claim 15, wherein: the program instructions are further executable to prompt the user to indicate whether or not the spoken utterance includes the hotword; and the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
19. The system according to claim 18, wherein the prompting is in response to determining that a number of times that the user has been previously prompted does not exceed a rate limit.

2. The method according to claim 1, wherein the machine learning model is a hotword detection model, and further comprising training the hotword detection model based on determining that the automated assistant function should not have been initiated.
3. The method according to claim 1, wherein the adjusting the threshold comprises raising the threshold.
4. The method according to claim 1, further comprising prompting the user to indicate whether or not the spoken utterance includes the hotword, wherein the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
5. The method according to claim 4, wherein the prompting is in response to determining that a number of times that the user has been previously prompted does not exceed a rate limit.
6. The method according to claim 4, wherein the prompting is in response to determining that a do not disturb state is disabled.
7. The method according to claim 4, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time.
9. The computer program product according to claim 8, wherein: the machine learning model is a hotword detection model; and the program instructions are further executable to train the hotword detection model based on determining that the automated assistant function should not have been initiated.
10. The computer program product according to claim 8, wherein the adjusting the threshold comprises raising the threshold.
11. The computer program product according to claim 8, wherein: the program instructions are further executable to prompt the user to indicate whether or not the spoken utterance includes the hotword; and the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
12. The computer program product according to claim 11, wherein the prompting is in response to determining that a number of times that the user has been previously prompted does not exceed a rate limit.
13. The computer program product according to claim 11, wherein the prompting is in response to determining that a do not disturb state is disabled.
14. The computer program product according to claim 11, wherein the prompting is further in response to determining that the user has not accessed the client device during a predetermined period of time.
16. The system according to claim 15, wherein: the machine learning model is a hotword detection model; and the program instructions are further executable to train the hotword detection model based on determining that the automated assistant function should not have been initiated.
17. The system according to claim 15, wherein the adjusting the threshold comprises raising the threshold.
18. The system according to claim 15, wherein: the program instructions are further executable to prompt the user to indicate whether or not the spoken utterance includes the hotword; and the user interface input received subsequent to initiating the automated assistant function is received as a response to the prompting.
19. The system according to claim 18, wherein the prompting is in response to determining that a number of times that the user has been previously prompted does not exceed a rate limit.
20. The system according to claim 18, wherein the prompting is in response to determining that a do not disturb state is disabled.

Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARZAD KAZEMINEZHAD whose telephone number is (571)270-5860. The examiner can normally be reached 10:30 am to 11:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Paras D Shah can be reached at (571) 270-1650. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Farzad Kazeminezhad/
Art Unit 2653
January 10th 2026.
Read full office action
Prosecution Timeline

Jul 01, 2024
Application Filed
Jan 10, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/319,946
Patent 12603080
GAZE-BASED AND AUGMENTED AUTOMATIC INTERPRETATION METHOD AND SYSTEM
2y 5m to grant Granted Apr 14, 2026
18/475,788
Patent 12592242
MACHINE LEARNING (ML) BASED EMOTION, IDENTITY AND VOICE CONVERSION IN AUDIO USING VIRTUAL DOMAIN MIXING AND FAKE PAIR-MASKING
2y 5m to grant Granted Mar 31, 2026
18/890,293
Patent 12586596
SYSTEM AND METHOD FOR BACKGROUND NOISE SUPPRESSION BY PROJECTING AN INPUT AUDIO INTO A HIGHER DIMENSION SPACE
2y 5m to grant Granted Mar 24, 2026
18/604,374
Patent 12555587
APPARATUS AND METHOD FOR ENCODING AN AUDIO SIGNAL USING AN OUTPUT INTERFACE FOR OUTPUTTING A PARAMETER CALCULATED FROM A COMPENSATION VALUE
2y 5m to grant Granted Feb 17, 2026
18/164,336
Patent 12537019
ACTIVITY CHARTING WHEN USING PERSONAL ARTIFICIAL INTELLIGENCE ASSISTANTS INCLUDING DIFFERENTIATING A PATIENT FROM A DIFFERENT PERSON BASED ON AUDIO ASSOCIATED WITH TOILETTING
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
71%
Grant Probability
99%
With Interview (+67.2%)
3y 6m
Median Time to Grant
Low
PTA Risk
Based on 534 resolved cases by this examiner. Grant probability derived from career allow rate.