Prosecution Insights
Last updated: April 19, 2026
Application No. 18/765,108

VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION

Non-Final OA §DP
Filed
Jul 05, 2024
Examiner
ROBERTS, SHAUN A
Art Unit
2655
Tech Center
2600 — Communications
Assignee
Google LLC
OA Round
1 (Non-Final)
76%
Grant Probability
Favorable
1-2
OA Rounds
2y 10m
To Grant
86%
With Interview

Examiner Intelligence

Grants 76% — above average
76%
Career Allow Rate
491 granted / 647 resolved
+13.9% vs TC avg
Moderate +10% lift
Without
With
+10.3%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
31 currently pending
Career history
678
Total Applications
across all art units

Statute-Specific Performance

§101
7.6%
-32.4% vs TC avg
§103
49.2%
+9.2% vs TC avg
§102
29.5%
-10.5% vs TC avg
§112
3.5%
-36.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 647 resolved cases

Office Action

§DP
DETAILED ACTION 1. This action is responsive to Application no.18/765,108. All claims have been examined and are currently pending. Notice of Pre-AIA or AIA Status 2. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Information Disclosure Statement 3. The information disclosure statement (IDS) submitted is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner. Double Patenting 4. The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969). A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13. The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer. 5. Claims 1-20 rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-4, 6-10 13-18, 20 of U.S. Patent No.11,568,878. Although the claims at issue are not identical, they are not patentably distinct from each other because they recite similar limitations, and thus anticipate the limitations of the application claims; both teaching processing the plurality of device speaker embeddings to generate a multi-user speaker embedding, to generate and recognize separated audio for causing a device to perform an action. 6. Claims 1-20 rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-4, 6-16, 18-20 of U.S. Patent No.12,033,641. Although the claims at issue are not identical, they are not patentably distinct from each other because they recite similar limitations, and thus anticipate the limitations of the application claims; both teaching processing the plurality of device speaker embeddings to generate a multi-user speaker embedding, to generate and recognize separated audio for causing a device to perform an action. Conclusion 7. The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: See PTO-892, where the application would be in condition for allowance once the double patenting rejection is overcome. The Application at hand teaches: [0081] FIG. 5 illustrates an example of generating an attended speaker embedding for multiple users in accordance with various implementations disclosed herein. Speaker-aware technologies, such as voice filter technology generally assume the neural network takes a single embedding (also referred to herein as a d-vector) as a side input, thus can only be personalized for a single user at runtime. However, many smart devices, such as home speakers, can be a shared device among multiple users. For example, smart home speakers are usually shared between multiple family members. In such cases, conventional voice filter model techniques may be impractical to use. [0084] In some implementations, a system, such as a shared smart home speaker, may have an unknown number of users. In some of those implementations, the system may have multiple speaker embeddings, each corresponding to a distinct user of the shared device. For example, assume we have three users of a shared device and three corresponding speaker embeddings: d.sub.1, d.sub.2, and d.sub.3. [0085] In some implementations, the speaker embeddings can be concatenated from multiple enrolled users. The concatenated speaker embeddings can be processed using the voice filter model to generate the predicted mask. In some versions of those implementations, the system needs to know the maximal number of enrolled users in advance. For example, the system can have three speaker embeddings d.sub.1, d.sub.2, and d.sub.3 corresponding to three enrolled users. The closest prior art is as follows: Zhang et al (2021/0390970) [0028] The speaker embedding network 206 may process the enrollment audio of a target speaker, such as the audio data 212, and may generate an utterance-level speaker embedding. Speaker embedding may be a bias signal that may inform the multi-modal separation network 208 to perform and enhance target speaker separation. A pre-trained speaker model may be introduced and utilized for producing speaker embeddings to characterize the target speaker. The speaker model may be pre-trained on the task of speaker verification, using one or more convolution layers followed by a fully connected layer. The input to the speaker model may an enrollment utterance of the target speaker, such as saying one's name when entering a teleconference. The speaker embedding network 206 may output the utterance-level speaker embedding. Zhao et al 2021/0082438 [0032] Speaker identifier 155 computes a similarity between speaker embedding z associated with the unknown speaker and each of enrollment embeddings 160, which comprise utterance-level speaker embeddings associated with each of several enrollment speakers. Each of enrollment embeddings 160 was previously-generated based on speech frames of a respective enrollment speaker using the same components 105, 115, 125, 140 and 150 which were used to generate speaker embedding z. Speaker identifier 155 may identify the unknown speaker as the enrollment speaker whose associated embedding is most similar to speaker embedding z associated with the unknown speaker. If none of enrollment embeddings 160 is sufficiently similar to speaker embedding z, speaker identifier 155 may output an indication that the unknown speaker cannot be identified from (i.e., is not one of) the enrollment speakers. Speaker identifier 155 may be implemented by a computing device executing an algorithm for computing similarities, by a trained neural network, etc. Chen et al 2019/0318757 Abstract: This document relates to separation of audio signals into speaker-specific signals. One example obtains features reflecting mixed speech signals captured by multiple microphones. The features can be input a neural network and masks can be obtained from the neural network. The masks can be applied one or more of the mixed speech signals captured by one or more of the microphones to obtain two or more separate speaker-specific speech signals, which can then be output. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAUN A ROBERTS whose telephone number is (571)270-7541. The examiner can normally be reached Monday-Friday 9-5 EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /SHAUN ROBERTS/Primary Examiner, Art Unit 2655
Read full office action

Prosecution Timeline

Jul 05, 2024
Application Filed
Feb 19, 2026
Non-Final Rejection — §DP (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12586599
AUDIO SIGNAL PROCESSING METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM WITH MACHINE LEARNING AND FOR MICROPHONE MUTE STATE FEATURES IN A MULTI PERSON VOICE CALL
2y 5m to grant Granted Mar 24, 2026
Patent 12586568
SYNTHETICALLY GENERATING INNER SPEECH TRAINING DATA
2y 5m to grant Granted Mar 24, 2026
Patent 12573376
Dynamic Language and Command Recognition
2y 5m to grant Granted Mar 10, 2026
Patent 12562157
GENERATING TOPIC-SPECIFIC LANGUAGE MODELS
2y 5m to grant Granted Feb 24, 2026
Patent 12555562
VOICE SYNTHESIS FROM DIFFUSION GENERATED SPECTROGRAMS FOR ACCESSIBILITY
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
76%
Grant Probability
86%
With Interview (+10.3%)
2y 10m
Median Time to Grant
Low
PTA Risk
Based on 647 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month