Last updated: May 04, 2026

Application No. 18/674,479

ENABLING NATURAL CONVERSATIONS WITH SOFT ENDPOINTING FOR AN AUTOMATED ASSISTANT

Non-Final OA §102§103

Filed

May 24, 2024

Priority

Aug 17, 2021 — provisional 63/233,877 +1 more

Examiner

WOO, STELLA L

Art Unit

2693

Tech Center

2600 — Communications

Assignee

Google LLC

OA Round

1 (Non-Final)

Interview Optional

— +13.1% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 80% grant rate with +13.1% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 1009 resolved cases, 2023–2026

Examiner Intelligence

WOO, STELLA L View full profile →

Grants 80% — above average

Career Allowance Rate

803 granted / 1009 resolved

+17.6% vs TC avg

Moderate +13% lift

Without

With

+13.1%

Interview Lift

resolved cases with interview

Typical timeline

2y 7m

Avg Prosecution

20 currently pending

Career history

1029

Total Applications

across all art units

Statute-Specific Performance

§101

3.3%

-36.7% vs TC avg

§103

42.5%

+2.5% vs TC avg

§102

27.9%

-12.1% vs TC avg

§112

11.4%

-28.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1009 resolved cases

Office Action

§102 §103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-2, 4, 6, 8-13, 15, 17-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Krishnan et al. (US 2022/0093101 A1, “Krishnan”).
As to claims 1, 12, 20, Krishnan discloses a method implemented by one or more processors, the method comprising: 
receiving a stream of audio data, the stream of audio data being generated by one or more microphones of a client device of a user, and the stream of audio data capturing at least a portion of a spoken utterance provided by the user that are directed to an automated assistant implemented at least in part at the client device (client device 110 includes a microphone array to capture audio including speech directed by a user to the digital assistant, para. 0060, and on-device language processing components, para. 0098); 
determining, based on processing the stream of audio data, audio-based characteristics associated with the portion of the spoken utterance (conversation analyzer 1120 considers audio characteristics such as pause timing/length, para. 0320); 
determining, based on the audio-based characteristics associated with the portion of the spoken utterance, whether the user has paused in providing the spoken utterance (system detects a silence that is classified as a pause, para. 0320, 0349); and 
in response to determining that the user has paused in providing the spoken utterance: 
determining natural conversation output to be provided for audible presentation to the user, the natural conversation output to be provided for audible presentation to the user to indicate the automated assistant is waiting for the user to continue providing of the spoken utterance (in response to a pause in speech, the system outputs a backchannel response to identify to the user that the system is continuing to pay attention and is waiting, para. 0040, 0349; system may act more human-like as a natural participant in a conversation and may answer questions or interject information that may be helpful to the conversation, para. 0311, 0342, 0495-0499); and 
causing the natural conversation output to be provided for audible presentation to the user via one or more speakers of the client device (synthesized speech output, para. 0040, 0043, 0080, 0495-0499).
As to claims 2, 13, Krishnan discloses: wherein causing the natural conversation output to be provided audible presentation to the user via the one or more speakers of the client device is further in response to determining that the user has paused in providing the spoken utterance for a threshold duration of time (pause time data may be used to determine the timing of an interjection, para. 0320).
As to claims 4, 15, Krishnan discloses: 
determining whether the user has completed providing of the spoken utterance (determining an incomplete user utterance when detected silence is classified as a pause, para. 0349), 
wherein determining natural conversation output to be provided for audible presentation to the user is further in response to determining that the user has not completed providing of the spoken utterance (a backchannel response is determined and output in order to encourage a completed utterance by the user, para. 0349).
As to claim 6, Krishnan discloses: in response to determining that the user has completed providing the spoken utterance: causing the automated assistant to initiate fulfillment of the spoken utterance (computing device performs task based on user’s spoken commands, para. 0002).
As to claims 8, 17, Krishnan discloses: keeping one or more automated assistant components that utilize the ASR model active while causing the natural conversation output to be provided for audible presentation to the user via one or more speakers of the client device (system may act more human-like as a natural participant in a conversation and may answer questions or interject information that may be helpful to the conversation, para. 0311, 0342, 0495-0499; synthesized speech output via loudspeakers of device 110, para. 0040, 0043, 0080, 0094-0095, 0495-0499).
As to claims 9, 18, Krishnan discloses: wherein causing the natural conversation output to be provided for audible presentation to the user via the one or more speakers of the client device comprises: 
processing, using a text-to-speech (TTS) model, the natural conversation output to generate synthesized speech audio data that includes the natural conversation output (TTS component 280 creates audio data corresponding to the system generated natural language response, para. 0074, 0077); and 
causing the synthesized speech audio data to be provided for audible presentation to the user via the one or more speakers of the client device (synthesized speech output via loudspeakers of device 110, para. 0094-0095).
As to claims 10, 19, Krishnan discloses: wherein causing the natural conversation output to be provided for audible presentation to the user via the one or more speakers of the client device comprises: 
obtaining, from on-device memory of the client device, synthesized speech audio data that includes the natural conversation output (on-device language processing components include ASR, NLU, TTS, NLG, para. 0098); and 
causing the synthesized speech audio data to be provided for audible presentation to the user via the one or more speakers of the client device (synthesized speech output via loudspeakers of device 110, para. 0094-0095).
As to claim 11, Krishnan discloses: wherein the one or more processors are implemented locally at the client device of the user (device 110 may conduct its own speech processing using on-device language processing components, para. 0098).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 3, 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Krishnan in view of Bratt et al. (US 2022/0115001 A1, “Bratt”).
Krishnan differs from claims 3, 14 in that although it teaches the use of machine learning models, it does not specifically disclose: wherein determining whether the user has paused in providing the spoken utterance comprises: 
processing, using an audio-based classification machine learning (ML) model, the audio-based characteristics associated with the portion of the spoken utterance to generate output; and 
determining, based on the output generated using the audio-based classification ML model, whether the user has paused in providing the spoken utterance.
Bratt teaches a conversational assistant with pause analytics to analyze and understand detected pauses in user speech (para. 0040, 0076, 0133, 0177), using a trained machine learning model (para. 0186, 0188).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Krishnan in view of Bratt in order to improve the flow of dialogue between a user and the assistant.
Claim(s) 5, 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Krishnan in view of Kodish-Wachs (US 11,062,704 B1).
Krishnan differs from claims 5, 16 in that it does not disclose: processing, using a natural language understanding (NLU) model, the stream of ASR output, to generate a stream of NLU output, and wherein determining whether the user has completed providing of the spoken utterance is based on the stream of NLU output.
Kodish-Wachs teaches a natural language processor, which encompasses NLP, ASR and NLU (col. 5, lines 42-45), determining whether an utterance is complete or incomplete based on analyzing grammatical structure of the utterance, detecting a portion of the utterance is undecipherable and therefore incomplete, etc. (col. 9, line 50 – col. 10, line 7).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Krishnan with the above teaching of Kodish-Wachs in order to improve further computing functions, as taught by Kodish-Wachs (col. 4, lines 27-67).
Claim(s) 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Krishnan in view of Hansen et al. (US 11,038,934 B1, “Hansen”).
Krishnan differs from claim 7 in that it does not disclose: wherein causing the automated assistant to initiate the fulfillment of the spoken utterance comprises: causing, based on the stream of NLU output, a stream of fulfillment data to be generated, wherein the stream of fulfillment data includes an indication of the fulfillment of the spoken utterance.
Hansen teaches a digital assistant which outputs an audio and/or visual response based on the performance of one or more tasks in response to a user request in the form of a natural language command (col. 15, line 49 – col. 16, line 8; col. 87, lines 33-49; col. 88, lines 6-16).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Krishnan with the above teaching of Hansen in order to indicate to the user performance of a requested task.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Garg et al. (US 2022/0020376 A1) teach determining a pause using machine learning (para. 0050).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Stella L Woo whose telephone number is (571)272-7512. The examiner can normally be reached Monday - Friday, 8 a.m. to 5 p.m.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached at 571-272-7488. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

STELLA L. WOO
Primary Examiner
Art Unit 2693



/Stella L. Woo/            Primary Examiner, Art Unit 2693

Read full office action

Prosecution Timeline

May 24, 2024

Application Filed

Jan 28, 2026

Non-Final Rejection — §102, §103

Apr 17, 2026

Interview Requested

Apr 23, 2026

Applicant Interview (Telephonic)

Apr 23, 2026

Examiner Interview Summary

Precedent Cases

Applications granted by this same examiner with similar technology

18/530,261

Patent 12615345

COMMUNICATION MANAGEMENT DEVICE, IMAGE COMMUNICATION SYSTEM, COMMUNICATION MANAGEMENT METHOD, AND RECORDING MEDIUM

2y 4m to grant Granted Apr 28, 2026

18/209,475

Patent 12602416

HYBRID ARTIFICIAL INTELLIGENCE SYSTEM FOR SEMI-AUTOMATIC PATENT CLAIMS ANALYSIS

2y 10m to grant Granted Apr 14, 2026

18/454,212

Patent 12587613

System and method for documenting and controlling meetings with labels and automated operations

2y 7m to grant Granted Mar 24, 2026

18/466,814

Patent 12585681

Methods for Converting Electronic Presentations Into Autonomous Information Collection and Feedback Systems

2y 6m to grant Granted Mar 24, 2026

18/543,126

Patent 12581038

AUDIO PROCESSING IN VIDEO CONFERENCING SYSTEM USING MULTIMODAL FEATURES

2y 3m to grant Granted Mar 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

80%

Grant Probability

93%

With Interview (+13.1%)

2y 7m (~8m remaining)

Median Time to Grant

Low

PTA Risk

Based on 1009 resolved cases by this examiner. Grant probability derived from career allowance rate.