Last updated: May 29, 2026

Application No. 18/390,495

VOICE-INTERACTION DEVICE FOR ENGAGING IN ACTIVITIES

Non-Final OA §103§112

Filed

Dec 20, 2023

Examiner

SURVILLO, OLEG

Art Unit

2457

Tech Center

2400 — Computer Networks

Assignee

Capital One Financial Corporation

OA Round

1 (Non-Final)

Interview Optional

— +28.0% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 72% grant rate with +28.0% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 561 resolved cases, 2023–2026

Examiner Intelligence

SURVILLO, OLEG View full profile →

Grants 72% — above average

Career Allowance Rate

405 granted / 561 resolved

+14.2% vs TC avg

Strong +28% interview lift

Without

With

+28.0%

Interview Lift

resolved cases with interview

Typical timeline

4y 4m

Avg Prosecution

22 currently pending

Career history

587

Total Applications

across all art units

Statute-Specific Performance

§101

1.5%

-38.5% vs TC avg

§103

84.3%

+44.3% vs TC avg

§102

7.8%

-32.2% vs TC avg

§112

4.3%

-35.7% vs TC avg

Black line = Tech Center average estimate • Based on career data from 561 resolved cases

Office Action

§103 §112

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claim 14 is rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
	As to claim 14, monitoring (or continuing to monitor) for receipt of the biometric input until the given activity is completed is ambiguous because claim 1 requires the biometric input to be received and validated before the given activity to even presented to the user. It is unclear why would the system to monitor on a continuing basis for the biometric input that was required to be received while the device is operating in the first mode. No prior art can be reasonably applied until this issue is clarified by either an amendment or explanation. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4, 8-9, 15-16, and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Weis et al. (US 2023/0119154 A1) in view of Gailloux et al. (US Patent 8,055,247 B1).
As to claim 1, Weis teaches a voice-interaction device [self-service terminal 100] comprising: 
a biometric sensor [sensing device 206]; 
an audio output interface [speakers] (par. [0056]); 
an audio input interface [microphone] (par. [0056]); 
at least one processor (par. [0068]); 
at least one non-transitory computer-readable medium (par. [0068]); and 
program instructions stored on the at least one non-transitory computer-readable medium that are executable by the at least one processor (par. [0014]) such that the voice-interaction device is configured to: 
operate in a first mode in which the voice-interaction device monitors for biometric input [operation of the self-service terminal is triggered by user approaching the terminal or touching the user interface] (par. [0105]); 
while operating in the first mode, receive a biometric input via the biometric sensor [step 401] (par. [0104]); 
validate the biometric input and thereby determine that the biometric input is valid [step 403 recognizing the user based on the biometric input] (par. [0106]-[0108], [0113]); 
based on determining that the biometric input is valid, transition from operating in the first mode to operating in a second mode in which the voice-interaction device is configured to facilitate selection by a given user of an available activity via voice-based interaction with the voice-interaction device [selecting a user profile that triggers reconfiguration of the user interface to activate non-visible auditory user interface] (par. [0116], [0135], Fig. 7); and 
while operating in the second mode: 
produce, via the audio output interface, a first spoken output that indicates at least one activity that is available for selection by the given user via voice-based interaction [presenting options 1-3 via sound over the speaker] (par. [0136], Fig. 7).
While Weis teaches having a microphone (par. [0056]), Weis fails to teach that the self-service terminal is configured to receive, via the audio input interface, a spoken input from the given user; based on an analysis of the spoken input, determine that the spoken input indicates a selection of a given activity that is available for selection by the given user; and based on determining that the spoken input indicates selection of the given activity, cause the given activity to be initiated. 
Gailloux is directed to providing requested content to users without requiring the use of a display or a keyboard (abstract). In particular, Gailloux teaches a self-service terminal [mobile device 104] is configured to receive, via the audio input interface [microphone], a spoken input from the given user [voice to select menu options] (col. 2 lines 49-64); based on an analysis of the spoken input, determine that the spoken input indicates a selection of a given activity that is available for selection by the given user [analyzing user request to determine if the requested content (menu selection) is available, where the menu selection could be a request to playback “my calendar for today”] (col. 5 lines 30-35, 49-57); and based on determining that the spoken input indicates selection of the given activity, cause the given activity to be initiated [rendering the requested content (menu selection)] (col. 5 lines 42-45). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method and system of Weis by having the self-service terminal configured to receive, via the audio input interface, a spoken input from the given user; based on an analysis of the spoken input, determine that the spoken input indicates a selection of a given activity that is available for selection by the given user; and based on determining that the spoken input indicates selection of the given activity, cause the given activity to be initiated, in order to allow the user of Weis to select one of the options 1-3 without using a visual display or a keyboard (col. 1 lines 39-49 in Gailloux).

As to claim 2, Weis teaches that the biometric sensor comprises a fingerprint sensor and wherein the biometric input comprises a fingerprint input (par. [0059]).

As to claim 4, Weis in view of Gailloux teaches that the program instructions that are executable by the at least one processor such that the voice-interaction device is configured to cause the given activity to be initiated comprise program instructions that are executable by the at least one processor such that the voice-interaction device is configured to: transmit, to a computing platform configured to facilitate activities selected by the given user, a request on behalf of the given user to facilitate the given activity [query remote data service for requested content] (step 510 in Fig. 5 of Gailloux, col. 5 lines 46-63); receive, from the computing platform, a response to the request; and produce, via the audio output interface, a second spoken output indicating the response to the request (steps 606, 608 in Fig. 6. of Gailloux, col. 6 lines 10-18). 

As to claim 8, Weis teaches that the at least one activity that is available 8. for selection by the given user via voice-based interaction includes at least one of (i) checking an account balance for a given financial account, (ii) scheduling a payment related to a given financial account, or (iii) transferring funds from a first financial account to a second financial account (par. [0030]). 

As to claim 9, Weis in view of Gailloux teaches that the first spoken output that indicates the 9. at least one activity that is available for selection by the given user via voice-based interaction further indicates, for each respective activity, one or more corresponding responses that may be spoken by the given user to indicate selection of the respective activity [using audio prompts to state menu options] (col. 2 lines 49-64 in Gailloux).

As to claim 15, Weis in view of Gailloux teaches at least one non-transitory computer-readable medium, wherein the at least one non-transitory computer-readable medium is provisioned with program instructions (par. [0014] in Weis) that, when executed by at least one processor, cause a voice-interaction device to perform the method steps, as discussed above with respect to claim 1.

As to claims 16 and 19, Weis teaches all the elements, as discussed per claim 2 above.

As to claim 18, Weis teaches a method carried out by a voice-interaction device [self-service terminal 100] comprising: 
operating in a first mode in which the voice-interaction device monitors for biometric input [operation of the self-service terminal is triggered by user approaching the terminal or touching the user interface] (par. [0105]); 
while operating in the first mode, receiving a biometric input via the biometric sensor [step 401] (par. [0104]); 
validating the biometric input and thereby determining that the biometric input is valid [step 403 recognizing the user based on the biometric input] (par. [0106]-[0108], [0113]); 
based on determining that the biometric input is valid, transitioning from operating in the first mode to operating in a second mode in which the voice-interaction device is configured to facilitate selection by a given user of an available activity via voice-based interaction with the voice-interaction device [selecting a user profile that triggers reconfiguration of the user interface to activate non-visible auditory user interface] (par. [0116], [0135], Fig. 7); and 
while operating in the second mode: 
producing, via the audio output interface, a first spoken output that indicates at least one activity that is available for selection by the given user via voice-based interaction [presenting options 1-3 via sound over the speaker] (par. [0136], Fig. 7).
While Weis teaches having a microphone (par. [0056]), Weis fails to teach that the self-service terminal is configured to receive, via the audio input interface, a spoken input from the given user; based on an analysis of the spoken input, determine that the spoken input indicates a selection of a given activity that is available for selection by the given user; and based on determining that the spoken input indicates selection of the given activity, cause the given activity to be initiated. 
Gailloux is directed to providing requested content to users without requiring the use of a display or a keyboard (abstract). In particular, Gailloux teaches a self-service terminal [mobile device 104] is configured to receive, via the audio input interface [microphone], a spoken input from the given user [voice to select menu options] (col. 2 lines 49-64); based on an analysis of the spoken input, determine that the spoken input indicates a selection of a given activity that is available for selection by the given user [analyzing user request to determine if the requested content (menu selection) is available, where the menu selection could be a request to playback “my calendar for today”] (col. 5 lines 30-35, 49-57); and based on determining that the spoken input indicates selection of the given activity, cause the given activity to be initiated [rendering the requested content (menu selection)] (col. 5 lines 42-45). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method and system of Weis by having the self-service terminal configured to receive, via the audio input interface, a spoken input from the given user; based on an analysis of the spoken input, determine that the spoken input indicates a selection of a given activity that is available for selection by the given user; and based on determining that the spoken input indicates selection of the given activity, cause the given activity to be initiated, in order to allow the user of Weis to select one of the options 1-3 without using a visual display or a keyboard (col. 1 lines 39-49 in Gailloux).

Claims 3, 17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Weis et al. in view of Gailloux et al. and in further view of Agrawal et al. (US 2025/0053626 A1).
As to claims 3, 17, and 20, Weis in view of Gailloux teaches all the elements except that the program instructions that are executable by the at least one processor such that the voice-interaction device is configured to validate the biometric input and thereby determine that the biometric input is valid comprise program instructions that are executable by the at least one processor such that the voice- interaction device is configured to: capture the biometric input; generate a representation of the captured biometric input; compare the generated representation of the captured biometric input against one or more stored biometric representations to determine whether or not the generated representation of the captured biometric input matches any stored biometric representations; and determine that the generated representation of the captured biometric input matches a stored biometric representation.
	Agrawal is directed to providing dynamic authentication and authorization an on electronic device (abstract). In particular, Agrawal teaches instructions that: capture the biometric input (par. [0037]); generate a representation of the captured biometric input [executing code to implement a function associated with the virtual button] (par. [0037]); compare the generated representation of the captured biometric input against one or more stored biometric representations to determine whether or not the generated representation of the captured biometric input matches any stored biometric representations (par. [0072]); and determine that the generated representation of the captured biometric input matches a stored biometric representation (par. [0073]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method and system of Weis in view of Gailloux by having the program instructions that are executable by the at least one processor such that the voice-interaction device is configured to validate the biometric input and thereby determine that the biometric input is valid comprise program instructions that are executable by the at least one processor such that the voice- interaction device is configured to: capture the biometric input; generate a representation of the captured biometric input; compare the generated representation of the captured biometric input against one or more stored biometric representations to determine whether or not the generated representation of the captured biometric input matches any stored biometric representations; and determine that the generated representation of the captured biometric input matches a stored biometric representation, in order to perform a well known and understood process of biometric authentication. 

Claims 5-7 are rejected under 35 U.S.C. 103 as being unpatentable over Weis et al. in view of Gailloux et al. and in further view of Lee et al. (US Patent 8,074,083 B1).
As to claim 5, Weis in view of Gailloux teaches that the request on behalf of the given user to facilitate the given activity includes an indication of the given activity [it is inherent that the query in Gailloux must include an indication (identification) of the requested content] (step 510 in Fig. 5 of Gailloux). 
Weis in view of Gailloux fails to teach an indication that the voice-interaction device has successfully authenticated the given user.
Lee is directed to controlling download and playback of media content (abstract). In particular, Lee teaches that a request to facilitate a given activity includes an indication that the device has successfully authenticated the given user [content request includes an authentication token] (col. 8 lines 39-44).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method and system of Weis in view of Gailloux by having an indication that the voice-interaction device has successfully authenticated the given user in order to confirm with the remote data service that the user is authorized to consume the requested content.

As to claim 6, Weis in view of Gailloux and Lee teaches that the indication that the voice-interaction device has successfully authenticated the user comprises a data communication confirming that the given user is authorized to engage in the given activity [content request includes an authentication token] (col. 8 lines 39-44 in Lee).

As to claim 7, Weis in view of Gailloux and Lee teaches that the data communication is encrypted using a randomly generated, unique code that indicates a particular decrypting algorithm that is to be used by the computing platform to decrypt the data communication (col. 8 line 57 to col. 9 line 7 in Lee).

Claims 10-11 are rejected under 35 U.S.C. 103 as being unpatentable over Weis et al. in view of Gailloux et al. and in further view of Wong (US 2019/0200223 A1).
As to claim 10, Weis teaches that before determining that the biometric input is valid, determine that the biometric input is not valid [user recognition is negative] (par. [0113]).
Weis in view of Gailloux fails to teach instructions configured to receive a second spoken input indicating a request to store the biometric input; and based on the request, authenticate the given user for storing the biometric input.
	Wong is directed to a biometric authentication system (abstract). In particular, Wong teaches instructions configured to receive a second input indicating a request to store the biometric input [user sends the fingerprinting command]; and based on the request, authenticate the given user for storing the biometric input [storing new fingerprint] (par. [0053], [0076], steps 6-9 in Fig. 2).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method and system of Weis in view of Gailloux by having the instructions configured to receive a second input indicating a request to store the biometric input; and based on the request, authenticate the given user for storing the biometric input, where the second input would be spoken as per system configuration of Weis in view of Gailloux, in order to store a new fingerprint such as when the previously stored information has been deleted or there was a software update requiring a new registration (par. [0003] in Wong).

As to claim 11, Weis in view of Gailloux and Wong teaches instructions configured to transmit, to a computing platform configured to authenticate the given user, a request to issue an original user verification code to the given user (step 2 in Fig. 2 of Wong); receive a third spoken input comprising a spoken user verification code (step 3 in Fig. 2 of Wong, where “spoken” is per configuration of Weis in view of Gailloux); based on a comparison of (i) the spoken user verification code and (ii) the original user verification code, determine that the spoken user verification code matches the original user verification code and thereby determine that the spoken user verification code is valid (step 4 in Fig. 2 of Wong); and based on determining that the spoken user verification code is valid, store a captured representation of the biometric input (steps 6-9 in Fig. 2 of Wong).

Claims 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over Weis et al. in view of Gailloux et al. in view of Wong and in further view of Woodard et al. (US Patent 11,683,302 B2).
As to claim 12, Weis in view of Gailloux and Wong teaches all the elements except comprising program instructions stored on the at least one non-transitory computer-readable medium that are executable by the at least one processor such that the voice-interaction device is configured to: use one or more speech processing techniques to generate a text representation of the spoken user verification code; transmit, to the computing platform, a request to verify the spoken user verification code, wherein the request includes the text representation of the spoken user verification code; and receive, from the computing platform, a response indicating that the text representation of the spoken user verification code matches the original user verification code. 
	Woodard is directed to providing a verification process during an exchange between the parties (abstract). In particular, Woodard teaches program instructions configured to: use one or more speech processing techniques to generate a text representation of the spoken user verification code (Fig. 3C); transmit, to the computing platform, a request to verify the spoken user verification code, wherein the request includes the text representation of the spoken user verification code (col. 10 line 63 to col. 11 line 18); and receive, from the computing platform, a response indicating that the text representation of the spoken user verification code matches the original user verification code (Fig. 3D, col. 11 lines 46-59). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method and system of Weis in view of Gailloux and Wong by having program instructions stored on the at least one non-transitory computer-readable medium that are executable by the at least one processor such that the voice-interaction device is configured to: use one or more speech processing techniques to generate a text representation of the spoken user verification code; transmit, to the computing platform, a request to verify the spoken user verification code, wherein the request includes the text representation of the spoken user verification code; and receive, from the computing platform, a response indicating that the text representation of the spoken user verification code matches the original user verification code, in order to allow the system of Weis in view of Gailloux to convert verbal input by the user into text for transmission over the network as was well known in the art before the effective filing date of the claimed invention.

As to claim 13, Weis in view of Gailloux and Wong teaches all the elements except program instructions stored on the at least one non-transitory computer-readable medium that are executable by the at least one processor such that the voice-interaction device is configured to: use one or more speech processing techniques to generate a text representation of the spoken user verification code; obtain, from the computing platform, a text representation of the original user verification code; and compare (i) the text representation of the spoken user verification code and (ii) the text representation of the original user verification code to determine if the spoken user verification code matches the original user verification code.
Woodard is directed to providing a verification process during an exchange between the parties (abstract). In particular, Woodard teaches program instructions configured to: use one or more speech processing techniques to generate a text representation of the spoken user verification code (col. 10 line 63 to col. 11 line 18); obtain, from the computing platform, a text representation of the original user verification code; and compare (i) the text representation of the spoken user verification code and (ii) the text representation of the original user verification code to determine if the spoken user verification code matches the original user verification code (col. 11 lines 19-27, Fig. 3D).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the method and system of Weis in view of Gailloux and Wong by having program instructions stored on the at least one non-transitory computer-readable medium that are executable by the at least one processor such that the voice-interaction device is configured to: use one or more speech processing techniques to generate a text representation of the spoken user verification code; obtain, from the computing platform, a text representation of the original user verification code; and compare (i) the text representation of the spoken user verification code and (ii) the text representation of the original user verification code to determine if the spoken user verification code matches the original user verification code, in order to  allow the system of Weis in view of Gailloux to convert verbal input by the user into text for transmission over the network as was well known in the art before the effective filing date of the claimed invention.

Related Prior Art
Park et al. (US 2018/0218739 A1) is directed to receiving voice commands, performing voiceprint recognition, and performing requested function upon user authentication (par. [0070]-[0072], Fig. 3). Park also teaches performing fingerprint recognition (par. [0105]-[0106]). Therefore, Park teaches similar subject matter as the claimed invention and can be used in the rejection of claims 1-20.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to OLEG SURVILLO whose telephone number is (571)272-9691. The examiner can normally be reached 9:00am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ario Etienne can be reached at 571-272-4001. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/OLEG SURVILLO/Primary Examiner, Art Unit 2457

Read full office action

Prosecution Timeline

Dec 20, 2023

Application Filed

Jan 02, 2026

Non-Final Rejection mailed — §103, §112

Mar 25, 2026

Examiner Interview Summary

Mar 25, 2026

Applicant Interview (Telephonic)

Apr 02, 2026

Response Filed

Precedent Cases

Applications granted by this same examiner with similar technology

17/809,996

Patent 12639430

TECHNIQUES FOR DIFFERENTIAL INSPECTION OF CONTAINER LAYERS

3y 11m to grant Granted May 26, 2026

18/191,848

Patent 12627661

ENHANCED NETCONF ACCESS CONTROL MODEL (NACM) OPERATIONS AND GRANULAR CONTROLS FOR SHARED DATA NODE MANAGEMENT

3y 1m to grant Granted May 12, 2026

18/537,095

Patent 12625993

PROVIDING A GRAPHICAL REPRESENTATION OF ANOMALOUS EVENTS

2y 5m to grant Granted May 12, 2026

18/309,580

Patent 12621323

CONTENT-OBLIVIOUS FRAUDULENT EMAIL DETECTION SYSTEM

3y 0m to grant Granted May 05, 2026

18/441,826

Patent 12615269

DETECTION RULE OUTPUT METHOD, SECURITY SYSTEM, AND DETECTION RULE OUTPUT DEVICE

2y 2m to grant Granted Apr 28, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

72%

Grant Probability

99%

With Interview (+28.0%)

4y 4m (~1y 11m remaining)

Median Time to Grant

Low

PTA Risk

Based on 561 resolved cases by this examiner. Grant probability derived from career allowance rate.