Last updated: April 19, 2026

Application No. 18/577,216

User Gestures to Initiate Voice Commands

Final Rejection §102§103

Filed

Jan 05, 2024

Examiner

SAINT CYR, LEONARD

Art Unit

2658

Tech Center

2600 — Communications

Assignee

Hewlett-Packard Development Company, L.P.

OA Round

2 (Final)

Interview Optional

— +18.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 1144 resolved cases, 2023–2026

Examiner Intelligence

SAINT CYR, LEONARD View full profile →

Grants 77% — above average

Career Allow Rate

882 granted / 1144 resolved

+15.1% vs TC avg

Strong +18% interview lift

Without

With

+18.2%

Interview Lift

resolved cases with interview

Typical timeline

3y 1m

Avg Prosecution

32 currently pending

Career history

1176

Total Applications

across all art units

Statute-Specific Performance

§101

17.8%

-22.2% vs TC avg

§103

39.1%

-0.9% vs TC avg

§102

28.0%

-12.0% vs TC avg

§112

2.2%

-37.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1144 resolved cases

Office Action

§102 §103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 11/03/25 have been fully considered but they are not persuasive. 
Applicant argues that the prior art does not teach the voice command capability is initiated in response to the initiating gesture data (Amendment, pages 10, 11).
The examiner disagrees, since Robaina et al. disclose “The gesture associated with the initiation/termination condition may be set by the HCP. For example, the healthcare provider may designate “tapping a finger on the left hand” as an initiation condition of recording audio data and “tapping a finger on the right hand” as a termination condition of recording the audio data… Because this hand gesture is associated with the initiation condition, the wearable device may start recording the audio data using the dictation program as the patient and the HCP converse.” (paragraphs 228, 229).

Applicant’s arguments, see pages 6 - 9, filed 11/03/25, with respect to claims 1 - 15 have been fully considered and are persuasive.  The rejection of claims 1 – 15 under 35 U.S.C 101 has been withdrawn. 
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1 – 5, 9, 10, 14, 15, 18 - 20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Robaina et al. (US PAP 2018/0197624).
As per claim 1, Robaina et al. teach a computing system comprising: 
a gesture input device (“gesture sensors”; paragraph 62); 
an audio input device (“an audio sensor (e.g., a microphone 232)”; paragraph 51); and 
a processor communicatively coupled to the gesture input device and the audio input device, the processor to (paragraph 251): 
receive, by the gesture input device, initiating gesture data performed by a user which indicates an initiation of a voice command; in response to receiving the initiating gesture data; initiate a voice command capability of the computing system, and capture, by the audio input device and using the voice command capability, the voice command spoken the user; and receive, by the gesture input device, terminating gesture data performed by the user which indicates a termination of the voice command (“One or more poses may be used to activate or to turn off voice recordings of a patient's visit. For example, the doctor may use a certain hand gesture to indicate whether to start dictating the diagnosis of the patient… The gesture associated with the initiation/termination condition may be set by the HCP. For example, the healthcare provider may designate “tapping a finger on the left hand” as an initiation condition of recording audio data and “tapping a finger on the right hand” as a termination condition of recording the audio data… Because this hand gesture is associated with the initiation condition, the wearable device may start recording the audio data using the dictation program as the patient and the HCP converse”; paragraphs 50, 51; 228, 229).

As per claim 2, Robaina et al. further disclose the terminating gesture data comprises a release of a gesture associated with the initiating gesture data (“One or more poses may be used to activate or to turn off voice recordings of a patient's visit. For example, the doctor may use a certain hand gesture to indicate whether to start dictating the diagnosis of the patient… The wearable device can also detect a pose of the HCP to determine whether an initiation condition or a termination condition is present. The wearable device can use data acquired by its environmental sensors to detect the pose of the HCP. For example, the wearable device can use IMUs to determine whether the user's head pose has changed.”; paragraphs 50, 51; 228, 229).

As per claim 3, Robaina et al. further disclose the initiating gesture data and the terminating gesture data comprises a sequence of gestures performed by the user(“One or more poses may be used to activate or to turn off voice recordings of a patient's visit. For example, the doctor may use a certain hand gesture to indicate whether to start dictating the diagnosis of the patient… The wearable device can also detect a pose of the HCP to determine whether an initiation condition or a termination condition is present. The wearable device can use data acquired by its environmental sensors to detect the pose of the HCP. For example, the wearable device can use IMUs to determine whether the user's head pose has changed.”; paragraphs 50, 51; 228, 229).

As per claim 4, Robaina et al. further disclose the initiating gesture data and the terminating gesture data comprises a velocity of a gesture performed by the user (49, 50, 116, 228, 229).

As per claim 5, Robaina et al. further disclose the initiating gesture data and the terminating gesture data comprises a depth of a gesture performed by the user (50, 68, 116, 228, 229).

As per claim 9, Robaina et al. further disclose the gesture input device comprises at least one of a camera, a depth sensor, and a motion sensor (paragraph 62).

As per claim 10, Robaina et al. further disclose the processor to maintain the initiating gesture data and the terminating gesture data in a cloud-based data repository to be ingested by a machine learning computing system (paragraphs 65, 104 – 106).

As per claim 14, Robaina et al. teach a non-transitory computer readable medium comprising program instructions executable by a processor to:
maintain gesture sequence data in a cloud-based data repository to be ingested by a machine learning computing system(paragraphs 65, 104 – 106);
detect a gesture sequence (“gesture sensors”; paragraph 62); and 
query the machine learning computing system to determine that that gesture sequence data indicates an initiation of a voice command based on the detected gesture sequence and the gesture sequence data maintained in the cloud-based data repository, and in response to a determination that the gesture sequence data indicates the initiation of the voice command, initiate a voice assistant to capture a voice command (“One or more poses may be used to activate or to turn off voice recordings of a patient's visit. For example, the doctor may use a certain hand gesture to indicate whether to start dictating the diagnosis of the patient… tapping a finger on the left hand” as an initiation condition of recording audio data and “tapping a finger on the right hand” as a termination condition of recording the audio data… Because this hand gesture is associated with the initiation condition, the wearable device may start recording the audio data using the dictation program as the patient and the HCP converse.”; paragraphs 50, 51; 228, 229).

As per claim 15, Robaina et al. further disclose the cloud-based data repository is updated with the detected gesture sequence (“The user can also edit the medical records using poses or the user input device 466, such as, e.g., head poses, gestures, voice input, totem, etc.”; paragraphs 150 – 153).

As per claim 18, Robaina et al. further disclose the instructions are executable by the processor to update the cloud-based data repository with the gesture sequence (“The user can also edit the medical records using poses or the user input device 466, such as, e.g., head poses, gestures, voice input, totem, etc.”; paragraphs 150 – 153).

As per claim 19, Robaina et al. further disclose the gesture sequence includes a first pose data indicating a first user pose followed by a second pose data indicating a second user pose (“The pose of the user may include head pose, eye pose, hand gestures, foot pose, or other body poses. One or more poses may be used to activate or to turn off voice recordings of a patient's visit.”; paragraphs 50, 224, 229).

As per claim 20, Robaina et al. further disclose the second user pose is different from the first user pose (“The pose of the user may include head pose, eye pose, hand gestures, foot pose, or other body poses. One or more poses may be used to activate or to turn off voice recordings of a patient's visit.”; paragraphs 50, 224, 229).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 6 – 8, 11 – 13, 16, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Robaina et al. (US PAP 2018/0197624) in view of Querze et al. (US PAP 2020/0142667).
As per claim 6, Robaina et al. do not specifically teach the gesture is detected during execution of a conferencing application.
Querze et al. disclose that the configuration of the array 830 is useful for conference call applications…the spatialized VPA audio engine 240 is configured to mute the user 310 on the call(s) with the other phone call sources while the user is making the VPA command or looking in the look direction of the VPA response zone (paragraph 117).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to detect a gesture during execution of a conferencing application as taught by Querze et al. in Robaina et al., because that would help enhance user interaction with the wearable audio device (paragraph 48).

As per claim 7, Robaina et al. in view of Querze et al. further disclose the internet call audio is muted in response to receiving the initiating gesture data performed by the user which indicates the initiation of the voice command (“the spatialized VPA audio engine 240 is configured to mute the user 310 on the call(s) with the other phone call sources while the user is making the VPA command or looking in the look direction of the VPA response zone”; Querze et al., paragraph 117).

As per claim 8, Robaina et al. in view of Querze et al. further disclose the internet call audio is unmuted in response to receiving the terminating gesture data performed by the user which indicates a termination of the voice command (Querze et al., paragraph 117; Robaina et al. paragraphs 50, 51; 228, 229).

As per claim 11, Robaina et al. teach a method of operating an internet call computing system comprising:
detecting a user pose which indicates an activation of a voice recognition key; detecting a release of the user pose which indicated a deactivation of a voice recognition key; in response to detecting the release of the user pose, initiating  a voice command application of the computing system; terminating the voice command (“One or more poses may be used to activate or to turn off voice recordings of a patient's visit. For example, the doctor may use a certain hand gesture to indicate whether to start dictating the diagnosis of the patient… tapping a finger on the left hand” as an initiation condition of recording audio data and “tapping a finger on the right hand” as a termination condition of recording the audio data… Because this hand gesture is associated with the initiation condition, the wearable device may start recording the audio data using the dictation program as the patient and the HCP converse”; paragraphs 50, 51; 228, 229).
However, Robaina et al. do not specifically teach in response to detecting the user pose, muting the internet call to receive a voice command via the voice command application; in response to detecting the release of the user pose, unmuting the internet call.
Querze et al. disclose that the configuration of the array 830 is useful for conference call applications…the spatialized VPA audio engine 240 is configured to mute the user 310 on the call(s) with the other phone call sources while the user is making the VPA command or looking in the look direction of the VPA response zone (paragraph 117).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to mute the internet call to receive a voice command as taught by Querze et al. in Robaina et al., because that would help enhance user interaction with the wearable audio device (paragraph 48).

As per claim 12, Robaina et al. in view of Querze et al. further disclose the user pose is detected based on a determination of a distance of an input device from a predetermined location (Robaina et al., paragraphs 67, 68, 339; Querze et al. paragraphs 89 – 92; 104, 119).

As per claim 13, Robaina et al. in view of Querze et al. further disclose the user pose is detected by a movement of a mouse or stylus (Robaina et al., paragraph 228).

As per claim 16, Robaina et al. in view of Querze et al. further disclose in response to detecting the release of the user pose, terminating the voice command application (“The wearable device can also detect a pose of the HCP to determine whether an initiation condition or a termination condition is present. The wearable device can use data acquired by its environmental sensors to detect the pose of the HCP. For example, the wearable device can use IMUs to determine whether the user's head pose has changed…tapping a finger on the left hand” as an initiation condition of recording audio data and “tapping a finger on the right hand” as a termination condition of recording the audio data… Because this hand gesture is associated with the initiation condition, the wearable device may start recording the audio data using the dictation program as the patient and the HCP converse”; Robaina et al., paragraphs 228, 229).

As per claim 17, Robaina et al. in view of Querze et al. further disclose muting the internet call includes receiving the voice command without transmitting the voice command to other users of the internet call computing system (Querze et al. paragraph 117).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEONARD SAINT-CYR whose telephone number is (571)272-4247. The examiner can normally be reached Monday- Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached at (571)272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LEONARD SAINT-CYR/           Primary Examiner, Art Unit 2658

Read full office action

Prosecution Timeline

Jan 05, 2024

Application Filed

Aug 06, 2025

Non-Final Rejection — §102, §103

Nov 03, 2025

Response Filed

Jan 28, 2026

Final Rejection — §102, §103

Apr 15, 2026

Response after Non-Final Action

Apr 15, 2026

Notice of Allowance

Precedent Cases

Applications granted by this same examiner with similar technology

18/397,693

Patent 12603100

SYSTEM AND METHOD FOR OPTIMIZED AUDIO MIXING

2y 5m to grant Granted Apr 14, 2026

18/065,588

Patent 12597415

VOICE RECOGNITION GRAMMAR SELECTION BASED ON CONTEXT

2y 5m to grant Granted Apr 07, 2026

18/442,239

Patent 12592227

DIALOG UNDERSTANDING DEVICE AND DIALOG UNDERSTANDING METHOD

2y 5m to grant Granted Mar 31, 2026

18/496,523

Patent 12591765

SYSTEMS AND METHODS FOR BUILDING A CUSTOMIZED GENERATIVE ARTIFICIAL INTELLIGENT PLATFORM

2y 5m to grant Granted Mar 31, 2026

18/561,788

Patent 12585884

DIALOGUE APPARATUS, DIALOGUE METHOD, AND PROGRAM

2y 5m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

77%

Grant Probability

95%

With Interview (+18.2%)

3y 1m

Median Time to Grant

Moderate

PTA Risk

Based on 1144 resolved cases by this examiner. Grant probability derived from career allow rate.