Last updated: April 19, 2026

Application No. 18/486,219

AUTOMATIC PROCESSING STATE CONTROL OF A MICROPHONE OF A LISTENING DEVICE

Final Rejection §103

Filed

Oct 13, 2023

Examiner

AL AUBAIDI, RASHA S

Art Unit

2693

Tech Center

2600 — Communications

Assignee

Caavo Inc.

OA Round

2 (Final)

Interview Optional

— +11.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 744 resolved cases, 2023–2026

Examiner Intelligence

AL AUBAIDI, RASHA S View full profile →

Grants 78% — above average

Career Allow Rate

577 granted / 744 resolved

+15.6% vs TC avg

Moderate +11% lift

Without

With

+11.1%

Interview Lift

resolved cases with interview

Typical timeline

3y 3m

Avg Prosecution

38 currently pending

Career history

782

Total Applications

across all art units

Statute-Specific Performance

§101

10.2%

-29.8% vs TC avg

§103

55.9%

+15.9% vs TC avg

§102

16.1%

-23.9% vs TC avg

§112

8.4%

-31.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 744 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

      Response to Amendment
1.	This in response to an amendment filed 11/07/2025. Claims 21-22 have been added. Claims 9 and 18 have been canceled. Claims 1-4, 8, 10-13, 16-17 and 19-20 have been amended. Claims 1-8, 10-17 and 19-22 are still pending in this application. 

Claim Rejections - 35 USC § 103
2.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

	Claim(s) 1-8. 10-17 and 19-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (US PAT # 10,798,506 B2) in view of Lang (US PAT # 10,475,449 B2) in view of Carter et al. (US PAT # 10,210,726 B2) and further in view of Gilson (US PAT # 9,313,440 B2). 

Regarding claims 1, 10 and 19, Liu teaches a system, method and computer-readable storage medium (see abstract), comprising: 
an event detector (see event detection module 215 as shown in Fig.2 of system 200) that:
receives a signal (see col. 2, lines 10-40), and 
detects a presence of a user based on an analysis of the signal (see col. 2, lines 10-40). 

	Liu features are already addressed in the rejection of claims 1, 10 and 19. While Lui teaches event detection via audio signal analysis using a microphone- triggered detection system, Liu does not specifically teach “microphone control component that: determines to activate a first microphone of a  remote control device based at least on the detected presence of  the user”.

	However, Lang teaches a control system for a plurality of microphones that upon certain conditions transmits commands to disable or alter wake responses of microphones (see col. 6, lines 5-30). Note that Liu teaches transmitting commands to control microphone behavior. 

	Thus, it would have been obvious for one of an ordinary skill in the art before the invention of the filing date of the claimed invention to incorporate the feature of controlling microphone by transmitting command to networked microphone to disable or alter the response, as taught by Lang, into the teaching of Liu, in order to dynamically manage microphone activity based on detected events—improving efficiency and responsiveness.

	Liu and lang features are addressed in the rejection of claims 1, 10 and 19. Neither Liu nor Lang teaches “responsive to the determination, wirelessly transmits a first command to the remote-control device, the first command including instructions to activate the first microphone”. 

	However, Carter teaches a system that detects specific audio events (e.g., gunshots) and, upon such detection, transmits notifications or initiates commands for external system responses (see, e.g., col. 3, lines 10–40 and col. 1, lines 48-67).

Thus, it would have been obvious for one of an ordinary skill in the art before the invention of the filing date of the claimed invention to incorporate the feature of command transmission behavior, as taught by Carter, into the combination of Liu and Lang, in order to allow for system-wide responses upon event detection, as such notification mechanisms are common in distributed audio processing and surveillance systems. The combination of these references yields the claimed invention without the exercise of inventive skill.

	Liu, Lang and Carter features are addressed in the rejection of claims 1, 10 and 19. Neither one of the applied references specifically teach sensor detecting presences (via movement of the remote control device (i.e., generated based on a sensor of a remote control device externally located from the system”) and enabling/activating the microphone based on the detection. 

	However, Gilson teaches sensor detection motion of remote control device (note that the remote control device register some user actions, (e.g., sensing movement of the remote control device using an accelerometer), see col. 8, lines53-60). Gilson also teaches that remote control device may sense movement of the remote control device using, for example an accelerometer, and based on the detected move the remote control may initiate the active state, see (col. 9, lines 17-25). Note that in Gilson, the microphone is activated in active state (see col. 8, lines28-36) and may be deactivated in standby (see col. 8, lines 60-67) after transitioning to active state, the remote control device may “activate a microphone”. 

	Thus, it would have been obvious for one of an ordinary skill in the art before the invention of the filing date of the claimed invention to incorporate motion-based activation of the microphone as taught by Gilson into the primary reference to conserve power and automatically enable voice capture upon detecting user handling of the device, since Gilson expressly teaches this benefit (battery conservation and conditional activation). 

Regarding claims 2, 11 and 20, the combination of Liu, Lang, Carter and Gilson teaches wherein the first signal comprises at least one of: 
a media content signal that is provided to a media presentation device that presents media content based on the media content signal;
an audio signal captured by a second microphone that is proximate to the media presentation device; 
a network signal received by a network interface; or 
an image or a video of the media presentation device captured by a camera (see Liu, col. 6, lines 1-16).

Regarding claims 3 and 12, the combination of Liu, Lang, Carter and Gilson teaches wherein the transmission of the first command to the remote control device causes the remote control device to provide power to the first microphone to cause the first microphone to capture the audio; and the system comprises an interface that receives, from the remote control device, the audio captured by the first microphone (reads on upon detecting a wake word, a NMD may respond by listening, via a microphone, for a voice command following the wake word. This response is referred to herein as the “wake response” of a NMD, see Lang, col. 2, lines 21-63). 

Regarding claims 4 and 13, the combination of Liu, Lang, Carter and Gilson teaches wherein the transmission of the first command to the remote control device causes the remote control device to provide audio captured by the first microphone to an application executing on a network device for processing thereof (see Lang col. 19, lines 11-30).

Claim 5 recites “wherein the event detector is further configured to: 
compare an audio signal captured by the first microphone to an expected audio output of a media presentation device;
determine a level of similarity between the audio signal and the expected audio output meets a threshold condition;
in response to the level of similarity being determined to meet the threshold condition, determine that processing of the audio captured by the first microphone is enabled”. 

Note that Liu teaches analyzing microphone input to detect certain events (e.g., audio patterns or triggers) to enable device functionality and lang teaches dynamically enabling or disabling microphone processing in response to certain operating conditions or system context (e.g., user activity, environmental conditions), demonstrating the importance of controlling audio capture to preserve system performance or privacy.
While neither Liu nor Lang specifically compares a captured signal to an “expected output,” it would have been obvious to use known signal comparison techniques, such as correlation or similarity scoring, to distinguish between relevant and irrelevant audio inputs—particularly to avoid processing audio that matches known media output (e.g., a movie or broadcast) being played by the system itself. Also, the comparison functionality is considered well-known in the field of acoustic echo cancellation, voice activity detection, and smart speaker optimization, and would have been an obvious design choice for one skilled in the art seeking to reduce processing load or avoid redundant audio input (e.g., the device processing its own playback as if it were external speech). In addition, applying a threshold condition to a similarity metric is a standard control mechanism to ensure that activation or enablement occurs only when meaningful deviation from expected output is detected.

it would have been obvious for one of an ordinary skill in the art before the invention of the filing date of the claimed invention to modify the systems of Liu, Lang and Carter to include the functionality for comparing a microphone-captured audio signal to an expected audio output of a media presentation device, determining a similarity level, and enabling audio processing based on that similarity exceeding a threshold condition. Such modification would have been a predictable and routine enhancement for improving contextual awareness and reducing false positive activations in microphone-controlled environments.

Claim 6 recites “wherein the event detector is further configured to:
compare an audio signal captured by the first microphone to an expected audio output of a media presentation device;
determine a level of similarity between the audio signal and the expected audio output does not meet a threshold condition;
in response to the level of similarity being determined to not meet the threshold condition, performing a corrective action”.
Note that Liu teaches detecting specific events via microphone input—providing a foundation for audio analysis and Lang teaches dynamic control of microphone functionality in response to environmental or system-based triggers—indicating a framework for responsive audio control. This mean building on these teachings, it would have been a routine and predictable step to implement signal comparison logic to distinguish between actual external inputs and known expected playback (e.g., voice commands versus media playback from a device). For example, systems with media output and voice capture (e.g., smart TVs, virtual assistants) inherently face the problem of differentiating between external signals and internal audio outputs. Ensuring accurate interpretation and avoiding false triggers or feedback would necessitate comparison with expected output. Also, threshold-based decision-making is a standard control technique. If the similarity falls below the threshold (indicating non-matching, possibly novel or user-generated input), it would be natural to initiate corrective actions—such as adjusting gain, retransmitting a control command, re-sampling, enhancing noise filtering, or alerting the system to reanalyze the input.in addition, signal comparison and similarity threshold mechanisms are well-established in the art of acoustic signal processing (e.g., echo cancellation, voice recognition calibration, background-signal suppression). Applying these to decide on corrective actions in real-world environments is a logical extension and a skilled in the art would recognize the desirability of confirming that captured audio differs sufficiently from expected media output before proceeding with normal processing. Finally, when such a comparison and threshold were applied, it would be routine to define corrective behavior for when similarity is too low—such as adjusting microphone sensitivity, pausing playback, or reinitiating detection routines.

Thu, it would have been obvious for one of an ordinary skill in the art before the invention of the filing date of the claimed invention to combine the event detection of Liu, the microphone control of Lang and the notification/command features as taught by Carter with standard threshold-based logic constitutes no more than an obvious design choice to improve system robustness and reliability.

Dependent Claim 14 is rejected for the same reasons addressed in claims 5 and respectively. 

Regarding claims 7 and 15, the combination of Liu, Lang, Carter and Gilson teaches wherein the detected first event comprises one of:
detect an incoming audio or video call;
obtain an indication that an audio input feature of an application has been enabled;
determine that an application is in a state to accept user input; or
launching of an application with audio input features (see Liu, col. 6, lines 17-25).

Claims 8 and 16 recite “receive, from the remote control device, an audio signal captured by the first microphone while the first microphone is on; and 
determine whether to accept the incoming call based at least on the audio
signal”.
	Note that none of the applied references specifically teaches “the detected first event comprises the incoming call” as recited in claims 8 and 16, however Liu teaches that local computing device 115, 120 may be a personal computer and/or smart phone, see col. 6, lines 20-22. Thus, it would have been obvious for the smart phone utilized in Liu to detect a first event, wherein the first event detected being an incoming call. Note this feature is considered obvious if not inherent within the teaching of Liu.   

Claim 17 recites “detecting a event; determining to cease processing audio captured by the first microphone based at least on the detected event; and transmitting a second command to the remote control device, the second command including instructions to cease processing audio captured by the first microphone”. 
Note that Liu, Lang, Carter in view of Gilson references collectively demonstrate systems that detect and respond to single audio-based events (e.g., wake words, environmental triggers) and while they do not explicitly disclose “a second event” as recited in claim 17, the use of event-driven architectures naturally lends itself to sequential event handling—especially in systems where user interaction or environmental conditions are dynamic and temporally spaced. Note that modifying the systems of Liu, Lang and Carter to include “detecting a second event” would enhance user experience consistency because sequential detection enables smoother, more intuitive interactions (For example, detecting a first event (like media playback) followed by a second event (like speech directed at the device) allows systems to differentiate between passive background activity and intentional user input). Also, it allows technical predictability, because detecting multiple events over time is considered common control pattern in user interface systems, command pipelines, and audio recognition. 

Regarding claim 21, the combination of Liu, Lang, Carter and Gilson teaches wherein said transmitting the first command to the remote-control device causes the remote-control device to: provide power to the first microphone to cause the first microphone to capture the audio; and the operations further comprising: receiving the audio captured by the first microphone from the remote-control device (reads on microphone enablement and activation feature, see Gilson col. 8, lines 28-33 and 60-67). 

Regarding claim 22, the combination of Liu, Lang, Carter and Gilson teaches wherein said transmitting the first command to the remote-control device causes the remote-control device to: provide audio captured by the first microphone to an application executing on a network device for processing thereof (reads on accelerometer-based sensing, see Gilson col.8, lines 53-60 and col. 17-23).

   Response to Arguments
3.	Applicant’s arguments for independent claims have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

			      Conclusion
4.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

5.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Rasha S. AL-Aubaidi whose telephone number is (571) 272-7481.  The examiner can normally be reached on Monday-Friday from 8:30 am to 5:30 pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Ahmad Matar, can be reached on (571) 272-7488.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/RASHA S AL AUBAIDI/               Primary Examiner, Art Unit 2693

Read full office action

Prosecution Timeline

Oct 13, 2023

Application Filed

Aug 07, 2025

Non-Final Rejection — §103

Nov 03, 2025

Applicant Interview (Telephonic)

Nov 03, 2025

Examiner Interview Summary

Nov 07, 2025

Response Filed

Mar 02, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/397,725

Patent 12593179

System and Method for Efficiency Among Devices

2y 5m to grant Granted Mar 31, 2026

18/105,022

Patent 12581225

CHARGING BOX FOR EARPHONES

2y 5m to grant Granted Mar 17, 2026

18/688,139

Patent 12576367

POLYETHYLENE MEMBRANE ACOUSTIC ASSEMBLY

2y 5m to grant Granted Mar 17, 2026

17/734,011

Patent 12563147

Shared Speakerphone System for Multiple Devices in a Conference Room

2y 5m to grant Granted Feb 24, 2026

18/240,324

Patent 12563330

ELECTRONIC DEVICE

2y 5m to grant Granted Feb 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

78%

Grant Probability

89%

With Interview (+11.1%)

3y 3m

Median Time to Grant

Moderate

PTA Risk

Based on 744 resolved cases by this examiner. Grant probability derived from career allow rate.