Last updated: April 19, 2026

Application No. 18/218,953

Accelerometer-Based Voice Activity Detection With Optimized Detection Axis

Non-Final OA §102§103

Filed

Jul 06, 2023

Examiner

OPSASNICK, MICHAEL N

Art Unit

2658

Tech Center

2600 — Communications

Assignee

Invensense Inc.

OA Round

3 (Non-Final)

Interview Optional

— +10.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 900 resolved cases, 2023–2026

Examiner Intelligence

OPSASNICK, MICHAEL N View full profile →

Grants 82% — above average

Career Allow Rate

737 granted / 900 resolved

+19.9% vs TC avg

Moderate +10% lift

Without

With

+10.5%

Interview Lift

resolved cases with interview

Typical timeline

3y 3m

Avg Prosecution

46 currently pending

Career history

946

Total Applications

across all art units

Statute-Specific Performance

§101

17.7%

-22.3% vs TC avg

§103

33.0%

-7.0% vs TC avg

§102

29.9%

-10.1% vs TC avg

§112

6.3%

-33.7% vs TC avg

Black line = Tech Center average estimate • Based on career data from 900 resolved cases

Office Action

§102 §103

Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 2/3/2026 has been entered.
 

Allowable Subject Matter

Claims 16,20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

The following is a statement of reasons for the indication of allowable subject matter:  As per claims 16,20, Rivolta et al (20220270593) teaches the claim elements of the independent claim, as mapped below; as to the beamforming aspect, it is old and notoriously well known in the prior art to use beamforming to direct/affect filtering coefficients based on directionality (e.g., see Condorelli et al, 20210383824, teaching beamforming of acoustic transducers based on the voice accelerometer – para 0040 referring back to Figure 9), but does not explicitly teach the use of beamforming calculations to map the accelerometer axes to the inner canal/bone conductor axes. .

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 3,5-7,9,11-13,15,17,19 are rejected under 35 U.S.C. 102 (a)(2) as being anticipated by Rivolta et al (20220270593) in view of Condorelli et al (20210383824).
As per claim 1, Rivolta et al (20220270593) teaches a head worn electronic device comprising: 
a transceiver for communicating with a host device (as using the disclosed technique for wireless headphones – para 0015; ie, the wireless headphones, by definition, communicate with a base device)
an accelerometer having a plurality of axes for detecting three-dimensional forces applied to the head worn electronic device; and a processor configured to: receive a three-dimensional vibration vector from the accelerometer caused by a voice of a user while the head worn electronic device is positioned in a user's ear (as, the device(s)/headphones process accelerometer data as well as bone-conduction accelerometer data – para 0016, and para 0018; the accelerometer measures in 3 axes (ie, 3 dimensional axes – para 0019, measuring information from the user’s ear canal – end of para 0019); 
process the three-dimensional vibration vector to determine a voice activity detection axis that Rivolta et al (20220270593), as, the devices are on the ear/inside the ear canal of the persons face – para 0019 – examiner notes, that by definition, at least one of the axes would be perpendicular to the users ear; and, using the accelerometer 3 axes measurement for voice activity detection – para 0019 – wherein the accelerometer determines a filtered acceleration signal, tying that signal to human speech – para 0031); and extracting different features based on the different axes of the device – para 0036);
perform processing of data from the voice activity detection axis to detect voice activity of the user; and send an instruction to the host device via the transceiver to control the host device based on the voice activity detection (based, on detected voice-activity via accelerometer information, notifying the device if the information was human speech or not, after matriculating through the decision tree – para 0039-0043, with para 0048 outputting the final decision).

	As per claim 1, Rivolta et al (20220270593) teaches the separation of the bone conductor accelerometer, and the device accelerometers (figure 1, and para 0017), but is silent on the relationship between the axis of both; however, Condorelli et al (20210383824) teaches the steering of the acoustic transducers in a different direction than, of the voice accelerometers – fig. 9,  and para 0040.  Therefore, it would have been obvious to one of ordinary skill in the art of voice activity detection processing to modify the measuring system of Rivolta et al (20220270593) with separate axes of calculation, as taught above by Condorelli et al (20210383824), because it would advantageously allow to direct the voice activity towards background noise sources, as to dominate/cancel the affect of the background noise (see Condorelli et al (20210383824), para 0040).     

As per claim 3, the combination of Rivolta et al (20220270593) in view of Condorelli et al (20210383824) teaches the head worn electronic device of claim 1, wherein the voice activity detection axis is remote from the plurality of axes of the accelerometer – (see Rivolta et al (20220270593), see figure 1, wherein the bone conduction accelerometer is separate from the regular accelerometer, and para 0017).

	As per claims 5, 6, the combination of Rivolta et al (20220270593) in view of Condorelli et al (20210383824) teaches axes coefficients/transfer functions, to map/project onto the voice activity detection axes – see Rivolta et al (20220270593), para 0036, wherein the signals are processed differently based on the axes; when the accelerometer shows a speech activity, the bone conduction accelerometer is activated; in other words, considering in vector form, the three axes measurements of the accelerometers through decision tree – see Rivolta et al (20220270593), para 0036 Xa, Ya, Za , culminating in a final affirmative speech detection, the Xboneductor, Yboneconductor, Zboneconductor  
Vector would be {1, 1, 1}, and oppositely, if the final decision tree result is “no speech”, the vector axes representation for the bone conductor accelerometer would be {0, 0, 0}. 

As per claim 7, the combination of Rivolta et al (20220270593) in view of Condorelli et al (20210383824) teaches a head worn electronic device comprising: 
a transceiver for communicating with a host device as using the disclosed technique for wireless headphones – see Rivolta et al (20220270593), para 0015; ie, the wireless headphones, by definition, communicate with a base device); 
an accelerometer with a plurality of axes for detecting three-dimensional forces applied to the head worn electronic device; and a processor configured to: receive a three-dimensional vibration vector from the accelerometer caused by a voice of a user while the head worn electronic device is positioned in a user's ear (see Rivolta et al (20220270593), as, the device(s)/headphones process accelerometer data as well as bone-conduction accelerometer data – para 0016, and para 0018; the accelerometer measures in 3 axes (ie, 3 dimensional axes – para 0019, measuring information from the user’s ear canal – end of para 0019); 
transmit, via the transceiver, the three-dimensional vibration vector to the host device; receive, via the transceiver, voice activity detection axis coefficients from the host device (see Rivolta et al (20220270593), as, transmitting the information from the bone vibration signals, and the acoustic signals, to be processed by the device using time-division multiplexing synchronization – para 0014; para 0020),
compute a voice activity detection axis based on the voice activity detection axis coefficients, the voice activity detection axis correlates with vibrations caused by the voice of the user (see Rivolta et al (20220270593), as, using the accelerometer 3 axes measurement for voice activity detection – para 0019 – wherein the accelerometer determines a filtered acceleration signal, tying that signal to human speech – para 0031); and extracting different features based on the different axes of the device – para 0036);
perform processing of data from the voice activity detection axis to detect voice activity of the user; and send an instruction to the host device via the transceiver to control the host device based on the voice activity detection (see Rivolta et al (20220270593), based, on detected voice-activity via accelerometer information, notifying the device if the information was human speech or not, after matriculating through the decision tree – para 0039-0043, with para 0048 outputting the final decision).

	Claims 9,11,12 are device claims that are similar/identical to the claim limitations in claims 3,5,6 and as such, claims 9,11,12 are similar in scope and content to claims 3,5,6 above; therefore, claims 9,11,12 are rejected under similar rationale as presented against claims 3,5,6 above.

	Claims 13, 15 are device claims whose elements/steps are similar in scope and content found in claims 1,5,6,7,9,11,12 above and as such, claims 13,15 are similar in scope and content to claims 1,5-7,9,11,12 above; therefore, claims 13,15 are rejected under similar rationale as presented against 1,5-7,9,11,12 above.

Claims 17, 19 are method claims whose steps are performed by device claims 1, 5-7, 9,11,12 above and as such, claims 17,19 are similar in scope and content to claims 1, 5-7, 9, 11, 12 above; therefore, claims 17,19 are rejected under similar rationale as presented against 1, 5-7, 9, 11, 12 above.


Claim(s) 4,10,14,18 are rejected under 35 U.S.C. 103 as being unpatentable over Rivolta et al (20220270593) in view of Condorelli et al (20210383824) in further view of El Guindi et al (20210306726).

As per claims 4,10,14, 18, the combination of Rivolta et al (20220270593) in view of Condorelli et al (20210383824) teaches the claim elements of the independent claims, from which these claims depend (see mapping above, toward the independent claims 1, 7, 13, 17); Rivolta et al (20220270593) further teaches switching modes of the device to listening mode (as noted above in the rejection of claims 1-3,5-9,11-13,15,17,19); during listening mode, the signal is marked as “voice-activity” for further processing; however, the combination of Rivolta et al (20220270593) in view of Condorelli et al (20210383824) does not explicitly teach keyword-spotting as part of the processing; El Guindi et al (20210306726) teaches a hearing device using accelerometers for VAD (para 0062), analyzing voice characteristics based on sound features (para 0071, first two sentences), which includes a further speech classification to include keyword spotting – para 0071, 4th/5th sentence, to the end of para 0071.  Therefore, it would have been obvious to one of ordinary skill in the art of accelerometer based VAD to modify the system of signal classification of Rivolta et al (20220270593) in view of Condorelli et al (20210383824) with an added category of keyword spotting, with other word/phrase recognition, as taught by El Guindi et al (20210306726), because it provide the added benefit of detecting if there is a conversation that the user is conducting ( El Guindi et al (20210306726) – para 0071; improving upon Rivolta et al (20220270593) mere voice signal analysis).     

Response to Arguments

Applicant's arguments filed 2/3/2026 have been fully considered but they are not persuasive.  Applicants arguments on pp 10 of the response, toward the features pertaining to the maintaining the voice activity detection axis perpendicular to an anatomical feature of the user, examiner argues that the constant accelerometer calculations, at differing angles, is a form of ‘maintaining’, -- see para 0036, of Rivolta et al (20220270593) , showing peak to peak calculation of the z axis, and in para 0037-0042, using a decision tree to constantly determine the location of human speech.  Furthermore,  the Condorelli et al (20210383824) reference teaches the concept of adaptive beamforming to maximize signal measurement for speech and maximizing noise cancellation – para 0040.  In maintaining these maximized signals, the angle of maximization is ninety degrees angle of incidence.  
Further proof of beamforming, to adjust for speech/non speech—see Dusan et al (20140093091), figure 12 for non-speech noise, and para 0004 to maximize speech.

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

The following references were found toward using multiple accelerometers and performing beamforming:
Dusan et al (20140093091), figure 12 for non-speech noise, and para 0004 to maximize speech.
Sabin et al (20210345047) para 0012, para 0034
Wax et al (20220068298)Para 0031, 0040, 0043


The following reference (s) were found toward VAD using accelerometers w/wo keyword spotting:
Burnett et al (20040133421) teaches the concept of VoiceActivityDetection using an accelerometer on a user, to determine vibrations, indicating voice activity – para 0023).

Pedersen et al (20210409878) teaches VAD using accelerometers (for picking up vibrations of the tissue/flesh/bone of the user – para 0024; with keyword detection – para 0015. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Opsasnick, telephone number (571)272-7623, who is available Monday-Friday, 9am-5pm. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Mr. Richemond Dorvil, can be reached at (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Michael N Opsasnick/Primary Examiner, Art Unit 2658                                                                                                                                                                                                        02/17/2026

Read full office action

Prosecution Timeline

Jul 06, 2023

Application Filed

May 08, 2025

Non-Final Rejection — §102, §103

Aug 13, 2025

Response Filed

Nov 12, 2025

Final Rejection — §102, §103

Feb 03, 2026

Request for Continued Examination

Feb 10, 2026

Response after Non-Final Action

Feb 17, 2026

Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/512,723

Patent 12602554

SYSTEMS AND METHODS FOR PRODUCING RELIABLE TRANSLATION IN NEAR REAL-TIME

2y 5m to grant Granted Apr 14, 2026

17/698,029

Patent 12592246

SYSTEM AND METHOD FOR EXTRACTING HIDDEN CUES IN INTERACTIVE COMMUNICATIONS

2y 5m to grant Granted Mar 31, 2026

18/367,779

Patent 12586580

System For Recognizing and Responding to Environmental Noises

2y 5m to grant Granted Mar 24, 2026

18/344,007

Patent 12579995

Automatic Speech Recognition Accuracy With Multimodal Embeddings Search

2y 5m to grant Granted Mar 17, 2026

18/273,354

Patent 12567432

VOICE SIGNAL ESTIMATION METHOD AND APPARATUS USING ATTENTION MECHANISM

2y 5m to grant Granted Mar 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

82%

Grant Probability

92%

With Interview (+10.5%)

3y 3m

Median Time to Grant

High

PTA Risk

Based on 900 resolved cases by this examiner. Grant probability derived from career allow rate.