Prosecution Insights
Last updated: April 19, 2026
Application No. 18/282,115

METHOD AND APPARATUS FOR IMPROVED SPEAKER IDENTIFICATION AND SPEECH ENHANCEMENT

Final Rejection §103
Filed
Sep 14, 2023
Examiner
KANG, ANNABELLE
Art Unit
2695
Tech Center
2600 — Communications
Assignee
Magic Leap Inc.
OA Round
2 (Final)
80%
Grant Probability
Favorable
3-4
OA Rounds
2y 8m
To Grant
63%
With Interview

Examiner Intelligence

Grants 80% — above average
80%
Career Allow Rate
12 granted / 15 resolved
+18.0% vs TC avg
Minimal -17% lift
Without
With
+-16.7%
Interview Lift
resolved cases with interview
Typical timeline
2y 8m
Avg Prosecution
24 currently pending
Career history
39
Total Applications
across all art units

Statute-Specific Performance

§101
7.3%
-32.7% vs TC avg
§103
53.7%
+13.7% vs TC avg
§102
33.5%
-6.5% vs TC avg
§112
5.5%
-34.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 15 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 86-105 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jing (US 20110026722 A1, hereinafter “Jing”). Regarding claim 86, Jing teaches a user speech subsystem, comprising: a vibration voice pickup (VVPU) sensor configured for capturing vibration originating from a voiced sound of a user and generating a vibration signal; (see [0073]: voice activity detector (VAD) combines acoustic VAD and a vibration sensor VAD appropriate to the environment or condition in which a user is operating a host device) and at least one processor configured for acquiring the vibration signal, (see Fig. 45, [0232]: at least one processor 30 controlling subsystems including detection and collection) acquiring an audio signal output by at least one microphone in response to capturing voiced sound from the user and ambient noise containing voiced sound of others, and (see [0083-0085]: two omnidirectional microphones 10) using the vibration signal to discriminate between voiced sound of the user and the voiced sound from others in the audio signal captured by the at least one microphone. (see [0231]-[0237], Fig. 11: system and method for discriminating between the voiced sound of the user and both voiced and unvoiced speech from background noise are provided including a Non-Acoustic Sensor Voiced Speech Activity Detection (NAVSAD) system which is captured by at least one microphone. Voice “SIGNAL s(n)” and other voices “NOISE n(n)” and “uses” a vibration signal from VAD 1104 in the Noise Removal process to discriminate a voiced sound of the user “Cleaned Speech”, from the Noise as claim recites to “use” the vibration signal without further detail on how it is being used. First the user’s voiced sounds are analyzed with respect to their surround sounds, both voiced and unvoiced (see [0138]-[0140]). “The noise removal and reduction methods provided herein, while allowing for the separation and classification of unvoiced and voiced human speech from background noise” (see [0231]) Jing does not exclude this possibility therefore it would have been obvious to have used the vibration signal to discriminate a voiced sound of the user from the Noise, including other voices.) Regarding claim 87, Jing teaches at least one processor is further configured for performing an analysis of the vibration signal, and determining that the at least one microphone has captured the voiced sound of the user based on the analysis of the vibration signal, and discriminating between voiced sound of the user in the audio signal and voiced sound from others captured by the at least one microphone in response to the determination that the at least one microphone has captured the voiced sound of the user. (see Fig. 46, [0231]-[0240]: Pathfinder Speech Activity Detection (PSAD) system couples microphones 10 to at least one processor 30 including a detection subsystem 50, detection algorithm, and a denoising subsystem 50. NAVSAD system has sensor to enable detection of voiced speech in any environment.) Regarding claim 88, Jing teaches at least one processor is configured for discriminating between voiced sound of the user and voiced sound from others captured by the at least one microphone by detecting voice activity in the audio signal, generating a voice stream corresponding to the voiced sound of the user and the voiced sound of others, and discriminating between the voiced sound of the user and the voiced sound of the others in the voice stream. (see [0231]-[0240], [0315]-[0318]: at least one processor 30 discriminates between voice sound of the user and background noise, separating and classifying of unvoiced and voiced human speech. System generated voice activity detection (VAD) signal to indicate a presence of voiced speech.) Regarding claim 89, Jing teaches at least one processor is further configured for outputting a voice stream corresponding to the voiced sound of the user. (see [0136], [0315]-[0318]: VAD signal means a signal indicating when user speech is detected. System generates VAD signal to indicate presence of voice speech.) Regarding claim 90, Jing teaches at least one processor is further configured for outputting a voice stream corresponding to the voiced sound of the others. (see [0231]-[0240], [0315]-[0318]: VAD signal means a signal indicating when user speech is detected. System generates VAD signal to indicate presence of voice speech. Then discriminate between voice sound of the user and background noise, separating and classifying of unvoiced and voiced human speech.) Regarding claim 91, Jing teaches a speech recognition engine configured for interpreting the enhanced voiced sound of the user in the voice stream into speech. (see [0073], [0278]: speech recognition for enhanced performance and accurate VAD) Regarding claim 92, Jing teaches a frame structure configured for being worn on the head of a user; (see Fig. 16, [0160]: headset or head-worn device 1600 that includes the DOMA) and the user speech subsystem of claim 86, wherein the WPU sensor and the at least one microphone are affixed to the frame structure. (see Fig. 16, [0085], [0098]: accelerometer or radio-vibrometer, for vibration measurements and microphone affixed to headset or head-worn device 1600 that includes the DOMA and at least one microphone.) Regarding claim 93, Jing teaches at least one speaker affixed to the frame structure, the at least one speaker configured for conveying sound to the user. (see Fig. 16, [0160]: speaker 1602 that can be worn by headset 1600) Regarding claim 94, Jing is silent to at least one display screen affixed and at least one projection assembly affixed to the frame structure, the at least one projection assembly configured for projecting virtual content onto the at least one display screen for viewing by the user. However, it would have been obvious that the designer could have chosen such arrangement for projecting virtual content on a display screen for viewing by the user based on the users' needs/preferences and no unexpected result is produced. Regarding claim 95, Jing is silent to the WPU is further configured for being vibrationally coupled to one of a nose, an eyebrow, and a temple of the user when the frame structure is worn by the user. However, it would have been obvious that the designer could have chosen such arrangement to couple to one of a nose, an eyebrow, and a temple of the user when the frame structure is worn by the user. Jing mentions the detector in contact with the skin of a user 206. This serves the same function of being coupled to a nose, eyebrow, or temple of a user for accurate data, or signal, collection. Regarding claim 96, Jing is silent to the frame structure comprises a nose pad in which the VVPU sensor is affixed. However, it would have been obvious that the designer could have chosen such arrangement include a nose pad in which the VVPU sensor is affixed to the frame structure based on the users' needs/preferences and no unexpected result is produced. Regarding claims 97-102, the claimed limitations are method claims directly corresponding to the system claims 86-91; therefore, is rejected for the significant similar reasons as claims 86-91 as discussed above. Regarding claim 103, the claimed limitations are claims directly corresponding to the system claims 86 and 97; therefore, is rejected for the significant similar reasons as claims 86 and 97 as discussed above. Regarding claim 104, Jing teaches the analysis of the vibration signal comprises determining that one or more characteristics of the vibration signal exceeds a threshold level. (see [0075]-[0082]: VAD component determines if vibration signals exceeds threshold levels, first and second.) Regarding claim 105, Jing teaches at least one processor is further configured for performing an analysis of the audio signal, and determining that the at least one microphone has captured voiced sound from the user based on the analyses of the audio signal and the vibration signal. (see [0232]-[0234]: at least one processor 30 controls detection subsystem 50 based on audio and vibration signal collected by microphone. ) Response to Arguments Applicant's arguments filed October 22, 2025 have been fully considered but they are not persuasive. On page 8-11 of the Applicant’s remarks, Applicant mainly argues that the art of record fails to disclose a vibration signal used to discriminate the voiced sound of the user and voiced sound of others. The Office disagrees. As pointed out in the rejection above, Jing clearly teaches a user speech subsystem, comprising: a vibration voice pickup (VVPU) sensor configured for capturing vibration originating from a voiced sound of a user and generating a vibration signal; (see [0073]: voice activity detector (VAD) combines acoustic VAD and a vibration sensor VAD appropriate to the environment or condition in which a user is operating a host device) and at least one processor configured for acquiring the vibration signal, (see Fig. 45, [0232]: at least one processor 30 controlling subsystems including detection and collection) acquiring an audio signal output by at least one microphone in response to capturing voiced sound from the user and ambient noise containing voiced sound of others, and (see [0083-0085]: two omnidirectional microphones 10) using the vibration signal to discriminate between voiced sound of the user and the voiced sound from others in the audio signal captured by the at least one microphone. (see [0231]-[0237], Fig. 11: system and method for discriminating between the voiced sound of the user and both voiced and unvoiced speech from background noise are provided including a Non-Acoustic Sensor Voiced Speech Activity Detection (NAVSAD) system which is captured by at least one microphone. Voice “SIGNAL s(n)” and other voices “NOISE n(n)” and “uses” a vibration signal from VAD 1104 in the Noise Removal process to discriminate a voiced sound of the user “Cleaned Speech”, from the Noise as claim recites to “use” the vibration signal without further detail on how it is being used. First the user’s voiced sounds are analyzed with respect to their surround sounds, both voiced and unvoiced (see [0168]-[0170]). “The noise removal and reduction methods provided herein, while allowing for the separation and classification of unvoiced and voiced human speech from background noise” which may include other voices (see [0231]). Jing does not exclude this possibility therefore it would have been obvious to have used the vibration signal to discriminate a voiced sound of the user from the Noise, including other voices.) “Using the vibration signal…” is a very broad language which does not require “comparing the vibration signal to the microphone signal in order to determine a voice of the user”. Under the broadest reasonable interpretation of this claim language, Jing clearly teaches this through that a user’s voice “SIGNAL s(n)” and other voices “NOISE n(n)” will “use” a vibration signal from VAD 1104 in the Noise Removal process to discriminate a voiced sound of the user “Cleaned Speech”, from the Noise. Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANNABELLE KANG whose telephone number is (571)270-3403. The examiner can normally be reached Monday-Thursday 8:00-5:00. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached at 571-272-7848. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /ANNABELLE KANG/Examiner, Art Unit 2695 /VIVIAN C CHIN/Supervisory Patent Examiner, Art Unit 2695
Read full office action

Prosecution Timeline

Sep 14, 2023
Application Filed
Jul 18, 2025
Non-Final Rejection — §103
Oct 22, 2025
Response Filed
Jan 28, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12604141
ULTRA-LOW FREQUENCY SOUND COMPENSATION METHOD AND SYSTEM BASED ON HAPTIC FEEDBACK, AND COMPUTER-READABLE STORAGE MEDIUM
2y 5m to grant Granted Apr 14, 2026
Patent 12581255
SYSTEMS AND METHODS FOR ASSESSING HEARING HEALTH BASED ON PERCEPTUAL PROCESSING
2y 5m to grant Granted Mar 17, 2026
Patent 12556868
Speaker
2y 5m to grant Granted Feb 17, 2026
Patent 12549895
DYNAMIC WIND DETECTION FOR ADAPTIVE NOISE CANCELLATION (ANC)
2y 5m to grant Granted Feb 10, 2026
Patent 12513372
AUDIO DATA PROCESSING METHOD, AUDIO DATA PROCESSING APPARATUS, COMPUTER READBLE STORAGE MEDIUM, AND ELECTRONIC DEVICE SUITABLE FOR STAGE
2y 5m to grant Granted Dec 30, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
80%
Grant Probability
63%
With Interview (-16.7%)
2y 8m
Median Time to Grant
Moderate
PTA Risk
Based on 15 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month