Last updated: April 19, 2026
Application No. 18/794,843
EAR-WORN DEVICE WITH NEURAL NETWORK-BASED NOISE MODIFICATION AND/OR SPATIAL FOCUSING

Non-Final OA §103
Filed
Aug 05, 2024
Examiner
YU, NORMAN
Art Unit
2693
Tech Center
2600 — Communications
Assignee
Fortell Research Inc.
OA Round
1 (Non-Final)
Interview Optional

— +13.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 598 resolved cases, 2023–2026
Examiner Intelligence

YU, NORMAN View full profile →
Grants 88% — above average
Career Allow Rate
525 granted / 598 resolved
+25.8% vs TC avg
Moderate +14% lift
Without
With
+13.5%
Interview Lift
resolved cases with interview
Fast prosecutor
2y 1m
Avg Prosecution
35 currently pending
Career history
633
Total Applications
across all art units
Statute-Specific Performance

§101
2.2%
-37.8% vs TC avg
§103
51.8%
+11.8% vs TC avg
§102
17.2%
-22.8% vs TC avg
§112
16.8%
-23.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 598 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3, 5, and 17-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over DiCenso (US 2015/0195641) in view of Jelcicova (US 2022/0232331).

Regarding claim 1, DiCenso teaches An ear-worn device, comprising: two or more microphones (DiCenso figure 1 and ¶0022, “Wearable device 130 may be implemented by headphones or ear buds 134 that each contain an associated speaker and one or more microphones or transducers, which may include an ambient microphone to detect ambient sounds within the ambient auditory environment, and an internal microphone used in a closed loop feedback control system for cancellation of user selected sounds”); and noise reduction circuitry (DiCenso figure 3) wherein: the noise reduction circuitry is configured to generate an output audio signal (DiCenso ¶0028, “The changes made by user 120 using the user interface are communicated to the wearable device 130 to control corresponding processing of input signals to create auditory output signals that implement the user preferences”) comprising: a target speech signal comprising a first spatially-focused version of a speech signal (DiCenso figure 1 and ¶0019, “a voice from a person 104 talking to user 120”), wherein the speech signal comprises speech in a first audio signal among the multiple audio signals (DiCenso figure 1 and ¶0019, “a voice from a person 104 talking to user 120”); an interfering speech signal comprising a second spatially-focused version of the speech signal (DiCenso figure 1 and ¶0019, “voices from a crowd or conversations 108 either not directed to user 120 or in a different spatial location than voice from person 104”); and a background noise signal comprising background noise in the first audio signal (DiCenso figure 1 and ¶0019, “may include tens or hundreds of other types of sounds or noises””); wherein the noise reduction circuitry is configured to generate the output audio signal such that, in the output audio signal: a change in volume of the background noise signal is different from a change in volume of the target speech signal by a first volume change difference amount; a change in volume of the interfering speech signal is different from the change in volume of the target speech signal by a second volume change difference amount; and the first volume change difference amount and the second volume change difference amount are independently controllable (DiCenso figure 1 and ¶0028, “a user interface (FIGS. 5-6) allows user 120 to create a personalized or customized auditory experience by setting his/her preferences indicated by symbols 140, 142, 144, 146, for associated sound types to indicate which sounds to amplify, cancel, add or insert, or attenuate, respectively”), however does not explicitly teach the noise reduction circuitry comprising neural network circuitry, wherein: the neural network circuitry is configured to: receive multiple audio signals wherein at least two of the multiple audio signals each originate from a different one of the two or more microphones and/or at least one of the multiple audio signals is a beamformed audio signal originating from the two or more microphones; and implement one or more neural network layers trained to perform background noise modification and spatial focusing, such that the neural network circuitry generates, based on the multiple audio signals, two or more neural network outputs.

Jelcicova teaches a noise reduction circuitry comprising neural network circuitry (Jelcicova figure 11B and ¶0263, “implementing noise reduction”), wherein: the neural network circuitry is configured to: receive multiple audio signals wherein at least two of the multiple audio signals each originate from a different one of the two or more microphones (Jelcicova figure 11B, Neural network NN receiving signals from originating from M1 and M2) and/or at least one of the multiple audio signals is a beamformed audio signal originating from the two or more microphones; and implement one or more neural network layers (Jelcicova ¶0005, “neural network comprising at least on layer”) trained to perform background noise modification (Jelcicova ¶0266, “The maximum amount of noise reduction provided by the neural network may be controlled by level, or modulation (e.g. SNR), or a degree of sparsity of the inputs to the neural network. A degree of sparsity may e.g. be represented by a degree of overlap in time and/or frequency of background noise with (target) speech”) and spatial focusing (Jelcicova ¶263 and figure 11b, beamform filter BFa), such that the neural network circuitry generates, based on the multiple audio signals, two or more neural network outputs (Jelcicova figure 11B, outputs of neural network NN).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Jelcicova to improve the known ear-worn device of DiCenso to achieve the predictable result of a faster and more efficient learning device.

Regarding claim 3, DiCenso in view of Jelcicova teaches wherein the target speech signal comprises the speech signal to which has been applied a particular spatial focusing pattern, the particular spatial focusing pattern comprising different weights applied to the speech originating from different directions-of-arrival relative to the wearer of the ear-worn device (Jelcicova ¶0049-0050 and ¶0063 “the MVDR beamformer keeps the signals from the target direction (also referred to as the look direction) unchanged, while attenuating sound signals from other directions maximally”).

Regarding claim 5, DiCenso in view of Jelcicova teaches wherein the neural network circuitry is further configured to: receive one or more spatial focusing control inputs indicating the particular spatial focusing pattern (Jelcicova ¶0049-0050 and ¶0063 “the MVDR beamformer keeps the signals from the target direction (also referred to as the look direction) unchanged, while attenuating sound signals from other directions maximally”); and use the one or more spatial focusing control inputs (Jelcicova figure 11b, ¶0263, “target estimate” and “noise estimate”) to generate the two or more neural network outputs such that the target speech signal comprises the speech signal to which has been applied the particular spatial focusing pattern (Jelcicova figure 11b, Gain values G(k,m)).

Regarding claim 17, DiCenso in view of Jelcicova teaches wherein: the background noise signal is not spatially-focused (Jelcicova ¶0049-0050 and ¶0063 “the MVDR beamformer keeps the signals from the target direction (also referred to as the look direction) unchanged, while attenuating sound signals from other directions maximally”); and the interfering speech signal does not comprise a portion of the background noise in the first audio signal (DiCenso figure 1 and ¶0019, “voices from a crowd or conversations 108 either not directed to user 120 or in a different spatial location than voice from person 104,” voice from crowd can be classified separately from other noises).

Regarding claim 18, DiCenso in view of Jelcicova teaches wherein: the background noise signal comprises a first spatially-focused version of the background noise in the first audio signal; and the interfering speech signal comprises the second spatially-focused version of the speech signal plus a second spatially-focused version of the background noise in the first audio signal (DiCenso figure 1 and ¶0019, “voices from a crowd or conversations 108 either not directed to user 120 or in a different spatial location than voice from person 104,” voice from crowd can be classified separately from other noises).

Regarding claim 19, DiCenso in view of Jelcicova teaches wherein the ear-worn device comprises a hearing aid (Jelcicova ¶0002).

Claim(s) 2 is/are rejected under 35 U.S.C. 103 as being unpatentable over DiCenso (US 2015/0195641) in view of Jelcicova (US 2022/0232331) in further view of Lopatka (US 11711648).

Regarding claim 2, Dicenso in view of Jelcicova does not explicitly teach wherein at least two of the multiple audio signals have different beamformed directional patterns.

Lopatka teaches wherein at least two of the multiple audio signals have different beamformed directional patterns (Lopatka figure 4, and Col 4 lines 51-53, Event detection circuit 240 receives “Each Beam”).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Lopatka to improve the known ear worn device of DiCenso in view of Jelcicova to achieve the predictable result of improved reliability for tracking sources of acoustic events (Lopatka ¶0015).

Claim(s) 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over DiCenso (US 2015/0195641) in view of Jelcicova (US 2022/0232331) in further view of Xu (US 2017/0295439).

Regarding claim 4, DiCenso in view of Jelcicova does not explicitly teach wherein the particular spatial focusing pattern comprises higher weights applied to speech originating from directions-of-arrival towards a front of the wearer of the ear-worn device than weights applied to speech originating from directions-of-arrival towards sides and a back of the wearer.

Xu teaches wherein the particular spatial focusing pattern comprises higher weights applied to speech originating from directions-of-arrival towards a front of the wearer of the ear-worn device than weights applied to speech originating from directions-of-arrival towards sides and a back of the wearer (Xu ¶0030, “neural network 408 was trained on synthesized speech in babble noise conditions with the desired speech coming from front”).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Xu to improve the known ear-worn device of the DiCenso in view of Jelcicova to achieve the predictable result of improving the SNR of the target speech.

Claim(s) 6-7 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over DiCenso (US 2015/0195641) in view of Jelcicova (US 2022/0232331) in further view of Chen (US 2010/0123785).

Regarding claim 6, DiCenso in view of Jelcicova does not explicitly teach communication circuitry configured to receive, from a processing device, an indication of a user selection of the particular spatial focusing pattern; and control circuitry configured to generate, based at least in part on the indication of the user selection of the particular spatial focusing pattern, the one or more spatial focusing control inputs indicating the particular spatial focusing pattern.

Chen teaches communication circuitry configured to receive, from a processing device, an indication of a user selection of the particular spatial focusing pattern; and control circuitry configured to generate, based at least in part on the indication of the user selection of the particular spatial focusing pattern, the one or more spatial focusing control inputs indicating the particular spatial focusing pattern (Chen ¶0020, “The GUI 20 further receives a selection 18 of at least one of the plurality of audio sources from a user. The GUI 20 provides the selection to the signal processor 24 for aiming the audio beamforming toward the selected audio source 30 as suggested by the dashed line”).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Chen to improve the known device of DiCenso in view of Jelcicova to achieve the predictable result of user controllable audio reproduction.

Regarding claim 7, DiCenso in view of Jelcicova in further view of Chen teaches the processing device in communication with the ear-worn device and configured to: display a graphical user interface including options for different spatial focusing patterns; and receive the user selection of the particular spatial focusing pattern (Chen ¶0020, “The GUI 20 further receives a selection 18 of at least one of the plurality of audio sources from a user. The GUI 20 provides the selection to the signal processor 24 for aiming the audio beamforming toward the selected audio source 30 as suggested by the dashed line”).

Regarding claim 15, DiCenso in view of Jelcicova in further view of Chen teaches wherein the ear-worn device is further configured to receive a user selection to turn spatial focusing off (Chen ¶0020, “The GUI 20 further receives a selection 18 of at least one of the plurality of audio sources from a user. The GUI 20 provides the selection to the signal processor 24 for aiming the audio beamforming toward the selected audio source 30 as suggested by the dashed line.” When the user selects a second audio source, it may be considered turning off spatial filtering for the first  audio source).

Claim(s) 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over DiCenso (US 2015/0195641) in view of Jelcicova (US 2022/0232331) in further view of Song (US 2018/0330726).

Regarding claim 8, DiCenso in view of Jelcicova does not explicitly teach wherein the interfering speech signal comprises a remainder when the target speech signal is subtracted from the speech signal.

Song teaches wherein the interfering speech signal comprises a remainder when the target speech signal is subtracted from the speech signal (Song figure 1, step 104).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Song to improve the known ear worn device of DiCenso in view of Jelcicova to achieve the predictable result of improving speech recognition efficiency (Song ¶0068).

Claim(s) 10-14 is/are rejected under 35 U.S.C. 103 as being unpatentable over DiCenso (US 2015/0195641) in view of Jelcicova (US 2022/0232331) in further view of Chhetri (US 11646009).

Regarding claim 10, DiCenso in view of Jelcicova does not explicitly teach wherein the two or more neural network outputs comprise two different masks.

Chhetri teaches wherein the two or more neural network outputs comprise two different masks (Chhetri figure 5B, real mask data 524 and imaginary mask data 526).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Chhetri to improve the known ear worn device of DiCenso in view of Jelcicova to achieve the predictable result of improved convergence in training (Chhetri Col 26 lines 8-23).
	
Regarding claim 11, DiCenso in view of Jelcicova in further view of Chhetri teaches  mixing circuitry configured to: generate the output audio signal by mixing a combination of audio signals (Chhetri figure 5B); or generate the output audio signal by mixing a combination of masks; or wide dynamic range compression (WDRC) circuitry comprising multiple WDRC pipelines configured to generate the output audio signal by performing WDRC on the combination of audio signals.

Regarding claim 12, DiCenso in view of Jelcicova in further view of Chhetri teaches  wherein the mixing circuitry is further configured to: receive a first volume change control input and a second volume change control input; and perform the mixing using the first volume change control input and the second volume change control input such that the first volume change difference amount is controlled, at least in part, by the first volume change control input and the second volume change difference amount is controlled, at least in part, by the second volume change control input (DiCenso figure 1 and ¶0028, “a user interface (FIGS. 5-6) allows user 120 to create a personalized or customized auditory experience by setting his/her preferences indicated by symbols 140, 142, 144, 146, for associated sound types to indicate which sounds to amplify, cancel, add or insert, or attenuate, respectively”).

Regarding claim 13, DiCenso in view of Jelcicova in further view of Chhetri teaches  communication circuitry configured to receive the first volume change control input and the second volume control change value from a processing device; memory configured to store the first volume change control input and the second volume change control input; and control circuitry configured to retrieve the first volume change control input and the second volume change control input from the memory and output the first volume change control input and the second volume change control input to the mixing circuitry (DiCenso figure 1 and ¶0028, “a user interface (FIGS. 5-6) allows user 120 to create a personalized or customized auditory experience by setting his/her preferences indicated by symbols 140, 142, 144, 146, for associated sound types to indicate which sounds to amplify, cancel, add or insert, or attenuate, respectively.” It is inherent that upon selecting the preferences, it will be stored in the memory).

Regarding claim 14, DiCenso in view of Jelcicova in further view of Chhetri teaches control circuitry configured to: generate the first volume change control input based on a level of background noise in the first audio signal; and generate the second volume change control input based on a level of interfering speech in the first audio signal (DiCenso figure 1 and ¶0028, “a user interface (FIGS. 5-6) allows user 120 to create a personalized or customized auditory experience by setting his/her preferences indicated by symbols 140, 142, 144, 146, for associated sound types to indicate which sounds to amplify, cancel, add or insert, or attenuate, respectively.” It is inherent that upon selecting the preferences, it will be stored in the memory. The use can make the adjustment based on what is heard).

Claim(s) 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over DiCenso (US 2015/0195641) in view of Jelcicova (US 2022/0232331) in further view of Chan (US 8175291).

Regarding claim 20, DiCenso in view of Jelcicova does not explicitly teach wherein the noise reduction circuitry is implemented on a chip.

Chan teaches wherein the noise reduction circuitry is implemented on a chip (Chan Col 23 lines 4-13, “any other noise reduction elements of the device, such as an implementation of a single-channel noise reduction module (which may be included, for example, within a baseband portion of a mobile station modem (MSM) chip or chipset)”).

Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to use the known technique of Chan to improve the known ear worn device of DiCenso in view of  Jelcicova to achieve the predictable result of increased processing efficiency.

Allowable Subject Matter
Claims 9 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims because the closest prior art either alone or in combination, fail to anticipate or render obvious, the claimed limitation of “wherein: the neural network circuitry is configured to use: a first subset of the one or more neural network layers to generate a first of the two or more neural network outputs; and a second subset of the one or more neural network layers to generate a second of the two or more neural network outputs; and the noise reduction circuitry is configured to obtain the speech signal and/or the background noise signal from the first of the two or more neural network outputs, and to obtain the target speech signal and/or the interfering speech signal from the second of the two or more neural network outputs” in combination with all other limitations in the claim(s) as defined by the applicant.

Claims 16 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims because the closest prior art either alone or in combination, fail to anticipate or render obvious, the claimed limitation of “wherein: at least one of the two or more neural network outputs comprise: the speech signal; a mask configured to generate the speech signal; the background noise signal; a mask configured to generate the background noise signal; the target speech signal; a mask configured to generate the target speech signal; the interfering speech signal; a mask configured to generate the interfering speech signal” in combination with all other limitations in the claim(s) as defined by the applicant.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NORMAN YU whose telephone number is (571)270-7436.  The examiner can normally be reached on Mon - Fri 11am-7pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached on 571-272-7488.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Any response to this action should be mailed to:
                        Commissioner of Patents and Trademarks
                        P.O. Box 1450
                        Alexandria, Va.  22313-1450
        Or faxed to:
                    (571) 273-8300, for formal communications intended for entry and for 
                     informal or draft communications, please label “PROPOSED” or “DRAFT”.
                                Hand-delivered responses should be brought to: 

                         Customer Service Window 
                         Randolph Building 
                         401 Dulany Street 
                         Arlington, VA 22314

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/NORMAN YU/Primary Examiner, Art Unit 2693
Read full office action
Prosecution Timeline

Aug 05, 2024
Application Filed
Oct 16, 2024
Response after Non-Final Action
Mar 05, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/205,362
Patent 12604123
APPARATUS AND VEHICULAR APPARATUS INCLUDING THE SAME
2y 5m to grant Granted Apr 14, 2026
18/188,055
Patent 12598409
IN-EAR WEARABLE DEVICE
2y 5m to grant Granted Apr 07, 2026
18/312,253
Patent 12594882
AUTOMOTIVE SOUND AMPLIFICATION
2y 5m to grant Granted Apr 07, 2026
18/327,873
Patent 12593165
ACOUSTIC INPUT-OUTPUT DEVICES
2y 5m to grant Granted Mar 31, 2026
18/343,228
Patent 12581238
BINDING BAND ASSEMBLY FOR HEADSET AND HEADSET
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
88%
Grant Probability
99%
With Interview (+13.5%)
2y 1m
Median Time to Grant
Low
PTA Risk
Based on 598 resolved cases by this examiner. Grant probability derived from career allow rate.