Last updated: April 19, 2026

Application No. 18/568,462

AUDIO SIGNAL PROCESSING APPARATUS, AUDIO SIGNAL PROCESSING METHOD, AND ELECTRONIC DEVICE

Final Rejection §102§103

Filed

Dec 08, 2023

Examiner

NGUYEN, DUC MINH

Art Unit

2691

Tech Center

2600 — Communications

Assignee

Sony Group Corporation

OA Round

2 (Final)

This examiner grants 22% of cases after interview

— +17.4% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 85 resolved cases, 2023–2026

Examiner Intelligence

NGUYEN, DUC MINH View full profile →

Grants only 22% of cases

Career Allow Rate

19 granted / 85 resolved

-39.6% vs TC avg

Strong +17% interview lift

Without

With

+17.4%

Interview Lift

resolved cases with interview

Typical timeline

3y 11m

Avg Prosecution

16 currently pending

Career history

101

Total Applications

across all art units

Statute-Specific Performance

§101

2.4%

-37.6% vs TC avg

§103

62.6%

+22.6% vs TC avg

§102

22.5%

-17.5% vs TC avg

§112

8.3%

-31.7% vs TC avg

Black line = Tech Center average estimate • Based on career data from 85 resolved cases

Office Action

§102 §103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 12 is/are rejected under 35 U.S.C. 102(a) (2) as being anticipated by Shumard et al (US2021/0058702).
Consider claims 1, 12.  Shumard teaches an audio signal processing apparatus (600 fig. 6), comprising: an audio signal conversion unit configured to convert a first audio signal (i.e., audio signal detected by microphone (Mic1-2, fig. 6, para 79, The microphones included in the array microphone may be, for example, MEMS transducers which are inherently omnidirectional, other types of omnidirectional microphones, electret or condenser microphones, or other types of omnidirectional transducers or sensors) into a first unidirectional audio signal (e.g., directional output, para 65: Referring now to FIG. 6, sum and difference beamformer 600 may be configured to combine audio signals captured by a given set or pair of microphones 602 and generate a combined output signal for said microphone pair that has a directional polar pattern, in accordance with embodiments. More specifically, beamformer 600 may be configured to use appropriate sum and difference techniques on each set of first and second microphones 602 arranged orthogonally to a first axis, or front face, of an array microphone, such as, e.g., array microphone 100 in FIG. 1, to form cardioid elements with narrowed lobes (or sound pick-up patterns), for example, as compared to the full omni-directional polar pattern of the individual microphones 602; para 67: beamformer 600 further includes a correction component 608 for correcting the differential output generated by the difference component 606. The correction component 608 may be configured to correct the differential output for a gradient response caused by the difference calculation. For example, the gradient response may give a 6 dB per octave slope to the frequency response of the microphone pair. In order to generate a first-order polar pattern (e.g., cardioid) for the microphone pair over a broad frequency range, the differential output must be corrected so that it has the same magnitude as the summation output; also see para 95-95, fig. 11a-b and 12, array microphone 100 forms a unidirectional or cardioid polar pattern) , wherein the first audio signal is from a non-directional microphone (i.e., omnidirectional microphone, para 79, fig. 6), and the non-directional microphone collects first sound to generate the first audio signal (i.e., audio signal detected by microphone (Mic1-2, fig. 6)). 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 2 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shumard et al (US2021/0058702) in view of Gigandet (US2022/0240027).
Consider claim 2.  Shumard does not explicitly teach the audio signal conversion unit is configured with a deep neural network.  
Gigandet further teaches the audio signal conversion unit is configured with a deep neural network (para 0059, system 100 may analyze sound from each of the multidirectional audio input signals 418 and, within that sound, may recognize (e.g., based on voice recognition technologies, machine learning or artificial intelligence technologies, etc.) the voice of listener 312).  
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Gigandet into the teachings of Shumard in order to present users with improved sound quality when an audio input signal is received from an external microphone assembly instead of or in addition to audio input signals captured by one or more built-in microphones of the hearing device.

Claim(s) 3-4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shumard et al (US2021/0058702) in view of Gigandet (US2022/0240027) as applied to claims 1-2 above, and further in view of Komeilipoor (WO 2021/237368).
	Consider claim 3.  Shumard further teaches extract a first acoustic feature amount from audio signal, wherein the audio signal is from the non-directional microphone (i.e., omnidirectional microphone, Mic 1); extract a second acoustic feature amount from at least one second unidirectional audio signal (e.g., all sounds would have been picked up by Mic 1-2 regardless of non-directional or unidirectional sounds), wherein the at least one second unidirectional audio signal is from a unidirectional microphone (i.e., the combined of at least two omnidirectional microphones, Mic 1-2): and learn, based on the extracted first acoustic feature amount and the extracted second acoustic feature amount, to convert the first audio signal into the first unidirectional audio signal (e.g., the directional output by combiner 610).  
Shumard in view of Gigandet does not expressively teach at a time of a first training process on the deep neural network, the deep neural network is configured to: extract a first acoustic feature amount from at least one learning audio signal, wherein the at least one learning audio signal is from the non-directional microphone; extract a second acoustic feature amount from at least one second unidirectional audio signal, wherein the at least one second unidirectional audio signal is from a unidirectional microphone, and the unidirectional microphone collects second sound to generate the at least one second unidirectional audio signal.
	Komeilipoor teaches extract a first acoustic feature amount from at least one learning audio signal, wherein the at least one learning audio signal is from the non-directional microphone; extract a second acoustic feature amount from at least one second unidirectional audio signal, wherein the at least one second unidirectional audio signal is from a unidirectional microphone (para 13, the solution should provide a speech separation and enhancement system that comprises the combination of binaural beamforming and deep learning using microphone arrays comprising directional and omnidirectional microphones on each pair of a hearing device in addition to an input from external microphones in order to improve the quality and reliability of speech enhancement using both beamforming and neural network approaches).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Komeilipoor into the teachings of Shumard and Gigandet in order to provide a speech separation and enhancement system that comprises the combination of binaural beamforming and deep learning using microphone arrays comprising directional and omnidirectional microphones on each pair of a hearing device in addition to an input from external microphones in order to improve the quality and reliability of speech enhancement using both beamforming and neural network approaches.	
Consider claim 4.  Komeilipoor further teaches the audio signal processing apparatus is configured with a convolutional neural network, and the extracted first acoustic feature amount and the extracted second acoustic feature amount as correspond to information of a plurality of layers of the convolutional neural network (para 85-86). 



Claim(s) 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shumard et al (US2021/0058702) in view of Gigandet (US2022/0240027) and Komeilipoor (WO 2021/237368), as applied to claims 1-4 above, and further in view of Bhadrik et al (KR 20210099059A).
Consider claim 5.  Bhadrik teaches at a time of a second training process on the convolutional neural network, the convolutional neural network is configured to: learn to distinguish between the at least one learning audio signal (e.g., the historical data; see fig 8, figure 8 shows an evaluation system 800 according to an embodiment. System 800 may implement a self-consistent model to evaluate and train a composite model composed of various machine learning blocks. The accuracy of a composite model is affected by each of the machine learning blocks in it. To evaluate each of these blocks, the system 800 may compare the outputs, introduce a major source of error, and then the models may update the lossy blocks. System 800 includes sensors 818 (which may serve as data sources), other data sources 808-1 and 808-2, data source models 891-0/1/2, parameter estimators 889 , subject model 887-0/1, comparison section 885, and evaluation section 883. The data sources may include at least one primary or direct data source (sensor 818) and one or more secondary data sources (808-1 and 808-2 in the illustrated example). Primary data source 818 may provide data for a value that may be inferred from secondary data sources 808 - 1 / 808 - 2 and serve as a reference value. Data sources 818 , 808 - 1 , 808 - 2 may provide current data as well as historical data. In the illustrated embodiment, the primary data source 818 may be a sensor. In some embodiments, the primary data source 818 may be a sensor making a biophysical measurement of the subject) and the at least one second unidirectional audio signal (e.g., the collected or current data; fig, 8 and its descriptions).  
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Bhadrik into the teachings of Shumard in view of Gigandet and Komeilipoor in order to provide a neural network involves providing inputs to an untrained neural network to produce predicted outputs, comparing predicted outputs to expected outputs, and taking into account differences between predicted and expected outputs.

Claim(s) 6-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shumard et al (US2021/0058702) in view of Gigandet (US2022/0240027) as applied to claim 1 above, and further in view of Yang et al (US2023/0185518).
Consider claim 6.  Shumard teaches an electronic device, the electronic device comprising: a non-directional microphone configured to collect sound to generate an audio signal (fig. 6, microphones Mic1-2; para 79, The microphones included in the array microphone may be, for example, MEMS transducers which are inherently omnidirectional, other types of omnidirectional microphones, electret or condenser microphones, or other types of omnidirectional transducers or sensors); and an audio signal conversion unit (fig. 6: 604, 606, 608 and 610) configured to convert the generated audio signal into an unidirectional audio signal (e.g., directional output, para 65: Referring now to FIG. 6, sum and difference beamformer 600 may be configured to combine audio signals captured by a given set or pair of microphones 602 and generate a combined output signal for said microphone pair that has a directional polar pattern, in accordance with embodiments. More specifically, beamformer 600 may be configured to use appropriate sum and difference techniques on each set of first and second microphones 602 arranged orthogonally to a first axis, or front face, of an array microphone, such as, e.g., array microphone 100 in FIG. 1, to form cardioid elements with narrowed lobes (or sound pick-up patterns), for example, as compared to the full omni-directional polar pattern of the individual microphones 602; para 67: beamformer 600 further includes a correction component 608 for correcting the differential output generated by the difference component 606. The correction component 608 may be configured to correct the differential output for a gradient response caused by the difference calculation. For example, the gradient response may give a 6 dB per octave slope to the frequency response of the microphone pair. In order to generate a first-order polar pattern (e.g., cardioid) for the microphone pair over a broad frequency range, the differential output must be corrected so that it has the same magnitude as the summation output; also see para 95-95, fig. 11a-b and 12, array microphone 100 forms a unidirectional or cardioid polar pattern). 
Shumard in view of Gigandet does not clearly teach the electronic device including an image capturing function.
Yang teaches the electronic device including an image capturing function (fig. 9, camera 1 to N [193]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Yang into the teachings of Shumard in view of Gigandet in order to provide enhance a sound signal of a target sound source based on a zoom operation performed by the user on the image content.
Consider claim 7. Yang further teaches the electronic device includes a smartphone (see Fig. 9).  
Consider claim 8.  Shumard further teaches the audio signal conversion unit is further configured to convert a mixed signal (e.g., by combiner 610, fig 6) into a third unidirectional audio signal (e.g., directional output), the mixed signal includes a second audio signal and a third audio signal, the second audio signal corresponds to a signal from the first microphone (e.g., Mic 1), the first microphone collects third sound to generate the second audio signal, the third audio signal corresponds to a signal from the second microphone (e.g., Mic 2), the second microphone collects fourth sound to generate the third audio signal.  It is noted that all sounds would have been picked up by Mic 1-2 regardless of non-directional or unidirectional sounds.
Yang teaches the smartphone includes, as the non-directional microphone, a first microphone on a top side and a second microphone on a bottom side of the smartphone, (see fig. 4 of Yang). 

    PNG
    media_image1.png
    351
    647
    media_image1.png
    Greyscale
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Yang into the teachings of Shumard in view of Gigandet in order to provide enhance a sound signal of a target sound source based on a zoom operation performed by the user on the image content.

Claim(s) 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shumard et al (US2021/0058702) in view of Yang et al (US2023/0185518).
Consider claim 13.  Shumard teaches an electronic device, the electronic device comprising: a non-directional microphone configured to collect sound to generate an audio signal (fig. 6, microphones Mic1-2; para 79, The microphones included in the array microphone may be, for example, MEMS transducers which are inherently omnidirectional, other types of omnidirectional microphones, electret or condenser microphones, or other types of omnidirectional transducers or sensors); and an audio signal conversion unit (fig. 6: 604, 606, 608 and 610) configured to convert the generated audio signal into an unidirectional audio signal (e.g., directional output, para 65: Referring now to FIG. 6, sum and difference beamformer 600 may be configured to combine audio signals captured by a given set or pair of microphones 602 and generate a combined output signal for said microphone pair that has a directional polar pattern, in accordance with embodiments. More specifically, beamformer 600 may be configured to use appropriate sum and difference techniques on each set of first and second microphones 602 arranged orthogonally to a first axis, or front face, of an array microphone, such as, e.g., array microphone 100 in FIG. 1, to form cardioid elements with narrowed lobes (or sound pick-up patterns), for example, as compared to the full omni-directional polar pattern of the individual microphones 602; para 67: beamformer 600 further includes a correction component 608 for correcting the differential output generated by the difference component 606. The correction component 608 may be configured to correct the differential output for a gradient response caused by the difference calculation. For example, the gradient response may give a 6 dB per octave slope to the frequency response of the microphone pair. In order to generate a first-order polar pattern (e.g., cardioid) for the microphone pair over a broad frequency range, the differential output must be corrected so that it has the same magnitude as the summation output; also see para 95-95, fig. 11a-b and 12, array microphone 100 forms a unidirectional or cardioid polar pattern). 
Shumard does not clearly teach the electronic device including an image capturing function.
Yang teaches the electronic device including an image capturing function (fig. 9, camera 1 to N [193]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Yang into the teachings of Shumard in order to provide enhance a sound signal of a target sound source based on a zoom operation performed by the user on the image content.

Allowable Subject Matter
Claims 9-11 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-13 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DUC M NGUYEN whose telephone number is (571)272-7503. The examiner can normally be reached 6:30AM-3:45PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc M. Nguyen can be reached at 571-272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

DUC M. NGUYEN
Supervisory Patent Examiner
Art Unit 2691



/DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2691

Read full office action

Prosecution Timeline

Dec 08, 2023

Application Filed

Jul 08, 2025

Non-Final Rejection — §102, §103

Oct 10, 2025

Response Filed

Nov 04, 2025

Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/020,906

Patent 12563110

IMAGE SECURITY USING SOURCE IDENTIFICATION

2y 5m to grant Granted Feb 24, 2026

18/237,452

Patent 12549900

LOUDSPEAKER TRANSDUCER ARRANGEMENT

2y 5m to grant Granted Feb 10, 2026

18/192,637

Patent 12477275

Speaker Device and Acoustic System

2y 5m to grant Granted Nov 18, 2025

18/525,814

Patent 12389183

PLAYER DEVICE AND ASSOCIATED SIGNAL PROCESSING METHOD

2y 5m to grant Granted Aug 12, 2025

17/826,502

Patent 11889221

Selective Video Conference Segmentation

2y 5m to grant Granted Jan 30, 2024

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

22%

Grant Probability

40%

With Interview (+17.4%)

3y 11m

Median Time to Grant

Moderate

PTA Risk

Based on 85 resolved cases by this examiner. Grant probability derived from career allow rate.