Last updated: April 19, 2026

Application No. 18/729,993

UPMIXING SYSTEMS AND METHODS FOR EXTENDING STEREO SIGNALS TO MULTI-CHANNEL FORMATS

Non-Final OA §101§103

Filed

Jul 18, 2024

Examiner

ANWAH, OLISA

Art Unit

2692

Tech Center

2600 — Communications

Assignee

Zynaptiq GmbH

OA Round

1 (Non-Final)

Interview Optional

— +4.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 1162 resolved cases, 2023–2026

Examiner Intelligence

ANWAH, OLISA View full profile →

Grants 89% — above average

Career Allow Rate

1036 granted / 1162 resolved

+27.2% vs TC avg

Minimal +4% lift

Without

With

+4.2%

Interview Lift

resolved cases with interview

Fast prosecutor

2y 1m

Avg Prosecution

38 currently pending

Career history

1200

Total Applications

across all art units

Statute-Specific Performance

§101

4.5%

-35.5% vs TC avg

§103

42.0%

+2.0% vs TC avg

§102

29.1%

-10.9% vs TC avg

§112

5.0%

-35.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1162 resolved cases

Office Action

§101 §103

DETAILED ACTION

1.	Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55. 

Information Disclosure Statement
2.	The information disclosure statements submitted are being considered by the examiner.

Claim Rejections - 35 USC § 101
3.	35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

4.	Claims 13-23 are rejected under 35 U.S.C 101 because the claimed invention is directed to non-statutory subject matter. More specifically, the instant rejection is applicable because the specification does not explicitly exclude non-transitory media from its definition of a computer-readable medium. 

Claim Objections
5.	Claim 1 is objected to because of the following informalities: --within the two-dimensional positional distribution-- should be changed to “within the two-dimensional positional distribution plotting frequency versus normalized magnitude”. Appropriate correction is required.
Claim Rejections - 35 USC § 103
6.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


7.	Claims 1-6, 8-18 and 20-23 are rejected under 35 U.S.C. 103 as being unpatentable over Reams et al, U.S. Patent Application Publication No. 2006/0093164 (hereinafter Reams) in view of Avendano et al, U.S. Patent Application Publication No. 2008/0247555 (hereinafter Avendano). 
Regarding claim 1, Reams discloses a method comprising: 
receiving a stereo signal containing a left input channel and a right input channel (from paragraph 0080, see System 800 receives a left channel stereo signal L(T) and a right channel stereo signal R(T) at time-frequency analysis units 802 and 804); 
transforming, based at least on a Fast Fourier Transform, windowed overlapping sections of the stereo signal containing the left input channel and the right input channel to generate a set of frequency bins for the left input channel and the right input channel (from paragraph 0080, see which convert the time domain signals into frequency domain signals. These time-frequency analysis units could be an appropriate filter bank, such as a finite impulse response (FIR) filter bank, a quadrature mirror filter (QMF) bank, a discrete Fourier transform (DFT), a time-domain aliasing cancellation (TDAC) filter bank, or other suitable filter bank. The output from time-frequency analysis units 802 and 804 are a set of frequency domain values covering a sufficient frequency range of the human auditory system, such as a 0 to 20 kHz frequency range where the analysis filter bank sub-band bandwidths could be processed to approximate psycho-acoustic critical bands, equivalent rectangular bandwidths, or some other perceptual characterization. Likewise, other suitable numbers of frequency bands and ranges can be used); 
generating a two-dimensional positional distribution plotting frequency versus normalized magnitude for the transformed stereo signal to identify a position in a left-right plane of each frequency bin in the set of frequency bins (from paragraph 0088, see For each frequency band, the normalized lateral coordinate and depth coordinate form a 2-dimensional vector (LAT(F), DEP(F)) which is input into a 2-dimensional channel map); 
identifying multiple portions of the transformed stereo signal to be extracted, wherein each portion of the transformed stereo signal is identified based on a respective region of interest within the two-dimensional positional distribution (from paragraph 0085, see Relevant inter-channel spatial cues are extracted for each frequency band of the M channel input signals, and a spatial position vector is generated for each frequency band. This spatial position vector is interpreted as the perceived source location for that frequency band for a listener under ideal listening conditions. Each channel filter is then generated such that the resulting spatial position for that frequency element in the up-mixed N channel output signal is reproduced consistently with the inter-channel cues. Estimates of the inter-channel level differences (ICLD's) and inter-channel coherence (ICC) are used as the inter-channel cues to create the spatial position vector); 
applying a filtering function to each respective region of interest to extract the multiple identified portions of the transformed stereo signal, wherein the filtering function attenuates the transformed stereo signal outside of the respective region of interest (from paragraph 0088, see FIGS. 10A through 10E, to produce a filter value H.sub.i(F) for each channel i. These channel filters H.sub.i(F) for each channel i are output from the filter generation unit, such as filter generation unit 606 of FIG. 6, filter generation unit 706 of FIG. 7, and filter generation unit 806 of FIG. 8); and 
transforming each of the multiple identified portions of the transformed stereo signal into a time domain output signal to generate an upmixed multi-channel time domain audio signal, wherein the upmixed multi-channel time domain audio signal is used for playback in a multi-channel sound field via a plurality of output components (from paragraph 0078, see FIG. 8 is a diagram of a system 800 for up-mixing data from M channels to N channels in accordance with an exemplary embodiment of the present invention. System 800 converts stereo time domain data into 7.1 channel time domain data).

Still on the issue of claim 1, Reams does not teach the FFT is short-time. All the same, Avendano discloses this feature (from abstract, see short-time). Therefore, it would have been obvious to one of ordinary skill in the art to modify Reams wherein the FFT is short-time as taught by Avendano. This modification would have improved flexibility by allowing adjustment of window size as suggested by Avendano.

Regarding claim 2, Reams discloses the identified position of each frequency bin is expressed as an angle relative to a stereo center line (see Figures 10A through 10E). 

Regarding claim 3, Reams discloses the multiple identified portions of the transformed stereo signal are extracted without regard to individual sound sources within the stereo signal (from paragraph 0085, see Relevant inter-channel spatial cues are extracted for each frequency band of the M channel input signals, and a spatial position vector is generated for each frequency band. This spatial position vector is interpreted as the perceived source location for that frequency band for a listener under ideal listening conditions. Each channel filter is then generated such that the resulting spatial position for that frequency element in the up-mixed N channel output signal is reproduced consistently with the inter-channel cues. Estimates of the inter-channel level differences (ICLD's) and inter-channel coherence (ICC) are used as the inter-channel cues to create the spatial position vector). 

Regarding claim 4, Reams discloses the multiple portions of the transformed stereo signals are identified based solely on a range of locations defined by frequency and positional coordinates relative to the two-dimensional positional distribution (see Figures 10A through 10E).

Regarding claim 5, Reams discloses a number of the regions of interest is based on a number of the plurality of output components (from paragraph 0085, see Relevant inter-channel spatial cues are extracted for each frequency band of the M channel input signals, and a spatial position vector is generated for each frequency band. This spatial position vector is interpreted as the perceived source location for that frequency band for a listener under ideal listening conditions. Each channel filter is then generated such that the resulting spatial position for that frequency element in the up-mixed N channel output signal is reproduced consistently with the inter-channel cues. Estimates of the inter-channel level differences (ICLD's) and inter-channel coherence (ICC) are used as the inter-channel cues to create the spatial position vector).

Regarding claim 6, Reams discloses generating a visual representation comprising the positions of the frequency bins in the set of frequency bins and positions of each of the identified portions of the transformed stereo signal in the multi-channel sound field, wherein the visual representation is generated to facilitate analysis of relative positions of the frequency bins and the identified portions of the transformed stereo signal in the multi-channel sound field; and providing the visual representation for display in a user interface (see Figures 10A through 10E).

Regarding claim 8, Reams discloses providing the upmixed multi-channel time domain audio signal to an audio playback device (from paragraph 0094, see speakers) for playback.

Regarding claim 9, Reams discloses determining, for each frequency bin in the set of frequency bins, a magnitude, a phase, or combinations thereof, wherein the magnitude is indicative of a frequency amplitude of a frequency bin (from paragraph 0086, see In the exemplary embodiment shown in system 900, sub-band magnitude or energy components are used to estimate inter-channel level differences, and sub-band phase angle components are used to estimate inter-channel coherence. The left and right frequency domain inputs L(F) and R(F) are converted into a magnitude or energy component and phase angle component where the magnitude/energy component is provided to summer 902 which computes a total energy signal T(F) which is then used to normalize the magnitude/energy values of the left M.sub.L(F) and right channels M.sub.R(F) for each frequency band by dividers 904 and 906, respectively. A normalized lateral coordinate signal LAT(F) is then computed from M.sub.L(F) and M.sub.R(F), where the normalized lateral coordinate for a frequency band is computed as: LAT(F)=M.sub.L(F)*X.sub.MIN+M.sub.R(F)*X.sub.MAX).

Regarding claim 10, Reams discloses calculating, based at least on the magnitude for each frequency bin in the set of frequency bins, a spectral summation (from paragraph 0086, see In the exemplary embodiment shown in system 900, sub-band magnitude or energy components are used to estimate inter-channel level differences, and sub-band phase angle components are used to estimate inter-channel coherence. The left and right frequency domain inputs L(F) and R(F) are converted into a magnitude or energy component and phase angle component where the magnitude/energy component is provided to summer 902 which computes a total energy signal T(F) which is then used to normalize the magnitude/energy values of the left M.sub.L(F) and right channels M.sub.R(F) for each frequency band by dividers 904 and 906, respectively. A normalized lateral coordinate signal LAT(F) is then computed from M.sub.L(F) and M.sub.R(F), where the normalized lateral coordinate for a frequency band is computed as: LAT(F)=M.sub.L(F)*X.sub.MIN+M.sub.R(F)*X.sub.MAX).

Regarding claim 11, the combination of Reams and Avendano discloses the stereo signal containing the left input channel and the right input channel is a recorded signal received from a database (from paragraph 0003 of Avendano, see While surround multi-speaker systems are already popular in the home and desktop settings, the number of multi-channel audio recordings available is still limited. Recent movie soundtracks and some musical recordings are available in multi-channel format, but most music recordings are still mixed into two channels and playback of this material over a multi-channel system poses several questions).

Regarding claim 12, the combination of Reams and Avendano discloses the stereo signal containing the left input channel and the right input channel is a live-stream signal received in near real-time from a live event (from paragraph 0030 of Avendano, see The second class, or live recording, is done when the number of instruments is large such as in a symphony orchestra or a jazz big band, and/or the performance is captured live). 

Regarding claim 13, Reams discloses a computer-readable medium carrying instructions that, when executed by at least one processor, cause a computing system to perform operations comprising: 
receiving a stereo signal containing a left input channel and a right input channel (from paragraph 0080, see System 800 receives a left channel stereo signal L(T) and a right channel stereo signal R(T) at time-frequency analysis units 802 and 804); 
transforming, based at least on a Fast Fourier Transform, windowed overlapping sections of the stereo signal containing the left input channel and the right input channel to generate a set of frequency bins for the left input channel and the right input channel (from paragraph 0080, see which convert the time domain signals into frequency domain signals. These time-frequency analysis units could be an appropriate filter bank, such as a finite impulse response (FIR) filter bank, a quadrature mirror filter (QMF) bank, a discrete Fourier transform (DFT), a time-domain aliasing cancellation (TDAC) filter bank, or other suitable filter bank. The output from time-frequency analysis units 802 and 804 are a set of frequency domain values covering a sufficient frequency range of the human auditory system, such as a 0 to 20 kHz frequency range where the analysis filter bank sub-band bandwidths could be processed to approximate psycho-acoustic critical bands, equivalent rectangular bandwidths, or some other perceptual characterization. Likewise, other suitable numbers of frequency bands and ranges can be used); 
generating a two-dimensional positional distribution plotting frequency versus normalized magnitude for the transformed stereo signal to identify a position in a left-right plane of each frequency bin in the set of frequency bins (from paragraph 0088, see For each frequency band, the normalized lateral coordinate and depth coordinate form a 2-dimensional vector (LAT(F), DEP(F)) which is input into a 2-dimensional channel map); 
identifying multiple portions of the transformed stereo signal to be extracted, wherein each portion of the transformed stereo signal is identified based on a respective region of interest within the two-dimensional positional distribution (from paragraph 0085, see Relevant inter-channel spatial cues are extracted for each frequency band of the M channel input signals, and a spatial position vector is generated for each frequency band. This spatial position vector is interpreted as the perceived source location for that frequency band for a listener under ideal listening conditions. Each channel filter is then generated such that the resulting spatial position for that frequency element in the up-mixed N channel output signal is reproduced consistently with the inter-channel cues. Estimates of the inter-channel level differences (ICLD's) and inter-channel coherence (ICC) are used as the inter-channel cues to create the spatial position vector); 
applying a filtering function to each respective region of interest to extract the multiple identified portions of the transformed stereo signal, wherein the filtering function attenuates the transformed stereo signal outside of the respective region of interest (from paragraph 0088, see FIGS. 10A through 10E, to produce a filter value H.sub.i(F) for each channel i. These channel filters H.sub.i(F) for each channel i are output from the filter generation unit, such as filter generation unit 606 of FIG. 6, filter generation unit 706 of FIG. 7, and filter generation unit 806 of FIG. 8); and 
transforming each of the multiple identified portions of the transformed stereo signal into a time domain output signal to generate an upmixed multi-channel time domain audio signal, wherein the upmixed multi-channel time domain audio signal is used for playback in a multi-channel sound field via a plurality of output components (from paragraph 0078, see FIG. 8 is a diagram of a system 800 for up-mixing data from M channels to N channels in accordance with an exemplary embodiment of the present invention. System 800 converts stereo time domain data into 7.1 channel time domain data).

Still on the issue of claim 13, Reams does not teach the FFT is short-time. All the same, Avendano discloses this feature (from abstract, see short-time). Therefore, it would have been obvious to one of ordinary skill in the art to modify Reams wherein the FFT is short-time as taught by Avendano. This modification would have improved flexibility by allowing adjustment of window size as suggested by Avendano.

Claim 14 is rejected for the same reasons as claim 2.
Claim 15 is rejected for the same reasons as claim 3.
Claim 16 is rejected for the same reasons as claim 4.
Claim 17 is rejected for the same reasons as claim 5.
Claim 18 is rejected for the same reasons as claim 6.

Claim 20 is rejected for the same reasons as claim 8.
Claim 21 is rejected for the same reasons as claim 9.
Claim 22 is rejected for the same reasons as claim 10.
Claim 23 is rejected for the same reasons as claim 11.

Allowable Subject Matter
8.	Claims 7 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
9.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to OLISA ANWAH whose telephone number is 571-272-7533. The examiner can normally be reached Monday to Friday from 8.30 AM to 6 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Carolyn Edwards can be reached on 571-270-7136. The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300 for regular communications and 571-273-8300 for After Final communications.
Any inquiry of a general nature or relating to the status of this application or proceeding should be directed to the receptionist whose telephone number is 571-272-2600.

Olisa Anwah
Patent Examiner
March 6, 2026

/CAROLYN R EDWARDS/Supervisory Patent Examiner, Art Unit 2692                                                                                                                                                                                                        



/OLISA ANWAH/Primary Examiner, Art Unit 2692

Read full office action

Prosecution Timeline

Jul 18, 2024

Application Filed

Feb 27, 2026

Non-Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/400,585

Patent 12604130

HEARING DEVICE WITH A BLEEDING CIRCUIT FOR DELIVERING MESSAGES TO A CHARGING DEVICE

2y 5m to grant Granted Apr 14, 2026

17/802,336

Patent 12598710

Terminal Device

2y 5m to grant Granted Apr 07, 2026

18/244,024

Patent 12597251

VIDEO FRAMING BASED ON TRACKED CHARACTERISTICS OF MEETING PARTICIPANTS

2y 5m to grant Granted Apr 07, 2026

18/275,264

Patent 12596515

FIRST DEVICE, COMMUNICATION SERVER, SECOND DEVICE AND METHODS IN A COMMUNICATIONS NETWORK

2y 5m to grant Granted Apr 07, 2026

18/647,348

Patent 12598437

EARPHONES AND EARPHONE SYSTEM

2y 5m to grant Granted Apr 07, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

89%

Grant Probability

93%

With Interview (+4.2%)

2y 1m

Median Time to Grant

Low

PTA Risk

Based on 1162 resolved cases by this examiner. Grant probability derived from career allow rate.

UPMIXING SYSTEMS AND METHODS FOR EXTENDING STEREO SIGNALS TO MULTI-CHANNEL FORMATS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email