Last updated: April 19, 2026
Application No. 18/447,486
APPARATUS, METHOD AND COMPUTER PROGRAM FOR ENCODING, DECODING, SCENE PROCESSING AND OTHER PROCEDURES RELATED TO DIRAC BASED SPATIAL AUDIO CODING USING DIRECT COMPONENT COMPENSATION

Final Rejection §102§103
Filed
Aug 10, 2023
Examiner
KRZYSTAN, ALEXANDER J
Art Unit
2694
Tech Center
2600 — Communications
Assignee
Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
OA Round
4 (Final)
Interview Optional

— +6.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 1121 resolved cases, 2023–2026
Examiner Intelligence

KRZYSTAN, ALEXANDER J View full profile →
Grants 81% — above average
Career Allow Rate
913 granted / 1121 resolved
+19.4% vs TC avg
Moderate +7% lift
Without
With
+6.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
38 currently pending
Career history
1159
Total Applications
across all art units
Statute-Specific Performance

§101
2.7%
-37.3% vs TC avg
§103
37.1%
-2.9% vs TC avg
§102
24.3%
-15.7% vs TC avg
§112
21.0%
-19.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 1121 resolved cases
Office Action

§102 §103
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION

Examiner’s Comments

The terminal disclaimer filed 8-23-2024 has been approved. 
Applicant’s use of ‘time frequency tile’ as now amended into the claims is used in a manner where said tile cannot be read as a singular entity in the various recitation because a single signal can only exist at a single point in time and space. For example, in claim 1, the same time frequency tile is recited for both the estimator step and for calculating a compensation gain step.  However, since the calculating step requires the results of the estimator/estimating step it cannot occur at the same point in time as the estimating step.  As such, ‘for the time frequency tile’ is read is using parameters based on a common frequency band in the context of a dsp receiving and processing an audio stream.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1,19,20,3,5-6,8-13,15,18 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Kelloniemi (9838821).

As per claim 1, an apparatus for generating a sound field description from an input signal comprising at least two channels, the apparatus comprising: 

an input signal analyzer for acquiring direction data and diffuseness data from a time frequency tile (the process is performed on a stream of audio input as shown in fig. 4, where the processor is digital as such operates at distinct points in time, where said points in time, coupled with the cited bins/frequency bands are time frequency tiles) from the input signal (the means for performing 606 in fig. 6, where, per para. 75: the sound diffuseness/diffuseness data may be estimated for each of the frequency bins by computing the speed of variation of sound arrival direction/direction data); 

an estimator for estimating for the time frequency tile, (the cited process is performed on the same set of time frequency tiles as the above cited signal analyzer, via the input streams into processor 411 ) a first energy- or energy- or amplitude-related measure for an omnidirectional component derived from the input signal (the estimated energy density and intensity in stage 606 for a particular frequency bin that has the diffuseness exceeding a limit is an omnidirectional component ) and 
for estimating for the time frequency tile, (the cited process is performed on the same set of time frequency tiles as the above cited signal analyzer, via the input streams into processor 411 ) a second energy- or amplitude-related measure for a directional component derived from the input signal (the estimated energy density and intensity in stage 606 for a particular frequency bin that has the diffuseness not exceeding a limit is a direct component ), and 
a sound component generator for generating sound field components of the sound field (the means for performing 605-611 where the frequency bins are the components, since the bins derive from microphone signals per 601, the microphone captures direct sound from sound sources and captures diffuse/omnidirectional sound from the environment where the microphones are located, where the first group is the group of bins which exceed the limit in 607), 
wherein the sound component generator comprises an energy compensator configured to perform an energy compensation (means to perform step 611) of the directional component using

 the energy compensator comprising a compensation gain calculator configured for calculating a compensation gain for the time frequency tile, (the cited process is performed on the same set of time frequency tiles as the above cited signal analyzer, via the input streams into processor 411 )  using: 

the first energy- or amplitude-related measure for the omnidirectional component for the time frequency tile, (the cited process is performed on the same set of time frequency tiles as the above cited signal analyzer, via the input streams into processor 411 ), 

the second energy- or amplitude-related measure for the directional component for the time frequency tile, (the cited process is performed on the same set of time frequency tiles as the above cited signal analyzer, via the input streams into processor 411 ), 

the direction data and the diffuseness data for the time frequency tile, (the cited process is performed on the same set of time frequency tiles as the above cited signal analyzer, via the input streams into processor 411 )  

(the energy compensation in step 607-611 uses the direction data and diffuseness data as cited above, where the compensation for a particular component can be based on a diffuseness estimation in stage 606, that is based on combinations of audio signals from different microphones, where the bins from different microphone are different components including the component used to generate the first energy related measure and the component used to generate the second energy related measure );
wherein the compensation gain calculator is configured to:

 increase the compensation gain for the time frequency tile, (the cited process is performed on the same set of time frequency tiles as the above cited signal analyzer, via the input streams into processor 411 ) with an increasing first energy- or amplitude-related measure for the omnidirectional component (para. 98 : select one of a plurality of different weighting factors for each frequency bin, such that the weighting factor is the higher, the higher the estimated diffuseness/OMNIDIRECTIONAL COMPONENT of sound.), and 
to decrease the compensation gain for the time frequency tile, (the cited process is performed on the same set of time frequency tiles as the above cited signal analyzer, via the input streams into processor 411 ) with an increasing second energy- or amplitude-related measure for the directional component (para. 98 if it is determined in action 607 that the diffuseness of sound exceeds the predetermined limit for a particular frequency bin, a weighting factor for this frequency bin is set to a value larger than 1, where, if the diffusiveness decreases to below the limit/directional component, the compensation gain is decreased/set to 1 ).

As per claim 19. A method for generating a sound field description from an input signal comprising at least two channels, comprising: acquiring direction data and diffuseness data from the input signal for atime-freqiuencytile;estimating, for the time-frequency tile, a first energy- or amplitude-related measure for an omnidirectional component derived from the input signal and estimating, for the time-freqiuencytil a second energy- or amplitude-related measure for a directional component derived from the input signal, and generating sound field components of the sound field description, wherein the generating comprises performing an energy compensation of the directional component, the performing the energy compensation comprising calculating a compensation gain for the time-freqiuency tile using the first energy- or amplitude-related measure for the omnidirectional component for the time-freqiuency tile, the second energy- or amplitude-related measure for the directional component for the time-frequency tile, the direction data and the diffuseness data for the time- frequencv tile,wherein the calculating is configured to increase the compensation gain for the time- frequency tile with an increasing first energy- or amplitude-related measure for the omnidirectional component for the time-frequency tile, and to decrease the compensation gain for the time-frequency tile with an increasing second energy- or amplitude-related measure for the directional component for the time-frequency tile. (per the claim 1 rejection).

As per claim 20, the system of the claim 1 rejection requires A non-transitory digital storage medium having stored thereon a computer program for performing a method for generating a sound field description from an input signal comprising at least two channels, comprising:acquiring direction data and diffuseness data from the input signal, for atime-frequency tile;estimating, for the time-frequency tile, a first energy- or amplitude-related measure for an omnidirectional component derived from the input signal and estimating, for the time-frequency til a second energy- or amplitude-related measure for a directional component derived from the input signal, and generating sound field components of the sound field description, wherein the generating comprises performing an energy compensation of the directional component, the performing the energy compensation comprising calculating acalculating a compensation gain for the time- frequency tile using the first energy- or amplitude-related measure for the omnidirectional component for the time-frequency tile, the second energy- or amplitude-related measure for the directional component for the time-frequency tile, the direction data and the diffuseness datafor the time-frequency tile,wherein the calculating is configured to increase the compensation gain for the time- freqiuency tile with an increasing first energy- or amplitude-related measure for the omnidirectional component for the time-freqiuency tile, and to decrease the compensation gain for the time-freqiuency tile with an increasing second energy- or amplitude-related measure for the directional component for the time-freqiuency tile,when said computer program is run by a computer.. (the processor based apparatus of the claim 1 rejection requires nontransitory memory with software on to implement the disclosed functions).

As per claim 3, the apparatus of claim 1, wherein the input signal comprises the omnidirectional component and one or more directional components (per the claim 1 rejection), and wherein the estimator is configured to calculate the first amplitude related measure for the omnidirectional component using the input signal and to calculate the second energy- or amplitude-related measure for each of the one or more directional components from the input signal (amplitude related measures from multiple microphones/components as cited in the claim 1 rejection are used for each of the omnidirectional and directional components as cited in the claim 1 rejection).

As per claim 5, the apparatus of claim 1, wherein the input signal analyzer is configured to extract the diffuseness data from metadata associated with the input signal 
Or
 to extract the diffuseness data from the input signal by a signal analysis (stages 606, 607) of the input signal comprising the at least two channels or components.

As per claim 6, The apparatus of claim 1, wherein the estimator is configured to calculate the first energy- or amplitude-related measure or the second energy- or amplitude- related measure from an absolute value of a complex amplitude or 
a magnitude raised to a power greater than 1 and lower than 5 or being equal to 2 or 3 (step 606 magnitude squared coherence).
As per claim 8, The apparatus of claim 1, wherein the sound component generator is configured to calculate, from the direction data, a directional gain (factors in 608 and 609) and to combine the directional gain and the diffuseness data for performing the energy compensation (combined as per those elements as cited in the claim 1 rejection).
As per claim 9, the estimator is configured to estimate the second energy- or amplitude-related measure for the directional component and a third energy- or amplitude-related measure for a second directional component (there can be more than two bins or components and more than two microphones which provide first second third fourth measures ect for subsequent diffuse and omnidirectional components), to calculate the compensation gain (608 or 609) for the first directional component using the first and the second energy- or amplitude-related measures (per the claim 1 rejection), and to calculate a second compensation gain for the second directional component using the first and the third energy- or amplitude-related measures (per the claim 1 rejection via the step in 606 using signals from different microphones to determine a particular diffuseness value).
As per claim 10, wherein the compensation gain calculator is configured to calculate a first gain factor depending on the diffuseness data and at least one of the number of sound field components in the second group of sound field components, the maximum order of sound field components of the first group of sound field components and the maximum number of sound field components of the second group ((the sound field components of the claim 1 rejection by definition have respective orders based on the signals that adapt each component within each bin, where the orders affect the value of the signal in each bin which affects all subsequent processing including the energy compensation), 

to calculate a second gain factor depending on the first energy- or amplitude-related measure for the omnidirectional component, the second energy- or amplitude-related measure for the directional component, the direction data and the diffuseness data (each bin/components has a respective calculated gain factor), and 

to calculate the compensation gain using the first gain factor and the second gain factor (the gain is calculated via the gain factors, and applied in stage 611), 

wherein the sound component generator is configured to use the same direction data and diffuseness data for calculating the diffuse compensation gain and the direct compensation gain (the direction data and diffuseness data is used for each component to determine first and second gain factors as per 608 and 609, which are used to make a compensation gain in stage 611 as first and second compensation gains, where the amount of diffuseness for each component defines it as either a direct or diffuse gain factor).

As per claim 11, The apparatus of claim 7, wherein 
 the directional component belongs to a plurality of directional components having a number of directional components, and wherein the compensation gain calculator is configured to increase the compensation gain with a decreasing number of directional components (para. 69, the number of bins/components directly affects the bandwidth of each bin where a lower number of bins requires greater bandwidth for each bin which will require greater compensation gain to be applied to account for the expanded bandwidth of each bin).

As per claim 12, The apparatus of claim 7, wherein a sound component generator for generating, from the input signal, one or more sound field components of a first group of sound field components comprising for each sound field component a direct component and a diffuse component ((means for providing the components per stage 609)), and
 for generating, from the input signal, a second group of sound field components comprising only a direct component (per stage 608), 
wherein the compensation gain calculator is configured to calculate the compensation gain using the diffuseness data and at least one of the number of sound field components in the second group, the number of diffuse components in the first group, a maximum order of sound field components of the first group, and a maximum order of sound field components of the second group (per the claim 10 rejection).

As per claim 13, The apparatus of claim 7, wherein the compensation gain calculator is configured to perform a gain factor manipulation using a limitation with a fixed maximum threshold ((predetermined limit, step 607)) 
or a fixed minimum threshold or using a compression function for compressing low or high gain factors towards medium gain factors to acquire the compensation gain.

As per claim 15, The apparatus of claim 7, wherein the energy compensator comprises a compensation gain applicator for applying the compensation gain to at least one sound field component (stage 611, the means for the weighting).

As per claim 18, The apparatus of claim 1, an analysis filter bank for generating the one or more sound field components for a plurality of different time-frequency tiles (step 605 is frequency based and time based since the processor must be clocked and framed), 
, wherein the input signal analyzer is configured to acquire a diffuseness data item for each time-frequency tile (each bin has a respective diffuseness value),, and 
wherein the sound component generator is configured to perform the energy compensation separately for each time-frequency tile (step 611).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 2,4,14,16,17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kelloniemi (9838821).

As per claim 2, The apparatus of claim 1, wherein the input signal comprises the at least two channels, wherein the estimator is configured to calculate the omnidirectional component using an addition of the at least two channels, and to calculate the directional component using a subtraction of the at least two channels (the input signal has at least two channels but does not specify the format of particular stereo pairs).

The examiner takes official notice it is well known in the art to use M/S processing to represent stereo pairs for the purpose of compatibility with well known signaling standards.  Where a mid signal uses signal addition and a side signal uses signal subtraction.

As per claim 4, The rejection of claim 1, discloses wherein the estimator is configured to derive the omnidirectional component and the directional components using a weighted linear combination (610-612)  of the at least two channels but does not specify wherein the input signal comprises an A-format or B- format representation with at least two channels.

The examiner takes official notice it is well known in the art to use A and B format protocols to represent stereo pairs for the purpose of compatibility with well known signaling standards.

As per claim 14, the apparatus of claim 1, however that embodiments does not specify wherein the sound component generator is configured to generate other sound field components of other orders, wherein a combiner is configured to combine the sound field components of the sound field and the other sound field components of other orders to acquire a sound field description comprising an order being higher than an order of the input signal.

(the examiner takes official notice it is well known in the art to encode/combine multichannel audio signals into well known ambisonics formats  including the low order mono ambisonics signal and higher order ambisonics signal for the purpose of compatibility with existing standards.

	As per claim 16, The apparatus of claim 1, wherein the sound component generator comprises a low-order component generator for generating a low-order sound field description from the input signal up to a predetermined order and the predetermined mode (Obvious as per the claim 15 rejection in view of the lower order ambisonics signals), 

wherein the low-order components generator is configured to derive the low-order sound field description by copying or taking the input signal or forming a weighted combination of the channels of the input signal (per stage 606, combinations of audio signals from different microphones), wherein the low order sound field description comprises the omnidirectional component and the directional component generated by the copying or the taking of the linear combination (the components combined via different microphone signals as per 606 to determine the diffuseness which determines the eventual sound field description output from 613).

As per claim 17, a first group of sound field components up to an order I of coefficients and a second group of sound field components above the order I of coefficients are orthogonal to each other, 
or 
wherein the sound field components are at least one of coefficients of orthogonal basis functions, coefficients of spatial basis functions, coefficients of spherical or circular harmonics, and Ambisonics coefficients(the ambisonics cited in the claim 15 rejection by definition uses spherical harmonic and ambisonics coefficients to represent audio components.  

Claim(s) 21,22,23,7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kelloniemi (9838821) and further in view of Goodwin (US 11205435 B2).
As per claim 21, the claim 1 rejection discloses an apparatus for generating a sound field description from an input signal comprising at least two channels, the apparatus comprising: an input signal analyzer configured for acquiring direction data and diffuseness data from the input signal; an estimator configured for estimating a first energy- or amplitude-related measure for an omnidirectional component derived from the input signal and configured for estimating a second energy- or amplitude-related measure for a directional component derived from the input signal, and a sound component generator configured for generating sound field components of the sound field description, wherein the sound component generator comprises an energy compensator configured to perform an energy compensation of the directional component, the energy compensator comprising a compensation gain calculator configured for calculating a compensation gain using the first energy- or amplitude-related measure for the omnidirectional component, the second energy- or amplitude-related measure for the directional component, the direction data and the diffuseness data, (per the claim 1 rejection)
Kelloniemi discloses generating sound field components of the sound field description, including using variation of sound arrival direction wherein, and wherein the compensation gain calculator is configured to increase the compensation gain with an increasing direction gain,  (fig. 6,, stage 607, not diffuse means direct which means increasing direction gain, which has a gain of 1 per stage 608, and where, when the direction gain is less (more diffuse) the gain applied is less as per stage 609.
but does not specify: 

the direction data comprises an indication on an azimuth angle 'p and an indication on an elevation angle B
Goodwin teaches an audio device can use an azimuth angle 'p and an indication on an elevation angle B in order to represent diffuse audio sources (para 18).  Goodwin teaches that this allows modelling of point and diffuse sources (para. 18).  It would have been obvious to one skilled in the art at the time of filing that the direction gain/diffuse indication could be represented using well known indicators including azimuth and elevation for the purpose of compatibility with known form of signal representations using for modeling captured sources.
Wherein, when the indicators of azimuth and elevation are used to determine direction in the system of Kelloniemi, the direction gain of Kelloniemi derived from the indication on the azimuth angle 'p and the indication on the elevation angle B. 

As per claim 22, a method for generating a sound field description from an input signal comprising at least two channels, comprising:acquiring direction data and diffuseness data from the input signal; estimating a first energy- or amplitude-related measure for an omnidirectional component derived from the input signal and estimating a second energy- or amplitude-related measure for a directional component derived from the input signal, and generating sound field components of the sound field description, wherein the generating comprises performing an energy compensation of the directional component, the performing the energy compensation comprising calculating a compensation gain using the first energy- or amplitude-related measure for the omnidirectional component, the second energy- or amplitude- related measure for the directional component, the direction data and the diffuseness data, wherein the direction data comprises an indication on an azimuth angle 'p and an indication on an elevation angle 0, and wherein the calculating is configured to increase the compensation gain with an increasing direction gain, wherein the direction gain is derived from the indication on the azimuth angle 'p and the indication on the elevation angle (per the claim 21 rejection).
As per claim 23,  a non-transitory digital storage medium having stored thereon a computer program for performing a method for generating a sound field description from an input signal comprising at least two channels, comprising:acquiring direction data and diffuseness data from the input signal; estimating a first energy- or amplitude-related measure for an omnidirectional component derived from the input signal and estimating a second energy- or amplitude-related measure for a directional component derived from the input signal, and generating sound field components of the sound field description, wherein the generating comprises performing an energy compensation of the directional component, the performing the energy compensation comprising calculating a compensation gain using the first energy- or amplitude-related measure for the omnidirectional component, the second energy- or amplitude- related measure for the directional component, the direction data and the diffuseness data, wherein the direction data comprises an indication on an azimuth angle 'p and an indication on an elevation angle B, and wherein the calculating is configured to increase the compensation gain with an increasing direction gain, wherein the direction gain is derived from the indication on the azimuth angle 'p and the indication on the elevation angle 0,when said computer program is run by a computer (per the claim 21 rejection).

As per claim 7, it is rejected per the claim 1 and 21 rejections.

Response to Arguments

The submitted arguments have been considered but are moot in view of the new grounds of rejection. 

As per applicant’s argument that the cited prior art does not disclose an energy compensation of the direct component, the examiner notes the cited weighting of the bin, where the weighting is providing a compensation to the direct component/bin, noting each bin, including the direct bin/components from each microphone is combined, and then each bin is weighted/energy compensated per stage 611.

As per applicant’s argument that an energy compensation using the amplitude related measure for the omnidirectional component is not done, the examiner notes the amplitude related measure/diffuseness estimation is determined and used for the omnidirectional and direct components/bins per stages 606-609, noting that omnidirectional components are those with the diffuseness exceeding the threshold per stage 607.

Response to Arguments

The submitted arguments have been considered but are moot in view of the new grounds of rejection.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER KRZYSTAN whose telephone number is 571-272-7498, and whose email address is alexander.krzystan@uspto.gov

The examiner can usually be reached on m-f 7:30-4:00 est.
If attempts to reach the examiner by telephone or email are unsuccessful, the examiner’s supervisor, Fan Tsang can be reached on (571) 272-7547.  

The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300 for regular communications and 571-273-8300 for After Final communications.
/ALEXANDER KRZYSTAN/Primary Examiner, Art Unit 2653                                                                                                                                                                                                        
Examiner Alexander Krzystan
November 10, 2025
Read full office action
Prosecution Timeline

Aug 10, 2023
Application Filed
Apr 18, 2024
Non-Final Rejection — §102, §103
Aug 23, 2024
Response Filed
Sep 13, 2024
Final Rejection — §102, §103
Feb 16, 2025
Request for Continued Examination
Feb 18, 2025
Response after Non-Final Action
May 27, 2025
Non-Final Rejection — §102, §103
Oct 28, 2025
Response Filed
Nov 10, 2025
Final Rejection — §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/286,841
Patent 12598440
RENDERING OF OCCLUDED AUDIO ELEMENTS
2y 5m to grant Granted Apr 07, 2026
18/486,764
Patent 12593170
SWITCHING METHOD FOR AUDIO OUTPUT CHANNEL, AND DISPLAY DEVICE
2y 5m to grant Granted Mar 31, 2026
18/314,713
Patent 12573410
DECODER, ENCODER, AND METHOD FOR INFORMED LOUDNESS ESTIMATION IN OBJECT-BASED AUDIO CODING SYSTEMS
2y 5m to grant Granted Mar 10, 2026
18/397,683
Patent 12574675
Acoustic Device and Method
2y 5m to grant Granted Mar 10, 2026
18/082,987
Patent 12541554
TRANSCRIPT AGGREGATON FOR NON-LINEAR EDITORS
2y 5m to grant Granted Feb 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
81%
Grant Probability
88%
With Interview (+6.9%)
3y 1m
Median Time to Grant
High
PTA Risk
Based on 1121 resolved cases by this examiner. Grant probability derived from career allow rate.