Last updated: April 19, 2026

Application No. 18/076,872

Spatial Audio Object Positional Distribution within Spatial Audio Communication Systems

Non-Final OA §103

Filed

Dec 07, 2022

Examiner

BEKEE, CHIMEZIE EZERIWE

Art Unit

2691

Tech Center

2600 — Communications

Assignee

Nokia Technologies Oy

OA Round

3 (Non-Final)

Interview Optional

— +33.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 16 resolved cases, 2023–2026

Examiner Intelligence

BEKEE, CHIMEZIE EZERIWE View full profile →

Grants 69% — above average

Career Allow Rate

11 granted / 16 resolved

+6.8% vs TC avg

Strong +33% interview lift

Without

With

+33.3%

Interview Lift

resolved cases with interview

Typical timeline

2y 8m

Avg Prosecution

27 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

1.6%

-38.4% vs TC avg

§103

67.7%

+27.7% vs TC avg

§102

18.2%

-21.8% vs TC avg

§112

6.8%

-33.2% vs TC avg

Black line = Tech Center average estimate • Based on career data from 16 resolved cases

Office Action

§103

DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
1.	The amendment filed February 12, 2026 has been entered. Claims 1-7, 9-14, 16, and 18-22 are pending in the application. Claims 8, 15, and 17 are canceled. The rejection of Claim 16 under 35 U.S.C. 112(b) is withdrawn.

Claim Rejections - 35 USC § 103
2.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

3.	Claim(s) 1-7, 10-14, 16, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Atti et al. (U.S. Pub. No. 2019/0103118 A1, hereinafter "Atti") in view of Eubank et al. (U.S. Pub. No. 2021/0035597 A1, hereinafter "Eubank"), and further in view of Jot et al. (U.S. Pub. No. 2017/0127212 A1, hereinafter "Jot").

	Regarding Claim 1, Atti teaches an apparatus for spatial audio communication (device 101, 600, Figs. 1 and 6, Paras. [0020] and [0114]-[0118]), the apparatus comprising:
at least one processor (processor 104, 606, Figs. 1 and 6, Paras. [0020] and [0114]-[0118]); and
at least one memory storing instructions that, when executed with the at least one processor (memory 653, Fig. 6, Para. [0118]), cause the apparatus to:
obtain at least one audio signal (audio processor 104 is configured receive the audio signals 136-139, Fig. 1, Para. [0025]), at least one first spatial audio parameter associated with a first sound object from a first group of one or more sound objects (processor 104 generates audio streams 131-133 [mapped as audio objects], Fig. 1, Para. [0026]; audio signals 136-139 are processed to estimate the spatial characteristics (e.g., azimuth, elevation, etc.) [mapped as spatial audio parameters which will include a first spatial audio parameter] of the sound sources. Audio signals 136-139 are mapped to independent streams corresponding to sound sources and the corresponding spatial metadata 124, Fig. 1, Para. [0030]), and at least one second spatial audio parameter associated with a second sound object from a second group of one or more sound objects (processor 104 generates audio streams 131-133 [mapped as audio objects], Fig. 1, Para. [0026]; audio signals 136-139 are processed to estimate the spatial characteristics (e.g., azimuth, elevation, etc.) [mapped as spatial audio parameters which will include a first spatial audio parameter] of the sound sources. Audio signals 136-139 are mapped to independent streams corresponding to sound sources and the corresponding spatial metadata 124, Fig. 1, Para. [0030]), the first sound objects associated with a first physical sound source (a particular stream from the one or more streams 131-133 may be associated with the user's speech in a telephonic mode, Paras. [0024] and [0031]);
after obtaining the at least one audio signal, the at least one first spatial audio parameter and the at least one second spatial audio parameter, allocate at least one of:
the first physical sound source to a first direction associated with the at least one first spatial audio parameter, or the second physical sound source to a second direction associated with the at least one second spatial audio parameter (the IVAS codec 102 may determine whether a particular stream is within or outside of a targeted viewpoint. This determination may be based on estimation of a direction of arrival of the particular stream, which may be estimated by the IVAS codec 102 or front end audio processor 104, or may be based on prior statistical information of each stream, Para. [0082]; [i.e. the first physical sound source (user's speech) is allocated to a first direction associated with the determined direction of arrival]); and
low-bitrate encode a spatial audio signal based on the at least one audio signal, the at least one first spatial audio parameter and the at least one second spatial audio parameter (IVAS codec 102 is configured to encode the audio data 122 comprising the audio streams 131-133, Para. [0033]; spatial metadata may also be embedded in the input streams, Para. [0045]; spatial metadata 124 include first and second spatial parameter, Para. [0078]; low bit rate encoding is also done using the IVAS codec 102, Paras. [0050], [0057], and [0082]), wherein the spatial audio signal comprises the first and second sound objects that are allocated to a first direction and a second direction, respectively, wherein the first direction is different from the second direction (IVAS codec 102 can identify audio streams [i.e. audio objects] from left side direction and right side direction and enable encoding with fewer bits, Para. [0082]; the first and second sound objects can be allocated to a first direction and a second direction which are different i.e. the left and right direction).
Atti fails to explicitly teach the first and second sound objects associated with a first physical sound source and a second physical sound source, respectively;
wherein the spatial audio signal comprises controlled versions of the first and second sound objects that are allocated to the first direction and [[a]] the second direction, respectively, wherein the first direction is different from the second direction. 
However, Eubank teaches the first and second sound objects associated with a first physical sound source and a second physical sound source, respectively (sound object and sound bed identifier 10 is used to associate a sound object with physical sound source, Fig. 1, Para. [0043]; for example the sound object may be a human speech 16 and a dog bark 17 and the identifier 10 is configured to identify a sound source (e.g. a position of the sound source within the environment) and produce spatial sound source data that specifically represents the sound source. Spatial sound-source data of the dog bark 17 may include an audio signal that contains the bark 17 and position data of the source (e.g., the dog's mouth) of the bark 17, such as azimuth and elevation with respect to the device 1 and/or distance between the source and the device 1, Fig. 1, Para. [0044]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the apparatus (as taught by Atti) to include first and second sound object associated with a first and second physical sound source (as taught by Eubank). Doing so will enable the creation of an enhanced immersive and realistic audio experience.
However, Jot teaches wherein the spatial audio signal comprises controlled versions of the first and second sound object (The audio signals 110 include a first object-based audio signal that includes a dialog signal, and a second object-based audio signal that includes a non-dialog signal. The encoder device 120 can be configured to modify metadata 113 associated with one or more of the first and second object-based audio signals, Fig. 1, Para. [0034]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the apparatus (as taught by Atti in view of Eubank) to include the controlled versions of the first and second object (as taught by Jot). Doing so, a reduced bitrate can be used which improves compression efficiency.

Regarding Claim 2, Atti in view of Eubank, and further in view of Jot teach wherein the instructions, when executed with the at least one processor, cause the apparatus to perform at least one of:
amplify audio signals associated with at least one of the first sound object or the second sound object;
attenuate audio signals associated with at least one of the first sound object or the second sound object; or 
modify at least one of the at least one first spatial audio parameter or the at least one second spatial audio parameter (Jot, The encoder device 120 can be configured to modify metadata 113 associated with one or more of the first and second object-based audio signals, Fig. 1, Para. [0034]). 

Regarding Claim 3, Atti in view of Eubank, and further in view of Jot teach wherein the instructions, when executed with the at least one processor, cause the apparatus to modify at least one sound source direction or position in at least one of a first metadata which comprises the at least one first spatial audio parameter or a second metadata which comprises the at least one second spatial audio parameter (Jot, The encoder device 120 can be configured to modify metadata 113 associated with one or more of the first and second object-based audio signals. The metadata 113 can include spatial position, Fig. 1, Para. [0034]).

Regarding Claim 4, Atti in view of Eubank, and further in view of Jot teach wherein the instructions, when executed with the at least one processor, cause the apparatus to at least partially discard at least one of:
the association between the at least one first spatial audio parameter and the first sound object, or the association between the at least one second spatial audio parameter and the second sound object (Jot, encoder device 120 can be configured to modify metadata 113 associated with one or more of the first and second object-based audio signals. The metadata 113 can include spatial position, Fig. 1, Para. [0034]; modifying the spatial position partially discards the association between first and second metadata and the first and second sound objects from the group of sound objects).

Regarding Claim 5, Atti in view of Eubank, and further in view of Jot teach wherein the instructions, when executed with the at least one processor, cause the apparatus to: 
obtain a first sound object audio signal (Jot, audio signals 110 include a first object-based audio signal, Fig. 1, Para. [0034]);
obtain a second sound object audio signal (Jot, audio signals 110 include a second object-based audio signal, Fig. 1, Para. [0034]); and
mix the first sound object audio signal and the second sound object audio signal to at least partially generate the spatial audio signal, wherein the mix is based on at least one of: the at least one first spatial audio parameter or the at least one second spatial audio parameter (Jot, the encoder device 120 receives the audio signals 110 and adds respective metadata 113 to the audio signals 110. The metadata 113 can include spatial position. The object-based audio signals can be received at a multiplexer circuit 122 in the encoder device 120, and an output 111 [spatial audio signal] of the multiplexer circuit 122 can be coupled to an output of the encoder device 120, Fig. 1, Paras. [0034] and [0035]).

Regarding Claim 6, Atti in view of Eubank, and further in view of Jot teach wherein the instructions, when executed with the at least one processor, cause the apparatus to:
identify at least two physical sound sources including the first and second physical sound sources (Eubank, two physical sound sources are identified i.e. the person and the dog, Fig. 1, Paras. [0042] and [0043]); and
associate for each of the at least two physical sound sources one of the first and second sound objects (Atti, audio signals 136-139 are mapped to independent streams [first and second objects] corresponding to sound sources, Fig. 1, Para. [0030]).

Regarding Claim 10, it is similarly rejected as Claim 1. The method can be found in Atti (Paras. [0103]-[0113]).

Regarding Claim 11, it is similarly rejected as Claim 2. The method can be found in Atti (Paras. [0103]-[0113]).

Regarding Claim 12, it is similarly rejected as Claim 3. The method can be found in Atti (Paras. [0103]-[0113]).

Regarding Claim 13, it is similarly rejected as Claim 4. The method can be found in Atti (Paras. [0103]-[0113]).

Regarding Claim 14, it is similarly rejected as Claims 5 and 6. The method can be found in Atti (Paras. [0103]-[0113]).

Regarding Claim 16, it is similarly rejected as Claim 7. The method can be found in Atti (Paras. [0103]-[0113]).

Regarding Claim 19, Atti in view of Eubank, and further in view of Jot teach wherein the method is performed for an application associated with at least one of:
a spatial audio UI, where sounds are limited to application direction (Eubank, spatial audio UI 1, sounds are limited to direction of sound sources, Fig. 1, Paras. [0038], [0045], [0053] and [0054]).

4.	Claim(s) 21 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Atti et al. (U.S. Pub. No. 2019/0103118 A1, hereinafter "Atti") in view of Eubank et al. (U.S. Pub. No. 2021/0035597 A1, hereinafter "Eubank") in view of Jot et al. (U.S. Pub. No. 2017/0127212 A1, hereinafter "Jot"), and further in view of Laitinen et al. (U.S. Pub. No. 2022/0303711 A1, hereinafter "Laitinen").

	Regarding Claim 21, Atti in view of Eubank, and further in view of Jot teach render spatial audio signals (Atti, render and binauralize circuit 218 is used to render spatial audio signal, Fig. 2, Para. [0055]).
Atti in view of Eubank, and further in view of Jot fail to explicitly teach wherein the instructions, when executed with the at least one processor, cause the apparatus to:
estimate at least one target spatial audio feature based, at least partially, on at least one of the first direction or the second direction; and
render spatial audio signals based, at least partially, on the estimated at least one target spatial audio feature. 
However, Laitinen teaches estimate at least one target spatial audio feature based, at least partially, on at least one of the first direction or the second direction (spatial analysis is configured to provide multiple direction estimates or parametric based direction of arrival estimates. It may be in some circumstances some of these parametric based estimates point to reflections instead of the actual direction of the sound source. Thus as described another estimate of the sound source direction is obtained and the parametric based directions are modified or biased based on the further estimates. This other estimate should relatively reliably point to the actual main direction of the sound source, Para. [0208]); and
render spatial audio signals based, at least partially, on the estimated at least one target spatial audio feature (spatial audio is captured based on estimating at least one direction parameter in frequency bands and rendering the spatial sound at the frequency bands according to these direction parameters, Para. [0105]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the apparatus (as taught by Atti in view of Eubank, and further in view of Jot) to include the estimating target spatial feature based on direction and rendering the spatial audio (as taught by Laitinen). Doing so, increases an accuracy of important audio sources (Laitinen Para. [0242]).

Regarding Claim 22, it is similarly rejected as Claim 21. The method can be found in Atti (Paras. [0103]-[0113]).

Response to Arguments
5.	Applicant's arguments filed February 12, 2026 have been fully considered but they are not persuasive. 
Regarding independent Claim 1, applicant argues (see applicant’s remark pages 11-13), claim 1 is claiming that the low-bitrate encode of the spatial audio signal is based on the at least one audio signal, the at least one first spatial audio parameter and the at least one second spatial audio parameter, wherein the spatial audio signal comprises controlled versions (plural) of the first and second sound objects (multiple objects; not a merged object) that are allocated to the first direction and the second direction, respectively. Atti teaches reducing streams by combining or merging into a common stream. Thus, applying the teachings of Jot to Atti would be to a common merged stream; not to provide plural controlled versions. Also, in view of the teachings in Jot, it would not have been obvious to modify Atti and the other cited art to provide controlled versions (plural), by performing the allocating, after obtaining the at least one audio signal, the at least one first spatial audio parameter and the at least one second spatial audio parameter; Jot describes activities at the capture side, not after sending the bitstream.

In response to applicant’s argument above, Atti does not merely describe that the streams may be grouped, but teaches the multi-stream audio data may be encoded by the IVAS codec using one or more encoders such as an algebraic code-excited linear prediction (ACELP) encoder for speech and a frequency domain (e.g., modified discrete cosine transform (MDCT)) encoder for non-speech audio to generate the bitstream (Para. [0033). Also, Atti teaches determining a priority configuration for a plurality of the audio streams and encoding each of the audio stream based on its priority enables the IVAS codec 102 to allocate different bit rates and use different coding modes, coding bandwidths (Para. [0021]). 
Furthermore, Atti teaches performing the allocating after obtaining the at least one audio signal, the at least one first spatial audio parameter and the at least one second spatial audio parameter (Para. [0082]).
Jot teaches audio signal includes first and second object based signal and the encoder modifies the metadata associated with one or more of the first and second object based audio signals [i.e. the controlled versions the first and second sound objects] (Para. 0034). 
In response to applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., activities at the capture side, not after sending the bitstream) is not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
The combination of the teachings of Atti in view of Eubank, and further in view of Jot renders independent Claims 1, 10 and 20 obvious. 
The rejections of Claims 1, 10, and 20 under 35 U.S.C. 103 as being unpatentable over Atti in view of Eubank and further in view of Jot are maintained. 

Dependent Claims 2-7, 11-14, 16, and 19 have been rejected under 35 U.S.C. 103 as being unpatentable over Atti in view of Eubank and further in view of Jot.
The rejections of Claims 2-7, 11-14, 16, and 19  under 35 U.S.C. 103 as being unpatentable over Atti in view of Eubank and further in view of Jot are maintained. 

Allowable Subject Matter
6.	Claims 9 and 18 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
7.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHIMEZIE E BEKEE whose telephone number is (571)272-0202. The examiner can normally be reached M-F 7.30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached at 571-272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHIMEZIE EZERIWE BEKEE/Examiner, Art Unit 2691    

/DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2691

Read full office action

Prosecution Timeline

Dec 07, 2022

Application Filed

Feb 25, 2025

Non-Final Rejection — §103

Aug 27, 2025

Response Filed

Nov 11, 2025

Final Rejection — §103

Feb 12, 2026

Response after Non-Final Action

Feb 24, 2026

Request for Continued Examination

Feb 26, 2026

Response after Non-Final Action

Mar 19, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/175,904

Patent 12602426

RECORDLESS

2y 5m to grant Granted Apr 14, 2026

18/539,276

Patent 12604146

BEAMFORMING DEVICE

2y 5m to grant Granted Apr 14, 2026

18/160,894

Patent 12586595

APPARATUS, METHOD AND COMPUTER PROGRAM FOR ENCODING AN AUDIO SIGNAL OR FOR DECODING AN ENCODED AUDIO SCENE

2y 5m to grant Granted Mar 24, 2026

18/330,326

Patent 12585145

SMART WEARABLE GLASSES

2y 5m to grant Granted Mar 24, 2026

18/469,616

Patent 12587798

Headphones with Sound-Enhancement and Integrated Self-Administered Hearing Test

2y 5m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

69%

Grant Probability

99%

With Interview (+33.3%)

2y 8m

Median Time to Grant

High

PTA Risk

Based on 16 resolved cases by this examiner. Grant probability derived from career allow rate.

Spatial Audio Object Positional Distribution within Spatial Audio Communication Systems

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email