Last updated: April 19, 2026

Application No. 18/583,311

SPACING-BASED AUDIO SOURCE GROUP PROCESSING

Non-Final OA §102§103

Filed

Feb 21, 2024

Examiner

ZHANG, LESHUI

Art Unit

2695

Tech Center

2600 — Communications

Assignee

Qualcomm Incorporated

OA Round

1 (Non-Final)

Interview Optional

— +36.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 928 resolved cases, 2023–2026

Examiner Intelligence

ZHANG, LESHUI View full profile →

Grants 78% — above average

Career Allow Rate

719 granted / 928 resolved

+15.5% vs TC avg

Strong +36% interview lift

Without

With

+36.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 10m

Avg Prosecution

47 currently pending

Career history

975

Total Applications

across all art units

Statute-Specific Performance

§101

5.5%

-34.5% vs TC avg

§103

42.5%

+2.5% vs TC avg

§102

13.6%

-26.4% vs TC avg

§112

28.7%

-11.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 928 resolved cases

Office Action

§102 §103

DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
This Office Action is in response to a reply communication filed on November 14, 2025 with respect to the Election/Restriction Requirement office action mailed on September 19, 2025 and wherein claims 9-10 amended and wherein applicant elected Group I, claims 1-15 and withdrawn Group II, claims 16-20 from further consideration on the merits pursuant to 37 CFR 1.142(b), as being drawn to a non-elected invention with traverse and a detailed response to the applicant’s reply is as set forth below.
In virtue of this communication, claims 1-30 are currently pending in this Office Action.
In a response to this office action, the Office respectfully requests that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line numbers in the specification and/or drawing figure(s). This will assist the Office in prosecuting this application.

Response to Applicant’s Reply
Applicant, in the reply filed on November 14, 2025, argued, “from an encoding device and/or to a rendering device, that includes audio source group assignment information and audio stream content” and further “Office has not shown that a search for the subject matter of claim 5 of group 1 would be unlikely to identify references relevant to the subject matter of claim 21 of Group 2, since each include similar features related to the group assignment information at least partially based on comparisons of one or more source spacing metrics to a threshold”, as asserted in paragraph 3 of page 8 in Remarks filed on November 14, 2025.
In response to the argument above, the Office respectfully disagrees because 
(1). As discussed in the Election/Restriction Requirement office action mailed on September 19, 2025 (pages 3-4), the invention acquired separate status in the art due to the recognized divergency and different classifications (Group 1 – G10L25/78, Group 2 – G10L19/008) and thus, searching in different classification and/or subclassification in divergent subject matters is serious burden.
(2). Although there would be a potential overlap in some claimed features, e.g., features in claim 5 as indicated by applicant in the argument above, it does not necessarily avoid distinctness in classification and/or subclassification and thus, serious burden as discussed above. In the instant application, for example, claim 5 recited “the group assignment information is determined at least partially based on comparison … one or more source spacing metrics to a threshold”, but in light of the application specification, this feature is only performed at disclosed audio encoder side before bitstream generated and transmitted (through source spacing condition 126 implemented by processor 120 at audio streaming device 102 in fig. 1, para 71-74, 117-118, etc.), while the disclosed audio decoder merely performs rendering by using the feature created and transmitted by the encoder (fig. 1), and claim 5 does not recite that the performance of “the group assignment information is determined based on …” is by claimed “device” and “during an audio decoding operation”, but merely claimed intended purpose which is rendered with a weight in view of search and application of prior arts, see MPEP 2144.07.
Therefore, the applicant argument above is not persuasive and the requirement is still deemed proper and therefore made FINAL. A complete reply to a future final office action must include cancellation of non-elected claims or other appropriate action (37 CFR 1.144). See MPEP § 821.01.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention..

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-7, 9-10, 12-15 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Munoz et al. (US 20210004201 A1, hereinafter Munoz).
Claim 1: Munoz teaches a device (title and abstract, ln 1-17, a content consumer device or one or more source device in fig. 8) comprising: one or more processors (GPU 714, processor 712, display processor 718 in fig. 8) configured, during an audio decoding operation (carried out by at least audio decoding device 34 in fig. 1A-1C), to: 
obtain a set of audio streams (audio streams 27 in transmission channel in figs. 1A-1C) associated with a set of audio sources (associated with audio sources obtained from audio capture devices of a live scene in fig. 1A, para 19, or synthesized sources in a virtual scene in fig. 1C, para 65, e.g., audio elements 302A-302J in fig. 3A, para 88); 
obtain group assignment information (audio location information ALI 45A included in metadata in fig. 1A-1C, para 80 and device location information DLI 45B, audio source location ASL 49, para 86, and audio source distance, para 91, or audio source distance 306A/306B, para 90) indicating that particular audio sources in the set of audio sources (the ALI defined coordinates for microphones that captured the respective audio streams 27, para 80 or audio source distance for each of audio streams above, para 90) are assigned to a particular audio source group (a subset of the audio streams 27 is selected based on the ALI 45A, as output of audio data 19’ referred to audio streams 19’’’, para 80, device location information DLI 45B and audio source location ASL, para 84-86 or audio source distance 306A/306B compared with an audio source distance threshold, para 90), the particular audio source group associated with a source spacing condition (capture location or synthesize location of the stream by which, at least one of the audio streams 27 is excluded, para 80, or satisfying an audio source distance threshold, para 90, a single audio stream from the stream 27 is selected when the audio source distance is greater than an audio source distance threshold, para 91 and multiple audio streams selected if the audio source distance 306B is less than or equal to the threshold in fig. 3A, para 93-95); and 
render, based on a rendering mode assigned to the particular audio source group (one of the audio renders 32 is selected to render the audio data 19’ in figs. 1A-1C, para 58-62), particular audio streams that are associated with the particular audio sources (the audio data 19’ to be rendered is associated with the selected subset of audio streams 27 from the stream selection unit 44 of the audio decoding device 34 in figs. 1A-1C, para 88).
Claim 15 has been analyzed and rejected according to claim 1 above.
Claim 2: Munoz further teaches, according to claim 1 above, wherein at least one of the set of audio streams is received via a bitstream from an encoder device (bitstream 27 represented an encoded version of the audio data 19 in fig. 1A-1C, para 48, e.g., psychoacoustic audio encoding, para 46).
Claim 3: Munoz further teaches, according to claim 2 above, wherein the group assignment information is received via the bitstream (audio location information ALI is included as metadata, para 80 or metadata identifying a location of the audio object relative to a listener wearing the VR device in fig. 2 or other point of reference in the soundfield, i.e., audio source distance 306A/306B, para 23 and the metadata is transmitted from the source device 12B as part of the bitstream 27, para 39).
Claim 4: Munoz further teaches, according to claim 3 above, wherein the one or more processors are configured to update the received group assignment information (update streams to include the new audio stream and update associate metadata with new audio metadata, including capture location information representative of a capture coordinates, para 119, 177 or snaping into a new audio stream in snaping mode, based on a new location of the audio object relative to the listener in the soundfield in figs. 5A-5D, para 120).
Claim 5: Munoz further teaches, according to claim 1 above, wherein the group assignment information is determined (e.g., audio source distance in claim 1 above) at least partially based on comparisons of one or more source spacing metrics (the audio source distance for each of the transmitted audio streams 27, para 90) to a threshold (compared to a audio source distance threshold to determine whether the audio stream in question is placed into the subset or not represented by dashed area in figs. 3A-3E, para 87-88, 94-100).
Claim 6: Munoz further teaches, according to claim 5 above, wherein the threshold includes a dynamic threshold (the audio source distance threshold is dynamically adapted to the a proximity distance threshold set by user or a quality of the audio elements 302F-302J, a gain or loudness of the audio source 308, tracking information 41, or any other factors in figs. 3A-3C, para 95).
Claim 7: Munoz further teaches, according to claim 1 above, wherein the rendering mode assigned to the particular audio source group is one of multiple rendering modes that are supported by the one or more processors (binaural renderer 42 and audio renderers 32, including a number of different audio renderers 32 supported in the audio playback system 16A in figs. 1A-1C, para 52).
Claim 9: Munoz further teaches, according to claim 1 above, wherein the one or more processors are further configured to combine a first rendered audio signal associated with the set of audio sources (selected from bitstream or audio streams 27 transmitted through the transmission channel in figs. 1A-1C and the discussion in claim 1 above) with a second rendered audio signal associated with a microphone input (320A in fig. 3D, one of a dedicated microphone 320A and smartphones 320B-320D, 320G, 320H, 320J in fig. 3D, para 99) to generate a combined signal (generating combined audio feed to the VR device 400, para 90, 129).
Claim 10: Munoz further teaches, according to claim 9 above, wherein the one or more processors are further configured to binauralize the combined signal to generate a binaural output signal (by using binaural renderer 42 in fig. 7B), and further comprising one or more speakers coupled to the one or more processors and configured to play out the binaural output signal (headphones 48 to emit left and right speaker feeds 43 in fig. 7B, para 63).
Claim 12: Munoz further teaches, according to claim 1 above, wherein the one or more processors are integrated in a headset device (the consumer device 14A can be a headset, a smartphone, a laptop computer, or tablet computer, etc., para 49-50).
Claim 13: Munoz further teaches, according to claim 1 above, wherein the one or more processors are integrated in at least one of a mobile phone, a tablet computer device, or a wearable electronic device (the consumer device 14A can be a headset, a smartphone, a laptop computer, or tablet computer, etc., para 49-50).
Claim 14: Munoz further teaches, according to claim 1 above, wherein the one or more processors are integrated in a vehicle (UE 115 integrated in a vehicle, a smartphone, a microphone, an array of microphones, or a XR/VR/AR headset, etc., para 166).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Munoz (above) and in view of reference Walsh et al. (US 20200382894 A1, hereinafter Walsh).
Claim 8: Munoz further teaches, according to claim 7 above, wherein the multiple rendering modes include: a baseline rendering mode in which signal processing (including application of Bessel function, etc., for representing sound field by spherical harmonic coefficients Anm(k), para 24-25), source direction analysis (sound field in a point represented by radius and azimuth and elevation angles, and captured and recorded by microphones at the point, para 25-26) in frequency domain (represented in frequency domain via DFT or DCT, para 25), and source interpolation (interpolation between a first and a second streams 438, 440 in figs. 5B/5C, para 135) are performed; and a low-complexity rendering mode in which distance-weighted time domain interpolation is performed (interpolation having weight updated with respect to the device location and the stream coordinates in fig. 5B-5C, para 122, 233), except wherein it is in frequency domain to perform the source interpolation.
Walsh teaches an analogous field of endeavor by disclosing a device (title and abstract, ln 1-18 and a mobile phone or electronic device, as consumer electronic device, para 28 and included in a virtual surround system in fig. 5) and wherein a source interpolation is performed in frequency domain (frequency domain interpolation of audio source via interpolation of personalized HRTFs related to each audio source, para 13) for benefits of improving sound quality (by more accurately recreation of interpolated HRTF audio source locations, and improving the performance specifically in frontal localization and externalization, para 13). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied a source interpolation is performed in frequency domain, as taught by Walsh, to the source interpolation in the device, as taught by Munoz, for the benefits discussed above.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Munoz (above) and in view of reference Swaminathan et al. (US 20210006976 A1, hereinafter Swaminathan)
Claim 11: Munoz further teaches, according to claim 1 above, further comprising a transceiver coupled to the one or more processors (transceiver module 722 coupled to processor 712 and GPU 714 in fig. 8), the transceiver configured to receive at least one audio stream of the set of audio streams via a bitstream from an encoder device (maintaining a connection from the source device 12 to the content consumer device 14 in fig. 1A-1C, para 161), except a modem.
Swaminathan teaches an analogous field of endeavor by disclosing a device (title and abstract, ln 1-18 and a VR device in fig. 2) and wherein a modem or a transceiver can be used for obtaining audio streams (obtaining audio streams 27 from source device 12 in fig. 1A-1B, para 95, 111) for benefits of improving adaptability of the device (by using either modem in analog domain or transceiver in digital domain, para 95, and for transmitting and receiving ambisonics coefficients accurately representing 3D localization of sound sources, para 30). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied the option among the modem and the transceiver in the device, as taught by Swaminathan, to the transceiver in the device, as taught by Munoz, for the benefits discussed above.

The prior art (US 20140023197 A1 by Xiang et al.) made of record and not relied upon is considered pertinent to applicant's disclosure because it disclosed density-based clustering and a measurement of spatial concentration with respect to a threshold, etc., which is part of the disclosures disclosed by the application in group assignment information and source spacing metrics to the threshold, etc.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LESHUI ZHANG whose telephone number is (571)270-5589. The examiner can normally be reached Monday-Friday 6:30amp-4:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached at 571-272-7848. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LESHUI ZHANG/
Primary Examiner, 
Art Unit 2695

Read full office action

Prosecution Timeline

Feb 21, 2024

Application Filed

Jan 08, 2026

Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/124,589

Patent 12585677

AUTOMATED GENERATION OF IMPROVED LIST-TYPE ANSWERS IN QUESTION ANSWERING SYSTEMS

2y 5m to grant Granted Mar 24, 2026

17/726,728

Patent 12572757

VIDEO PROCESSING METHOD, VIDEO PROCESSING APPARATUS, AND COMPUTER-READABLE STORAGE MEDIUM

2y 5m to grant Granted Mar 10, 2026

18/410,942

Patent 12567423

SYSTEM AND METHODS FOR UPSAMPLING OF DECOMPRESSED SPEECH DATA USING A NEURAL NETWORK

2y 5m to grant Granted Mar 03, 2026

18/553,783

Patent 12567424

METHOD AND DEVICE FOR MULTI-CHANNEL COMFORT NOISE INJECTION IN A DECODED SOUND SIGNAL

2y 5m to grant Granted Mar 03, 2026

18/104,083

Patent 12561354

SYSTEMS AND METHODS FOR ITEM-SPECIFIC KEYWORD RECOMMENDATION

2y 5m to grant Granted Feb 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

78%

Grant Probability

99%

With Interview (+36.0%)

2y 10m

Median Time to Grant

Low

PTA Risk

Based on 928 resolved cases by this examiner. Grant probability derived from career allow rate.