Last updated: April 19, 2026

Application No. 18/577,068

Three-Dimensional, Direction-Dependent Audio for Multi-Entity Telecommunication

Final Rejection §102

Filed

Jan 05, 2024

Examiner

MOONEY, JAMES K

Art Unit

2695

Tech Center

2600 — Communications

Assignee

Google LLC

OA Round

2 (Final)

Interview Optional

— +22.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 695 resolved cases, 2023–2026

Examiner Intelligence

MOONEY, JAMES K View full profile →

Grants 76% — above average

Career Allow Rate

525 granted / 695 resolved

+13.5% vs TC avg

Strong +22% interview lift

Without

With

+22.2%

Interview Lift

resolved cases with interview

Typical timeline

2y 3m

Avg Prosecution

25 currently pending

Career history

720

Total Applications

across all art units

Statute-Specific Performance

§101

3.8%

-36.2% vs TC avg

§103

50.0%

+10.0% vs TC avg

§102

16.7%

-23.3% vs TC avg

§112

24.6%

-15.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 695 resolved cases

Office Action

§102

DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 11/3/25 have been fully considered but they are not persuasive. 
Applicant submits that Hvidsten does not disclose the information on a physical location of the first audio-producing entity with respect to the remote device. The examiner respectfully disagrees. Hvidsten (Col. 8 line 63 – Col. 9 line 12, Fig. 3) discloses, headset 110(3) of user 105(3) transmits an audio signal 145(3) and audio perception factors 335(3) to headset 110(1) of user 105(1). The processing block 320(1) of headset 110(1) of user 105(1) obtains the audio signal 145(3), the perception factors 335(3) and also perception factors 335(1) of user 105(1). Based on the audio signal 145(3), perception factors 335(3) and 335(1), and a room model 325, the processing block 320(1) produces output audio signals 340(1) and 340(2) that correspond to the speech of user 105(3) that would be present at respective ears of user 105(1) in the absence of headset 110(1) and any noise generated in the open office environment. These perception factors 335(3) and 335(1) are, for example, six degrees of freedom of user 105(3) and user 105(1). 
	The instant specification as filed discloses “The spatial audio output model 412 may be configured to receive the processed data, analyze (e.g., compare) the processed data against device data of another electronic device, and reproduce an audio message (e.g., received in the multi-stream content) with a spatial audio effect. For example, analysis of the processed data against device data of another electronic device can include, as non-limiting examples,… determining an orientation (e.g., yaw, roll, tilt) of a face of a first user with respect to location coordinates and/or an orientation of a face of a second user.”
	Yaw, roll, pitch/tilt, as used in the instant application, are included in the six degrees of freedom of a user, as used in Hvidsten (Col. 4 lines 49-54), to reproduce the audio with a spatial effect. Therefore, Hvidsten discloses orientation information comprising information on a physical location of the first audio-producing entity with respect to the remote device and… three-dimensional audio that conveys the physical location of the first audio-processing entity with respect to the audio receiving entity,” as claimed. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-6, 8, 9, 11-14 and 16-23 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Hvidsten et al. (US 10,757,240 B1.).
		As to claim 1, Hvidsten discloses a method comprising: 
receiving, at a remote device and during an active, multi-entity audio communication (Col. 8 lines 19-29, Fig. 3. 105(1) in communication with 105(2) and 105(3). While 105(3) is local in this example, having multiple remote participants is disclosed.)
first audio information associated with a first audio-producing entity of multiple entities of the multi-entity audio communication (Col. 8 lines 26-29 and 63-65 and Col. 9 lines 13-15, Fig. 3. “Headset 110(2) transmits audio signal 145(2) from user 105(2) (e.g., speech) to headset 110(1).” “It will be appreciated that any suitable number of local (physically present) participants and/or remote video conference participants may participate in the ad-hoc communication.”); 
obtaining orientation information associated with at least one of the first audio-producing entity or the remote device indicative of a relative positioning of the first audio-producing entity with respect to the remote device Col. 4 lines 47-57, Col. 8 lines 65-67, and Col. 9 lines 14-17 and 39-42. Headsets 110(1/2/3) transmit audio perception factors 335(1/2/3) detected via tracking module 315(1/2/3) (e.g., six degrees of freedom of user 105(1/2/3), etc.).), 
the orientation information comprising information on a physical location of the first audio-producing entity with respect to the remote device and usable to determine a first direction between the first audio-producing entity and an audio-receiving entity (Col. 8 line 63 – Col. 9 line 37, Fig. 3. “Similarly, headset 110(2) transmits audio signal 145(2) from user 105(2) (e.g., speech) to headset 110(1). Headset 110(2) may also transmit audio perception factors 335(2), which may be detected via tracking module 315(2) (e.g., six degrees of freedom of user 105(2), etc.). Processing block 320(2) may obtain audio signal 145(2) and audio perception factors 335(2).” “Based on audio signal 145(2), audio perception factors 335(1) and 335(2), and room model 325, processing block 320(2) produces modified audio signals 345(1) and 345(2) that correspond to the speech of user 105(2) that would be present at respective ears of user 105(1) in the absence of headset 110(1) and any noise generated in the open office environment.”) and 
providing three-dimensional, direction-dependent audio information, the three- dimensional, direction-dependent audio information sufficient to enable a multi-stereo audio output device associated with the audio-receiving entity to reproduce direction-dependent, three-dimensional audio that conveys the physical location of the first audio-producing entity with respect to the audio-receiving entity (Col. 9 lines 27-61, Fig. 3. “Based on audio signal 145(2), audio perception factors 335(1) and 335(2), and room model 325, processing block 320(2) produces modified audio signals 345(1) and 345(2) that correspond to the speech of user 105(2) that would be present at respective ears of user 105(1) in the absence of headset 110(1) and any noise generated in the open office environment.” “Headset 110(1) may provide modified audio signals 370(1) and 370(2) to earpieces 142(1) and 142(2), respectively, via summing nodes 270(1) and 270(2). Thus, realistic audio rendering may be enabled by compatibility of headsets 110(1)-110(3), tracking of users 105(2) and 105(3) and the display of video conference endpoint 305, suitable audio signaling, referencing far-end geometry to video conference endpoint 305, etc.”).
As to claim 2, Hvidsten discloses wherein the remote device is the multi- stereo audio output device (Col. 2 lines 28-34. Headset.).
As to claim 3, Hvidsten discloses wherein the multi- stereo audio output device, using the three-dimensional, direction-dependent audio information, is configured to reproduce direction-dependent, three-dimensional audio that includes an audible-manipulation of the first audio information based on the orientation information of one or more of the multi-stereo audio output device and the first audio-producing entity (Col. 9 lines 1-12, 17-61, Fig. 3. “Summing node 350(1) may add modified audio signal 340(1) and modified audio signal 345(1) to produce modified audio signal 355(1), and summing node 350(2) may add modified audio signal 340(2) and modified audio signal 345(2) to produce modified audio signal 355(2).” “Summing node 365(1) may add modified audio signal 355(1) and modified audio signal 360(1) to produce modified audio signal 370(1), and summing node 365(2) may add modified audio signal 355(2) and modified audio signal 360(2) to produce modified audio signal 370(2). Headset 110(1) may provide modified audio signals 370(1) and 370(2) to earpieces 142(1) and 142(2), respectively, via summing nodes 270(1) and 270(2). Thus, realistic audio rendering may be enabled.”).
As to claim 4, Hvidsten discloses wherein the audible-manipulation includes a machine-learned technique configured to adjust at least one of an inter-aural time difference, an inter-aural level difference, or a timbre difference (Col. 4 lines 34-41 and Col. 15 line 63 – Col. 16 line 3. “Headset 110(1) may also produce the modified audio signal based on one or more head-related transfer functions associated with a shape of the head of the user. A head-related transfer function models how an ear perceives sound. This may depend on a number of variables such as the size and shape of the head, ears, ear canal, and nasal and oral cavities, density of the head, etc. Each ear of user 105(1) may be modeled as receiving the modified audio signal at slightly different times and frequency/phase modifications due to the shape and shadowing of the head of user 105(1).” “The computer or other processing systems employed by the present embodiments may be implemented by… custom software (e.g., machine learning software, etc.).”).
As to claim 5, Hvidsten discloses wherein the multi-stereo audio output device includes one or more of a smartphone, wireless earbuds, or wired headphones (Col. 2 lines 28-39, Fig. 3. “Headset 110(1)… Audio processor 135 may also convert received audio (via wired or wireless communication interface 125) to analog signals to drive speakers 140(1) and 140(2).”).
As to claim 6, Hvidsten discloses wherein the remote device is a computing entity associated with a communication network through which the active, multi-entity audio communication is enabled (Col. 8 lines 20-29, Fig. 3. “A signal processing/operational flow for user 105(1) who is in an ad-hoc communication session with users 105(2) and 105(3). In this example, user 105(2) is a remote video conference participant, and participates through video conference endpoint 305. User 105(3) is local to user 105(1), and participates through wireless network 310.”).
As to claim 8, Hvidsten discloses wherein receiving multi-entity audio communication and obtaining orientation information occur in real-time and concurrently (Col. 8 line 63 – Col. 9 line 16, Fig. 3.).
As to claim 9, Hvidsten discloses wherein the first audio information and orientation information associated with the first audio-producing entity are transmitted together in multi-stream data from the first audio-producing entity (Col. 9 lines 13-17, Fig. 3. Audio signal 145(2) and audio perception factors 335(2) transmitted together.).
As to claim 11, Hvidsten discloses wherein the orientation information is further usable to determine a first rotation of the first audio-producing entity with respect to a relative rotation of the remote device (Col. 7 line 67 – Col. 8 line 3 and Col. 10 lines 47-55. “Six degree of freedom tracking 210 may include functionality for tracking the three dimensions in which headset 110(1) can move and the three directions of in which headset 110(1) can rotate.”).
As to claim 12, Hvidsten discloses wherein the first rotation of the first audio-producing entity with respect to the relative rotation of the remote device is further usable to determine a difference in elevation and a proximity between the first audio-producing entity and the remote device (Col. 7 line 67 – Col. 8 line 3, Col. 6 lines 6-9 and 40-45. “headset 110(1) may model or estimate the acoustics of speech from the mouth of user 105(2) to user 105(1) based on the relative azimuth, elevation, and distance of user 105(2) relative to user 105(1).”).
As to claim 13, Hvidsten discloses wherein the orientation information includes an orientation of a user's head or ears or an orientation of one or more speakers or exterior housing of the first audio-producing entity or the remote device (Col. 7 line 67 – Col. 8 line 3. “Six degree of freedom tracking 210 may include functionality for tracking the three dimensions in which headset 110(1) can move and the three directions of in which headset 110(1) can rotate.”).
As to claim 14, Hvidsten discloses receiving video information, and wherein providing the three- dimensional, direction-dependent audio provides video information enabling a display associated with the multi-stereo audio output device to provide video associated with the first or second audio-producing entity (Col. 5 lines 43-61 and Col. 9 lines 20-26, Fig. 3. “User 105(2) may be tracked by headset 110(2) in the remote room (audio perception factors 335(2)), as well as on the display of video conference endpoint 305 in the open office environment. For example, user 105(2) may be tracked on the display by video conference endpoint 305 and/or headset 110(1).”).
As to claim 16, it is directed towards substantially the same subject matter as claim 1 and is therefore rejected using the same rationale as claim 1 above.
As to claims 17-20, they are rejected under claim 16 using the same rationale as claims 2-5 above.
As to claim 21, Hvidsten discloses wherein obtaining the orientation information comprises acquiring the information on a physical location of the first audio-producing entity with respect to the remote device based on at least one of Global Positioning System (GPS), cellular positioning techniques, triangulation, or a location- based application (Col. 4 line 58 – Col. 5 line 7. “Headset 110(1) may use any suitable mechanism to track the six degrees of freedom of the head of user 105(1).” “A Global Positioning System (GPS) may be used.”).
As to claim 22, Hvidsten discloses wherein the physical location of the first audio-producing entity with respect to the audio-receiving entity comprises one or more of: a distance from the first audio-producing entity to the remote device; a geographic coordinate difference between the first audio-producing entity to the remote device; an elevation difference between the first audio-producing entity to the remote device; an acceleration difference between the first audio-producing entity to the remote device; a velocity difference between the first audio-producing entity to the remote device; a direction of travel difference between the first audio-producing entity to the remote device; or a facing direction of a first user of the first audio-producing entity relative to the remote device (Col. 4 line 58 – Col. 5 line 7 and Col. 6 lines 6-9. “Headset 110(1) may use any suitable mechanism to track the six degrees of freedom of the head of user 105(1). For example, headset 110(1) may include one or more accelerometers, gyroscopes, and/or magnetometers… As such, headset 110(1) may also/alternatively use ground truth measurements. For outdoor devices (e.g., smart phones, drones, etc.), a Global Positioning System (GPS) may be used.” “Headset 110(1) may model or estimate the acoustics of speech from the mouth of user 105(2) to user 105(1) based on the relative azimuth, elevation, and distance of user 105(2) relative to user 105(1).”)
As to claim 23, it is rejected under claim 16 using the same rationale as claim 21 above.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAMES K MOONEY whose telephone number is (571)272-2412. The examiner can normally be reached Monday-Friday, 9:00 AM -5:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached at 5712727848. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/JAMES K MOONEY/Primary Examiner, Art Unit 2695

Read full office action

Prosecution Timeline

Jan 05, 2024

Application Filed

Aug 13, 2025

Non-Final Rejection — §102

Oct 21, 2025

Examiner Interview Summary

Oct 21, 2025

Applicant Interview (Telephonic)

Nov 03, 2025

Response Filed

Feb 17, 2026

Final Rejection — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/379,726

Patent 12598436

AUDIO SIGNAL COMPENSATION METHOD AND APPARATUS, EARPHONE AND STORAGE MEDIUM

2y 5m to grant Granted Apr 07, 2026

18/698,082

Patent 12598422

KALMAN-FILTER-BASED ADAPTIVE MICROPHONE ARRAY NOISE REDUCTION METHOD AND APPARATUS

2y 5m to grant Granted Apr 07, 2026

18/362,384

Patent 12593193

DETERMINING SPATIAL IMPULSE RESPONSE VIA ACOUSTIC SCRAMBLING

2y 5m to grant Granted Mar 31, 2026

18/237,727

Patent 12587785

ADAPTIVE FILTERBANKS USING SCALE-DEPENDENT NONLINEARITY FOR PSYCHOACOUSTIC FREQUENCY RANGE EXTENSION

2y 5m to grant Granted Mar 24, 2026

18/391,123

Patent 12581234

HOWLING SUPPRESSION DEVICE, HOWLING SUPPRESSION METHOD, AND NON-TRANSITORY COMPUTER READABLE RECORDING MEDIUM STORING HOWLING SUPPRESSION PROGRAM

2y 5m to grant Granted Mar 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

76%

Grant Probability

98%

With Interview (+22.2%)

2y 3m

Median Time to Grant

Moderate

PTA Risk

Based on 695 resolved cases by this examiner. Grant probability derived from career allow rate.