Last updated: April 19, 2026

Application No. 18/589,388

METHODS AND SYSTEMS FOR ADDING REAL-WORLD SOUNDS TO VIRTUAL REALITY SCENES

Final Rejection §103

Filed

Feb 27, 2024

Examiner

PARK, SANGHYUK

Art Unit

2623

Tech Center

2600 — Communications

Assignee

Sony Interactive Entertainment Inc.

OA Round

6 (Final)

Interview Optional

— +16.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 717 resolved cases, 2023–2026

Examiner Intelligence

PARK, SANGHYUK View full profile →

Grants 71% — above average

Career Allow Rate

509 granted / 717 resolved

+9.0% vs TC avg

Strong +16% interview lift

Without

With

+16.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 6m

Avg Prosecution

25 currently pending

Career history

742

Total Applications

across all art units

Statute-Specific Performance

§101

0.8%

-39.2% vs TC avg

§103

54.1%

+14.1% vs TC avg

§102

25.9%

-14.1% vs TC avg

§112

16.4%

-23.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 717 resolved cases

Office Action

§103

Detailed Action
Response to Amendment
The amendment filed on 11/12/2025 has been entered and considered by the examiner.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-5, 7-14, 16-18 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Fullam (PGPUB 2016/0080874 A1) in view of Salter et al (PGPUB 2014/0375683 A1).
	Independent Claims
As to claim 1, Fullam (Figs. 1-3) teaches, a method, comprising:
presenting a virtual reality scene (augmented reality) on a display (¶ 30: HMD device 400 with augmented reality);
outputting audio (i.e. output amplified or attenuated audio) related to a virtual reality space (augmented reality) via headphones (headphone, ¶ 11) while a noise canceling function (noise cancellation) of the headphones is active for filtering out sounds from a real-world space (¶ 17, ¶ 26: i.e. fully cancelling some audio signals);
receiving sensor data (i.e. detected audio, visual environment detection, and gaze detection) from one or more sensors (¶ 27: microphone array 418, ¶ 39: optical sensor 414, ¶ 32: i.e. image sensor for acquiring image of user’s eyes) in the real-world space in which the display is located (Fig. 4);
identifying an object location of an object in the real-world space that produces a sound using the sensor data (¶ 14: i.e. detect the second user is the gaze target, ¶ 39: i.e. optical sensor 414 detects physical object within the field of view, ¶ 41: i.e. depth map using infrared illumination from 416); and
generating a media cue (sound bar 204, 206, 212, 214) in the virtual reality scene presented in the display, wherein the media cue is presented in a virtual location (i.e. augmented next to the objects 106, 108) that is correlated to the object location of the object in the real-world space, and (ii) audio data (amplified or attenuated audio signal) that is generated to approximate the sound that is captured from the real-world space (i.e. beamforming biased in the direction of the gaze direction, ¶ 25) and canceled by the noise canceling function (¶ 25: i.e. fully cancel some signals, ¶ 17: i.e. noise cancellation), wherein the audio data is mixed with audio related to the virtual reality space when output via said headphones (¶ 42: i.e. 3D model system generates virtual environment that models physical environment surrounding the user, and ¶ 19: i.e. the sound level of the real sound is processed by being amplified and attenuated and is output according to the user’s gaze. The processed audio is produced according to the 3D model based on the user’s gaze direction).
Fullam teaches Fig. 2 that shows the real-world space 202 visually augmented with sound bars 204, 206 to show the volume status corresponding to the real-world sound and volume, but does not specifically teach the sound bars shown in Fig. 2 is actually displayed on the display of the augmented reality system or just for an illustrative purpose of the invention. 
Salter (Fig. 4) teaches, the media cue includes (i) a visual element (markers 410, 412, 414 such as tendril) that is presented in a virtual location (Fig. 4) that is correlated to the object location (¶ 25: i.e. represents location of out-of-view object) of the object in the real-world space, and (ii) audio data that is generated to approximate the sound that is captured from the real world space (¶ 40: i.e. audio cue alerts the user by creating a tendril, ¶ 46: i.e. visual and audio indications, and ¶ 53: i.e. sound with varying volume, claim 17: i.e. sound indicates creation of tendril),
wherein the visual element is associated with a match to at least one an identity (i.e. restaurant in real world), a type (i.e. real object such as restaurant, or information to display such as distance), or a behavior of the object (i.e. velocity or distance) and depicts the object as at least one of an icon (tendril icon), a symbol (logo), an avatar (i.e. game object such as enemy), or a graphic (telephone number, menu) corresponding to the match (¶ 14: i.e. icon 153 may be any suitable information, ¶ 28: i.e. visual indicator vary based on properties of the object, ¶ 42: i.e. animation 706 is shown when velocity is high, ¶ 51),
wherein interactivity related to the virtual reality scene is active when the media cue is presented (¶ 26: i.e. marker 412 continue to be displayed after object 416 comes into the field of view in order to direct the user’s attention to object 416).
It would have been obvious to a person of ordinary skilled in the art before the effective filing date of the claimed invention to incorporate Salter’s method of presenting visual indicator of the display device on the real object’s audio with visual indicators at the peripheral of the display into Fullam’s HMD, so as to prevent the user from missing out on the opportunity to view and interact with augmented reality information (¶ 10).

As to claim 10, Fullam (Figs. 1-3) teaches, Non-transitory computer-readable media (storage machine 504) storing instructions that, when executed by one or more processors (processors, ¶ 51), cause the one or more processors for perform a method comprising: 
presenting a virtual reality scene (augmented reality) on a display (¶ 30: HMD device 400 with augmented reality);
outputting audio (i.e. output amplified or attenuated audio) related to a virtual reality space via headphones (headphone, ¶ 11) while a noise canceling function (noise cancellation) of the headphones is active for filtering out sounds from a real-world space (¶ 17, ¶ 26: i.e. fully cancelling some audio signals);
receiving sensor data (i.e. detected audio, visual environment detection, and gaze detection) from one or more sensors (¶ 27: microphone array 418, ¶ 39: optical sensor 414, ¶ 32: i.e. image sensor for acquiring image of user’s eyes) in the real-world space in which the display is located (Fig. 4);
identifying an object location of an object in the real-world space that produces a sound using the sensor data (¶ 14: i.e. detect the second user is the gaze target, ¶ 39: i.e. optical sensor 414 detects physical object within the field of view, ¶ 41: i.e. depth map using infrared illumination from 416); and
generating a media cue (sound bar 204, 206, 212, 214) in the virtual reality scene presented in the display, wherein the media cue is presented in a virtual location (i.e. augmented next to the objects 106, 108) that is correlated to the object location of the object in the real-world space, and (ii) audio data (amplified or attenuated audio signal) that is generated to approximate the sound that is captured from the real-world space (i.e. beamforming biased in the direction of the gaze direction, ¶ 25) and canceled by the noise canceling function (¶ 25: i.e. fully cancel some signals, ¶ 17: i.e. noise cancellation), wherein the audio data is mixed with audio related to the virtual reality space when output via said headphones (¶ 42: i.e. 3D model system generates virtual environment that models physical environment surrounding the user, and ¶ 19: i.e. the sound level of the real sound is processed by being amplified and attenuated and is output according to the user’s gaze. The processed audio is produced according to the 3D model based on the user’s gaze direction).
Fullam teaches Fig. 2 that shows the real-world space 202 visually augmented with sound bars 204, 206 to show the volume status corresponding to the real-world sound and volume, but does not specifically teach the sound bars shown in Fig. 2 is actually displayed on the display of the augmented reality system or just for an illustrative purpose of the invention. 
Salter (Fig. 4) teaches, the media cue includes (i) a visual element (markers 410, 412, 414 such as tendril) that is presented in a virtual location (Fig. 4) that is correlated to the object location (¶ 25: i.e. represents location of out-of-view object) of the object in the real-world space, and (ii) audio data that is generated to approximate the sound that is captured from the real world space (¶ 40: i.e. audio cue alerts the user by creating a tendril, ¶ 46: i.e. visual and audio indications, and ¶ 53: i.e. sound with varying volume, claim 17: i.e. sound indicates creation of tendril),
wherein the visual element is associated with a match to at least one an identity (i.e. restaurant in real world), a type (i.e. real object such as restaurant, or information to display such as distance), or a behavior of the object (i.e. velocity or distance) and depicts the object as at least one of an icon (tendril icon), a symbol (logo), an avatar (i.e. game object such as enemy), or a graphic (telephone number, menu) corresponding to the match (¶ 14: i.e. icon 153 may be any suitable information, ¶ 28: i.e. visual indicator vary based on properties of the object, ¶ 42: i.e. animation 706 is shown when velocity is high, ¶ 51),
wherein interactivity related to the virtual reality scene is active when the media cue is presented (¶ 26: i.e. marker 412 continue to be displayed after object 416 comes into the field of view in order to direct the user’s attention to object 416).
It would have been obvious to a person of ordinary skilled in the art before the effective filing date of the claimed invention to incorporate Salter’s method of presenting visual indicator of the display device on the real object’s audio with visual indicators at the peripheral of the display into Fullam’s HMD, so as to prevent the user from missing out on the opportunity to view and interact with augmented reality information (¶ 10).

As to claim 12, Fullam (Figs. 1-3) teaches, a method, comprising: 
presenting a virtual reality scene (augmented reality) on a display (¶ 30: HMD device 400 with augmented reality);
outputting audio (i.e. output amplified or attenuated audio) related to a virtual reality space (augmented reality) via headphones (headphone, ¶ 11) while a noise canceling function (noise cancellation) of the headphones is active for filtering out sounds from a real-world space (¶ 17, ¶ 26: i.e. fully cancelling some audio signals);
receiving sensor data (i.e. detected audio, visual environment detection, and gaze detection) from one or more sensors (¶ 27: microphone array 418, ¶ 39: optical sensor 414, ¶ 32: i.e. image sensor for acquiring image of user’s eyes) in the real-world space in which the display is located (Fig. 4);
identifying an object location of an object in the real-world space that produces a sound using the sensor data (¶ 14: i.e. detect the second user is the gaze target, ¶ 39: i.e. optical sensor 414 detects physical object within the field of view, ¶ 41: i.e. depth map using infrared illumination from 416); and
generating a media cue (sound bar 204, 206, 212, 214) in the virtual reality scene presented in the display, wherein the media cue is presented in a virtual location (i.e. augmented next to the objects 106, 108) that is correlated to the object location of the object in the real-world space (¶ 18: i.e. sound bar is depicted near the source of the audio corresponding to the object location, Fig. 2).
Fullam does not specifically teach, interactivity related to the virtual reality scene is active when the media cue is presented.
Salter (Fig. 4) teaches, interactivity related to the virtual reality scene is active when the media cue is presented (¶ 26: i.e. marker 412 continue to be displayed after object 416 comes into the field of view in order to direct the user’s attention to object 416).
It would have been obvious to a person of ordinary skilled in the art before the effective filing date of the claimed invention to incorporate Salter’s method of presenting visual indicator of the display device on the real object’s audio with visual indicators at the peripheral of the display into Fullam’s HMD, so as to prevent the user from missing out on the opportunity to view and interact with augmented reality information (¶ 10).

Dependent Claims
As to claim 2, Fullam (Fig. 2) teaches, wherein the virtual location of the media cue (sound bar 204, 206, 212 and 214) is presented relative to a point of view (POV) of a user viewing the display (¶ 19: i.e. amplify or attenuate the sounds based on the gaze target detection, ¶ 58: i.e. visually represent changes in the underlying data).
 	
As to claim 3, Fullam (Fig. 2) teaches, wherein the media cue is additionally represented as image data (sound bar 204, 206, 212, 214) for conveying the sound, the image data provides an indicator of direction of the sound relative to a point of view (POV) of a user viewing the display (Fig. 2: i.e. sound bar is presented next to the origin of the audio, including second user 106 and television 108).

As to claims 4 and 13, Fullam teaches the method of claim 1, but does not specifically teach interactivity.
	Salter (Fig. 1) teaches, wherein interactivity and audio related to the virtual reality scene is active when the media cue is presented 9¶ 24: i.e. markers 406-414 are displayed within the field of view 102 based on the real object, ¶ 40: i.e. audio cue, ¶41: i.e. visual indicator).
It would have been obvious to a person of ordinary skilled in the art before the effective filing date of the claimed invention to incorporate Salter’s method of presenting visual indicator of the display device on the real object’s audio with visual indicators at the peripheral of the display into Fullam’s HMD, so as to prevent the user from missing out on the opportunity to view and interact with augmented reality information (¶ 10).

As to claim 5, Fullam (Fig. 2) teaches, wherein the media cue is additionally represented as image data (sound bar 204, 206, 212 and 214) that conveys an identity (i.e. sound bar) of the object in a graphical form (¶ 58: i.e. visually represent changes in the underlying data).

As to claims 7 and 16, Fullam teaches the method of claim 1, but does not specifically teach dynamically changing based on changes or movements of the objects in the real-world space. 
Salter (Fig. 4) teaches, wherein the virtual location of the media cue is dynamically changed based on changes or movements of the object in the real-world space (¶ 26: i.e. marker 418 associated with object 402 is to the left edge of the display according to the user’s movement. Marker 412 is moved into the center region of the display as object 416 is within the field of view of the user as shown in Fig. 4).
	It would have been obvious to a person of ordinary skilled in the art before the effective filing date of the claimed invention to incorporate Salter’s method of presenting visual indicator of the display device on the real object’s audio with visual indicators at the peripheral of the display into Fullam’s HMD, so as to prevent the user from missing out on the opportunity to view and interact with augmented reality information (¶ 10).

	As to claims 8 and 17, Fullam teaches the method of claim 1, but does not specifically teach the virtual location of the media cue moves. 
Salter (Fig. 4) teaches, wherein the virtual location of the media cue moves to represent movements of the object in the real-world space, and updating a correlation of the object location of the object in the real-world space relative to the point of view of a user viewing the display (¶ 26: i.e. marker 418 associated with object 402 is to the left edge of the display according to the user’s movement. Marker 412 is moved into the center region of the display as object 416 is within the field of view of the user as shown in Fig. 4).
	It would have been obvious to a person of ordinary skilled in the art before the effective filing date of the claimed invention to incorporate Salter’s method of presenting visual indicator of the display device on the real object’s audio with visual indicators at the peripheral of the display into Fullam’s HMD, so as to prevent the user from missing out on the opportunity to view and interact with augmented reality information (¶ 10).

As to claim 9, Fullam (Fig. 2) teaches, wherein the audio data is a computer-generated sound that represents the sound that is captured from the real-world space (i.e. attenuated or amplified sound captured by the microphone array to dynamically adjust the volume based on the user’s gaze direction/detection)(¶ 25, 26).
	As to claim 11, Fullam (Fig. 2) teaches, the computer readable media of claim 10, wherein the virtual location of the media cue (sound bar 204, 206, 212 and 214) is presented relative to a point of view (POV) of a user viewing the display (¶ 19: i.e. amplify or attenuate the sounds based on the gaze target detection, ¶ 58: i.e. visually represent changes in the underlying data).

As to claim 12, Fullam (Fig. 2) teaches, the computer readable media of claim 10, wherein the media cue is additionally represented as image data (sound bar 204, 206, 212, 214) for conveying the sound, the image data provides an indicator of direction of the sound relative to a point of view (POV) of a user viewing the display (Fig. 2: i.e. sound bar is presented next to the origin of the audio, including second user 106 and television 108).

As to claim 14, Fullam (Fig. 2) teaches, the computer readable media of claim 10, wherein the media cue is additionally represented as image data (sound bar 204, 206, 212 and 214) that conveys an identity (i.e. sound bar) of the object in a graphical form (¶ 58: i.e. visually represent changes in the underlying data).

As to claim 18, Fullam (Fig. 2) teaches, wherein the audio data is a computer-generated sound that represents the sound that is captured from the real-world space (i.e. attenuated or amplified sound captured by the microphone array to dynamically adjust the volume based on the user’s gaze direction/detection)(¶ 25, 26).

Claim(s) 6 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Fullam and Salter as applied to claim 1 above, and further in view of  Kohler et al (PGPUB 2017/0061693 A1).

As to claims 6 and 15, Fullam and Salter teach the method of claims 1 and 15, but do not specifically teach wherein the audio related to the virtual reality space is native audio from the virtual reality scene.
Kohler (Figs. 1, 2) teaches, wherein the audio related to the virtual reality space is native audio from the virtual reality scene (¶ 104: i.e. audio stream including a virtual audio layer and a real-world audio layer).
	It would have been obvious to a person of ordinary skilled in the art before the effective filing date of claimed invention to incorporate Kohler’s HMD device with virtual and real audio stream into Fullam’s HMD as modified with the teaching of Salter, so as to provide immersive realistic mixed reality experience (¶ 27).

Allowable Subject Matter
Claim 19 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
Claim 19 recites the limitation, “wherein the audio of the media cue comprises a computer-generated proxy sound that is selected or generated to contextually represent the identity of the object in the real-world space, such that the proxy sound is presented in place of the actual sound captured from the real-world space after the actual sound has been canceled by the noise canceling function”. Examiner conducted a search to find the prior art that teaches this limitation alone or in combination. However, none of the prior arts specifically teach the media cue being computer generated proxy that has been noise cancelled in view of all of the limitations required by claims 1 or 10.

Response to Arguments
Applicant's arguments filed 11/12/2025 have been fully considered but they are not persuasive. 
Applicant has amended claims 1 and 10 to recite the new limitations, “wherein interactivity related to the virtual reality scene is active when the media cue is presented”. Applicant argues that this limitation is not taught by Fullam and Salter prior arts in combination. However, Examiner respectfully disagrees. The operation of markers 406, 408 ,410 and 412 must be continuous. The markers are displayed to guide the wearer of the augmented reality device in a real world with real world object. In other words, the markers, as a functional media cue, must be presented when interactivity is active. ¶ 26 describes that the directing the user’s attention to object 416 by continuously displaying marker 412. The function of the augmented reality device directing the user’s attention is considered as the interactivity, and continuously displaying the marker is considered as continuously presenting the media cue. Therefore, Examiner considers that Fullam and Salter combination still teach the new limitations presented in claims 1 and 10.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Inquiry
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SANGHYUK PARK whose telephone number is (571)270-7359.  The examiner can normally be reached on 10:00AM - 6:00 M-F.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chanh Nguyen can be reached on ((571) 272-7772.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at (866) 217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call (800) 786-9199 (IN USA OR CANADA) or (571) 272-1000.
/SANGHYUK PARK/Primary Examiner, Art Unit 2623

Read full office action

Prosecution Timeline

Feb 27, 2024

Application Filed

Sep 20, 2024

Non-Final Rejection — §103

Oct 25, 2024

Response Filed

Nov 02, 2024

Final Rejection — §103

Nov 25, 2024

Request for Continued Examination

Dec 04, 2024

Response after Non-Final Action

Dec 14, 2024

Non-Final Rejection — §103

Mar 28, 2025

Applicant Interview (Telephonic)

Apr 02, 2025

Examiner Interview Summary

Apr 21, 2025

Response Filed

May 17, 2025

Final Rejection — §103

Jul 31, 2025

Request for Continued Examination

Aug 01, 2025

Response after Non-Final Action

Aug 09, 2025

Non-Final Rejection — §103

Nov 12, 2025

Response Filed

Mar 12, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/962,055

Patent 12602134

ELECTRONIC DEVICE

2y 5m to grant Granted Apr 14, 2026

19/025,374

Patent 12603055

DISPLAY DEVICE INCLUDING A SWEEP DRIVER THAT PROVIDES A SWEEP SIGNAL, AND ELECTRONIC DEVICE INCLUDING THE DISPLAY DEVICE

2y 5m to grant Granted Apr 14, 2026

18/566,463

Patent 12594141

SYSTEMS, METHODS, AND MEDIA FOR PRESENTING BIOPHYSICAL SIMULATIONS IN AN INTERACTIVE MIXED REALITY ENVIRONMENT

2y 5m to grant Granted Apr 07, 2026

18/230,025

Patent 12591322

TOUCH INPUT SYSTEM INCLUDING PEN AND CONTROLLER

2y 5m to grant Granted Mar 31, 2026

19/001,766

Patent 12592207

GATE LINE DRIVING CIRCUIT WITH TOP GATE AND BOTTOM GATE

2y 5m to grant Granted Mar 31, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

7-8

Expected OA Rounds

71%

Grant Probability

88%

With Interview (+16.5%)

2y 6m

Median Time to Grant

High

PTA Risk

Based on 717 resolved cases by this examiner. Grant probability derived from career allow rate.

METHODS AND SYSTEMS FOR ADDING REAL-WORLD SOUNDS TO VIRTUAL REALITY SCENES

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email