Last updated: April 19, 2026
Application No. 18/611,281
DEVICES, METHODS, AND GRAPHICAL USER INTERFACES FOR CAPTURING AND VIEWING IMMERSIVE MEDIA

Non-Final OA §103
Filed
Mar 20, 2024
Examiner
HODGES, SUSAN E
Art Unit
2425
Tech Center
2400 — Computer Networks
Assignee
Apple Inc.
OA Round
3 (Non-Final)
Interview Optional

— +14.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 375 resolved cases, 2023–2026
Examiner Intelligence

HODGES, SUSAN E View full profile →
Grants 67% — above average
Career Allow Rate
250 granted / 375 resolved
+8.7% vs TC avg
Moderate +14% lift
Without
With
+14.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 4m
Avg Prosecution
31 currently pending
Career history
406
Total Applications
across all art units
Statute-Specific Performance

§101
6.0%
-34.0% vs TC avg
§103
48.7%
+8.7% vs TC avg
§102
20.9%
-19.1% vs TC avg
§112
22.6%
-17.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 375 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on December 19, 2025 has been entered.

Information Disclosure Statement
The information disclosure statements (IDS) were submitted on November 6, 2025, December 19, 2025 and January 6, 2026.  The submissions are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the Examiner.

Applicant(s) Response to Official Action
Presented arguments filed on December 19, 2025 in response to the Final Office Action mailed on November 10, 2025 have been made of record.  Claims 1, 3, 32 and 33 have been amended. Claims 2, 34 and 35 have been cancelled. Claim 1 and 3 - 33 are currently pending. 

 Response to Arguments
Applicant’s arguments see pages 12 – 17 with respect to the rejection of Claims 1 – 10, 16 – 18, 21 – 25, 28 and 30 - 33 under 35 U.S.C. 103 as being unpatentable over Frappiea (US 2018/0001198 A1) in view of Raffle (US 8,854,452 B1) have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground of rejection is made in view of the newly discovered reference to after capturing the immersive media item, displaying, via the display generation, the immersive media item as a virtual object, as claimed in the amended Claims 1, 32 and 33. 

Claim Objections
In Claims 1, 32 and 33, it recites the limitation “after capturing the immersive media item, displaying, via the display generation”.  For proper antecedent basis, the Examiner has interpreted “after capturing the immersive media item, displaying, via the display generation” to mean “after capturing the immersive media item, displaying, via the display generation component”. 
Appropriate correction is required. 
	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 6 – 8, 10, 16, 22 and 30 - 33 are rejected under 35 U.S.C. 103 as being unpatentable over KIM et al. (US 2023/0222746 A) referred to as KIM hereinafter, in view of Yasuda et al. (US 2010/0026787 A1) referred to as Yasuda hereinafter, and in further view of Reif (US 2019/0020843 A1) referred to as Reif hereinafter.
Regarding Claim 1, Kim teaches a head-mounted display device (Fig.1 Par. [0035], the electronic device 100 of FIGS. 1A and 1B) is a device that is capable of expressing an ‘augmented reality’, and may generally include augmented reality glasses provided in the form of glasses that a user wears on his or her face, a head mounted display (HMD) apparatus that a user wears on the his or her head, an augmented reality helmet, and the like) that includes a display generation component (Par. [0039], an electronic device 100 displays an augmented reality scene 200 via a display 101 (i.e. display generation component)), a plurality of sensors that includes at least a first camera (Fig. 2, camera 102, Par. [0040], the electronic device 100 may shoot the surrounding area of the electronic device 100 via a camera (not illustrated) (i.e. first camera) and Par. [0158] The camera module 1280 may capture a still image or moving images and may include one or more lenses, image sensors (i.e. plurality of sensors), image signal processors, or flashes, and Par. [0154] the sensor module 1276 (i.e. plurality of sensors) may include, for example, a gesture sensor, a gyro sensor, an atmospheric pressure sensor, a magnetic sensor, an acceleration sensor, a grip sensor, a proximity sensor, a color sensor, an infrared (IR) sensor, a biometric sensor, a temperature sensor, a humidity sensor, or an illuminance sensor), and a hardware input device (Fig. 2, Par. [0055], the user input unit 105 may be a device that a user uses to input data for controlling an augmented reality device (e.g., the electronic device 100)), the head-mounted display device comprising:
one or more processors (Fig. 2, Processor 104); and
memory storing one or more programs configured to be executed by the one or more processors (Fig. 2, Par. [0053],  the memory 103 may store a program to be executed by the processor 104), the one or more programs including instructions for:
detecting an activation of the hardware input device (Fig. 5, Par. [0077], in operation 331, the processor 104 of the electronic device 100 may receive a user input (i.e. activation of hardware input device) to command capturing an augmented reality scene. the user input may be a touch input, for example, a touch & hold input that keeps touching during a predetermined period of time); 
in response to detecting the activation of the hardware input device, capturing immersive media (Par. [0077], the processor 104 of the electronic device 100 may receive a user input to command capturing an augmented reality scene 200, where in Par. [0034], An ‘augmented reality scene’ is a scene that shows a virtual image together (i.e. immersive media) in a physical environment space of the real world, or that shows a real object and a virtual object together (i.e. immersive media), as illustrated in Fig. 1B ) that corresponds to a viewpoint from a perspective of the head-mounted display device (Par. [0040], the image 110 obtained via the camera may include a real scene that a user views via the electronic device 100 (i.e. viewpoint from perspective of HMD). Par. [0056], the user input unit 105 may shoot a surrounding area of the electronic device 100 using the camera 102), wherein 
capturing immersive media generate an immersive media item (Par. [0041], based on the image 110 obtained via the camera, the electronic device 100 may display an augmented reality scene 200 (i.e. immersive media item) in the display 101. The augmented reality scene 200 may include a scene that shows a virtual image together in a physical environment or space of the real world (i.e. immersive media), or a scene that shows a real object and a virtual object together. FIG. 1B is an augmented reality scene 200 displayed via the display 101) that, when viewed via the display generation component of the head-mounted display device (Par. [0063] in operation 310, the processor 104 of the electronic device 100 may obtain an image via the camera 102 and control the display 101 to display an augmented reality scene based on an image obtained via the camera 102);
capturing immersive media that corresponds to the viewpoint from the perspective of the head-mounted display device (Par. [0040], the image 110 obtained via the camera may include a real scene that a user views via the electronic device 100 (i.e. viewpoint from perspective of HMD)); and
after capturing the immersive media item, displaying, via the display generation (Par. [0063], in operation 310, the processor 104 of the electronic device 100 may obtain an image via the camera 102. The processor 104 may control the display 101 to display an augmented reality scene (i.e. immersive media item) based on an image obtained via the camera 102), the immersive media item (Par. [0041], The augmented reality scene 200 may include a scene that shows a virtual image together in a physical environment or space of the real world, or a scene that shows a real object and a virtual object together. FIG. 1B is an augmented reality scene 200 displayed via the display 101), wherein:
the display generation component includes a first display generation component that corresponds to the right eye of the user and a second display generation component that corresponds to the left eye of the user (Par. [0051], the display 101 may provide an augmented reality (AR) image. The display 101 may include a light guide panel and an optical engine. The light guide panel may be formed of a transparent material of which a part of the rear side is shown when a user puts on (i.e. corresponding to the right and left eyes as illustrated in Fig. 1A) the electronic device 100. When the light guide panel may be provided in a single flat panel structure or multi-flat panel structure (i.e. first and second display generation component) made of a transparent material in which light is reflected inside the light guide panel and is propagated. The light guide panel is disposed to face an exit surface of the optical engine, and may receive input of light of a virtual image projected from the optical engine); and 
the immersive media item, when displayed via the display generation component of the head-mounted display device, presents content (Par. [0041], based on the image 110 obtained via the camera, the electronic device 100 may display an augmented reality scene 200 in the display 101. The augmented reality scene 200 may include a scene that shows a virtual image together in a physical environment or space of the real world, or a scene that shows a real object and a virtual object together. FIG. 1B is an augmented reality scene 200 displayed (i.e. presents content) via the display 101) based on the data captured with the camera to the display generation component (Par. [0063], in operation 310, the processor 104 of the electronic device 100 may obtain an image via the camera 102. The processor 104 may control the display 101 to display an augmented reality scene (i.e. immersive media item) based on an image (i.e. based on data captured) obtained via the camera 102).
	While KIM teaches in Par. [0158] that the camera module 1280 may include one or more lenses, image sensors, image signal processors, or flashes and further in Par. [0154] that the sensor module 1276 has a plurality of sensors, KIM does not specifically teach a first camera corresponding to a viewpoint of a right eye of a user and second camera corresponding to a viewpoint of a left eye of a user and combining the data from both cameras for display on the display device, and after capturing the immersive media item, displaying, via the display generation the immersive media item as a virtual object.
However, Yasuda teaches capturing immersive media includes combining data obtained by two or more sensors of the plurality of sensors to generate an immersive media item that, when viewed (Par. [0062], generating a composite image (i.e. immersive media item) for providing an observer (i.e. when viewed) with a sense of virtual reality or mixed reality (i.e. immersive media)) via the display generation component of the head-mounted display device (Fig.1, head mounted image-sensing display device 10, Par. [0057] The composite image for the right eye and the composite image for the left eye are transferred to the video output units 21R and 21L to be displayed on the LCD modules 11R and 11L of the display units 13R and 13L via the video input units 14R and 14L (i.e. display generation component)) appears three-dimensional (Fig. 4, Par. [0034] The parallax image generating units 22R and 22L generate images (parallax images) of a virtual object for the right eye and left eye based on the input image sensing parameters and three-dimensional model information of the virtual object); 
the two or more sensors of the plurality of sensors includes a second camera different from the first camera (Fig. 1, Par. [0027], A head mounted image-sensing display device 10 has a pair of optical units consisting of a right eye optical unit 110R (i.e. first camera) and a left eye optical unit 110L (i.e. second camera, different from first camera) that correspond to a right eye 100R and a left eye 100L of an observer, where the pair of image-sensing units 18R and 18L that are stereoscopic image-sensing devices have image sensors 19R and 19L (i.e. two sensors), Par. [0029]); 
the first camera is oriented to capture content corresponding to a viewpoint of a right eye of a user of the head-mounted display device (Par. [0027], a right eye optical unit 110R (i.e. first camera) that corresponds to a right eye 100R of an observer); 
the second camera is oriented to capture content corresponding to a viewpoint of a left eye of the user of the head-mounted display device (Par. [0027], a left eye optical unit 110L (i.e. second camera) that corresponds to a left eye 100L of an observer); 
capturing immersive media that corresponds to the viewpoint from the perspective of the head-mounted display device (Fig. 2. viewpoint of HMD, Fig. 4, Par. [0016], fusion between an image of a virtual object and an image of a physical space represented by the MR system (i.e. immersive media)) includes capturing data with the first camera and capturing data with the second camera (Par. [0035], The image computing units 27R and 27L combine stereoscopic images from the captured image input units 25R and 25L (i.e. includes captured data of first and second camera) with parallax images from the parallax image generating units 22R and 22L to generate composite images for the right eye and left eye); and 
combining data obtained by two or more sensors of the plurality of sensors includes combining the data captured with the first camera and the data captured with the second camera (Par. [0029] The image-sensing units 18R and 18L sense and output stereoscopic images (i.e. combined data) of a physical space); and
after capturing the immersive media item, displaying, via the display generation, the immersive media item (Par. [0057] The composite image for the right eye and the composite image for the left eye are transferred (i.e. after capturing) to the video output units 21R and 21L to be displayed on the LCD modules 11R and 11L (i.e. display generation) of the display units 13R and 13L via the video input units 14R and 14L) as virtual (Fig. 7, Par. [0062], when aligning an image of a virtual object with an image that is picked up with an image-sensing device to form a composite image, an effect is obtained whereby it is possible to generate a composite image with high alignment accuracy providing an observer with a sense of virtual reality (i.e. as a virtual object) or mixed reality), wherein:
the display generation component includes a first display generation component that corresponds to the right eye of the user and a second display generation component that corresponds to the left eye of the user (Par. [0027], The display units 13R (i.e. first display generation component) and 13L (i.e. second display generation component) function as a right eye display device that displays images for the right eye (i.e. corresponding to right eye) and a left eye display device that displays images for the left eye, respectively (i.e. corresponding to left eye)); and
the immersive media item, when displayed virtually via the display generation component of the head-mounted display device (Par. [0035] The image computing units 27R and 27L combine stereoscopic images from the captured image input units 25R and 25L with parallax images from the parallax image generating units 22R and 22L to generate composite images for the right eye and left eye. The composite images (i.e. immersive media item) are supplied to the video input units 14R and 14L through the video output units 21R and 21L.), presents content based on the data captured with the first camera to the first display generation component and presents content based on the data captured with the second camera to the second display generation component (Par. [0057] The composite image for the right eye and the composite image for the left eye are transferred to the video output units 21R and 21L to be displayed (i.e. presents content) on the LCD modules 11R and 11L of the display units 13R and 13L via the video input units 14R and 14L).
References KIM and Yasuda are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a first camera corresponding to a viewpoint of a right eye of a user and second camera corresponding to a viewpoint of a left eye of a user and combining the data from both cameras for display on the display device as taught by Yasuda in the invention of Kim. This modification would allow the observer who observes the composite images generated by the right and left image computing units to perceive that the virtual object is correctly disposed on the real object in a physical space and with a sense of virtual reality or mixed reality (See Yasuda, Par. [0058], Par. [0062]).

In addition, KIM in view of Yasuda fails to explicitly teach after capturing the immersive media item, displaying, via the display generation, the immersive media item as a virtual object..
However, Reif teaches after capturing the immersive media item (Fig. 1, Par. [0016], The input component 120 receives, such as by detecting, acquiring, and/or capturing, a view of the environment that can include various real-world objects. The VR system 300 analyzes the view of the environment (i.e. after capturing) to identify one or more target objects (i.e. immersive media item) to be presented to the user via the HMD 150.), displaying, via the display generation (Par. [0016], a target object may be a real world object with a recognizable size, shape, color, or other characteristics identifiable by the VR system 300. A representation of these target objects is generated by the VR system 300 and combined with the other content (such as rendered VR content) to generate a combined scene for viewing (i.e. displaying) in the HMD 150 (i.e. display generation)), the immersive media item as a virtual object (Fig.2D, Par. [0026] the VR system 300 can generate a representation of the stick target object 216 based on a portion(s) of received image data identified as representing or depicting the stick target object 216. The representation of the stick target object 216 (i.e. virtual object) can be combined with the virtual environment, including the virtual scene 220 with the virtual objects 222, to produce the combined scene 230).
References KIM, Yasuda and Reif are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify displaying the immersive media item as a virtual object as taught by Reif  in the inventions of Kim and Yasuda. This modification would allow the user to interact with the real-world target object (or a representation thereof) in combination with the virtual environment (See Reif, Abstract).

Regarding Claim 2, it has been cancelled. 

Regarding Claim 3, KIM in combination with Yasuda and Reif teaches Claim 1. Yasuda further teaches wherein the immersive media item includes one or more images for a right eye and one or more images for a left eye different from the one or more images for the right eye that, when viewed concurrently, create a three-dimensional appearance ( Par. [0034] The parallax image generating units 22R and 22L generate images (parallax images) of a virtual object for the right eye and left eye based on the input image sensing parameters and three-dimensional model information of the virtual object).

Regarding Claim 6, KIM in combination with Yasuda and Reif teaches Claim 1. Kim further teaches wherein the activation of the hardware input device is detected while a media capture mode is enabled (Par. [0056], the user input unit 105 may shoot a surrounding area of the electronic device 100 using the camera 102(i.e. capture mode enabled)), and wherein the one or more programs further include instructions for: while the media capture mode is not enabled (Par. [0064], The processor 104 may receive a user input that commands capturing (i.e. while not enabled) the entirety or a part of the augmented reality scene), detecting an input directed to the hardware input device (Par. [0152], the display module 1260 may include a touch sensor adapted to detect a touch, or a pressure sensor adapted to measure the intensity of force incurred by the touch); and in response to detecting the input and in accordance with a determination that the input satisfies a first set of one or more criteria (Par. [0077], the user input may be a touch input, for example, a touch & hold (i.e. criteria) input that keeps touching during a predetermined period of time), enabling the media capture mode (Par. [0077], the processor 104 of the electronic device 100 may receive a user input to command capturing an augmented reality scene).

Regarding Claim 7, KIM in combination with Yasuda and Reif teaches Claim 6. Kim further teaches wherein the first set of one or more criteria includes a respective criterion that is satisfied when a duration of a detected input exceeds a respective threshold duration (Par. [0077], the user input may be a touch input, for example, a touch & hold (i.e. duration criteria) input that keeps touching during a predetermined period of time (i.e. threshold duration)).

Regarding Claim 8, KIM in combination with Yasuda and Reif teaches Claim 7. Kim further teaches wherein capturing the immersive media includes: in accordance with a determination that the activation of the hardware input device includes an input with a duration that exceeds the respective threshold duration (Par. [0077], the user input may be a touch input, for example, a touch & hold (i.e. duration criteria) input that keeps touching (i.e. exceeds threshold duration) during a predetermined period of time), initiating capture of immersive video media (Par. [0077], the processor 104 of the electronic device 100 may receive a user input to command capturing an augmented reality scene).

Regarding Claim 10, KIM combination with Yasuda and Reif teaches Claim 7. Kim further teaches wherein the immersive media item includes one or more audio outputs for a right ear and one or more audio outputs for a left ear that, when heard concurrently, provide virtual placement of sound in a three-dimensional environment (Par, [0153] the audio module 1270 may obtain the sound via the input module 1250, or output the sound via the sound output module 1255 or a headphone (i.e. left and right ear) of an external electronic device (e.g., an electronic device 1202) directly (e.g., wired) or wirelessly coupled with the electronic device 1201).

Regarding Claim 16, KIM in combination with Yasuda and Reif teaches Claim 1. Kim further teaches wherein capturing the immersive media is performed while the head-mounted display device is in a head-mounted state (Fig. 1A, Par. [0040] the image 110 obtained via the camera may include a real scene that a user views via the electronic device 100 (i.e. while in head mounted state)).

Regarding Claim 22, KIM in combination with Yasuda and Reif teaches Claim 1. KIM further teaches wherein capturing immersive media includes: obtaining augmented media data; and associating the immersive media item with the augmented media data (Par. [0041], based on the image 110 obtained via the camera, the electronic device 100 may display an augmented reality scene 200 (i.e. augmented media data)  in the display 101. The augmented reality scene 200 may include a scene that shows a virtual image together in a physical environment or space of the real world, or a scene that shows a real object and a virtual object (i.e. immersive media item) together. FIG. 1B is an augmented reality scene 200 displayed via the display 101)).

Regarding Claim 30, KIM in combination with Yasuda and Reif teaches Claim 1. KIM further teaches wherein the immersive media item includes additional information obtained from an electronic device that is nearby the head-mounted display device while capturing the immersive media (Par. [0049], The communication interface may perform, with an external device (not illustrated) and a server (not illustrated), transmission or reception of data for receiving a service based on the image 110 of FIGS. 1A and 1B obtained by shooting a surrounding area of the electronic device 100 or the augmented reality scene 200 of FIGS. 1A and 1B).

Regarding Claim 31, KIM in combination with Yasuda and Reif teaches Claim 30. KIM further teaches wherein the electronic device is a second head-mounted display device (Fig. 12, Par. [0144], The electronic devices 1201, 1202 and 1204 (i.e. a second HMD) of FIG. 12 may be described with reference to the electronic device 100 of Fig. 1A, Par. [0035], a head mounted display (HMD) apparatus that a user wears on the his or her head).

Computer-readable storage medium Claim 32 is drawn to the corresponding apparatus claimed in Claim 1.  Therefore Claim 32 corresponds to apparatus Claim 1 and is rejected for the same reasons of obviousness as used above.

Method Claim 33 is drawn to the method of using the corresponding apparatus claimed in Claim 1.  Therefore method Claim 33 corresponds to apparatus Claim 1 and is rejected for the same reasons of obviousness as used above.

Claims 4 and 5 are rejected under 35 U.S.C. 103 as being unpatentable over KIM et al. (US 2023/0222746 A), in view of Yasuda et al. (US 2010/0026787 A1), in view of Reif (US 2019/0020843 A1), and in further view of Rabinovich et al. (US 10,733,447 B2) referred to as Rabinovich hereinafter.
Regarding Claim 4, KIM in combination with Yasuda and Reif teaches Claim 1. While Yasuda teaches in Par. [0029] that the pair of image-sensing units 18R and 18L that are stereoscopic image-sensing devices, KIM in combination with Yasuda and Reif do not specifically teach a depth sensor. Therefore, KIM in combination with Yasuda and Reif fails to explicitly teach the two or more sensors of the plurality of sensors include at least one depth sensor.
However, Rabinovich teaches the two or more sensors of the plurality of sensors (Abstract, A head-mounted augmented reality (AR) device can include a hardware processor programmed to receive different types of sensor data from a plurality of sensors (e.g., an inertial measurement unit, an outward-facing camera, a depth sensing camera (i.e. one depth sensor), an eye imaging camera, or a microphone)) include at least one depth sensor (Fig. 16B, Col. 22:25-28 three outward-facing world-capturing cameras (124) are shown with their FOVs (18, 20, 22), as is the depth camera (154) and its FOV (24), and the picture camera (156) and its FOV (26)).
References KIM, Yasuda, Reif and Rabinovich are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a depth sensor as taught by Rabinovich in the inventions of Kim, Yasuda and Reif in order to use the depth sensor to provide hypotheses about stereo depth (See Rabinovich, Col. 22:41-42).

Regarding Claim 5, KIM in combination with Yasuda, Reif and Rabinovich teaches Claim 4. Kim further teaches wherein the immersive media item includes a representation of a physical environment that is outside of the head-mounted display device (Fig. 1B, Par. [0041], based on the image 110 obtained via the camera, the electronic device 100 may display an augmented reality scene 200 in the display 101. The augmented reality scene 200 may include a scene that shows a virtual image together in a physical environment or space of the real world, or a scene that shows a real object and a virtual object together. FIG. 1B is an augmented reality scene 200 displayed via the display 101) when the activation of the hardware input device is detected (Par, [0059], the processor 104 may receive a user input for capturing the augmented reality scene displayed in the display 101). Kim does not specifically teach a three-dimensional representation. However, Yasuda further teaches wherein the immersive media item includes a three-dimensional representation of a physical environment that is outside of the head-mounted display device (Fig. 4, Par. [0034] The parallax image generating units 22R and 22L generate images (parallax images) of a virtual object for the right eye and left eye based on the input image sensing parameters and three-dimensional model information (i.e. three-dimensional representation) of the virtual object). It would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a three-dimensional representation as taught by Yasuda in the invention of Kim in order to represent parallax images in both the left and right eye of the virtual object (See Yasuda, [0034]).

Claims 9, 17, 18, 21, 23 – 25 and 28 are rejected under 35 U.S.C. 103 as being unpatentable over KIM et al. (US 2023/0222746 A), in view of Yasuda et al. (US 2010/0026787 A1), in view of Reif (US 2019/0020843 A1), and in further view of Frappiea et al. (US 2018/0001198 A1) referred to as Frappiea hereinafter. 
Regarding Claim 9, KIM in combination with Yasuda and Reif teaches Claim 7. Kim further teaches wherein capturing the immersive media includes: in accordance with a determination that the activation of the hardware input device includes an input with a duration that does not exceed the respective threshold duration, initiating capture of immersive photo media (Par. [0077], the user input may be a touch input, for example, a touch & hold (i.e. duration criteria) input that keeps touching during a predetermined period of time (i.e. does not exceeds threshold duration)). KIM in combination with Yasuda and Reif does not specifically teach not satisfying a criteria. Therefore, KIM in combination with Yasuda and Reif fails to explicitly teach in response to detecting the input and in accordance with a determination that the input does not satisfy the first set of one or more criteria, performing a non-media capture action.
However, Frappiea teaches in response to detecting the input and in accordance with a determination that the input does not satisfy the first set of one or more criteria, performing a non-media capture action (Par. [0062], when the gesture input identifies a gesture that is below a pre-defined threshold, the game processor 310 may ignore the gesture input (i.e. non-media capture action)).
References KIM, Yasuda, Reif and Frappiea are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a depth sensor as taught by Frappiea in the inventions of Kim, Yasuda and Reif in order to providing input to the interactive application (See Frappiea, Par. [0035]).

Regarding Claim 17, KIM in combination with Yasuda and Reif teaches Claim 16. Kim further teaches the one or more programs further including instructions for: after capturing the immersive media, outputting, via the display generation component, the immersive media item (Par. [0063], in operation 310, the processor 104 of the electronic device 100 may obtain an image via the camera 102. The processor 104 may control the display 101 (i.e. display generation component) to display an augmented reality scene (i.e. immersive media) based on an image obtained (i.e. after capturing) via the camera 102). Kim in combination with Yasuda and Reif does not specifically teach displaying immersive media based on a viewpoint of a user. Therefore Kim in combination with Yasuda and Reif fails to explicitly teach outputting, via the display generation component, the immersive media item, wherein a displayed viewpoint of the immersive media while outputting the immersive media item is based on a viewpoint of a user during capture of the immersive media.
However, Frappiea teaches outputting, via the display generation component, the immersive media item (Par. [0070], the images and the video frames may be formatted (i.e. after capture) based on detected gaze direction of the user wearing the HMD), wherein a displayed viewpoint of the immersive media while outputting the immersive media item is based on a viewpoint of a user during capture of the immersive media (Par. [0070], When it is detected that the gaze direction of the user (i.e. viewpoint of user) is directed toward a pre-defined area on the display screen of the HMD, the image(s) of the user in the physical space (i.e. immersive media item) and the video frames from the video clip are formatted such that the image(s) of the user is presented (i.e. outputting) in the portion (i.e. display viewpoint) of the display screen (i.e. display generation component) corresponding to the user's gaze direction while the remaining portion of the display screen continues to render the video frames from the VR scene of the video game).
References KIM, Yasuda, Reif and Frappiea are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify outputting the immersive media item is based on a viewpoint of a user as taught by Frappiea in the inventions of Kim, Yasuda and Reif in order that the images and the video frames may be formatted based on detected gaze direction of the user wearing the HMD (See Frappiea, Par. [0070]).

Regarding Claim 18, KIM in combination with Yasuda and Reif teaches Claim 16. Kim further teaches the one or more programs further including instructions for: after capturing the immersive media, outputting, via the display generation component, the immersive media item (Par. [0063], in operation 310, the processor 104 of the electronic device 100 may obtain an image via the camera 102. The processor 104 may control the display 101 (i.e. display generation component) to display an augmented reality scene (i.e. immersive media) based on an image obtained (i.e. after capturing) via the camera 102). Kim in combination with Yasuda and Reif does not specifically teach displaying immersive media based on a viewpoint of a user. Therefore Kim in combination with Yasuda and Reif fails to explicitly teach outputting, via the display generation component, the immersive media item, wherein outputting the immersive media includes displaying the immersive media item from a viewpoint that does not match a viewpoint of a user during capture of the immersive media.
However, Frappiea teaches the one or more programs further including instructions for: after capturing the immersive media (Par. [0070], the images and the video frames may be formatted (i.e. after capture) based on detected gaze direction of the user wearing the HMD), outputting, via the display generation component, the immersive media item, wherein outputting the immersive media includes displaying the immersive media item from a viewpoint that does not match a viewpoint of a user during capture of the immersive media (Par. [0069], the formatting of the images and the video frames may include rendering the image(s) of the user captured in the physical space (i.e. outputting the immersive media) in a first portion (i.e. viewpoint) of the display screen (i.e. display generation component) and the video frames from the video clip in a second portion, wherein the first portion and the second portion may be defined by splitting the area defined in the display screen vertically, horizontally, diagonally, etc. (i.e. viewpoint that does not match a viewpoint of a user)).
References KIM, Yasuda, Reif and Frappiea are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify outputting the immersive media item is not based on a viewpoint of a user as taught by Frappiea in the inventions of Kim, Yasuda and Reif so as to cause image of the user to switch from a mirror view orientation to a reverse mirror view orientation (See Frappiea, Par. [0069]).

Regarding Claim 21, KIM in combination with Yasuda and Reif teaches Claim 1. Kim in combination with Yasuda and Reif does not specifically teach detecting a reposition of the HMD. Therefore, Kim in combination with Yasuda and Reif fails to explicitly teach detecting a repositioning of the head-mounted display device to a position near a face of a user of the head-mounted display device; and in response to detecting the repositioning of the head-mounted display device to the position near the face of the user, enabling a media capture mode.
However, Frappiea teaches detecting a repositioning of the head-mounted display device to a position near a face of a user of the head-mounted display device (Par. [0032], Various technologies may be employed to detect and interpret the user input provided at the input interface, input provided by movement of the HMD to determine position and movement of the user and the HMD that is communicatively coupled to the HMD); and in response to detecting the repositioning of the head-mounted display device to the position near the face of the user(Par. [0032], the user input circuit of the HMD, in some implementations, may include global position systems (GPS), compass, etc., to detect the position of the user, HMD, in relation to one or more reference points), enabling a media capture mode (Fig. 9, Par. [0140] a signal is generated to activate an image capturing device (i.e. enabling media capture mode) that is communicatively coupled to a computer on which the video game is being executed, as illustrated in operation 940).
References KIM, Yasuda, Reif and Frappiea are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify detecting a repositioning of a user as taught by Frappiea in the inventions of Kim, Yasuda and Reif in order to interpret the input provided by user action/interaction, movement of the HMD (See Frappiea, Par. [0032]).

Regarding Claim 23, KIM in combination with Yasuda and Reif teaches Claim 1. Kim further teaches the immersive media item includes captured photo media data (Par. [0041], based on the image 110 (i.e. captured photo media data) obtained via the camera, the electronic device 100 may display an augmented reality scene 200 in the display 101. The augmented reality scene 200 (i.e. immersive media) may include a scene that shows a virtual image together in a physical environment or space of the real world, or a scene that shows a real object and a virtual object (i.e. immersive media item) together). KIM in combination with Yasuda and Reif does not specifically teach motion data. Therefore, KIM in combination with Yasuda and Reif fails to explicitly teach the immersive media item includes captured photo media data and motion data, wherein the motion data represents a movement detected by at least one sensor of the plurality of sensors at a time proximate to detecting the activation of the hardware input device.
Frappiea further teaches the immersive media item includes captured photo media data and motion data (Fig. 7B, Par. [0106], an image capturing device, such as a camera, that captures one or more images (i.e. photo media data) of the user interacting in a physical space defined in a real-world environment, while the user is interacting (i.e. motion data) with the video game during game play), wherein the motion data represents a movement detected by at least one sensor of the plurality of sensors (Par. [0036], the user 100 may be used to provide gestures, e.g., hand gestures, finger gestures, etc., that may be interpreted by interactive application and/or the logic within the HMD 104. In some implementations, the user 100 may wear an interactive glove with built-in sensors to provide tactile feedback. The interactive glove acts as the HHC 102, when worn by a user, and provides input in the form of interactive gestures/actions to the interactive program and/or the HMD 104)  at a time proximate to detecting the activation of the hardware input device (Par. [0054] In response to the game data, the user 100 performs one or more head and/or eye motions, e.g., head tilting, winking, gazing, shifting gaze, staring, etc., or hand gestures, and each head or eye or hand motion triggers the user input circuit of the HMD to generate an input, which may be used as user interaction input provided during game play to influence an outcome of the game).
References KIM, Yasuda, Reif and Frappiea are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify motion data as taught by Frappiea in the inventions of Kim, Yasuda and Reif in order to provide input in the form of interactive gestures/actions to the interactive program and/or the HMD 104 (See Frappiea, Par. [0032]).

Regarding Claim 24, KIM in combination with Yasuda and Reif teaches Claim 1. Kim further teaches the one or more programs further including instructions for: attaching the head-mounted display device to a user of the head-mounted display device (Fig. 1A, Par. [0035] an ‘augmented reality device’ (e.g., the electronic device 100 of FIGS. 1A and 1B) is a device that is capable of expressing an ‘augmented reality’, and may generally include augmented reality glasses provided in the form of glasses that a user wears on his or her face, a head mounted display (HMD) apparatus that a user wears on the his or her head, an augmented reality helmet, and the like). Kim in combination with Yasuda and Reif does not specifically teach a strap. Therefore, Kim in combination with Yasuda and Reif fails to explicitly teach attaching the HMD to a user via a strap. 
However, Frappiea teaches attaching the head-mounted display device to a user of the head-mounted display device via a strap (Fig. 2, Par. [0043],  a headband of the HMD (i.e. strap)).
References KIM, Yasuda, Reif and Frappiea are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a strap for attaching the HMD to the user as taught by Frappiea in the inventions of Kim, Yasuda and Reif in order to provide a visually immersive experience to the user (See Frappiea, Par. [0005]).

Regarding Claim 25, Kim in combination with Yasuda and Reif teaches Claim 1. Kim in combination with Yasuda and Reif does not specifically teach gaze detection. Therefore, Kim in combination with Yasuda and Reif fails to explicitly teach further teaches detecting a gaze of a user of the head-mounted display device; and after capturing the immersive media, outputting, via the display generation component, the immersive media item, wherein outputting the immersive media item includes adjusting output of the immersive media item based on the detected gaze.
However, Frappiea teaches detecting a gaze of a user of the head-mounted display device; and after capturing the immersive media (Par. [0070], the images and the video frames may be formatted (i.e. after capture) based on detected gaze direction of the user wearing the HMD), outputting, via the display generation component, the immersive media item, wherein outputting the immersive media item includes adjusting output of the immersive media item based on the detected gaze (Par. [0070], When it is detected that the gaze direction of the user (i.e. detecting gaze of user) is directed toward a pre-defined area on the display screen of the HMD, the image(s) of the user in the physical space (i.e. immersive media item) and the video frames from the video clip are formatted (i.e. adjusting the output) such that the image(s) of the user is presented (i.e. outputting) in the portion (i.e. display viewpoint) of the display screen (i.e. display generation component) corresponding to the user's gaze direction while the remaining portion of the display screen continues to render the video frames from the VR scene of the video game).
References KIM, Yasuda, Reif and Frappiea are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify gaze detection as taught by Frappiea in the inventions of Kim, Yasuda and Reif in order that the image(s) of the user is presented in the portion of the display screen corresponding to the user's gaze direction while the remaining portion of the display screen continues to render the video frames from the VR scene of the video game (See Frappiea, Par. [0070]).

Regarding Claim 28, KIM in combination with Yasuda, Reif and Frappiea teaches Claim 25. Frappiea further teaches wherein: outputting the immersive media item includes outputting one or more audio outputs (Par. [0053], The processing may include de-packetizing, decoding, etc., the data stream, identifying the audio and video component, and forwarding the different components of data from the data stream to corresponding devices of the HMD. The audio data may be directed to speakers (i.e. audio outputs) of the HMD) for a right ear and one or more audio outputs for a left ear (Par. [0130], the one or more speakers 260 (i.e. left and right ears audio outputs)), that, when heard concurrently (Par. [0118] The audio codec 276 converts the synchronized audio data (i.e. concurrently) from a digital format into an analog format to generate audio signals and the audio signals are played back by the speakers 260 to generate sound), create an illusion that sound is emanating from one or more particular positions in three-dimensional space (Par. [0031], an interactive application, in response to a request from a user and provide audio and video content from the interactive application (i.e. illusion) for rendering on a display screen of the HMD 104)); and adjusting the output of the immersive media item based on the detected gaze (Par. [0070], the images and the video frames may be formatted (i.e. after capture) based on detected gaze direction of the user wearing the HMD) includes: in accordance with a determination that the detected gaze corresponds to a first region of an environment represented by the immersive media item (Par. [0070], When it is detected that the gaze direction of the user (i.e. viewpoint of user) is directed toward a pre-defined area on the display screen of the HMD, the image(s) of the user in the physical space (i.e. immersive media item) and the video frames from the video clip are formatted such that the image(s) of the user is presented (i.e. outputting) in the portion (i.e. display viewpoint) of the display screen (i.e. display generation component) corresponding to the user's gaze direction while the remaining portion of the display screen continues to render the video frames from the VR scene of the video game) adjusting the one or more audio outputs (Par. [0053], The processing may include de-packetizing, decoding, etc., the data stream, identifying the audio and video component, and forwarding the different components of data from the data stream to corresponding devices of the HMD. The audio data may be directed to speakers (i.e. audio outputs) of the HMD) for the right ear and the one or more audio outputs for the left ear that (Par. [0130], the one or more speakers 260 (i.e. left and right ears audio outputs)) such that, when heard concurrently (Par. [0118] The audio codec 276 converts the synchronized audio data (i.e. concurrently) from a digital format into an analog format to generate audio signals and the audio signals are played back by the speakers 260 to generate sound), the one or more audio outputs for the right ear and the one or more audio outputs for the left ear create an illusion that sound is emanating with more detail from a first position in three-dimensional space corresponding to the first region of the environment (Par. [0031], an interactive application, in response to a request from a user and provide audio and video content from the interactive application (i.e. illusion) for rendering on a display screen of the HMD 104)).

Claims 11 – 13 are rejected under 35 U.S.C. 103 as being unpatentable over KIM et al. (US 2023/0222746 A), in view of Yasuda et al. (US 2010/0026787 A1), in view of Reif (US 2019/0020843 A1), and in further view of Tanaka  et al. (US 2021/0003847 A1) referred to as Tanaka hereinafter.
Regarding Claim 11, KIM in combination with Yasuda and Reif teaches Claim 1. KIM combination with Yasuda and Reif does not specifically teach a visual indication. Therefore, KIM in combination with Yasuda and Reif fails to explicitly teach capturing the immersive media includes: displaying a visual indication of immersive media capture the visual indication is visible from an exterior of the head-mounted display device.
However, Tanaka teaches capturing the immersive media includes: displaying a visual indication of immersive media capture (Par. [0038], The LED indicator 67 is lit during execution of the imaging by the cameras 61R and 61L) the visual indication is visible from an exterior of the head-mounted display device (Fig. 1, Par. [0038], The LED indicator 67 is disposed at the right end portion of the front frame 27 of the HMD 100).
References KIM, Yasuda, Reif and Tanaka are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a visual indication of the capture of images as taught by Tanaka in the inventions of KIM, Yasuda and Reif in order to inform that the imaging is being performed (See Tanaka, Par. [0038]).

Regarding Claim 12, KIM in combination with Yasuda, Reif and Tanaka teaches Claim 11. Tanaka further teaches the display generation component includes at least one interior display (Par. [0029] The right light guide plate 26 is located in front of the right eye of the user in the worn state of the image display section 20 and causes the right eye to visually recognize an image (i.e. at least one interior display)) and at least one exterior display (Fig. 1, HMD 100, Par. [0027], the image display section 20 has an eyeglass shape which includes a front frame 27 (i.e. exterior display)), and wherein the visual indication is displayed via the exterior display (Par. [0035], an indicator 67 are provided in the image display section 20). 

Regarding Claim 13, KIM in combination with Yasuda, Reif and Tanaka teaches Claim 11. Tanaka further teaches wherein the visual indication includes an indication of a current state of the immersive media capture (Par. [0038], The LED indicator 67 is lit during execution (i.e. current state) of the imaging by the cameras 61R and 61L to inform that the imaging is being performed). 

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over KIM et al. (US 2023/0222746 A), in view of Yasuda et al. (US 2010/0026787 A1), in view of Reif (US 2019/0020843 A1), in view of Tanaka  (US 2021/0003847 A1), and in further view of Chen et al. (US 2019/0286406 A1) referred to as Chen hereinafter.
Regarding Claim 14, KIM in combination with Yasuda, Reif and Tanaka teaches Claim 11. Tanaka further teaches wherein the visual indication includes an indication currently being captured in the immersive media (Par. [0038], The LED indicator 67 is lit (i.e. visual indication) during execution of the imaging by the cameras 61R and 61L to inform that the imaging is being performed). KIM in combination with Yasuda, Reif and Tanaka does not specifically teach an indication of a subject. Therefore, KIM in combination with Yasuda, Reif and Tanaka fails to explicitly teach wherein the visual indication includes an indication of a subject currently being captured in the immersive media.
However, Chen teaches wherein the visual indication includes an indication of a subject (Par. [0064] FIG. 5D illustrates an expanded view of a VR scene that is rendered on the second display screen 311. The VR scene content (i.e. visual indication of a subject) may be rendered on the entire second display screen 311 or in a portion of the second display screen 311) currently being captured in the immersive media (Par. [0028], a camera 108 can be configured to capture image of the interactive environment in which the user 100 is located).
References KIM, Yasuda, Reif, Tanaka and Chen are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a visual indication of a subject being captured as taught by Chen in the inventions of KIM, Yasuda, Reif and Tanaka in order to allow the non-HMD user to view the content that the HMD user is viewing or interacting with using the HMD (See Chen, Par. [0060]).

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over KIM et al. (US 2023/0222746 A), in view of Yasuda et al. (US 2010/0026787 A1), in view of Reif (US 2019/0020843 A1), and in further view of Iwasa (US 2021/0141227 A1) referred to as Iwasa hereinafter.
Regarding Claim 15, KIM in combination with Yasuda and Reif teaches Claim 1. KIM in combination with Yasuda and Reif does not teach a non-head mounted state. Therefore KIM in combination with Yasuda and Reif fails to explicitly teach capturing the immersive media is performed while the head-mounted display device is in a non-head mounted state.
However, Iwasa teaches capturing the immersive media (Par. [0016], generates an image (hereinafter, a composite image) of a mixed reality (MR) space (i.e. immersive media) in which a real space and a virtual space are fused, and provides the generated image to the HMD 10) is performed while the head-mounted display device is in a non-head mounted state (Par. [0034], The period A is a state in which the image capturing and displaying system 1 is activated and the HMD 10 is placed on a desk (i.e. non-head mounted state) or the like in a state (operation mode) ready for the MR experience).
References KIM, Yasuda, Reif and Iwasa are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a non-head mounted state as taught by Iwasa in the inventions of KIM, Yasuda and Reif in order to provide technology that suppresses unnecessary lighting of a display (See Iwasa, Par. [0006]).

Claims 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over KIM et al. (US 2023/0222746 A), in view of Yasuda et al. (US 2010/0026787 A1), in view of Reif (US 2019/0020843 A1), and in further view of KASAR et al. (US 2024/0377877 A1) referred to as KASAR hereinafter.
Regarding Claim 19, KIM in combination with Yasuda and Reif teaches Claim 1. KIM in combination with Yasuda and Reif does not specifically teach a storage case for the HMD. Therefore, KIM in combination with Yasuda and Reif fails to explicitly teach detecting a removal of the head-mounted display device from a storage case; and in response to detecting the removal of the head-mounted display device from the storage case, enabling a media capture mode.
However, KASAR teaches detecting a removal of the head-mounted display device from a storage case (Fig. 9, Par. [0058] The head-mountable device can determine whether it is within a case (step 304, case 200 see Fig. 2). If the head-mountable device is determined to be not within a case (i.e. removal of HMD) and/or in use or being worn by a user, then the head-mountable device can enter or maintain an active state thereof); and in response to detecting the removal of the head-mounted display device from the storage case (Par. [0062] While the head-mountable device is in an off state, it can be configured to reboot upon a detection that the lid has been opened and/or that the head-mountable device has been removed from the case), enabling a media capture mode (Par. [0053], in an active state, the head-mountable device can operate the displays to output an image based on a view captured by the cameras).
References KIM, Yasuda, Reif and KASAR are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a storage case for the HMD as taught by KASAR in the inventions of KIM, Yasuda and Reif in order to manage a power state of a head-mountable device (See KASAR, Par. [0056]).

Regarding Claim 20, KIM in combination with Yasuda, Reif and KASAR teaches Claim 19. KASAR further teaches the one or more programs further including instructions for: drawing electrical power from the storage case for the head-mounted display device (Fig. 3, Par. [0038], While the head-mountable device 100 is connected to the case 200, for example via the case communication interface 216 and the HMD communication interface 116, the head-mountable device 100 and the case 200 can communicate and/or transfer power there between).
References KIM, Yasuda, Reif and KASAR are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a storage case for the HMD as taught by KASAR in the inventions of KIM, Yasuda and Reif in order to manage a power state of a head-mountable device (See KASAR, Par. [0056]).

Claim 26 is rejected under 35 U.S.C. 103 as being unpatentable over KIM et al. (US 2023/0222746 A), in view of Yasuda et al. (US 2010/0026787 A1), in view of Reif (US 2019/0020843 A1), in view of Frappiea et al. (US 2018/0001198 A1), and in further view of Miller et al. (US 2004/0207635 A1) referred to as Miller  hereinafter.
Regarding Claim 26, KIM in combination with Yasuda, Reif and Frappiea teaches Claim 25. Frappiea further teaches wherein adjusting the output of the immersive media item based on the detected gaze (Par. [0070], the images and the video frames may be formatted (i.e. after capture) based on detected gaze direction of the user wearing the HMD) includes: in accordance with a determination that the detected gaze corresponds to a first displayed portion of the immersive media item (Par. [0070], the formatting of the images and the video frames may include rendering the image(s) of the user captured in the physical space (i.e. outputting the immersive media) in a first portion (i.e. viewpoint) of the display screen (i.e. display generation component) and the video frames from the video clip in a second portion, wherein the first portion and the second portion may be defined by splitting the area defined in the display screen vertically, horizontally, diagonally, etc. (i.e. viewpoint that does not match a viewpoint of a user)) and a determination that the detected gaze does not correspond to a second displayed portion of the immersive media item (Par. [0070] while the remaining portion of the display screen (i.e. second displayed portion, not in the pre-defined area of gaze detection) continues to render the video frames from the VR scene of the video game) different from the first displayed portion (Fig. 6, first region is different from second region). KIM in combination with Yasuda, Reif and Frappiea does not specifically teach displaying two levels of detail. Therefore, KIM in combination with Yasuda, Reif and Frappiea fails to explicitly teach displaying the first displayed portion with a first level of detail; and displaying the second displayed portion with a second level of detail that is lower than the first level of detail.
However, Miller teaches wherein adjusting the output of the immersive media item based on the detected gaze (Par. [0019], The display 20 could be any visual display, but is preferably an immersive display having a field of view of at least X degrees vertical and Y degrees horizontal. The eye tracker 22 could be any device that can be used to monitor the gaze point of a viewer 26) includes: in accordance with a determination that the detected gaze corresponds to a first displayed portion of the immersive media item and a determination that the detected gaze does not correspond to a second displayed portion of the immersive media item, different from the first displayed portion (Par. [0019], capable of determining the viewer's point of gaze on the display 20), displaying the first displayed portion with a first level of detail (Par. [0040], the region of the image with the highest fidelity (i.e. first display portion) will be close to the final point of gaze once an viewer makes a constant point of gaze); and displaying the second displayed portion with a second level of detail that is lower than the first level of detail (Par. [0069], background image (i.e. second displayed portion, different from first displayed portion) that represents the low-resolution information (i.e. second level is lower than first level), it is then necessary to fill in higher resolution detail information in accordance with viewer's point of gaze and the corresponding contrast threshold function values across the field of view).
References KIM, Yasuda, Reif, Frappiea and Miller are considered to be analogous art because they relate to imaging devices on HMD. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a level of detail for detected gaze as taught by Miller in the inventions of KIM, Yasuda, Reif and Frappiea in order to refine the point of gaze to allow the highest fidelity information to be selected and transmitted very close to the time of display, providing minimal errors in point of gaze estimates (See Miller, Par. [0040]).

Claim 27 is rejected under 35 U.S.C. 103 as being unpatentable over KIM et al. (US 2023/0222746 A), in view of Yasuda et al. (US 2010/0026787 A1), in view of Reif (US 2019/0020843 A1), in view of Frappiea et al. (US 2018/0001198 A1), and in further view of Yoon et al. (US 10,528,128 B1) referred to as Yoon hereinafter.
Regarding Claim 27, KIM in combination with Yasuda, Reif and Frappiea teaches Claim 25. KIM in combination with Yasuda, Reif and Frappiea does not specifically teach a visual indication of detected gaze. Therefore, KIM in combination with Yasuda, Reif and Frappiea fails to explicitly teach displaying, via the display generation component, a visual indication at a location corresponding to the detected gaze.
However, Yoon teaches displaying, via the display generation component (Col. 15: 28-34 FIG. 5B also illustrates gaze indicator 520 displayed (i.e. displaying) by the one or more transparent displays (i.e. display generation component), where the position of gaze indicator 520 is determined based on the gaze direction of the one or more eyes of the wearer (e.g., the gaze direction is determined using infrared light reflected from the one or more eyes of the wearer as explained above with respect to FIG. 4), a visual indication at a location corresponding to the detected gaze (Col. 15:34-39 gaze indicator 520 (i.e. visual indication) is positioned at a location that corresponds to the gaze direction of the one or more eyes of the wearer (e.g., gaze indicator 520 is positioned over a person or an object that one or more eyes of the wearer are gazing at).
References KIM, Yasuda, Reif, Frappiea and Yoon are considered to be analogous art because they relate to imaging devices. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify a visual indication of gaze point on the display as taught by Yoon in the inventions of KIM, Yasuda, Reif and Frappiea in order to provide an image based on augmented reality content 510 (e.g., name, company, position, etc. of a person that the one or more eyes of the wearer are gazing at). (See Yoon, Col. 15:41-43).

Claim 29 is rejected under 35 U.S.C. 103 as being unpatentable over KIM et al. (US 2023/0222746 A), in view of Yasuda et al. (US 2010/0026787 A1), in view of Reif (US 2019/0020843 A1), in view of Frappiea et al. (US 2018/0001198 A1), and in further view of AN et al. (US 2017/0061953 A1) referred to as AN hereinafter.
Regarding Claim 29, KIM in combination with Yasuda, Reif and Frappiea teaches Claim 28. KIM in combination with Yasuda, Reif and Frappiea does not specifically teach adjusting the audio output based on detected gaze. Therefore, KIM in combination with Yasuda, Reif and Frappiea fails to explicitly teach adjusting the output of the immersive media item based on the detected gaze includes: in accordance with a determination that the detected gaze does not correspond to a second region an environment of the immersive media item different from the first region, adjusting the one or more audio outputs for the right ear and the one or more audio outputs for the left ear such that, when heard concurrently, the one or more audio outputs for the right ear and the one or more audio outputs for the left ear create an illusion that sound is emanating with less detail from a second position in three-dimensional space corresponding to the second region of the environment.
However, AN teaches adjusting the output of the immersive media item based on the detected gaze (Fig. 8A, adjust sound output based on gaze direction) includes: in accordance with a determination that the detected gaze does not correspond to a second region (Fig. 8A, beamforming in gaze direction is a first region, region outside of beamforming in gaze direction is a second region) an environment of the immersive media item different from the first region (Fig. 4, step 403 Par. [0092], the electronic device can output some audio signals among the audio signals on the basis of at least one piece of information of user information, Par. [0102] The user information may include user health information, user gaze direction information, and user location information), adjusting the one or more audio outputs for the right ear and the one or more audio outputs for the left ear such that, when heard concurrently (Par. [0095], blocking outputting of a second sound obtained in a direction different from the predetermined direction or in a region different from the predetermined region), the one or more audio outputs for the right ear and the one or more audio outputs for the left ear (Par. [0119], cancel noise except for the necessary signal from among the external sounds entering between the ears 501 of the user) create an illusion that sound is emanating with less detail from a second position in three-dimensional space corresponding to the second region of the environment (Par. [0152], An electronic device 800a can cancel all noise (i.e. an illusion that sound is with less detail) not corresponding to a gaze direction or region of the user among a plurality of external sounds).
References KIM, Yasuda, Reif, Frappiea and AN are considered to be analogous art because they relate to electronic devices. Therefore, it would be obvious to one possessing ordinary skill in the art before the effective filing date of the claimed invention to specify adjusting the audio output as taught by AN in the inventions of KIM, Yasuda, Reif and Frappiea in order to provide, to the user, the sounds which the user needs, and may not provide, to the user, sounds which the user does not need (See AN, Par. [0012]).

Regarding Claims 34 and 35, they have been cancelled.

Conclusion
The prior art references made of record are not relied upon but are considered pertinent to applicant's disclosure. Englert et al. (US 9,615,177 B2) teaches wireless immersive experience capture and viewing. 
Any inquiry concerning this communication should be directed to SUSAN E HODGES whose telephone number is (571)270-0498.  The Examiner can normally be reached on Monday - Friday from 8:00 am (EST) to 4:00 pm (EST).  
If attempts to reach the Examiner by telephone are unsuccessful, the Examiner's supervisor, Brian T. Pendleton, can be reached on (571) 272-7527. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://portal.uspto.gov/external/portal. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Susan E. Hodges/Primary Examiner, Art Unit 2425
Read full office action
Prosecution Timeline

Mar 20, 2024
Application Filed
Feb 21, 2025
Response after Non-Final Action
Jul 10, 2025
Non-Final Rejection — §103
Oct 07, 2025
Examiner Interview Summary
Oct 07, 2025
Applicant Interview (Telephonic)
Oct 13, 2025
Response Filed
Nov 06, 2025
Final Rejection — §103
Dec 19, 2025
Request for Continued Examination
Jan 08, 2026
Response after Non-Final Action
Jan 19, 2026
Non-Final Rejection — §103
Apr 09, 2026
Examiner Interview Summary
Apr 09, 2026
Applicant Interview (Telephonic)
Precedent Cases

Applications granted by this same examiner with similar technology

18/862,647
Patent 12603982
STEREOSCOPIC HIGH DYNAMIC RANGE VIDEO
2y 5m to grant Granted Apr 14, 2026
18/923,643
Patent 12604008
ADAPTIVE CLIPPING IN MODELS PARAMETERS DERIVATIONS METHODS FOR VIDEO COMPRESSION
2y 5m to grant Granted Apr 14, 2026
18/716,344
Patent 12574558
Method and Apparatus for Sign Coding of Transform Coefficients in Video Coding System
2y 5m to grant Granted Mar 10, 2026
18/208,163
Patent 12568212
ADAPTIVE LOOP FILTERING ON OUTPUT(S) FROM OFFLINE FIXED FILTERING
2y 5m to grant Granted Mar 03, 2026
18/906,582
Patent 12556671
THREE DIMENSIONAL STROBO-STEREOSCOPIC IMAGING SYSTEMS AND ASSOCIATED METHODS
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
67%
Grant Probability
81%
With Interview (+14.4%)
2y 4m
Median Time to Grant
High
PTA Risk
Based on 375 resolved cases by this examiner. Grant probability derived from career allow rate.