Last updated: April 19, 2026
Application No. 18/184,333
TIGHT IMU-CAMERA COUPLING FOR DYNAMIC BENDING ESTIMATION

Final Rejection §103
Filed
Mar 15, 2023
Examiner
RENZE, GEORGE NICHOLAS
Art Unit
2613
Tech Center
2600 — Communications
Assignee
Snap Inc.
OA Round
4 (Final)
Interview Optional

— +33.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 24 resolved cases, 2023–2026
Examiner Intelligence

RENZE, GEORGE NICHOLAS View full profile →
Grants 67% — above average
Career Allow Rate
16 granted / 24 resolved
+4.7% vs TC avg
Strong +33% interview lift
Without
With
+33.3%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
33 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
2.7%
-37.3% vs TC avg
§103
73.3%
+33.3% vs TC avg
§102
16.0%
-24.0% vs TC avg
§112
8.0%
-32.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 24 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Response to Amendment
The Amendment filed December 5th, 2025 has been entered.  Claims 1, 2, 4-6, 9, 11, 12, 14-16, 19 and 20 have been amended. Claims 1-20 remain pending and rejected in the application.  
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 6, 11-13, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Pease et al. (Pub. No.: US 2023/0281933 A1), hereinafter Pease, in view of Hudman (Pub. No.: US 2019/0220090 A1) and further in view of Kalinowski et al. (U.S. Patent: #11,763,779 B1), hereinafter Kalinowski.
Regarding claim 1, Pease discloses a method comprising (Paragraph 47 teaches that various implementations disclosed herein include devices, systems, and methods):
forming a plurality of sensor groups of an augmented reality (AR) display device, wherein each sensor group comprises a camera and an IMU (inertial measurement unit) sensor, (Paragraph 27 teaches that according to some implementations, the electronic device 120 presents a XR environment to the user while the user is present within the physical environment 105. For example, an XR environment may include virtual reality (VR) content, augmented reality (AR) content, mixed reality (MR) content, or the like. Paragraph 38 teaches that in some implementations, the one or more I/O devices and sensors 306 include at least one of an inertial measurement unit (IMU) and paragraph 40 teaches that the one or more image sensor systems 314 include one or more RGB cameras (e.g., with a complimentary metal-oxide-semiconductor (CMOS) image sensor or a charge-coupled device (CCD) image sensor), monochrome camera, IR camera, event-based camera, or the like).  However, Pease fails to teach and wherein for each sensor group, the camera and the IMU sensor are both mechanically mounted to a common rigid support such that a camera-IMU sensor spatial relationship between the camera and the IMU sensor is fixed and predefined, and wherein the camera and the IMU sensor are located within a prescribed maximum separation on the common rigid support.
Hudman discloses and wherein for each sensor group, the camera and the IMU sensor are both mechanically mounted to a common rigid support such that a camera-IMU sensor spatial relationship between the camera and the IMU sensor is fixed and predefined (FIG. 2 and paragraph 34 teach that FIG. 2 shows a front view of an example HMD system 200 when worn on the head of a user 202. FIG. 3 shows a top plan view of the HMD system 200, showing example fields of view 210 and 212 for some of the components of the HMD system 200. The HMD system 200 includes a support structure 204 that supports a front facing or forward camera 206 and a plurality of sensor ICs 208a-208n (collectively, 208).  Additionally, paragraph 38 teaches that in operation, the HMD system 200 may fuse or otherwise combine data from the forward camera 206, the sensor ICs 208 and optionally the IMU 216 to track the position of the HMD system 200 during operation by the user 202. ... Optionally, the HMD system 200 may also fuse sensor data from the IMU 216 to further improve the position tracking of the HMD system.).  Since Pease teaches a method comprising of a plurality of sensor groups with one of the groups comprising of a combination of a camera and an IMU sensor and Hudman teaches an Head-Mounted Display system where a camera, group of sensors and an IMU sensor are physically and mechanically mounted onto a rigid HMD support and the camera and the IMU sensor are also in a fixed and predefined position with one another, it would have been obvious to a person having ordinary skill in the art to combine the teachings together so that any type of sensor group consisting of a combination of a camera and an IMU sensor could be physically and mechanically mounted together on some type of rigid HMD and the spatial relationship between the camera and the IMU sensor would then be fixed and predefined.   

    PNG
    media_image1.png
    562
    493
    media_image1.png
    Greyscale

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Pease to incorporate the teachings of Hudman so that the combined features together would allow for the sensor group to be mechanically mounted onto a rigid support (like an HMD) and any camera and sensor, including an IMU sensor, would consist of a spatial relationship that is fixed and predefined which would help provide greater and more precise position tracking. 
However, Pease in view of Hudman fail to disclose and wherein the camera and the IMU sensor are located within a prescribed maximum separation on the common rigid support.
Kalinowski discloses and wherein the camera and the IMU sensor are located within a prescribed maximum separation on the common rigid support (FIG. 5 and Col. 8, Lines 32-44 teach that as shown in FIG. 5, in addition to displays 14 and optical combiner systems 46, device 10 may have components such as cameras 72 and inertial measurement units 74. Cameras 72 may be, for example, front-facing (forward facing) cameras that face outwardly in directions 80 towards real-life objects such as object 90 while facing away from eye boxes 30. There may be one or more cameras 72 on either side of device 10. Cameras 72 may operate at visible light wavelengths and/or infrared light wavelengths. If desired, cameras 72 may include cameras that face to the sides of device 10 and/or in other directions. Inertial measurement units 74 may be coupled to cameras 72 (or may be mounted to adjacent rigid portions of structure 26).).  Since Pease in view of Hudman teach a method and device comprising of a plurality of sensor groups that contain a camera and an IMU sensor mounted onto a rigid support with fixed separation of one another and Kalinowski teaches a method and device comprising of a plurality of sensor groups that involve multiple cameras and IMU sensors being coupled to each other by fixed separation and mounted to a common rigid support with maximum separation of the two camera-IMU sensor groups in order to properly combine visual data with each other because there would be no more room available to separate the groups of sensors on the rigid device without compromising the data being received from those sensor groups, it would have been obvious to a person having ordinary skill in the art to combine the teachings together so that any type of camera-IMU sensor spatial relationship would be considered when combining a camera and an IMU sensor system together and when physically and mechanically mounted the sensors together on a rigid support structure, the spatial relationship between the camera and the IMU sensor would then be fixed and set a maximum relationship to each of the different camera-IMU sensor groups.

    PNG
    media_image2.png
    231
    436
    media_image2.png
    Greyscale

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Pease in view of Hudman to incorporate the teachings of Kalinowski so that the combined features together would allow for the camera-IMU sensor groups to take into account the spatial relationship of one another and that the spatial relationship would be fixed and allow for a maximum separation between the two sensor groups, which would help improve overall accuracy when tracking position data in relation to the multiple sensor groups.    
Additionally, Pease in view of Hudman and Kalinowski disclose accessing sensor groups data from the plurality of sensor groups (Paragraph 53 of Pease teaches that in some implementations, data (e.g., metadata) about the capabilities of the capturing devices is recorded such as sensors, hardware, software applications, or additional stored data. In some implementations, additional data (e.g., metadata) about the capture conditions is recorded such as time of day, lighting, location, subject, input/output data, and additional characteristics about the physical environment involved in recording the 3D video data):
estimating a sensor group spatial relationship between the plurality of sensor groups based on the sensor groups data (Paragraph 47 of Pease teaches that in some implementations, spatial relationships of content of the 3D video to positional references (e.g., ground plane) are determined when created and used to align content represented in the 3D video for later viewing (e.g., during playback in replay environments).  Additionally, Col. 9, Lines 22-40 of Kalinowski teach that if desired, cameras and inertial measurement units can operate in conjunction with each other to form visual inertial odometry (VIO) systems. For example, the camera 72 and inertial measurement unit 74 on the left side of device 10 can operate together as a left visual inertial odometry system that gathers orientation information on the left side of support 26 (and that therefore measures misalignment of the left image in the left eye box 30). The camera 72 and inertial measurement unit 74 on the right side of device 10 can likewise operate together as a right visual inertial odometry system that gathers orientation information on the right side of support 26 (and therefore measures misalignment of the right image in the right eye box 30). Visual inertial odometry systems operate using both camera data from cameras 72 and orientation data from inertial measurement units 74 and may therefore be more accurate and responsive than systems that use only camera data or only inertial measurement unit data (although such single-sensor orientation data may be used in device 10, if desired).);
displaying virtual content in a display of the AR display device based on the sensor group spatial relationship between the plurality of sensor groups (Paragraph 87 of Pease teaches that in some implementations, the multiple (e.g., 4) 3D videos are processed as described herein and during playback, recorded metadata, recorded corresponding static objects, spatial relationships between single 3D coordinate system of each of the multiple (e.g., 4) 3D videos, actual size and independence of capture device motion, or the like allow the viewing electronic device to play back the multiple processed (e.g., 4) 3D videos in a single replay environment);
estimating a bending of the AR display based on changed in the sensor group spatial relationship between the plurality of sensor groups (Col. 8, Lines 45-65 of Kalinowski teach that during operation of device 10, cameras 72 may gather real-world image data while inertial measurement units 74 (e.g., units containing accelerometers, compasses, and/or gyroscopes such as six-degrees-of-freedom inertial measurement units) gather associated orientation measurements. Using information from cameras 72 and/or orientation sensors such as inertial measurement units 74, device 10 can monitor the orientations of the left and right portions of structure 26 (e.g., to determine whether structure 26 has bent about axis 82 so that the left and right images from the left and right portions of device 10 are misaligned with respect to each other and with respect to left and right eye boxes 30). For example, if this orientation information indicates that cameras 72 are pointing away from each other more than expected, control circuitry 12 can conclude that displays 14 and optical combiner assemblies 46 on the left and right of structure 26 have been bent away from each other about axis 82 and that the images in the left and right eye boxes 30 should therefore be shifted relative to each other to compensate and thereby ensure that the images remain aligned with the left and right eye boxes 30.  Additionally, Col 7, Lines 23-35 of Kalinowski teach that when deformation of structure 26 (e.g., bending between side portion 26E and front portion 26F) causes display 40 (e.g., axis 44) to become misoriented relative to optical combiner assembly 46 and thereby causes the image from display 14 to become misaligned with respect to eye box 30, control circuitry 12 can take corrective action. For example, control circuitry 12 can be configured to shift or otherwise warp the image being displayed by display 14 by an amount that is based on the amount of measured misalignment, thereby compensating for the misalignment and ensuring that the displayed image is not misaligned relative to eye box 30 even though optical components of device 10 are physically misaligned.);
and correcting display errors resulting from the bending by adjusting the display in real time based on the estimated bending (Col. 10, Lines 6-15 of Kalinowski teaches that the image misalignment measurements of block 100 may be used to compensate the images from the left and/or right display systems for misalignment. In particular, during the operations of block 102, control circuitry 12 may shift the left and/or right images based on the misalignment measurements from the camera/database systems or the VIO systems or control circuitry 12 may otherwise warp images associated with displays 14 to compensate for the misalignment and Col. 10, Lines 25-29 teach that the image warping transforms that are applied during misalignment compensation operations may include geometrical transforms such as shifts, shears, rotations, etc. and may be applied to the image data being provided to displays 14 in real time.).
Regarding claim 2, Pease in view of Hudman and Kalinowski disclose everything claimed as applied above (see claim 1), in addition, Pease in view of Hudman and Kalinowski disclose the method of claim 1, further comprising:
accessing factory calibration data indicating a static or dynamic spatial relationship among the plurality of sensor groups, wherein the sensor group spatial relationship between each sensor group is predefined, wherein estimating the sensor group spatial relationship between the plurality of sensor groups is based on the factory calibration data (Paragraph 64 of Pease teaches that in some implementations, the 3D video is in a preset multimedia format. In some implementations, the preset multimedia format specifies file(s) that contains one or more tracks, each of which stores a particular type of data (e.g., audio, video, etc.). In some implementations, dynamic versus static information masks is one file or track of data. In some implementations, the preset format is a 3D video format type approved by an organization that develops standards for encoding multimedia such as digital audio and video. In some implementations, files for the 3D video can include RGB files or image files, depth maps with confidences, segmentation information, point cloud files for static reconstruction, video capture device (e.g., camera) metadata, time or location metadata, spatial audio, or the like);
and updating the sensor group spatial relationship at runtime based on IMU data and image data from the plurality of sensor groups (Col. 8, Lines 22-31 of Kalinowski teach that as shown by line 78, the optical component orientation measurements of block 75 to detect misalignment and the corresponding misalignment compensation image processing adjustments that are performed at block 76 may be performed continuously (e.g., periodically such as every T seconds, where T is at least 1 microsecond, at least 1 ms, at least 1 s, at least 100 s, less than 100 hours, less than 1 hour, less than 10 minutes, or other suitable time period), upon detection of a drop event, upon power up, in response to a user-initiated calibration request, etc. Additionally, FIGs. 5 and 7 and Col. 10, Line 50 through Col. 11, Line 3 of Kalinowski teach that Item 106 may include one or more sensors 110. Sensors 110 may include one or more sensors 16 such as cameras. When it is desired to calibrate device 10 and thereby measure any misalignment in the left and right images being displayed by device 10, device 10 may be turned on and directed to produce images in directions 56. During these operations, device 10 is not worn on a user's head, but rather is maintained in a fixed relationship relative to sensors 110 and item 106. Sensors 110, which may be mounted at the locations of eye boxes 30, can capture images of the displayed left and right images for processing by control circuitry in item 106 and/or device 10 to detect misalignment.  If desired, portions of item 106 between sensors 110 (see, e.g., portion 112) and/or portions of item 106 between sensors 110 and the portions of receptacle 108 holding device 10 may be formed from rigid structures (rigid polymer, metal, fiber-composite material, etc.) so that measurement accuracy is satisfactory. If desired, sensors 110 may themselves be compensated for misalignment using techniques of the type described in connection with FIGS. 3, 4, 5, and 6.  Lastly, Col. 11, Lines 45-50 of Kalinowski teach that during the operations of block 122 (e.g., later, after device 10 has been removed from item 106 and is being worn on a user's head), control circuitry 12 may use the stored calibration data (e.g., information on the measured misalignment) to compensate for the measured misalignment.).
Regarding claim 3, Pease in view of Hudman and Kalinowski disclose everything claimed as applied above (see claim 2), in addition, Pease in view of Hudman and Kalinowski disclose the method of claim 2, further comprising:
wherein one of the plurality of sensor groups comprises one of a first IMU sensor tightly coupled to a first camera, a second IMU sensor tightly coupled to a first component, or a second camera tightly coupled to a second component, wherein a first spatial relationship between the first IMU sensor and the first camera is fixed and predefined, wherein a second spatial relationship between the second IMU sensor and the first component is fixed and predefined, wherein a third spatial relationship between the second camera and the second component is fixed and predefined, wherein the first component or the second component comprises one of a display component, a projector, an illuminator, a LIDAR component, or an actuator (Paragraph 28 of Hudman teaches that in operation, at least one processor operatively coupled to the forward camera and the plurality of sensor ICs may receive the image sensor data from the camera and the plurality of optical flow sensor ICs. The at least one processor may process the received image sensor data to track a position of the head-mounted display based at least in part on the processing of the received image sensor data. For example, the at least one processor may fuse the sensor data from the forward camera and the plurality of sensor ICs to track one or more features present in an environment. In at least some implementations, the image sensor data may be fused with sensor data from other sensors, such as sensor data from an inertial measurement unit (IMU) of the HMD. Additionally, paragraph 37 of Pease teaches that FIG. 3 is a block diagram of an example of an electronic device 120 in accordance with some implementations. To that end, as a non-limiting example, in some implementations the electronic device 120 includes one or more input/output (I/O) devices and sensors 306, one or more programming (e.g., I/O) interfaces 310, one or more displays 312, one or more interior or exterior facing sensor systems 314, a memory 320, and one or more communication buses 304 for interconnecting these and various other components).
Regarding claim 6, Pease in view of Hudman and Kalinowski disclose everything claimed as applied above (see claim 1), in addition, Pease in view of Hudman and Kalinowski disclose the method of claim 1, wherein estimating the spatial relationship between the plurality of sensor groups further comprises:
fusing data from the plurality of sensor groups (Paragraph 102 of Pease teaches that in some implementations, multiple 3D videos are captured in a single physical environment or related physical environments. In some implementations, the multiple 3D videos may be spatially oriented in a combined coordinated playback in a XR environment);
and correcting one or more sensor data based on the fused data (Paragraph 108 of Pease teaches that at block 1020, the method 1000 obtains first adjustments (e.g., first transforms) to align content represented in the images and depth data, the first adjustments accounting for movement of a device that captured the images and the depth data. In some implementations, the first adjustments are transforms identified based on motion sensor data for the capturing device, based on image data comparisons, or based on identifying one or more static objects and determining the first adjustments based on the static objects).
Regarding claim 11, the apparatus steps correspond to and are rejected the same as the method steps of claim 1 (see claim 1 above).
Regarding claim 12, the apparatus steps correspond to and are rejected the same as the method steps of claim 2 (see claim 2 above).
Regarding claim 13, the apparatus steps correspond to and are rejected the same as the method steps of claim 3 (see claim 3 above).
Regarding claim 16, the apparatus steps correspond to and are rejected the same as the method steps of claim 6 (see claim 6 above).
Regarding claim 20, a non-transitory computer-readable storage medium corresponds to and is rejected the same as the method steps of claim 1 (see claim 1 above), in addition, Pease in view of Hudman and Kalinowski discloses a non-transitory computer-readable storage medium (Paragraph 31 of Pease teaches that in some implementations, the memory 220 or the non-transitory computer readable storage medium of the memory 220 stores the following programs, modules and data structures, or a subset thereof including an optional operating system 230, a XR module 240, a 3D video capture unit 250, and a 3D video presentation unit 260).
Claims 4, 5, 14 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Pease in view of Hudman and Kalinowski, as applied to claim 3 above, and further in view of Ardisana, II et al. (Pub. No.: US 2020/0204787 A1), hereinafter Ardisana.
Regarding claim 4, Pease in view of Hudman and Kalinowski disclose everything claimed as applied above (see claim 3), however, Pease in view of Hudman and Kalinowski fail to disclose wherein the plurality of sensor groups are mechanically mounted to a flexible eyewear frame of the AR display device, the flexible eyewear frame being configured to bend during use.
Ardisana discloses wherein the plurality of sensor groups are mechanically mounted to a flexible eyewear frame of the AR display device, the flexible eyewear frame being configured to bend during use (FIGS. 1C and 1D and Paragraph 29 teach that for example, the stereoscopic imaging algorithm may be set based on the known fields of view of the cameras as shown FIG. 1C, which have sight lines that are substantially parallel to each other. As illustrated in FIG. 1D, however, when the user places eyewear 12 on their head, frame 13 may flex due to temples 14A, 14B bowing outward to bowed temple positions 14A′, 14B′, resulting in a change in the orientation of the cameras 10, 11.).  Since Pease in view of Hudman and Kalinowski teach a type of AR device that can be shaped to a user’s face and can consist of multiple types of cameras and sensors coupled to each other and Ardisana teaches a type of AR device that is flexible and is configured to bend, it would have been obvious to a person having ordinary skill in the art to combine the concepts together so that when creating a type of AR device that can physically mount and couple a camera and sensor(s) together on an AR device that can shape to a user’s head/face, that device would also be able to bend and flex in order to better properly shape to the user’s head more accurately. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing data of the claimed invention to have modified Pease in view of Hudman and Kalinowski to incorporate the concepts of Ardisana, so that the combined features together would allow for the AR device to be flexible and bendable, which would provide the user with better overall comfort and usability when wearing the AR device. 
Additionally, Pease in view of Hudman, Kalinowski and Ardisana disclose estimating the spatial relationship between a first sensor group and a second sensor group of the plurality of sensor groups based on the sensor groups data (Paragraph 104 of Pease teaches that in some implementations, a second 3D video is generated including a second sequence of images, second depth data, and second adjustments that align second content represented in the second sequence of images and the second depth data of the physical environment in a second coordinate system to reduce effects of motion of the second sensors during capturing by second sensors. In some implementations, a spatial relationship between the first coordinate system of the 3D video and second coordinate system of the second 3D video provides spatially related playback of the 3D video and the second 3D video in a XR environment);
and estimating a bending value of the flexible eyewear frame based on changes in the spatial relationship between the first sensor group and the second sensor group (Paragraph 46 of Ardisana teaches that in an alternative embodiment, the calibration offset is determined based on an amount of flexure experienced by the eyewear. The amount of flexure may be estimated based on a value generated by a strain gauge in the frame of the eyewear. For example, predefined offset values may be associated with predefined levels of strain (e.g., none, low, medium, and high). A difference calibration offset may be determined for each flexure amount (e.g., using steps 302 308) enabling the system to properly render and display stereoscopic images taking into account the amount of flexure.);  
estimating a bending of the AR display device based on the bending value form the changes in the spatial relationship between the first sensor group and the second sensor group (Paragraph 51 of Ardisana teaches that at block 326, the stereoscopic algorithm obtains a calibration offset (e.g., from the process described above with respect to FIG. 3A). In an example, controller 100 retrieves the calibration offset from memory 106. Controller 100 may first determine an amount of flexure the frame 12 is experiencing and select a calibration offset from memory 106 corresponding to the amount of flexure. Additionally, paragraph 52 teaches that at block 328, the stereoscopic algorithm adjusts a three-dimensional rendering offset (i.e., an offset between two captured images of a scene captured by cameras having a known relationship to one another in order to provide a three-dimensional effect) in a rendering algorithm by the obtained calibration offset. In an example, controller 100 adjusts the three-dimensional rendering offset by the calibration offset.);  
and transforming the display of the virtual content in the AR display device based on the bending and the estimated bending value to compensate for frame deformation (Paragraph 53 of Ardisana teaches that at block 330, the stereoscopic algorithm presents three dimensional images based on the rendered stereoscopic images using the adjusted offset. In an example, the stereoscopic algorithm presents the right and left images of the stereoscopic images to the right and left eyes, respectively, of an observer (e.g., via displays of the eyewear). The presented images are projected, taking the adjusted offset into account, in order provide a more realistic three-dimensional effect to the wearer. In another example, the stereoscopic algorithm blends the right and left images of the stereoscopic images on a display, taking the adjusted offset into account, in order provide a more realistic three-dimensional effect to the viewer.).
Regarding claim 5, Pease in view of Hudman, Kalinowski and Ardisana disclose everything claimed as applied above (see claim 4), in addition, Pease in view of Hudman, Kalinowski and Ardisana disclose the method of claim 4, wherein:
estimating the spatial relationship between the plurality of sensor groups is based on the factory calibration data and a combination of image data and IMU data from each sensor group (Paragraph 32 of Ardisana teaches that generally, the eyewear 12 performs a calibration prior to generating stereoscopic images. The calibration algorithm includes capturing images from both cameras 10 and 11 and determining the relative fields of view between the cameras by matching features between corresponding images captured by each of the cameras (i.e., what is the relative movement of a feature between right camera 10 and left camera 11. This calibration may be performed automatically by the eyewear, or upon user request (e.g. the user pressing a button such as button 32 (FIG. 1B)). Once calibration is performed, the eyewear may capture stereoscopic images for use in producing three dimensional images and/or producing three dimensional effects by taking into account changes to the sight lines/fields of view.  Additionally, paragraph 26 of Hudman teaches that in operation, at least one processor operatively coupled to the forward camera and the plurality of sensor ICs may receive the image sensor data from the camera and the plurality of optical flow sensor ICs. The at least one processor may process the received image sensor data to track a position of the head-mounted display based at least in part on the processing of the received image sensor data. For example, the at least one processor may fuse the sensor data from the forward camera and the plurality of sensor ICs to track one or more features present in an environment. In at least some implementations, the image sensor data may be fused with sensor data from other sensors, such as sensor data from an inertial measurement unit (IMU) of the HMD.).
Regarding claim 14, the apparatus steps correspond to and are rejected the same as the method steps of claim 4 (see claim 4 above).
Regarding claim 15, the apparatus steps correspond to and are rejected the same as the method steps of claim 5 (see claim 5 above).
Claims 7, 8, 17 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Pease in view of Hudman and Kalinowski, as applied to claim 1 above, and further in view of Sivan (Pub. No.: US 2020/0195833 A1).
Regarding claim 7, Pease in view of Hudman and Kalinowski disclose everything claimed as applied above (see claim 1), in addition, Pease in view of Hudman and Kalinowski disclose the method of claim 1, further comprising:
capturing a first set of image frames from a set of cameras of the AR display device (Paragraph 40 of Pease teaches that in some implementations, the one or more interior or exterior facing sensor systems 314 include an image capture device or array that captures image data);
accessing IMU data between the first set of image frames and a second set of image frames, wherein the second set of image frames is generated after the first set of image frames (Paragraphs 51 and 52 of Pease teach that in some implementations the processing includes a plurality of techniques. First, in some implementations, depth values are determined for every frame of every scene of the 3D video. In some implementations, multiple possible depth values are initially obtained for a frame from nearby frames or frames captured later in the 3D video data.  Second, in some implementations, for each frame, what is in motion and what is static is determined. This can include classifying objects or pixels representing objects in each frame that are dynamic (e.g., likely to move, such as people) or static (e.g., unlikely to move, such as walls, floors, tables, etc.));
estimating a first spatial relationship between each camera of the set of cameras for the first set of image frames (Paragraph 69 of Pease teaches the XR environment 500 may be generated from a frame of a sequence of frames captured by the first or second device 420, 425, for example, when executing an application in the physical environment 405 and Paragraph 104 teaches that in some implementations, a spatial relationship between the first coordinate system of the 3D video and second coordinate system of the second 3D video provides spatially related playback of the 3D video and the second 3D video in a XR environment);
estimating a second spatial relationship between each camera of the set of cameras for the second set of image frames (Paragraph 104 of Pease teaches that in some implementations, a spatial relationship between the first coordinate system of the 3D video and second coordinate system of the second 3D video provides spatially related playback of the 3D video and the second 3D video in a XR environment).  However, Pease in view of Hudman and Kalinowski fail to disclose with the IMU data.
Sivan discloses with the IMU data (Paragraph 348 teaches that the first or the second device, or any other device, component, or apparatus herein, may further comprise a second inertial sensor that may comprise one or more accelerometers, one or more gyroscopes, or one or more magnetometers, or an IMU, for measuring a second spatial direction of the respective third enclosure or component). Since Pease in view of Hudman and Kalinowski teach estimating a second spatial relationship between each camera of the set of cameras for the second set of image frames and Sivan teaches with IMU data , it would have been obvious to a person having ordinary skill in the art to combine the teachings together so that while estimating the second spatial relationship between each of the cameras, the IMU data would also be taken into account and used in the estimating of the second spatial relationship.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing data of the claimed invention to have modified Pease in view of Hudman and Kalinowski to incorporate the teachings of Sivan, so that the combined features together would allow for more accurate and detailed estimations of the spatial relationship between each of the cameras when taking into account the IMU data. 
In addition, Pease in view of Hudman, Kalinowski and Sivan disclose and processing the second set of image frames based on the first spatial relationship and the second spatial relationship (Paragraph 78 of Pease teaches that in some implementations during playback, second adjustments or viewing adjustments are continuously calculated and applied to address movements of the viewing electronic device around the replay environment (e.g., the real world in the XR environment) during playback of the processed 3D video. In some implementations, the viewing adjustments allow the processed 3D video being played back to remain stationary in the replay environment relative to the moving viewing electronic device. In some implementations, the viewing adjustments (e.g., second transform) counteract movement (e.g., reduces or eliminates motion) by the viewing electronic device so that the rendered 3D video stays fixed in real space as seen by the user at the viewing electronic device).
Regarding claim 8, Pease in view of Hudman, Kalinowski and Sivan disclose everything claimed as applied above (see claim 7), in addition, Pease in view of Hudman, Kalinowski and Sivan disclose the method of claim 7, wherein processing the second set of image frames comprises:
adjusting a predicted location of the virtual content in the second set of image frames based on the second spatial relationship between each camera of the set of cameras for the second set of image frames (Paragraph 79 of Pease teaches that in a video frame N1, 3D video content in a replay environment is displayed at a certain pose. Then, in a video frame N2, the 3D video content in the replay environment does not move. However, between the frame N1 and the frame N2, the user of the viewing electronic device (viewing the 3D video content in the replay environment) moves a first distance to the right and turns the viewing electronic device (e.g., the head of the user) 15o counterclockwise in a horizontal plane. In some implementations, the viewing adjustments (e.g., second transform) counters the physical displacement and angular horizontal displacement (e.g., movement between the frames N1 and N2 by the viewing electronic device) so that the rendered 3D video content in the replay environment stays fixed in the replay environment as seen by the user of the viewing electronic device (e.g., because the 3D video content did not move between frames N1 and N2)).
Regarding claim 17, the apparatus steps correspond to and are rejected the same as the method steps of claim 7 (see claim 7 above).
Regarding claim 18, the apparatus steps correspond to and are rejected the same as the method steps of claim 8 (see claim 8 above).
Claims 9, 10 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Pease in view of Hudman and Kalinowski, as applied to claim 1 above, and further in view of Osterhout et al.(U.S. Patent: 10,860,100 B2), hereinafter Osterhout.
Regarding claim 9, Pease in view of Hudman and Kalinowski disclose everything claimed as applied above (see claim 1), however, Pease in view of Hudman and Kalinowski fail to disclose the method of claim 1, wherein:
the AR display device comprises a proximity sensor.
Osterhout discloses the AR display device comprises a proximity sensor (Col. 78 Lines 51-54 teach that in embodiments, a remote control for the eyepiece may be activated and/or controlled through a proximity sensor. A proximity sensor may be a sensor able to detect the presence of nearby objects without any physical contact). Since Pease in view of Hudman and Kalinowski teach the initial method and AR display device and Osterhout teaches a proximity sensor, it would have been obvious to a person having ordinary skill in the art to combine the teachings together so that the AR display device would also comprise of a proximity sensor to help assist in estimating spatial relationships.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing data of the claimed invention to have modified Pease in view of Hudman and Kalinowski to incorporate the teachings of Osterhout so that the combined features together would provide a proximity sensor for the AR display device to improve the overall functionality and accuracy of the spatial relationship estimations. 
In addition, Pease in view of Hudman, Kalinowski and Osterhout disclose wherein the method comprises:
detecting a trigger event based on proximity data from the proximity sensor, wherein estimating the spatial relationship between the plurality of sensor groups in in response to detecting the trigger event (Col. 93 Lines 56-58 of Osterhout teach that in embodiments, eyepiece facilities may provide for presenting displayed content corresponding to an identified marker indicative of the intention to display the content and Col. 93 Line 66 and Col. 94 Lines 1-5 teach that in embodiments, visual marker cues and their associated content for display may be stored in memory on the eyepiece, in an external computer storage facility and imported as needed (such as by geographic location, proximity to a trigger target, command by the user, and the like)).
Regarding claim 10, Pease in view of Hudman, Kalinowski and Osterhout disclose everything claimed as applied above (see claim 1), in addition, Pease in view of Hudman, Kalinowski and Osterhout disclose the method of claim 1, wherein:
the AR display device includes an eyewear frame (Col. 29 Lines 13-22 of Osterhout teach that in embodiments, the see-through optics system including a planar illumination facility 8208 and reflective display 8210 as described herein may be applied to any head-worn device known to the art, such as including the eyepiece as described herein, but also to helmets (e.g. military helmets, pilot helmets, bike helmets, motorcycle helmets, deep sea helmets, space helmets, and the like) ski goggles, eyewear, water diving masks, dusk masks, respirators, Hazmat head gear, virtual reality headgear, simulation devices, and the like).
Regarding claim 19, the apparatus steps correspond to and are rejected the same as the method steps of claim 9 (see claim 9 above).
Response to Arguments
Applicant’s arguments with respect to independent claims 1, 11 and 20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.  The newly added limitations from the amendment is now rejected when considering the previous prior art of Pease in view of Hudman, along with the newly added prior art of Kalinowski.  Therefore, the combination of Pease in view of Hudman and the newly added prior art of Kalinowski appear to be capable of performing the intended use set forth in the independent claims 1, 11 and 20 (See claims 1, 11 and 20 above).
In regards to any additional arguments regarding the dependent claims 2-10 and 12-19, for the virtue of their dependency are moot because the independent claims are not allowable.  Furthermore, the newly added prior art of Kalinowski was incorporated into the rejection of the newly amended dependent claims 2 and 12 and therefore, the combination of Pease in view of Hudman and the newly added prior art of Kalinowski appear to be capable of performing the intended use set forth in the dependent claims 2 and 12 (See claims 2 and 12 above).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Hernandez et al. (Pub. No.: US 2024/0171726 A1) teaches an online calibration of display alignment in a head-worn device using multiple fixed IMUs rigidly coupled to an HMD.                                                                                                                                                                 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.  
Any inquiry concerning this communication or earlier communications from the examiner should be directed to George Renze whose telephone number is (703)756-5811. The examiner can normally be reached Monday-Friday 9:00am - 6:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached at (571) 272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/G.R./Examiner, Art Unit 2613    


/XIAO M WU/Supervisory Patent Examiner, Art Unit 2613
Read full office action
Prosecution Timeline

Mar 15, 2023
Application Filed
Dec 23, 2024
Non-Final Rejection — §103
Apr 02, 2025
Response Filed
Jun 04, 2025
Final Rejection — §103
Aug 06, 2025
Request for Continued Examination
Aug 07, 2025
Response after Non-Final Action
Sep 03, 2025
Non-Final Rejection — §103
Dec 05, 2025
Response Filed
Mar 28, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/461,229
Patent 12602407
SYSTEMS AND METHODS FOR GENERATING A UNIQUE IDENTITY FOR A GEOSPATIAL OBJECT CODE BY PROCESSING GEOSPATIAL DATA
2y 5m to grant Granted Apr 14, 2026
18/251,229
Patent 12573147
LANDMARK DATA COLLECTION METHOD AND LANDMARK BUILDING MODELING METHOD
2y 5m to grant Granted Mar 10, 2026
18/271,836
Patent 12555315
HEURISTIC-BASED VARIABLE RATE SHADING FOR MOBILE GAMES
2y 5m to grant Granted Feb 17, 2026
18/285,462
Patent 12530759
System and Method for Point Cloud Generation
2y 5m to grant Granted Jan 20, 2026
18/301,398
Patent 12505508
DIGITAL IMAGE RADIAL PATTERN DECODING SYSTEM
2y 5m to grant Granted Dec 23, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
67%
Grant Probability
99%
With Interview (+33.3%)
2y 7m
Median Time to Grant
High
PTA Risk
Based on 24 resolved cases by this examiner. Grant probability derived from career allow rate.