DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 02/26/2026 has been entered. Claims 1, 12, and 20 were amended. Claims 1-10 and 12-23 are pending in the application.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1, 5-10, 12, and 16-23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Guizilini et al. (US 2022/0148204) in view of Wilkinson et al. (US 9352230).
Regarding claim 1, Guizilini teaches/suggests: An apparatus comprising:
a non-transitory, machine-readable storage medium storing instructions; and at least one processor coupled to the non-transitory, machine-readable storage medium, the at least one processor being configured to execute the instructions (Guizilini [0009] “The depth system includes one or more processors and a memory communicably coupled to the one or more processors”) to:
process image data to generate feature data representing image features of the image data (Guizilini [0037]-[0038] “the image encoder 410 processes the monocular image 450 … image features from the image encoder 410 derived from the monocular image 450”);
sparse depth values in a two-dimensional (2D) space (Guizilini [0037] “the sensor data 250, which includes sparse depth data 440 and a monocular image 450”);
generate predicted depth values for the image data based on the feature data and the 2D space including the sparse depth values (Guizilini [0039] “the depth model 250 injects the sparse depth data 440 into the encoder/decoder structure and improves an accuracy of the depth map 260 as the final output”); and
store the predicted depth values in a data repository (Guizilini [0029] “the data store 230 includes … a depth map 260”).
Guizilini does not teach/suggest:
process, using a six degrees of freedom (6DOF) tracker, the image data and sensor data to determine three-dimensional (3D) feature data associated with keypoints in the image data;
apply a depth estimation process to the 3D feature data to generate sparse depth values in a two-dimensional (2D) space, wherein the sparse depth values correspond to the keypoints tracked by the 6DOF tracker and are generated based on projection of the 3D feature data into the 2D space at locations of the keypoints;
Wilkinson, however, teaches/suggests:
process, using a six degrees of freedom (6DOF) tracker, the image data and sensor data to determine three-dimensional (3D) feature data associated with keypoints in the image data (Wilkinson col. 7 ll. 11-35 “The sensor data 606 is processed by a tracker 607 as described previously, resulting in a set of consecutive points or positions 608 in a 3 dimensional space defined by the tracker 607” col. 2 ll. 37-53 “The motion of the controller in six degrees of freedom is tracked by analyzing sensor data from the inertial sensors in conjunction with video images”);
apply a depth estimation process to the 3D feature data to generate sparse depth values in a two-dimensional (2D) space, wherein the sparse depth values correspond to the keypoints tracked by the 6DOF tracker and are generated based on projection of the 3D feature data into the 2D space at locations of the keypoints (Wilkinson col. 7 ll. 11-35 “estimate the depth of each of the 2 dimensional points from the camera at 610, resulting in a set of consecutive points 611 in a 3 dimensional space determined by the camera” col. 9 ll. 1-7 “where the diagram 801 indicates three points (AA, BB, CC) in a tracker space and the diagram 802 indicates the corresponding three points (A, B, C) in a camera space as well as the projections of two of the points onto an image plane (pA, pC)”);
Before the effective filing date of the claimed invention, the substitution of one known element (the estimated sparse depth data of Wilkinson) for another (the sparse depth data of Guizilini) would have been obvious to one of ordinary skill in the art because such substitutions would have yielded predictable results, namely to improve the depth map.
Regarding claim 5, Guizilini as modified by Wilkinson teaches/suggests: The apparatus of claim 1, wherein the at least one processor is further configured to execute the instructions to:
apply a first encoding process to the image data to generate the feature data, the feature data including a first set of features (Guizilini [0038] “image features from the image encoder 410 derived from the monocular image 450”);
apply a second encoding process to the sparse depth values to generate a second set of features (Guizilini [0038] “The depth features are encoded features from the sparse depth data 440”); and
apply a decoding process to the first set of features and the second set of features to generate the predicted depth values (Guizilini [0039] “while the depth decoder 420 receives a feature map of image features from the image encoder 410 and iteratively decodes the feature map through subsequent spatial dimensions, the skip connections provide the residual image features concatenated with the depth features into the respective iterations”).
Regarding claim 6, Guizilini as modified by Wilkinson teaches/suggests: The apparatus of claim 5, wherein the at least one processor is further configured to execute the instructions to provide at least one skip connection from the second encoding process to the decoding process (Guizilini [0046] “which the encoding layers provide to subsequent layers in the depth model 250, including specific layers of an image decoder 420 via skip connections that may function to provide residual information between the image encoder 410 and the image decoder 420”).
Regarding claim 7, Guizilini as modified by Wilkinson teaches/suggests: The apparatus of claim 6,
wherein the at least one skip connection comprises a first skip connection and a second skip connection (Guizilini [0046] “which the encoding layers provide to subsequent layers in the depth model 250, including specific layers of an image decoder 420 via skip connections that may function to provide residual information between the image encoder 410 and the image decoder 420”),
wherein the at least one processor is configured to execute the instructions to:
provide the first skip connection from a first layer of the second encoding process to a first layer of the decoding process (Guizilini [0046] “which the encoding layers provide to subsequent layers in the depth model 250, including specific layers of an image decoder 420 via skip connections that may function to provide residual information between the image encoder 410 and the image decoder 420”); and
provide a second skip connection from a second layer of the second encoding process to a second layer of the decoding process (Guizilini [0046] “which the encoding layers provide to subsequent layers in the depth model 250, including specific layers of an image decoder 420 via skip connections that may function to provide residual information between the image encoder 410 and the image decoder 420”).
Regarding claim 8, Guizilini as modified by Wilkinson teaches/suggests: The apparatus of claim 5, wherein the at least one processor is configured to execute the instructions to:
obtain first parameters from the data repository, and establish the first encoding process based on the first parameters (Guizilini [0029] “the data store 230 includes … a depth model 250” [0040] “the separate encoder/decoder structures 400, 410, and 420 are comprised of multiple different layers” [The first parameters are inherent/implicit features of the encoder.]);
obtain second parameters from the data repository, and establish the second encoding process based on the second parameters (Guizilini [0029] “the data store 230 includes … a depth model 250” [0040] “the separate encoder/decoder structures 400, 410, and 420 are comprised of multiple different layers … the SAN 400 further includes learned weights 500 and 510” [The weights meet the second parameters.]); and
obtain third parameters from the data repository, and establish the decoding process based on the third parameters (Guizilini [0029] “the data store 230 includes … a depth model 250” [0040] “the separate encoder/decoder structures 400, 410, and 420 are comprised of multiple different layers” [The third parameters are inherent/implicit features of the decoder.]).
Regarding claim 9, Guizilini as modified by Wilkinson does not teach/suggest: The apparatus of claim 1, wherein the image data includes a monochrome image. However, the concept and advantages of a monochrome image are well known and expected in the art (Official Notice). It would have been obvious for the monocular images of Guizilini as modified by Wilkinson to include the monochrome image for a black and white only image.
Regarding claim 10, Guizilini as modified by Wilkinson teaches/suggests: The apparatus of claim 1 comprising at least one camera, wherein the at least one camera is configured to capture the image data (Guizilini [0030] “the monocular images are, for example, images from the camera 126”).
Claims 12, 16-19, and 21 recite limitation(s) similar in scope to those of claims 1 and 5-9, respectively, and are rejected for the same reason(s).
Claim 20 recites limitation(s) similar in scope to those of claim 1, and is rejected for the same reason(s).
Regarding claim 22, Guizilini as modified by Wilkinson teaches/suggests: The apparatus of claim 1, wherein, to apply the depth estimation process, the at least one processor is configured to execute the instructions to use a sparse point engine (Guizilini [0037] “the sensor data 250, which includes sparse depth data 440 and a monocular image 450” Wilkinson col. 7 ll. 11-35 “The image data 602 is processed by a module (e.g., a software module stored in memory and executed by a processor) capable of locating markers in the image data 602, resulting in a set of consecutive points 604 in the 2 dimensional image plane of the camera with some known correspondences to the tracker points 608”). The module meets the sparse point engine. The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.
Claim 23 recites limitation(s) similar in scope to those of claim 22, and is rejected for the same reason(s).
Claim(s) 2, 4, 13, and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Guizilini et al. (US 2022/0148204) in view of Wilkinson et al. (US 9352230) as applied to claims 1 and 12 above, and further in view of Sinha (US 2021/0279904).
Regarding claim 2, Guizilini and Wilkinson are silent regarding: The apparatus of claim 1, wherein the at least one processor is configured to execute the instructions to generate an output image based on the predicted depth values. Sinha, however, teaches/suggests generate an output image based on the predicted depth values (Sinha [0020] “The estimated depths can be utilized by a spatial computing system, for example, to provide an accurate and effective 3D XR experience … the estimated depths may be used to generate a 3D reconstruction”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the depth map of Guizilini as modified by Wilkinson to be used for display as taught/suggested by Sinha for a 3D XR experience.
Regarding claim 4, Guizilini as modified by Wilkinson and Sinha teaches/suggests: The apparatus of claim 2, wherein the at least one processor is configured to execute the instructions to provide the output image for viewing in an extended reality environment (Sinha [0020] “The estimated depths can be utilized by a spatial computing system, for example, to provide an accurate and effective 3D XR experience … the estimated depths may be used to generate a 3D reconstruction”). The same rationale to combine as set forth in the rejection of claim 2 above is incorporated herein.
Claims 13 and 15 recite limitation(s) similar in scope to those of claims 2 and 4, respectively, and are rejected for the same reason(s).
Claim(s) 3 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Guizilini et al. (US 2022/0148204) in view of Wilkinson et al. (US 9352230) and Sinha (US 2021/0279904) as applied to claims 2 and 13 above, and further in view of Zhang (US 2017/0295373).
Regarding claim 3, Guizilini as modified by Wilkinson and Sinha does not teach/suggest: The apparatus of claim 2, wherein the at least one processor is configured to execute the instructions to generate pose data characterizing a pose of a user, and generate the output image based on the pose data. Zhang, however, teaches/suggests generate pose data characterizing a pose of a user, and generate the output image based on the pose data (Zhang [0021] “as the user's pose and gaze direction change over time, the HMD device 100 commensurately alters the focus region and peripheral region so that the high-resolution portion of the displayed image remains within the user's area of focus”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the 3D reconstruction of Guizilini as modified by Wilkinson and Sinha to be displayed based on the user’s pose/gaze as taught/suggested by Zhang to remain within the user's area of focus.
Claim 14 recites limitation(s) similar in scope to those of claim 3, and is rejected for the same reason(s).
Response to Arguments
Applicant's arguments filed 02/26/2026 have been fully considered but they are moot in view of the new ground(s) of rejection set forth in this Office action.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
US 2014/0152647 – depth estimation
US 2025/0299350 – depth estimation
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANH-TUAN V NGUYEN whose telephone number is 571-270-7513. The examiner can normally be reached on M-F 9AM-5PM ET. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JASON CHAN can be reached on 571-272-3022. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ANH-TUAN V NGUYEN/
Primary Examiner, Art Unit 2619