Last updated: April 19, 2026
Application No. 18/658,929
SYSTEM AND METHODS FOR DEPTH-AWARE VIDEO PROCESSING AND DEPTH PERCEPTION ENHANCEMENT

Final Rejection §103
Filed
May 08, 2024
Examiner
CHEN, YU
Art Unit
2613
Tech Center
2600 — Communications
Assignee
Huawei Technologies Co., Ltd.
OA Round
2 (Final)
Interview Optional

— +29.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 1052 resolved cases, 2023–2026
Examiner Intelligence

CHEN, YU View full profile →
Grants 68% — above average
Career Allow Rate
711 granted / 1052 resolved
+5.6% vs TC avg
Strong +30% interview lift
Without
With
+29.9%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
110 currently pending
Career history
1162
Total Applications
across all art units
Statute-Specific Performance

§101
2.2%
-37.8% vs TC avg
§103
43.9%
+3.9% vs TC avg
§102
27.0%
-13.0% vs TC avg
§112
20.7%
-19.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 1052 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
Response to Amendment
This is in response to applicant’s amendment/response filed on 01/30/2026, which has been entered and made of record.  Claims 1, 8, 15 have been amended.  No claim has been cancelled.  No claim has been added.  Claims 1-20 are pending in the application. 
As an initial matter, the rejections under 35 USC 112 for claims 1-20 have been withdrawn in view of applicant's amendments.

Response to Arguments
Applicant's arguments filed on 01/30/2026 regarding claims rejection under 35 U.S.C 103 have been fully considered but they are not persuasive.
Applicant submits “the references do not teach: Generating a plurality of depth-aware processed signals wherein the plurality of depth- aware processed signals comprise depth-aware enhanced edge emphasis signals, depth-aware enhanced detail signals, and depth-aware enhanced imaging data; That these depth aware signals are generated from the plurality of edge emphasis signals, the plurality of detail signals, and the base signal, and using the scene lighting mode vector and the plurality of depth values, and a nonlinear emphasis amplitude modulation, Or the steps of generating, from the plurality of depth-aware processed signals, the depth- aware enhanced imaging data” (Remarks, Page 14.)
The examiner disagrees with Applicant’s premises and conclusion.  Roy teaches generating, from the plurality of depth-aware processed signals, the depth-aware enhanced imaging data (¶0035, “a depth map can be utilized to create relit “gain” maps that are applied to the images. The relit images can then be combined together to form a final image of a scene, ideally with more aesthetic-pleasing relighting effects.”).  Xie teaches a nonlinear emphasis amplitude modulation (Page 7, “In this way, non-zero gradients will be finally aggregated near sharp features, while zero gradients mainly stay at the region with small scale variations in normal field. Thus, it is feasible to distinguish structure constituents now due to the amplitudes of details having dropped below these of features in section 4.1.”. Non zero and zero gradients suggest nonlinear emphasis.).  
 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-2, 5, 8-9, 12, 15-16, 18 are rejected under 35 U.S.C. 103 as being unpatentable over Roy et al. (US Pub 2022/0020128 A1 in view of Xie, Haoran, et al. ("Joint weighted least squares for normal decomposition of 3D measurement surface." Measurement Science and Technology 31.4 (2020): 045401.)

As to claim 1, Roy discloses a computer-implemented method comprising: obtaining imaging data (abstract, “capturing a plurality of first images using a first image sensor and a plurality of second images using a second image sensor”);
obtaining a depth map including a plurality of depth values for the imaging data (¶00068, “Each depth estimator 530 and 532 operates to calculate a depth map of the scene 506 using the same exposure views from the image sensors 508 and 510. “) and 
a scene lighting mode vector characterizing a scene lighting of the imaging data (Fig. 11C, ¶0099-0100, “The light vector map generation operation 1138 generates a light vector map from the depth map 616 and the lighting conditions 1124. The light source position(s) from the lighting conditions 1124 can be used to determine the light vector. In some embodiments, a unit incident light ray vector 1148 (l) can be determined as follows:”);
generating, using the plurality of depth values, a plurality of edge emphasis signals by a depth edge filtering process (Fig. 7C, ¶0081, “the main block operation 704 includes an edge strength filter 742, a YUV transform operation 746, and a sub-main block operation 756. The edge strength filter 742 operates to identify edges in the Y reference frame 710 and to produce a normal map 744.”); 
generating a plurality of depth-aware processed signals, from the plurality of edge emphasis signals(Abstract, Fig. 7C, Fig. 11c, ¶0068, ¶0081, ¶0099-0100), (¶0035, “a depth map can be utilized to create relit “gain” maps that are applied to the images. The relit images can then be combined together to form a final image of a scene, ideally with more aesthetic-pleasing relighting effects.”); 
generating, from the plurality of depth-aware processed signals, the depth-aware enhanced imaging data (¶0035, “a depth map can be utilized to create relit “gain” maps that are applied to the images. The relit images can then be combined together to form a final image of a scene, ideally with more aesthetic-pleasing relighting effects.”); and 
providing the depth-aware enhanced imaging data for display on a display device (¶0035 and ¶0042).
Roy does not explicitly disclose generating, using the imaging data and the plurality of depth values, a plurality of detail signals and a base signal by a joint three-dimensional (3D) spatial-depth-value filtering process; and a nonlinear emphasis amplitude modulation.  
Xie teaches generating, using the imaging data and the plurality of depth values, a plurality of detail signals and a base signal by a joint three-dimensional (3D) spatial-depth-value filtering process (Xie, abstract, “We propose a joint weighted least squares (JWLS) to solve the challenging problem of how to filter out the detailed appearance (geometric details) and preserve intrinsic geometric properties (structural patterns) of any measurement surface simultaneously” Page 2, “Based on our new geometry assumption, we propose a novel 3D joint WLS algorithm or simply JWLS for decoupling a mesh normal field to a base layer and a detail layer” “we design a three-step filter for effective decomposition of a surface’s detail layer and base layer.” “we assume that a 3D surface contains three geometric properties, geometric detail, structure pattern (step edge), and smooth-varying shape.” Page 9, “Surface normal decomposition, i.e. JWLS, has three parameters, i.e. λ1, λ2 and α, which are fixed as λ1 = 1.5, λ2 = 0.3 and α = 1.2 for 3D models, and λ1 = 2, λ2 = 0.45 and α = 1.5 for depth images”); 
depth-aware enhanced detail signals (Page 2, “we propose a novel 3D joint WLS algorithm or simply JWLS for decoupling a mesh normal field to a base layer and a detail layer” “Under this geometric assumption, we separate a surface into a detail layer and a base layer, which correspond to geometric details, and the smooth-varying shape plus step edges, respectively.” Page 9, “We have also done experiments on depth images. Depth images are a kind of height fields, which also need to be compressed to produce bas-reliefs.”), 
and a nonlinear emphasis amplitude modulation (Page 7, “In this way, non-zero gradients will be finally aggregated near sharp features, while zero gradients mainly stay at the region with small scale variations in normal field. Thus, it is feasible to distinguish structure constituents now due to the amplitudes of details having dropped below these of features in section 4.1.”. Non zero and zero gradients suggest nonlinear emphasis.)  
Roy and Xie are considered to be analogous art because all pertain to image generation. It would have been obvious before the effective filing date of the claimed invention to have modified Roy with the features of “disclose generating, using the imaging data and the plurality of depth values, a plurality of detail signals and a base signal by a joint three-dimensional (3D) spatial-depth-value filtering process; depth-aware enhanced detail signals, and a nonlinear emphasis amplitude modulation” as taught by Xie. The suggestion/motivation would have been in order to filter out the detailed appearance (geometric details) and preserve intrinsic geometric properties (structural patterns) (Xie, abstract.).

As to claim 2, claim 1 is incorporated and the combination of Roy and Xie discloses generating, from the imaging data, the depth map of the imaging data (Roy, abstract, “generating a short single view, short depth map, long single view, and long depth map from the first and second images.”); and 
determining, using the depth map, the plurality of depth values (Roy, ¶0035, “a depth map can be utilized to create relit “gain” maps that are applied to the images.” ¶0096, “A depth-based relighting operation 1102 applies relighting to the short single view 534 using the multi-frame depth map 616, and a depth-based relighting operation 1106 applies relighting to the long single view 538 using the multi-frame depth map 616.”).

As to claim 5, claim 1 is incorporated and the combination of Roy and Xie discloses obtaining the scene lighting mode vector comprises: generating, from the imaging data and depth values from the depth map of the imaging data, the scene lighting mode vector (Roy, ¶0099, “a light vector map generation operation 1138” ¶0100, “The light vector map generation operation 1138 generates a light vector map from the depth map 616 and the lighting conditions 1124.”).

As to claim 8, the combination of Roy and Xie discloses one or more non-transitory computer-readable media coupled to one or more processors and having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations comprising: obtaining imaging data; obtaining a depth map including a plurality of depth values for the imaging data and a scene lighting mode vector characterizing a scene lighting of the imaging data; generating, using the plurality of depth values, a plurality of edge emphasis signals by a depth edge filtering process; generating, using the imaging data and the plurality of depth values, a plurality of detail signals and a base signal by a joint three-dimensional (3D) spatial-depth-value filtering process; generating a plurality of depth-aware processed signals, from the plurality of edge emphasis signals, the plurality of detail signals, and the base signal, and using the scene lighting mode vector and the plurality of depth values, and a nonlinear emphasis amplitude modulation wherein the plurality of depth-aware processed signals comprise depth-aware enhanced edge emphasis signals, depth-aware enhanced detail signals, and depth-aware enhanced image data; generating, from the plurality of depth-aware processed signals, depth-aware enhanced imaging data; and providing the depth-aware enhanced imaging data for display on a display device (See claim 1 for detailed analysis.).

As to claim 9, claim 8 is incorporated and the combination of Roy and Xie discloses generating, from the imaging data, the depth map of the imaging data; and determining, using the depth map, the plurality of depth values (See claim 2 for detailed analysis.).

As to claim 12, claim 8 is incorporated and the combination of Roy and Xie discloses obtaining the scene lighting mode vector comprises: generating, from the imaging data and depth values from the depth map of the imaging data, the scene lighting mode vector (See claim 5 for detailed analysis.).

As to claim 15, the combination of Roy and Xie discloses a system, comprising: one or more processors; and a computer-readable media device coupled to the one or more processors and having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations comprising: obtaining imaging data; obtaining a depth map including a plurality of depth values for the imaging data and a scene lighting mode vector characterizing a scene lighting of the imaging data; generating, using the plurality of depth values, a plurality of edge emphasis signals by a depth edge filtering process; generating, using the imaging data and the plurality of depth values, a plurality of detail signals and a base signal by a joint three-dimensional (3D) spatial-depth-value filtering process; generating a plurality of depth-aware processed signals, from the plurality of edge emphasis signals, the plurality of detail signals, and the base signal, and using the scene lighting mode vector and the plurality of depth values, and a nonlinear emphasis amplitude modulation, wherein the plurality of depth-aware processed signals comprise depth-aware enhanced edge emphasis signals, depth-aware enhanced detail signals, and depth-aware enhanced imaging data; generating, from the plurality of depth-aware processed signals, the depth-aware enhanced imaging data; and providing the depth-aware enhanced imaging data for display on a display device (See claim 1 for detailed analysis.).

As to claim 16, claim 15 is incorporated and the combination of Roy and Xie discloses generating, from the imaging data, the depth map of the imaging data; and determining, using the depth map, the plurality of depth values (See claim 2 for detailed analysis.).

As to claim 18, claim 15 is incorporated and the combination of Roy and Xie discloses obtaining the scene lighting mode vector comprises: generating, from the imaging data and depth values from the depth map of the imaging data, the scene lighting mode vector (See claim 5 for detailed analysis.).

Claims 3-4, 10-11, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Roy et al. (US Pub 2022/0020128 A1 in view of Xie, Haoran, et al. ("Joint weighted least squares for normal decomposition of 3D measurement surface." Measurement Science and Technology 31.4 (2020): 045401.) and Toksvig et al. (US Pub 2019/0037137 A1). 

As to claim 3, claim 1 is incorporated and the combination of Roy and Xie does not disclose obtaining, coordinate data defining positions of one or more of i) a body ii) a head iii) a face and iv) eye(s) of a dominant viewer of a display of a user device by a camera.
Toksvig discloses obtaining, coordinate data defining positions of one or more of i) a body ii) a head iii) a face and iv) eye(s) of a dominant viewer of a display of a user device by a camera (Toksvig, ¶0022, “Face tracking can continuously (or at defined intervals) return a positional coordinate of an eye of the user (or other relevant point of interest) relative to the camera 130 (hereinafter, a “camera coordinate position”). For example, in FIG. 1A, the camera coordinate position 150 representing the user's viewpoint 170 can be a camera coordinate position generated using face tracking. In some implementations, face tracking returns a camera coordinate position including the XYZ coordinates (i.e. 3D coordinates) of the location of an eye (or eyes) of a user relative to the position of the camera 130, but other implementations may only return 2D coordinates (such as the coordinates of the user's location in the current image output of the camera 130). In some embodiments, camera coordinate positions determined through face tracking include an estimated depth of the viewpoint, for example, determined based on the size of the user's face in the camera output, based on other depth calibration or detection of the camera coordinate positioning, or based on any other suitable factor.”).  
Roy, Xie and Toksvig are considered to be analogous art because all pertain to image generation. It would have been obvious before the effective filing date of the claimed invention to have modified Roy with the features of “obtaining, coordinate data defining positions of one or more of i) a body ii) a head iii) a face and iv) eye(s) of a dominant viewer of a display of a user device by a camera.” as taught by Toksvig. The suggestion/motivation would have been in order use the spatial relationship determined through calibration to transform coordinates relative to the camera into coordinates relative to the screen (Toksvig, ¶0021.).

As to claim 4, claim 3 is incorporated and the combination of Roy, Xie and Toksvig discloses generating, from the plurality of depth-aware processed signals, depth-aware enhanced imaging data further comprises: generating, based on the coordinate data, a spatial modulation of the depth-aware enhanced imaging data, wherein the spatial modulation of the depth-aware enhanced imaging data specifies modification of one or more of shadow, shading, and halo of the depth-aware enhanced imaging data (Roy, ¶0051, “the multi-frame image 200 represents an image captured with a standard exposure time. This particular image 200 is an image of a scene outside an office building. The scene has plants in a rock garden next to a sidewalk. The sidewalk extends from the foreground to the background of the image 200, and a shadow falls across the sidewalk in the foreground of the image.” ¶0058, “The depth-aware multi-frame relit image 302 uses depths within the scene, along with multiple image frames captured using different exposure times. In this example, the addition of light is only added to the foreground while keeping the overall well-exposedness of the scene. This approach allows for bright areas of the foreground to avoid saturation. In this particular example, this allows the shadow across the sidewalk in the foreground to remain visible. The following describes various techniques for performing multi-frame depth-based multi-camera relighting of images.”).

As to claim 10, claim 8 is incorporated and the combination of Roy, Xie and Toksvig discloses obtaining, coordinate data defining positions of one or more of i) a body ii) a head iii) a face and iv) eye(s) of a dominant viewer of a display of a user device by a camera (See claim 3 for detailed analysis.).

As to claim 11, claim 8 is incorporated and the combination of Roy, Xie and Toksvig discloses generating, from the plurality of depth-aware processed signals, depth-aware enhanced imaging data further comprises: generating, based on the coordinate data, a spatial modulation of the depth-aware enhanced imaging data, wherein the spatial modulation of the depth-aware enhanced imaging data specifies modification of one or more of shadow, shading, and halo of the depth-aware enhanced imaging data (See claim 4 for detailed analysis.).

As to claim 17, claim 15 is incorporated and the combination of Roy, Xie and Toksvig discloses obtaining, coordinate data defining positions of one or more of i) a body ii) a head iii) a face and iv) eye(s) of a dominant viewer of a display of a user device by a camera, wherein generating, from the plurality of depth-aware processed signals, depth-aware enhanced imaging data further comprises: generating, based on the coordinate data, a spatial modulation of the depth-aware enhanced imaging data, wherein the spatial modulation of the depth-aware enhanced imaging data specifies modification of one or more of shadow, shading, and halo of the depth-aware enhanced imaging data (See claim 3-4 for detailed analysis.).

Allowable Subject Matter
Claims 6-7, 13-14, 19-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

6. The method of claim 1, further comprising: converting pixel values of the imaging data to a perceptual color space prior to generating the plurality of edge emphasis signals, the plurality of detail signals, and the base signal; and converting pixel values of the depth-aware enhanced imaging data to a display color space prior to providing the depth-aware enhanced imaging data for display.
7. The method of claim 1, wherein generating the plurality of depth-aware processed signals comprising the depth-aware enhanced edge emphasis signals and depth-aware enhanced detail signals comprises: applying, to the plurality of edge emphasis signals and utilizing emphasis gain values obtained from the scene lighting mode vector, a nonlinear emphasis amplitude modulation to generate the depth-aware enhanced edge emphasis signals; and applying, to the plurality of detail signals and utilizing detail gain values obtained from the scene lighting mode vector, a nonlinear detail amplitude modulation to generate the depth-aware enhanced detail signals.
13. The computer-readable media of claim 8, further comprising: converting pixel values of the imaging data to a perceptual color space prior to generating the plurality of edge emphasis signals, the plurality of detail signals, and the base signal; and converting pixel values of the depth-aware enhanced imaging data to a display color space prior to providing the depth-aware enhanced imaging data for display.
14. The computer-readable media of claim 8, wherein generating the plurality of depth-aware processed signals comprising the depth-aware enhanced edge emphasis signals and depth-aware enhanced detail signals comprises: applying, to the plurality of edge emphasis signals and utilizing emphasis gain values obtained from the scene lighting mode vector, a nonlinear emphasis amplitude modulation to generate the depth-aware enhanced edge emphasis signals; and applying, to the plurality of detail signals and utilizing detail gain values obtained from the scene lighting mode vector, a nonlinear detail amplitude modulation to generate the depth-aware enhanced detail signals.
19. The system of claim 15, further comprising: converting pixel values of the imaging data to a perceptual color space prior to generating the plurality of edge emphasis signals, the plurality of detail signals, and the base signal; and converting pixel values of the depth-aware enhanced imaging data to a display color space prior to providing the depth-aware enhanced imaging data for display.
20. The system of claim 15, wherein generating the plurality of depth-aware processed signals comprising the depth-aware enhanced edge emphasis signals and depth-aware enhanced detail signals comprises: applying, to the plurality of edge emphasis signals and utilizing emphasis gain values obtained from the scene lighting mode vector, a nonlinear emphasis amplitude modulation to generate the depth-aware enhanced edge emphasis signals; and applying, to the plurality of detail signals and utilizing detail gain values obtained from the scene lighting mode vector, a nonlinear detail amplitude modulation to generate the depth-aware enhanced detail signals.



Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Error! Unknown document property name. whose telephone number is Error! Unknown document property name..  The examiner can normally be reached on Error! Unknown document property name..
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Error! Unknown document property name. can be reached on Error! Unknown document property name..  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/YU CHEN/
Primary Examiner, Art Unit 2613
Read full office action
Prosecution Timeline

May 08, 2024
Application Filed
Oct 28, 2025
Non-Final Rejection — §103
Jan 30, 2026
Response Filed
Mar 03, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/266,655
Patent 12604497
THIN FILM TRANSISTOR AND ARRAY SUBSTRATE
2y 5m to grant Granted Apr 14, 2026
18/029,093
Patent 12597176
IMAGE GENERATOR AND METHOD OF IMAGE GENERATION
2y 5m to grant Granted Apr 07, 2026
18/736,325
Patent 12589481
TOOL ATTRIBUTE MANAGEMENT IN AUTOMATED TOOL CONTROL SYSTEMS
2y 5m to grant Granted Mar 31, 2026
17/956,989
Patent 12588347
DISPLAY DEVICE
2y 5m to grant Granted Mar 24, 2026
18/291,931
Patent 12586265
LINE DRAWING METHOD, LINE DRAWING APPARATUS, ELECTRONIC DEVICE, AND COMPUTER READABLE STORAGE MEDIUM
2y 5m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
68%
Grant Probability
98%
With Interview (+29.9%)
2y 10m
Median Time to Grant
Moderate
PTA Risk
Based on 1052 resolved cases by this examiner. Grant probability derived from career allow rate.