Last updated: May 29, 2026

Application No. 18/497,940

TECHNIQUES FOR RECONSTRUCTING DIFFERENT THREE-DIMENSIONAL SCENES USING THE SAME TRAINED MACHINE LEARNING MODEL

Non-Final OA §103

Filed

Oct 30, 2023

Priority

Nov 15, 2022 — provisional 63/383,880

Examiner

NGUYEN, ANH TUAN V

Art Unit

2619

Tech Center

2600 — Communications

Assignee

Nvidia Corporation

OA Round

2 (Non-Final)

Interview Optional

— +19.8% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 72% grant rate with +19.8% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 490 resolved cases, 2023–2026

Examiner Intelligence

NGUYEN, ANH TUAN V View full profile →

Grants 72% — above average

Career Allowance Rate

354 granted / 490 resolved

+10.2% vs TC avg

Strong +20% interview lift

Without

With

+19.8%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

30 currently pending

Career history

529

Total Applications

across all art units

Statute-Specific Performance

§101

1.0%

-39.0% vs TC avg

§103

91.6%

+51.6% vs TC avg

§102

0.7%

-39.3% vs TC avg

§112

4.9%

-35.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 490 resolved cases

Office Action

§103

DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Applicant’s amendment/response filed 10/31/2025 has been entered and made of record. Claims 1, 11, and 20 were amended. Claims 2 and 12 were canceled. Claims 1, 3-11, and 13-20 are pending in the application.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 3-4, 7, 11, 13, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Borisov (US 2017/0064287) in view of Tang et al. (US 2023/0154051) and Steedly et al. (US 2010/0238164).
Regarding claim 1, Borisov teaches/suggests: A computer-implemented method for generating three-dimensional (3D) representations of scenes (Borisov [0005] “The output of the algorithm is a 3D model of a scene consisting of a mesh and a texture”), the method comprising: 
mapping a first red, blue, green, and depth (RGBD) image associated with both a first scene and a first viewpoint to a first surface representation of at least a first portion of the first scene (Borisov [0007] “generated a seamless texture map for a surface from images from multiple frames, corresponding to different camera positions in space” [0005] “The source data for a scene reconstruction algorithm is a set of pairs of RGB and depth images” [The texture corresponding to a first camera position meets the first surface representation.]); 
mapping a second RGBD image associated with both the first scene and a second viewpoint to a second surface representation of at least a second portion of the first scene (Borisov [0007] “generated a seamless texture map for a surface from images from multiple frames, corresponding to different camera positions in space” [0005] “The source data for a scene reconstruction algorithm is a set of pairs of RGB and depth images” [The texture corresponding to a second camera position meets the second surface representation.]); 
aggregating at least the first surface representation and the second surface representation in a 3D space to generate a first fused surface representation of the first scene (Borisov [0024] “Blend several RGB images to create a seamless texture”); 
Borisov is silent regarding:
mapping the first fused surface representation of the first scene to a 3D representation of the first scene.
Tang, however, teaches/suggests mapping (Tang [0017] “the texture maps associated with these volumetric representations can necessitate the streaming of coordinates that correlate the texture maps to spatial positions on the mesh representations of the volumes”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the mesh and the texture of Borisov to be mapped as taught/suggested by Tang to generate the 3D model. As such, Borisov as modified by Tang teaches/suggests:
mapping the first fused surface representation of the first scene to a 3D representation of the first scene (Borisov [0005] “The output of the algorithm is a 3D model of a scene consisting of a mesh and a texture” [0024] “Blend several RGB images to create a seamless texture” Tang [0017] “the texture maps associated with these volumetric representations can necessitate the streaming of coordinates that correlate the texture maps to spatial positions on the mesh representations of the volumes”).

Borisov and Tang are silent regarding wherein at least part of the first portion of the first scene overlaps with at least part of the second portion of the first scene. Steedly, however, teaches/suggests at least part of the first portion of the first scene overlaps with at least part of the second portion of the first scene (Steedly [0084] “a geometric proxy generation module 520 operates to automatically generate a 3D model of the scene using any of a number of conventional techniques. For example, capturing overlapping images of a scene from two or more slightly different perspectives allows conventional stereo imaging techniques to be used”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the images of Borisov as modified by Tang to be overlapped as taught/suggested by Steedly for stereo imaging.

Regarding claim 3, Borisov as modified by Tang and Steedly teaches/suggests: The computer-implemented method of claim 1, wherein mapping the first RGBD image comprises: 
determining a first plurality of input vectors based on the first viewpoint and a first depth image included in the first RGBD image (Borisov [0005] “The source data for a scene reconstruction algorithm is a set of pairs of RGB and depth images” Tang [0049] “instances of machine-learned encoding and decoding models 120 can be used to encode and decode the geometry (e.g., voxels containing truncated signed distance fields, etc.) of a three-dimensional volumetric representation” [The signed distance fields meet the input vectors.]); and 
executing a trained geometry encoder on the first plurality of input vectors to generate a first geometric surface representation of the at least first portion of the first scene (Tang [0049] “instances of machine-learned encoding and decoding models 120 can be used to encode and decode the geometry (e.g., voxels containing truncated signed distance fields, etc.) of a three-dimensional volumetric representation”).
The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Regarding claim 4, Borisov as modified by Tang and Steedly teaches/suggests:
determining a second plurality of input vectors based on a third viewpoint and a third RGBD image that is associated with both a second scene and the third viewpoint (Borisov [0005] “The source data for a scene reconstruction algorithm is a set of pairs of RGB and depth images” [Reconstructing another scene meets the second scene.] Tang [0049] “instances of machine-learned encoding and decoding models 120 can be used to encode and decode the geometry (e.g., voxels containing truncated signed distance fields, etc.) of a three-dimensional volumetric representation” [The signed distance fields meet the input vectors.]); and 
executing a trained geometry encoder on the second plurality of input vectors to generate a second geometric surface representation of at least a portion of the second scene (Tang [0049] “instances of machine-learned encoding and decoding models 120 can be used to encode and decode the geometry (e.g., voxels containing truncated signed distance fields, etc.) of a three-dimensional volumetric representation”).
The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Regarding claim 7, Borisov as modified by Tang and Steedly teaches/suggests: The computer-implemented method of claim 1, wherein the first surface representation comprises a geometric surface representation of the at least first portion of the first scene and a texture surface representation of the at least first portion of the first scene (Borisov [0005] “The output of the algorithm is a 3D model of a scene consisting of a mesh and a texture”). The mesh meets the geometric surface representation.

Claims 11 and 13 recite limitation(s) similar in scope to those of claims 1 and 3, respectively, and are rejected for the same reason(s). Borisov as modified by Tang and Steedly further teaches/suggests one or more non-transitory computer readable media including instructions (Borisov [0005] “3d scene reconstruction from RGBD-camera images (especially using IOS Ipad device with attached Structure Sensor)”).

Regarding claim 19, Borisov as modified by Tang and Steedly teaches/suggests: The one or more non-transitory computer readable media of claim 11, wherein the first viewpoint is specified by at least one of a rotation matrix, a 3D translation, or an intrinsic matrix associated with a camera (Borisov [0008] “Intrinsic parameters: parameters of RGB camera that define how 3D point map to pixels in an image generated by the camera”).

Claim 20 recites limitation(s) similar in scope to those of claim 1, and is rejected for the same reason(s). Borisov as modified by Tang and Steedly further teaches/suggests one or more memories storing instructions; and one or more processors coupled to the one or more memories (Borisov [0005] “3d scene reconstruction from RGBD-camera images (especially using IOS Ipad device with attached Structure Sensor)”).

Claim(s) 5-6 and 14-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Borisov (US 2017/0064287) in view of Tang et al. (US 2023/0154051) and Steedly et al. (US 2010/0238164) as applied to claims 1 and 11 above, and further in view of Wang et al. (US 2023/0071559).
Regarding claim 5, Borisov as modified by Tang and Steedly teaches/suggests a first red, green, and blue (RGB) image included in the first RGBD image (Borisov [0005] “The source data for a scene reconstruction algorithm is a set of pairs of RGB and depth images”); and a trained texture encoder (Tang [0046] “the texture encoding operations can correspond to the operation and/or outputs of the instances of the machine-learned encoding and decoding models 120”). The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Borisov, Tang, and Steedly are silent regarding: The computer-implemented method of claim 1, wherein mapping the first RGBD image comprises executing a trained texture encoder on a first red, green, and blue (RGB) image to generate a first plurality of texture feature vectors associated with a first plurality of pixels included in the first RGB image. Wang, however, teaches/suggests executing a trained texture encoder on a first red, green, and blue (RGB) image to generate a first plurality of texture feature vectors associated with a first plurality of pixels included in the first RGB image (Wang [0067] “assign to each 3D point p.sub.i a feature vector f.sub.i that encodes the object's appearance, its alpha matte, and its contextual information”). Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the texture encoding of Borisov as modified by Tang and Steedly to include the feature vectors of Wang to generate the texture.

Regarding claim 6, Borisov as modified by Tang, Steedly, and Wang teaches/suggests: The computer-implemented method of claim 5, further comprising projecting the first plurality of texture feature vectors onto a first plurality of 3D surface points to generate a first texture surface representation of the at least first portion of the first scene (Wang [0067] “When rendering a novel target viewpoint V, all points and thus their features are projected onto a view-dependent feature map M.sub.q”). The same rationale to combine as set forth in the rejection of claim 5 above is incorporated herein.

Claim 14 recites limitation(s) similar in scope to those of claim 5, and is rejected for the same reason(s).

Regarding claim 15, Borisov as modified by Tang, Steedly, and Wang teaches/suggests: The one or more non-transitory computer readable media of claim 14, further comprising: 
executing the trained texture encoder on a second RGB image associated with a second scene to generate a second plurality of texture feature vectors associated with a second plurality of pixels included in the second RGB image (Borisov [0005] “The source data for a scene reconstruction algorithm is a set of pairs of RGB and depth images” [Reconstructing another scene meets the second scene.] Tang [0046] “the texture encoding operations can correspond to the operation and/or outputs of the instances of the machine-learned encoding and decoding models 120” Wang [0067] “assign to each 3D point p.sub.i a feature vector f.sub.i that encodes the object's appearance, its alpha matte, and its contextual information”); and 
projecting the second plurality of texture feature vectors onto a second plurality of 3D surface points to generate a second texture surface representation of at least a portion of the second scene (Wang [0067] “When rendering a novel target viewpoint V, all points and thus their features are projected onto a view-dependent feature map M.sub.q”).
The same rationales to combine as set forth in the rejection of claims 1 and 5 above are incorporated herein.

Regarding claim 16, Borisov as modified by Tang and Steedly teaches/suggests: The one or more non-transitory computer readable media of claim 11, wherein the second surface representation comprises a plurality 3D surface points that are associated with a plurality of geometry feature vectors (Tang [0049] “instances of machine-learned encoding and decoding models 120 can be used to encode and decode the geometry (e.g., voxels containing truncated signed distance fields, etc.) of a three-dimensional volumetric representation”). The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Borisov, Tang, and Steedly are silent regarding a plurality of texture feature vectors. Wang, however, teaches/suggests a plurality of texture feature vectors (Wang [0067] “assign to each 3D point p.sub.i a feature vector f.sub.i that encodes the object's appearance, its alpha matte, and its contextual information”). The same rationale to combine as set forth in the rejection of claim 5 above is incorporated herein.

Claim(s) 8 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Borisov (US 2017/0064287) in view of Tang et al. (US 2023/0154051) and Steedly et al. (US 2010/0238164) as applied to claims 1 and 11 above, and further in view of Shen et al. (US 2022/0392162).
Regarding claim 8, Borisov as modified by Tang and Steedly teaches/suggests a plurality of geometry input vectors..
executing a trained geometry decoder on the plurality of geometry input vectors to generate a plurality of signed distance function values (Tang [0049] “instances of machine-learned encoding and decoding models 120 can be used to encode and decode the geometry (e.g., voxels containing truncated signed distance fields, etc.) of a three-dimensional volumetric representation” [The signed distance fields meet the plurality of geometry input vectors.]).
The same rationale to combine as set forth in the rejection of claim 1 above is incorporated herein.

Tang, Steedly, and Tang are silent regarding: The computer-implemented method of claim 1, wherein mapping the first fused surface representation comprises: 
performing one or more interpolation operations on the first fused surface representation to generate a plurality of geometry input vectors; 
Shen, however, teaches/suggests:
performing one or more interpolation operations on the first fused surface representation to generate a plurality of geometry input vectors (Shen [0026] “The machine learning model(s) may then be used to generate a feature vector F.sub.vol(v, x) for a grid vertex v∈custom-character.sup.3 via trilinear interpolation. The initial prediction of the SDF value for each vertex in the initial deformable tetrahedral grid may be generated using, e.g., a fully connected network s(v)=MLP (F.sub.vol(v, x), v)”); 
Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to modify the geometry encoding of Borisov as modified by Tang and Steedly to include the interpolation of Shen to generate the mesh.

Claim 17 recites limitation(s) similar in scope to those of claim 8, and is rejected for the same reason(s).
Allowable Subject Matter
Claims 9-10 and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter: The steps of “generating” and “executing,” taken as a whole, render the claims patentably distinct over the prior art.
Response to Arguments
Applicant's arguments filed 10/31/2025 have been fully considered but they are moot in view of the new ground(s) of rejection set forth in this Office action.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
US 2010/0201682 – overlapping images
US 2023/0147722 – overlapping images
US 2023/0224576 – overlapping images
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANH-TUAN V NGUYEN whose telephone number is 571-270-7513. The examiner can normally be reached on M-F 9AM-5PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JASON CHAN can be reached on 571-272-3022. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ANH-TUAN V NGUYEN/
Primary Examiner, Art Unit 2619

Read full office action

Prosecution Timeline

Oct 30, 2023

Application Filed

Aug 22, 2025

Non-Final Rejection mailed — §103

Oct 31, 2025

Response Filed

Nov 13, 2025

Examiner Interview Summary

Nov 13, 2025

Applicant Interview (Telephonic)

Dec 30, 2025

Final Rejection mailed — §103

Feb 26, 2026

Response after Non-Final Action

Precedent Cases

Applications granted by this same examiner with similar technology

18/527,769

Patent 12626456

ELECTRONIC DEVICE FOR DISPLAYING VIRTUAL OBJECT AND OPERATION METHOD THEREOF

2y 5m to grant Granted May 12, 2026

18/188,600

Patent 12614358

AUGMENTED REALITY ENVIRONMENT MELDING

3y 1m to grant Granted Apr 28, 2026

17/988,499

Patent 12608856

SYSTEM FOR AND METHOD OF GRAPHICALLY REPRESENTING INFORMATION

3y 5m to grant Granted Apr 21, 2026

18/156,185

Patent 12591359

ELECTRONIC DEVICE COMPRISING DISPLAY THAT OPTIMALLY DISPLAY CONTENT WITH RESPECT TO CAMERA HOLE, AND METHOD FOR CONTROLLING DISPLAY THEREOF

3y 2m to grant Granted Mar 31, 2026

18/633,935

Patent 12592033

METHOD AND APPARATUS FOR DETECTING PICKED OBJECT, COMPUTER DEVICE, READABLE STORAGE MEDIUM, AND COMPUTER PROGRAM PRODUCT

1y 11m to grant Granted Mar 31, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

2-3

Expected OA Rounds

72%

Grant Probability

92%

With Interview (+19.8%)

2y 11m (~4m remaining)

Median Time to Grant

Moderate

PTA Risk

Based on 490 resolved cases by this examiner. Grant probability derived from career allowance rate.