Last updated: May 29, 2026

Application No. 18/646,503

THREE-DIMENSIONAL RECONSTRUCTIONS BASED ON GAUSSIAN PRIMITIVES

Final Rejection §103

Filed

Apr 25, 2024

Examiner

KALHORI, DAN F

Art Unit

2618

Tech Center

2600 — Communications

Assignee

Adobe Inc.

OA Round

2 (Final)

Interview Optional

— +0.0% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 100% grant rate with +0.0% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 3 resolved cases, 2023–2026

Examiner Intelligence

KALHORI, DAN F View full profile →

Grants 100% — above average

Career Allowance Rate

3 granted / 3 resolved

+38.0% vs TC avg

Minimal +0% lift

Without

With

+0.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 4m

Avg Prosecution

13 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§103

100.0%

+60.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 3 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s remaining arguments regarding the prior § 103 rejection is persuasive in view of the amendments, and the Examiner has introduced new grounds of rejections based on new references to address amended limitations.
Regarding arguments to independent claims 10 and 17, they have been amended in an analogous manner to claim 1, and, for the reasons discussed above, the prior § 103 rejections of claims 10 and 17 are not maintained and new grounds of rejection are set forth below.
Applicant’s arguments regarding the dependent claims being in condition for allowance due to the reasons related to the corresponding independent claims are not persuasive because the independent claims are not allowed, therefore the dependent claims remain rejected.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


	Claims 1, 3-5, 8-10,12-14, and 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Wang (Wang J, Zhang Q, Sun J, Cao J, Han G, Zhao W, Zhang W, Shao Y, Guo Y, Xu R. Reinforcement Learning with Generalizable Gaussian Splatting. arXiv preprint arXiv:2404.07950. 2024 Mar 18.) in view of Minnen (Minnen, D., Toderici, G., Covell, M., Chinen, T., Johnston, N., Shor, J., Hwang, S.J., Vincent, D. and Singh, S., 2017, September. Spatially adaptive image compression using a tiled deep network. In 2017 IEEE International Conference on Image Processing (ICIP) (pp. 2796-2800). IEEE.), and Mao (CN114627023B).


Regarding claim 1, Wang teaches a method comprising: receiving, by a processing device (pg. 5 A. Experimental Settings ¶1), a first digital image depicting an object from a first angle and a second digital image depicting the object from a second angle (pg. 3-4 A. Generalizable 3D GS representation, describes using “N images”, their corresponding camera parameters 
    PNG
    media_image1.png
    20
    132
    media_image1.png
    Greyscale
, and “two source views” which would read on a first and second image from a first and second angle and pg. 5, Fig. 2 showing the images as being digital as well depicting objects.) and data involving one or more rays that indicate one or more angles of capture of the first digital image or the second digital image (Wang; pg. 3 B. 3D Gaussian Splatting, descbies the view transformation (W) representing rays projecting from the camera to the image plane. Camera parameters {Kn, Pn} encode camera position and viewing direction. This teaches receiving data involving rays that indicate angles of capture of the input images.) generating, by the processing device using a machine learning model, three-dimensional Gaussian primitives that predict parameters of points of the object in a three-dimensional space that correspond on a per-pixel basis to pixels based on analyzing the one or more rays to determine spatial relationships of the pixels of the patches (pg. 4 A. Generalizable 3D GS representation, Gaussian Properties Prediction, describes using the Eϕ encoder with “UNet-like architecture” (machine learning model) and that each 3D Gaussian is parameterized by properties (G = {X, R, S, c, o}) (generating Gaussian primitives) to define the shape and appearance. A Gaussian regressor module to predict properties in a per-pixel manner (per-pixel basis) and corresponding properties to predict (equations (8) and (10)) which teaches predicting parameters of points in three-dimensional space. Equation (8), uses camera characteristics and pose to unproject each pixel into 3D coordinate space. This teaches analyzing ray data to determine spatial relationships of pixels of the patches.) and forming, by the processing device, a three-dimensional reconstruction of the object for display in a user interface by merging the three-dimensional Gaussian primitives (pg. 3, A. Generalizable 3D GS representation, describes to reconstruct (forming) a 3D Gaussian representation and being able to render novel images with arbitrary viewpoints (for display in a user interface) and pg. 4 Gaussian Refinement, after obtaining the 3D Gaussian properties (primitives) being able to render novel views by merging the Gaussian properties).
	However, Wang fails to teach, but Minnen teaches segmenting, by the processing device (pg. 4 ¶2), the first digital image and the second digital image into patches (pg. 1 2. CODEC OVERVIEW, describes dividing (segmenting) images into tiles (patches) and pg. 2 Fig. 2, shows tiling into NxN patches). 
It would have been obvious to one of ordinary skill in the art, before the effective filing date, to modify the method as taught by Wang to tile the input images as taught Minnen as organizing images into tiles for neural network processing is a well-known technique for improving efficiency of processing images. 
However, Wang in view of Minnen does not explicitly disclose the patches are one-dimensional sequences of data. 
Mao teaches converting two-dimensional feature maps into one-dimensional feature sequences by flattening (Mao; S104 ¶8, describes downsampling images through convolutional branches to obtain feature maps and then flattening the two-dimensional feature maps into one-dimensional feature sequences of various lengths. Converting two-dimensional image data into one-dimensional sequences for neural network processing.) In the combination, this teaches segmenting images into patches represented as one-dimensional sequences of data. 
It would have been obvious to one of ordinary skill in the art, before the effective filing date, to modify the method as taught by Wang in view of Minnen with the one-dimensional sequence representation of Mao. The motivation for such a combination would have been to provide the benefit of improved feature extraction across the segmented image patches.

Regarding claim 10, Wang in view of Minnen and Mao teaches a system that performs the method of claim 1. Wang describes a system with processing components (pg. 5 A. Experimental Settings, the processing device implementing the machine learning models and the NVIDIA GPU) that performs the operations recited in claim 10 (pg. 3-4 A. Generalizable 3D GS representation, see claim 1). Claim 10 recites “depicting a scene” where Wang depicts scenes with objects (pg. 5 Fig. 2 depicting the robot arm scene). The system performs the method of claim 1.

Claim 17, has similar limitations as of claim(s) 1, therefore it is rejected under the same rationale as claim(s) 1, except claim 17 recites, “non-transitory computer-readable storage medium storing executable instructions”. Wang; pg. 5 A. Experimental Settings, describes computer-implemented processing that down samples predictions to improve memory consumption/efficiency, on a system using an Nvidia A6000. 

Regarding claim 3, Wang in view of Minnen and Mao teaches the method of claim 1, wherein the machine learning model is trained on images depicting objects captured from multiple camera angles (Wang teaches pg. 4-5 V. Experiments, training with image-depth pairs and their associated camera parameters (multiple camera angles) rendered by the robosuite simulator and Fig. 2 showing the method and displaying objects from 2 views).

Claims 12 and 18, have similar limitations as of claim(s) 3, therefore it is rejected under the same rationale as claim(s) 3.

Regarding claim 4, Wang in view of Minnen and Mao teaches the method of claim 1, wherein the three-dimensional Gaussian primitives have color values corresponding to colors of the pixels of the patches (Wang pg. 4 Gaussian Properties Prediction, describes parameterizing each 3D Gaussian by properties (G = {X, R, S, c, o}) to define shape and appearance (see claim 1) and the pixel color c = Is.)
Claims 13 and 19, have similar limitations as of claim(s) 4, therefore it is rejected under the same rationale as claim(s) 4.

Regarding claim 5, Wang in view of Minnen and Mao teaches the method of claim 1, wherein merging the three-dimensional Gaussian primitives further comprises positioning points of the three-dimensional Gaussian primitives in the three-dimensional space using coordinates associated with the three-dimensional Gaussian primitives (Wang pg. 3-4 A. Generalizable 3D GS representation, describes transforming the image into 3D coordinate space as part of the reconstruction process and (equation (8)) showing X from the Gaussian properties (G = {X, R, S, c, o} see claim 1) represents the coordinates of the Gaussian primitive used for position in 3D space.)

Claims 14 and 20, have similar limitations as of claim(s) 5, therefore it is rejected under the same rationale as claim(s) 5.

Regarding claim 8, Examiner interprets “Plücker ray” as described in the specification: [¶0026] A Plücker ray, for instance, indicates a direction and a location of a camera ray from a camera used to capture the first digital image or the second digital image and is understood to refer to data specifying the direction and/or position of the camera for each image.
Wang in view of Minnen and Mao teaches the method of claim 1, further comprising receiving Plücker rays indicating angles of capture for the first digital image and the second digital image (Wang pg. 3-4 A. Generalizable 3D GS representation, describes receiving images with camera parameters 
    PNG
    media_image1.png
    20
    132
    media_image1.png
    Greyscale
 (see claim 1) where K and P define camera position and viewing direction (angles of capture), and (pg. 3, B. 3D Gaussian Splatting) the view transformation (W) represents rays projecting from the camera to image plane or lines in 3D space that Plücker rays represent.) 


Regarding claim 9, Wang in view of Minnen and Mao teaches the method of claim 8, further comprising generating the three-dimensional Gaussian primitives (Wang pg. 4 A. Generalizable 3D GS representation, generating 3D Gaussians (G = {X, R, S, c, o}), see claim 1) by analyzing the Plücker rays to determine depicted depths of the pixels (Wang pg. 3-4 A. Generalizable 3D GS representation, describes predicting the absolute depth value for each pixel where the camera parameters (Plücker rays, see claim 8) define the geometry used for disparity prediction Dpred (Equation 7) which is then used to compute the 3D Gaussian positions X (Equation 8), teaching analyzing ray information to determine per-pixel depths for generating Gaussians) of the patches (Minnen; pg. 1-2 Fig. 2 and 2. CODEC OVERVIEW, see claim 1) using the machine learning model (Wang pg. 4 Gaussian Properties Prediction, describes the disparity prediction network, D (Equation 7) and Gaussian regressor module (see claim 1).) 

Claim 16 has similar limitations as of claim(s) 9, therefore it is rejected under the same rationale as claim(s) 9.


Claims 2, 7, and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Wang (Wang J, Zhang Q, Sun J, Cao J, Han G, Zhao W, Zhang W, Shao Y, Guo Y, Xu R. Reinforcement Learning with Generalizable Gaussian Splatting. arXiv preprint arXiv:2404.07950. 2024 Mar 18.), Minnen (Minnen, D., Toderici, G., Covell, M., Chinen, T., Johnston, N., Shor, J., Hwang, S.J., Vincent, D. and Singh, S., 2017, September. Spatially adaptive image compression using a tiled deep network. In 2017 IEEE International Conference on Image Processing (ICIP) (pp. 2796-2800). IEEE.), Mao (CN114627023B), and Wang2 (Wang, D., Cui, X., Chen, X., Zou, Z., Shi, T., Salcudean, S., Wang, Z.J. and Ward, R., 2021. Multi-view 3d reconstruction with transformers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 5722-5731).)

Regarding claim 2, as previously discussed in claim 1, Wang in view of Minnen and Mao teaches the method of claim 1, the machine learning model that generates the three-dimensional Gaussian primitives (Wang pg. 3-4 A. Generalizable 3D GS representation, see claim 1) by analyzing depicted depth and the spatial relationships of the pixels of the patches (pg. 3-4 A. Generalizable 3D GS representation, describes depth estimation (analyzing) for each pixel and transforming into a 3D space (spatial relationship)) and (patches as taught by Minnen in claim 1).
However, Wang in view of Minnen fails to teach, but Wang2 teaches wherein the machine learning model is a Transformer model (ABST, describes for 3D reconstruction, using a Transformer model instead of a CNN model.) It would have been obvious to one of ordinary skill in the art, before the effective filing date, to modify the method as taught by Wang in view of Minnen and Mao as Wang2 describes (ABST) the Transformer model having stronger scaling capability than CNN based methods.

Claim 11 has similar limitations as of claim(s) 2, therefore it is rejected under the same rationale as claim(s) 2.

Regarding claim 7, as previously discussed Wang in view of Minnen and Mao teaches the method of claim 1 of processing image data (patches) and generating 3D Gaussian primitives, but Wang uses a CNN rather than transformer models with self-attention.
However, Wang2 teaches processing data through a series of transformer models including self-attention and multilayer perceptron layers using the machine learning model for generating the three-dimensional Gaussian primitives (pg. 4 Fig. 1, shows transformer architecture showing “2D-view Encoder” and “3D-Volume Decoder” with multiple stacked blocks, each having “Multi-Head…Attention” which uses the multi-head self-attention operation (pg. 2, 2.2. Transformer). Figure 1 also shows “Position-wise Feed-Forward” networks which reads on multilayer perceptron layers (as a multilayer perceptron is a feed forward network). Wang2 describes (pg. 4, 3.1. Divergence-enhanced 2D-view Encoder¶ 1) stacking N = 6 basic blocks with each block consisting of a multihead divergence-enhanced view attention layer and a position-wise feed-forward network, which reads on processing through a series of transformer models.) It would have been obvious to one of ordinary skill in the art, before the effective filing date, to modify the method as taught by Wang in view of Minnen and Mao as Wang2 describes (ABST) the Transformer model having stronger scaling capability than CNN -based methods.


Claims 6 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Wang (Wang J, Zhang Q, Sun J, Cao J, Han G, Zhao W, Zhang W, Shao Y, Guo Y, Xu R. Reinforcement Learning with Generalizable Gaussian Splatting. arXiv preprint arXiv:2404.07950. 2024 Mar 18.), Minnen (Minnen, D., Toderici, G., Covell, M., Chinen, T., Johnston, N., Shor, J., Hwang, S.J., Vincent, D. and Singh, S., 2017, September. Spatially adaptive image compression using a tiled deep network. In 2017 IEEE International Conference on Image Processing (ICIP) (pp. 2796-2800). IEEE.), Mao (CN114627023B), and Shi (Shi Y, Wang P, Ye J, Long M, Li K, Yang X. Mvdream: Multi-view diffusion for 3d generation. arXiv preprint arXiv:2308.16512. 2023 Aug 31.).

Regarding claim 6, Wang in view of Minnen and Mao fails to teach, but Shi teaches the method of claim 1, wherein the first digital image and the second digital image are generated from a text input by a generative model (Shi ABST, describes a diffusion model (generative model) that generates multi-view images (first and second digital images) from a text imput (prompt)). It would have been obvious to one of ordinary skill in the art, before the effective filing date, to apply the method as taught by Wang in view of Minnen and Mao to multi-view images generated from text prompts as taught by Shi because generating multi-view images from text is a well-known technique with the benefits of consistency and repeatability.

Claim 15 has similar limitations as of claim(s) 6, therefore it is rejected under the same rationale as claim(s) 6.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAN F KALHORI whose telephone number is (571)272-5475. The examiner can normally be reached Mon-Fri 8:30-5:30 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, DEVONA E FAULK can be reached at (571) 272-7515. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DAN F KALHORI/Examiner, Art Unit 2618                                                                                                                                                                                                        
/DEVONA E FAULK/Supervisory Patent Examiner, Art Unit 2618

Read full office action

Prosecution Timeline

Apr 25, 2024

Application Filed

Nov 14, 2025

Non-Final Rejection mailed — §103

Dec 13, 2025

Examiner Interview Summary

Dec 17, 2025

Response Filed

Apr 08, 2026

Final Rejection mailed — §103

May 21, 2026

Examiner Interview Summary

Precedent Cases

Applications granted by this same examiner with similar technology

18/613,146

Patent 12620172

METHOD AND IMAGE-PROCESSING DEVICE FOR DETECTING A REFLECTION OF AN IDENTIFIED OBJECT IN AN IMAGE FRAME

2y 1m to grant Granted May 05, 2026

18/329,379

Patent 12567392

METHOD FOR A TELEVISION TO ASSIST A VIEWER IN IMPROVING WATCHING EXPERIENCE IN A ROOM, AND A TELEVISION IMPLEMENTING THE SAME

2y 9m to grant Granted Mar 03, 2026

18/195,466

Patent 12469152

SYSTEM AND METHOD FOR THREE-DIMENSIONAL MULTI-OBJECT TRACKING

2y 6m to grant Granted Nov 11, 2025

18/144,285

Patent 12456181

METHOD, ELECTRONIC DEVICE, AND COMPUTER PROGRAM PRODUCT FOR VERIFYING VIRTUAL AVATAR

2y 5m to grant Granted Oct 28, 2025

Study what changed to get past this examiner. Based on 4 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

100%

Grant Probability

99%

With Interview (+0.0%)

2y 4m (~3m remaining)

Median Time to Grant

Moderate

PTA Risk

Based on 3 resolved cases by this examiner. Grant probability derived from career allowance rate.