Prosecution Insights
Last updated: April 19, 2026
Application No. 18/655,917

METHOD AND APPARATUS FOR NEURAL RENDERING BASED ON BINARY REPRESENTATION

Non-Final OA §103
Filed
May 06, 2024
Examiner
MA, MICHELLE HAU
Art Unit
2617
Tech Center
2600 — Communications
Assignee
Postech Research And Business Development Foundation
OA Round
1 (Non-Final)
81%
Grant Probability
Favorable
1-2
OA Rounds
2y 7m
To Grant
99%
With Interview

Examiner Intelligence

Grants 81% — above average
81%
Career Allow Rate
17 granted / 21 resolved
+19.0% vs TC avg
Strong +36% interview lift
Without
With
+36.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
35 currently pending
Career history
56
Total Applications
across all art units

Statute-Specific Performance

§101
3.0%
-37.0% vs TC avg
§103
84.2%
+44.2% vs TC avg
§102
6.4%
-33.6% vs TC avg
§112
5.5%
-34.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 21 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Specification The disclosure is objected to because of the following informalities: In paragraph 0054 lines 7-8, “NSR module 220” should read “NSR model 222”. Appropriate correction is required. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-3, 6-7, and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Rho et al. (Masked Wavelet Representation for Compact Neural Radiance Fields), hereinafter Rho. Regarding claim 1, Rho teaches a neural rendering method (Paragraph 5 in 2nd Col. of Page 3, Paragraph 1 in 1st Col. of Page 4 – “We consider a neural radiance field leveraging grid representations. It takes an input coordinate x ∈ R3 and viewing direction, d ∈ R2, generating a four-dimensional vector consisting of a density and three-channel RGB colors…The following volumetric rendering equation is used to synthesize novel views”) comprising: receiving a query input (input coordinate and input viewing direction [Wingdings font/0xE0] see quote below) comprising coordinates and a view direction of a query point of a three-dimensional (3D) scene in a 3D space (Paragraph 5 in 2nd Col. of Page 3 – “We consider a neural radiance field leveraging grid representations. It takes an input coordinate x ∈ R3 and viewing direction, d ∈ R2”; Note: the input coordinate is a query point, and it is in a 3-dimensional space, symbolized by R3); extracting reference feature values (feature vectors used in bilinear interpolation [Wingdings font/0xE0] see quote and Fig. 2 below) around the query point from feature values (feature vectors [Wingdings font/0xE0] see quote below) in a binary format within a binary feature grid (masked/binarized wavelet coefficient grid [Wingdings font/0xE0] see quote below) representing the 3D scene (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation… More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the feature vectors used in the bilinear interpolation operation are the reference feature values, and they come from masked wavelet coefficients, which are in a binary format); determining an input feature value (sample feature vectors resulting from bilinear interpolation [Wingdings font/0xE0] see quote below) based on the reference feature values in the binary format (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation… vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively…More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the sampled features are the input feature values, and they are based on the reference features used in the bilinear interpolation operation, disclosed above. The sampled features are a result of bilinear interpolation); and generating a query output (color and density/opacity [Wingdings font/0xE0] see quotes below) corresponding to the query input by executing a neural scene representation (NSR) model (Fig. 2 Caption on Page 4 – “The overall architecture of our model… opacity and color for each input coordinate are estimated using the sampled feature vectors following the defined model”; Note: Fig. 2 shows the neural scene representation model; see screenshot of Fig. 2 below. The opacity and color are the query output) based on the query input and the input feature value (Paragraph 5 in 2nd Col. of Page 3, Paragraph 1 in 1st Col. of Page 4 – “We consider a neural radiance field leveraging grid representations. It takes an input coordinate x ∈ R3 and viewing direction, d ∈ R2, generating a four-dimensional vector consisting of a density and three-channel RGB colors, defined as follows. PNG media_image2.png 69 291 media_image2.png Greyscale where θ is the parameter of an MLP, γ = {γσ, γc} is a set of grid parameters, and M = {Mσ,Mc} is a set of masks for grid parameters. The following volumetric rendering equation is used to synthesize novel views: PNG media_image3.png 133 358 media_image3.png Greyscale where r(t) is a ray from a camera viewpoint, and C(r) is the expected color of the ray r(t)”; Note: the output is a four-dimensional vector consisting of a density and three-channel RGB colors. The output is generated from the coordinate and direction input, as well as the feature values obtained from the grid parameters and mask). PNG media_image4.png 190 1181 media_image4.png Greyscale Screenshot of Fig. 2 (taken from Rho) Rho does not directly teach that the input feature value is in a real number format, from the limitation: “determining an input feature value in a real number format based on the reference feature values in the binary format”. However, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to explicitly have the input feature value be in a real number format, because in Rho, the input feature value is determined by performing bilinear interpolation (Fig. 2 Caption on Page 4 – “We sample feature vectors for input coordinates using bilinear interpolation on the grids”), and interpolation tends to result in a decimal/fractional number, which binary format cannot account for. Furthermore, there is a finite number of number formats, including binary, real number, hexadecimal, etc. One of ordinary skill in the art could have determined the input feature value in a real number format with a reasonable expectation of success and would have done so for the benefit of predicting a reasonable feature value using interpolation. Therefore, it would have been obvious to try the solution of producing the input feature value as a real number. Regarding claim 2, Rho teaches the neural rendering method of claim 1. Rho further teaches wherein the determining of the input feature value comprises: determining the input feature value by performing an interpolation operation based on the reference feature values (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation… vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively”; Note the sampled features are the input feature values, and they are based on the features used in the bilinear interpolation operation). Regarding claim 3, Rho teaches the neural rendering method of claim 1. Rho further teaches wherein the binary feature grid comprises: a 3D binary feature grid representing the 3D scene (Fig. 2 Caption, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid…In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively. H, W, D are the resolution of the grid. More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the masked grid is a binary feature grid, and the grid is 3D), and a two-dimensional (2D) binary feature grid representing a 2D scene in which the 3D scene is projected onto a 2D plane (Fig. 2 Caption, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid…For efficient 3D object and scene representation, recent studies proposed using a lower-dimensional grid (2D planes or 1D lines) [3, 4]. For example, EG3D [3] utilizes three 2D planes (tri-plane) and TensoRF [4] employs three sets, each consisting of a plane and a vector. We describe our method based on TensoRF. In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively. H, W, D are the resolution of the grid. More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the matrices, when masked, are 2D binary feature grids. They represent 2D planes corresponding to the 3D scene). Regarding claim 6, Rho teaches the neural rendering method of claim 1. Rho further teaches wherein the NSR model is trained using a real-valued feature grid comprising feature values in the real number format (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4, Paragraph 1 in 2nd Col. of Page 5 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions…During training, the grid parameters and corresponding binarized masks are multiplied element by element”; Note: the wavelet coefficients/grid parameters (before masking) are real-valued reference feature values in a real-number format). Rho does not explicitly teach when training of the NSR model is completed, neural rendering is performed using the binary feature grid without the real-valued feature grid. Instead, Rho teaches when training of the NSR model is completed, neural rendering is performed using the binary feature grid with the real-valued feature grid (Paragraph 4 in 2nd Col. of Page 3, Paragraph 1 in 1st Col. of Page 4 – “We consider a neural radiance field leveraging grid representations. It takes an input coordinate x ∈ R3 and viewing direction, d ∈ R2, generating a four-dimensional vector consisting of a density and three-channel RGB colors, defined as follows. PNG media_image5.png 68 282 media_image5.png Greyscale where θ is the parameter of an MLP, γ = {γσ, γc} is a set of grid parameters, and M = {Mσ,Mc} is a set of masks for grid parameters, which will be described shortly (Sec. 3.3). The following volumetric rendering equation is used to synthesize novel views: PNG media_image3.png 133 358 media_image3.png Greyscale where r(t) is a ray from a camera viewpoint, and C(r) is the expected color of the ray r(t)”; Note: the grid parameters are used in rendering, along with masks for the grid parameters, which implies that the real-valued grid (before masking) and the binary grid (after masking) are both used in rendering). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to perform neural rendering using the binary feature grid without the real-valued feature grid, because if the model is trained to binarize the features, then the real-valued grid would not be necessary for rendering; only the feature values themselves would be necessary, but not the whole grid. Furthermore, there are a finite number of ways to perform neural rendering with real-valued and binary grids; either it can be performed with both grids or with just one grid. One of ordinary skill in the art could have performed neural rendering using only the binary feature grid with a reasonable expectation of success and would have done so for the benefit of increasing the efficiency of rendering since the real-valued feature grid would not have to be generated. Therefore, it would have been obvious to try the solution of performing neural rendering using only the binary feature grid. Regarding claim 7, Rho teaches the neural rendering method of claim 1. Rho further teaches wherein the query output comprises color information and a volume density based on the query input (Paragraph 5 in 2nd Col. of Page 3, Paragraph 1 in 1st Col. of Page 4 – “We consider a neural radiance field leveraging grid representations. It takes an input coordinate x ∈ R3 and viewing direction, d ∈ R2, generating a four-dimensional vector consisting of a density and three-channel RGB colors, defined as follows. PNG media_image2.png 69 291 media_image2.png Greyscale where θ is the parameter of an MLP, γ = {γσ, γc} is a set of grid parameters, and M = {Mσ,Mc} is a set of masks for grid parameters. The following volumetric rendering equation is used to synthesize novel views”; Note: the output is a four-dimensional vector consisting of a density and three-channel RGB colors). Regarding claim 17, Rho teaches an electronic device (Paragraph 3 in 2nd Col. of Page 6, Paragraph 1 in 2nd Col. of Page 12 – “Figs. 1 and 5 show the quantitative performances of our method and the baselines on various datasets. Each graph displays the average PSNR and size of methods… current codes lower the GPU utilization rate from 80% (baseline) to 50%”; Note: it is implied that there is an electronic device to perform the method since it cannot be performed otherwise) comprising: a processor (Paragraph 1 in 2nd Col. of Page 12 – “current codes lower the GPU utilization rate from 80% (baseline) to 50%”; Note: it is implied that a processor, at least the GPU, was used to perform the method); and a memory configured to store instructions executable by the processor, wherein, in response to the instructions being executed by the processor (Paragraph 5 in 2nd Col. of Page 6, Paragraph 1 in 2nd Col. of Page 12 – “Even with the base network (VM-192), our method consistently outperforms other baselines on all novel view synthesis datasets, under a small memory budget of 2 MB…current codes lower the GPU utilization rate from 80% (baseline) to 50%”; Note: there is a memory and code. It is implied that the memory stores the code since it would not be able to run otherwise), the processor is configured to: receive a query input (input coordinate and input viewing direction [Wingdings font/0xE0] see quote below) comprising coordinates and a view direction of a query point of a three-dimensional (3D) scene in a 3D space (Paragraph 5 in 2nd Col. of Page 3 – “We consider a neural radiance field leveraging grid representations. It takes an input coordinate x ∈ R3 and viewing direction, d ∈ R2”; Note: the input coordinate is a query point, and it is in a 3-dimensional space, symbolized by R3); extract reference feature values (feature vectors used in bilinear interpolation [Wingdings font/0xE0] see quote and Fig. 2 below) around the query point from feature values (feature vectors [Wingdings font/0xE0] see quote below) in a binary format within a binary feature grid (masked/binarized wavelet coefficient grid [Wingdings font/0xE0] see quote below) representing the 3D scene (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation… More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the feature vectors used in the bilinear interpolation operation are the reference feature values, and they come from masked wavelet coefficients, which are in a binary format); determine an input feature value (sample feature vectors resulting from bilinear interpolation [Wingdings font/0xE0] see quote below) based on the reference feature values in the binary format (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation… vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively…More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the sampled features are the input feature values, and they are based on the reference features used in the bilinear interpolation operation, disclosed above. The sampled features are a result of bilinear interpolation); and generate a query output (color and density/opacity [Wingdings font/0xE0] see quotes below) corresponding to the query input by executing a neural scene representation (NSR) model (Fig. 2 Caption on Page 4 – “The overall architecture of our model… opacity and color for each input coordinate are estimated using the sampled feature vectors following the defined model”; Note: Fig. 2 shows the neural scene representation model; see screenshot of Fig. 2 above. The opacity and color are the query output) based on the query input and the input feature value (Paragraph 5 in 2nd Col. of Page 3, Paragraph 1 in 1st Col. of Page 4 – “We consider a neural radiance field leveraging grid representations. It takes an input coordinate x ∈ R3 and viewing direction, d ∈ R2, generating a four-dimensional vector consisting of a density and three-channel RGB colors, defined as follows. PNG media_image2.png 69 291 media_image2.png Greyscale where θ is the parameter of an MLP, γ = {γσ, γc} is a set of grid parameters, and M = {Mσ,Mc} is a set of masks for grid parameters. The following volumetric rendering equation is used to synthesize novel views: PNG media_image3.png 133 358 media_image3.png Greyscale where r(t) is a ray from a camera viewpoint, and C(r) is the expected color of the ray r(t)”; Note: the output is a four-dimensional vector consisting of a density and three-channel RGB colors. The output is generated from the coordinate and direction input, as well as the feature values obtained from the grid parameters and mask). Rho does not directly teach that the input feature value is in a real number format, from the limitation: “determine an input feature value in a real number format based on the reference feature values in the binary format”. However, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to explicitly have the input feature value be in a real number format, because in Rho, the input feature value is determined by performing bilinear interpolation (Fig. 2 Caption on Page 4 – “We sample feature vectors for input coordinates using bilinear interpolation on the grids”), and interpolation tends to result in a decimal/fractional number, which binary format cannot account for. Furthermore, there is a finite number of number formats, including binary, real number, hexadecimal, etc. One of ordinary skill in the art could have determined the input feature value in a real number format with a reasonable expectation of success and would have done so for the benefit of predicting a reasonable feature value using interpolation. Therefore, it would have been obvious to try the solution of producing the input feature value as a real number. Regarding claim 18, Rho teaches the electronic device of claim 17. Rho further teaches wherein the binary feature grid comprises: a 3D binary feature grid representing the 3D scene (Fig. 2 Caption, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid…In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively. H, W, D are the resolution of the grid. More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the masked grid is a binary feature grid, and the grid is 3D), and a two-dimensional (2D) binary feature grid representing a 2D scene in which the 3D scene is projected onto a 2D plane (Fig. 2 Caption, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid…For efficient 3D object and scene representation, recent studies proposed using a lower-dimensional grid (2D planes or 1D lines) [3, 4]. For example, EG3D [3] utilizes three 2D planes (tri-plane) and TensoRF [4] employs three sets, each consisting of a plane and a vector. We describe our method based on TensoRF. In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively. H, W, D are the resolution of the grid. More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the matrices, when masked, are 2D binary feature grids. They represent 2D planes corresponding to the 3D scene). Claims 4-5, 8-13, 15, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Rho in view of Xu et al. (Grid-guided Neural Radiance Fields for Large Urban Scenes), hereinafter Xu. Regarding claim 4, Rho teaches the neural rendering method of claim 3. Rho teaches wherein the extracting of the reference feature values comprises: determining a 2D point by projecting the query point onto the 2D plane (Fig. 2 Caption, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid…We sample feature vectors for input coordinates using bilinear interpolation on the grids…For efficient 3D object and scene representation, recent studies proposed using a lower-dimensional grid (2D planes or 1D lines) [3, 4]. For example, EG3D [3] utilizes three 2D planes (tri-plane) and TensoRF [4] employs three sets, each consisting of a plane and a vector. We describe our method based on TensoRF. In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively. H, W, D are the resolution of the grid. More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the matrices, when masked, are 2D binary feature grids. They represent 2D planes corresponding to the 3D scene. The input coordinate, which is the query point, is identified in the grids in order to sample feature vectors); and extracting second reference feature values around the 2D point from the 2D binary feature grid (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation”; Note: the spatial features used in the bilinear interpolation operation are the second reference feature values, and they come from masked wavelet coefficients, which are in a binary format. Fig. 2 above shows the interpolation on the 2D grid). Rho does not teach extracting first reference feature values around the query point from the 3D binary feature grid. However, Xu teaches extracting first reference feature values around the query point from the 3D binary feature grid (Paragraph 3 in 1st Col. of Page 3, Paragraph 1 in 2nd Col. of Page 3 – “grid-based representations encode a scene into a feature grid, which can be intuitively thought of as a 3D voxel grid with grid resolution matched with the actual 3D space. Each voxel stores a feature vector at the vertices and can then be interpolated to extract a feature value at the query point coordinate and converted to the point density and color via a small network”; Note: feature values are extracted from a 3D voxel grid in order to perform interpolation. The 3D binary feature grid was previously taught by Rho in the rejection of claim 1. In this case, the 3D binary feature grid is represented by the 3D voxel grid). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Xu to extract reference feature values around the query point from the 3D binary feature grid because having reference feature values from both the 3D binary feature grid and corresponding 2D binary feature grids would provide plentiful data to the model in order to properly and accurately estimate the feature values of the query point for rendering. Regarding claim 5, Rho in view of Xu teaches the neural rendering method of claim 4. Rho does not teach wherein the input feature value comprises a first input feature value and a second input feature value, and the determining of the input feature value comprises: determining the first input feature value by performing an interpolation operation based on the first reference feature values; and determining the second input feature value by performing an interpolation operation based on the second reference feature values. However, Xu teaches wherein the input feature value comprises a first input feature value and a second input feature value (Paragraph 3 in 1st Col. of Page 3, Paragraph 1 in 2nd Col. of Page 3, Paragraph 4 in 2nd Col. of Page 3 – “grid-based representations encode a scene into a feature grid, which can be intuitively thought of as a 3D voxel grid with grid resolution matched with the actual 3D space. Each voxel stores a feature vector at the vertices and can then be interpolated to extract a feature value at the query point coordinate and converted to the point density and color via a small network… Formally, our grid-based radiance field is written as: PNG media_image6.png 25 319 media_image6.png Greyscale where PNG media_image7.png 22 275 media_image7.png Greyscale are the extracted interpolated feature values from the two grid-planes at location X ∈ R3.”; Note: the feature value interpolated from the 3D voxel grid is equivalent to the first input feature value, and the feature value interpolated from the grid-plane is equivalent to the second feature value), and determining the first input feature value by performing an interpolation operation based on the first reference feature values (Paragraph 3 in 1st Col. of Page 3, Paragraph 1 in 2nd Col. of Page 3 – “grid-based representations encode a scene into a feature grid, which can be intuitively thought of as a 3D voxel grid with grid resolution matched with the actual 3D space. Each voxel stores a feature vector at the vertices and can then be interpolated to extract a feature value at the query point coordinate and converted to the point density and color via a small network”; Note: feature values interpolated from feature values in a 3D voxel grid are equivalent to the first input feature values); and determining the second input feature value by performing an interpolation operation based on the second reference feature values (Paragraph 4 in 2nd Col. of Page 3 – “Formally, our grid-based radiance field is written as: PNG media_image6.png 25 319 media_image6.png Greyscale where PNG media_image7.png 22 275 media_image7.png Greyscale are the extracted interpolated feature values from the two grid-planes at location X ∈ R3.”; Note: feature values interpolated from feature values in a 2D grid-plane are equivalent to the second input feature values). While Xu separately teaches the first and second input feature values, it would have been obvious to have both in order to more accurately approximate the features of the query point. Continuing on, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Xu to have first and second input feature values for the benefit of having additional data to contribute to the prediction of the query point feature values, which would help make the rendering more accurate. Since Rho already teaches interpolation (Fig. 2 Caption on Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids”), it also would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Xu to perform interpolation to obtain the first and second input feature values for the benefit of efficient and accurate prediction of the query point feature values. Regarding claim 8, Rho teaches a training method (Paragraph 4 in 1st Col. of Page 2, Paragraph 5 in 2nd Col. of Page 3 – “To automatically filter out unnecessary coefficients, we propose a trainable binary mask. For each 3D scene, we jointly optimize grid parameters and their corresponding masks…We consider a neural radiance field leveraging grid representations. It takes an input coordinate x ∈ R3 and viewing direction, d ∈ R2, generating a four-dimensional vector consisting of a density and three-channel RGB colors”) comprising: receiving a query input (input coordinate and input viewing direction [Wingdings font/0xE0] see quote below) comprising coordinates and a view direction of a query point of a three-dimensional (3D) scene in a 3D space (Paragraph 5 in 2nd Col. of Page 3 – “We consider a neural radiance field leveraging grid representations. It takes an input coordinate x ∈ R3 and viewing direction, d ∈ R2”; Note: the input coordinate is a query point, and it is in a 3-dimensional space, symbolized by R3); determining an input feature value (sample feature vectors resulting from bilinear interpolation [Wingdings font/0xE0] see quote below) using binary reference feature values (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation… vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively…More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the sampled features are the input feature values, and they are based on the features used in the bilinear interpolation operation, which are the binary reference feature values. The sampled features are a result of bilinear interpolation); and training a neural scene representation (NSR) model (Model shown in Fig. 2) to generate a query output (color and density/opacity [Wingdings font/0xE0] see quotes below) corresponding to the query input by using the input feature value in the real number format as an input of the NSR model (Paragraph 5 in 2nd Col. of Page 3, Paragraph 1 in 1st Col. of Page 4, Fig. 2 Caption on Page 4 – “We consider a neural radiance field leveraging grid representations. It takes an input coordinate x ∈ R3 and viewing direction, d ∈ R2, generating a four-dimensional vector consisting of a density and three-channel RGB colors, defined as follows. PNG media_image2.png 69 291 media_image2.png Greyscale where θ is the parameter of an MLP, γ = {γσ, γc} is a set of grid parameters, and M = {Mσ,Mc} is a set of masks for grid parameters. The following volumetric rendering equation is used to synthesize novel views… The overall architecture of our model… opacity and color for each input coordinate are estimated using the sampled feature vectors following the defined model”; Note: the output is a four-dimensional vector consisting of a density and three-channel RGB colors. The output is generated from the coordinate and direction input, as well as the sampled feature values. The sampled feature values are used as input into the model operation. Fig. 2 shows the neural scene representation model; see screenshot of Fig. 2 above. It is implied that the model, especially the MLP, was trained, otherwise it would not be able to produce the desired results). Rho does not directly teach that the input feature value is in a real number format, from the limitation: “determining an input feature value in the real number format using the binary reference feature values”. However, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to explicitly have the input feature value be in a real number format, because in Rho, the input feature value is determined by performing bilinear interpolation (Fig. 2 Caption on Page 4 – “We sample feature vectors for input coordinates using bilinear interpolation on the grids”), and interpolation tends to result in a decimal/fractional number, which binary format cannot account for. Furthermore, there is a finite number of number formats, including binary, real number, hexadecimal, etc. One of ordinary skill in the art could have determined the input feature value in a real number format with a reasonable expectation of success and would have done so for the benefit of predicting a reasonable feature value using interpolation. Therefore, it would have been obvious to try the solution of producing the input feature value as a real number. Moreover, Rho does not teach extracting real-valued reference feature values around the query point from feature values in a real number format of a real-valued feature grid representing the 3D scene. Instead, Rho teaches extracting binary reference feature values around the query point from feature values in a binary format of a binary feature grid representing the 3D scene (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4, Paragraph 1 in 2nd Col. of Page 5 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions…During training, the grid parameters and corresponding binarized masks are multiplied element by element”; Note: the spatial features used in the bilinear interpolation operation are the reference feature values, and they come from masked wavelet coefficients, which are in a binary format). Xu teaches extracting real-valued reference feature values around the query point from feature values in a real number format of a real-valued feature grid representing the 3D scene (Paragraph 3 in 1st Col. of Page 3, Paragraph 1 and 4 in 2nd Col. of Page 3 – “grid-based representations encode a scene into a feature grid, which can be intuitively thought of as a 3D voxel grid with grid resolution matched with the actual 3D space. Each voxel stores a feature vector at the vertices and can then be interpolated to extract a feature value at the query point coordinate and converted to the point density and color via a small network… Formally, our grid-based radiance field is written as: PNG media_image6.png 25 319 media_image6.png Greyscale where PNG media_image7.png 22 275 media_image7.png Greyscale are the extracted interpolated feature values from the two grid-planes at location X ∈ R3”; Note: the feature values used in the interpolation operation are real-valued reference feature values. They are extracted and are of a real number format). A person of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that the binary reference feature values of Rho could have been substituted for the real-valued reference feature values of Xu because both the binary and real-valued reference feature values serve the purpose of representing feature values nearby to the query point. Furthermore, a person of ordinary skill in the art would have been able to carry out the substitution. Finally, the substitution achieves the predictable result of obtaining feature values that are nearby the query point. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to substitute the binary reference feature values of Rho for the real-valued reference feature values of Xu according to known methods to yield the predictable result of obtaining feature values that are nearby the query point. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Xu to have a real-valued feature grid and extract reference feature values from it for the benefit of training the model in Rho by using the real-valued reference feature values as a ground truth, which would help the model learn and be more accurate. Furthermore, Rho does not directly teach determining binary reference feature values in a binary format of a binary feature grid by binarizing the real-valued reference feature values. Instead, Rho teaches determining binary reference feature values in a binary format of a binary feature grid by binarizing the real-valued feature values (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4, Paragraph 1 in 2nd Col. of Page 5 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation… More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale …we have a set of trainable element-wise masks, M = {Mxr, MYr, MZr, mxr, mYr, mZr} Nrr=1 (note that we omitted the subscript σ for brevity). During training, the grid parameters and corresponding binarized masks are multiplied element by element”; Note: the spatial features used in the bilinear interpolation operation are the binary reference feature values, and they come from masked wavelet coefficients, which are in a binary format. The wavelet coefficients, which are the real-valued feature values, were binarized/masked). While Rho does not directly teach binarizing the real-valued reference feature values, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to binarize the real-valued reference feature values, because the binarization operation would perform more quickly if done only for the reference feature values, rather than for all the feature values. Furthermore, there are a finite number of ways to binarize the feature values; either it can be performed with all the feature values or a subset of feature values. One of ordinary skill in the art could have performed binarization after reference feature values were extracted in order to determine binary reference feature values with a reasonable expectation of success and would have done so for the benefit of increasing the efficiency of rendering since the binarization operation would not have to be performed for each element in the grids. Therefore, it would have been obvious to try the solution of binarizing the real-valued reference feature values, instead of all the feature values, by performing binarization after the extraction of reference feature values. Regarding claim 9, Rho in view of Xu teaches the training method of claim 8. Rho further teaches wherein the determining of the input feature value comprises: determining the input feature value by performing an interpolation operation based on the reference feature values (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation… vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively”; Note the sampled features are the input feature values, and they are based on the features used in the bilinear interpolation operation). Regarding claim 10, Rho in view of Xu teaches the training method of claim 8. Rho further teaches wherein the real-valued feature grid comprises: a 3D real-valued feature grid representing the 3D scene (Fig. 2 Caption, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid…In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions respectively. H, W, D are the resolution of the grid. More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the 3D grid before masking is a 3D real-valued feature grid), and a two-dimensional (2D) real-valued feature grid representing a 2D scene in which the 3D scene is projected onto a 2D plane (Fig. 2 Caption, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid…In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions respectively. H, W, D are the resolution of the grid. More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the matrices, before masking, are 2D real-valued feature grids. They represent 2D planes corresponding to the 3D scene), and the binary feature grid comprises: a 3D binary feature grid corresponding to a binary version of the 3D real-valued feature grid (Fig. 2 Caption, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid…In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively. H, W, D are the resolution of the grid. More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the masked grid is a binary feature grid, and the grid is 3D. It is a binarized version of the real-valued feature grid), and a 2D binary feature grid corresponding to a binary version of the 2D real-valued feature grid (Fig. 2 Caption, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid…For efficient 3D object and scene representation, recent studies proposed using a lower-dimensional grid (2D planes or 1D lines) [3, 4]. For example, EG3D [3] utilizes three 2D planes (tri-plane) and TensoRF [4] employs three sets, each consisting of a plane and a vector. We describe our method based on TensoRF. In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively. H, W, D are the resolution of the grid. More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the matrices, when masked, are 2D binary feature grids. They are the binarized/masked versions of the real-valued matrices/grids. They represent 2D planes corresponding to the 3D scene). Regarding claim 11, Rho in view of Xu teaches the training method of claim 10. Rho teaches wherein the extracting of the reference feature values comprises: determining a 2D point by projecting the query point onto the 2D plane (Fig. 2 Caption, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid…We sample feature vectors for input coordinates using bilinear interpolation on the grids…For efficient 3D object and scene representation, recent studies proposed using a lower-dimensional grid (2D planes or 1D lines) [3, 4]. For example, EG3D [3] utilizes three 2D planes (tri-plane) and TensoRF [4] employs three sets, each consisting of a plane and a vector. We describe our method based on TensoRF. In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively. H, W, D are the resolution of the grid. More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the matrices, when masked, are 2D binary feature grids. They represent 2D planes corresponding to the 3D scene. The input coordinate, which is the query point, is identified in the grids in order to sample feature vectors); and extracting second reference feature values around the 2D point from the 2D binary feature grid (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation”; Note: the spatial features used in the bilinear interpolation operation are the second reference feature values, and they come from masked wavelet coefficients, which are in a binary format. Fig. 2 above shows the interpolation on the 2D grid). Rho does not directly teach wherein the determining of the binary reference feature values comprises: determining first binary reference feature values by binarizing the first real-valued reference feature values; and determining second binary reference feature values by binarizing the second real-valued reference feature values. Instead, Rho teaches determining second binary feature values by binarizing the second real-valued feature values (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation”; Note: the wavelet coefficients are real-valued feature values, and when masked, they become binary features values). While Rho does not directly teach binarizing the real-valued reference feature values, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to binarize the real-valued reference feature values, because the binarization operation would perform more quickly if done only for the reference feature values, rather than for all the feature values. Furthermore, there are a finite number of ways to binarize the feature values; either it can be performed with all the feature values or a subset of feature values. One of ordinary skill in the art could have determined the binary reference feature values by binarizing the real-valued reference feature values with a reasonable expectation of success and would have done so for the benefit of increasing the efficiency of rendering since the binarization operation would not have to be performed for each element in the grids. Therefore, it would have been obvious to try the solution of binarizing the real-valued reference feature values, instead of all the feature values. Additionally, while Rho does not directly teach determining first binary reference feature values by binarizing the first real-valued reference feature values, Xu teaches first real-valued reference feature values (Paragraph 3 in 1st Col. of Page 3, Paragraph 1 in 2nd Col. of Page 3 – “grid-based representations encode a scene into a feature grid, which can be intuitively thought of as a 3D voxel grid with grid resolution matched with the actual 3D space. Each voxel stores a feature vector at the vertices and can then be interpolated to extract a feature value at the query point coordinate and converted to the point density and color via a small network”; Note: the feature values used in the interpolation operation are the first real-valued reference feature values), and a person of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that the second real-valued reference feature values could have been substituted for the first real-valued reference feature values because both the first and second reference feature values serve the purpose of representing feature values nearby to the query point. Furthermore, a person of ordinary skill in the art would have been able to carry out the substitution. Finally, the substitution achieves the predictable result of changing the real-valued reference feature values into the binary format. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to substitute the second real-valued reference feature values for the first real-valued reference feature values according to known methods to yield the predictable result of changing the real-valued reference feature values into the binary format. Furthermore, Rho does not teach extracting first reference feature values around the query point from the 3D binary feature grid. However, Xu teaches extracting first reference feature values around the query point from the 3D binary feature grid (Paragraph 3 in 1st Col. of Page 3, Paragraph 1 in 2nd Col. of Page 3 – “grid-based representations encode a scene into a feature grid, which can be intuitively thought of as a 3D voxel grid with grid resolution matched with the actual 3D space. Each voxel stores a feature vector at the vertices and can then be interpolated to extract a feature value at the query point coordinate and converted to the point density and color via a small network”; Note: feature values are extracted from a 3D voxel grid in order to perform interpolation. The 3D binary feature grid was previously taught by Rho in the rejection of claim 1. In this case, the 3D binary feature grid is represented by the 3D voxel grid). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Xu to extract reference feature values around the query point from the 3D binary feature grid because having reference feature values from both the 3D binary feature grid and corresponding 2D binary feature grids would provide plentiful data to the model in order to properly and accurately estimate the feature values of the query point for rendering. Regarding claim 12, Rho in view of Xu teaches the training method of claim 11. Rho does not teach wherein the input feature value comprises a first input feature value and a second input feature value, and the determining of the input feature value comprises: determining the first input feature value by performing an interpolation operation based on the first reference feature values; and determining the second input feature value by performing an interpolation operation based on the second reference feature values. However, Xu teaches wherein the input feature value comprises a first input feature value and a second input feature value (Paragraph 3 in 1st Col. of Page 3, Paragraph 1 in 2nd Col. of Page 3, Paragraph 4 in 2nd Col. of Page 3 – “grid-based representations encode a scene into a feature grid, which can be intuitively thought of as a 3D voxel grid with grid resolution matched with the actual 3D space. Each voxel stores a feature vector at the vertices and can then be interpolated to extract a feature value at the query point coordinate and converted to the point density and color via a small network… Formally, our grid-based radiance field is written as: PNG media_image6.png 25 319 media_image6.png Greyscale where PNG media_image7.png 22 275 media_image7.png Greyscale are the extracted interpolated feature values from the two grid-planes at location X ∈ R3.”; Note: the feature value interpolated from the 3D voxel grid is equivalent to the first input feature value, and the feature value interpolated from the grid-plane is equivalent to the second feature value), and determining the first input feature value by performing an interpolation operation based on the first reference feature values (Paragraph 3 in 1st Col. of Page 3, Paragraph 1 in 2nd Col. of Page 3 – “grid-based representations encode a scene into a feature grid, which can be intuitively thought of as a 3D voxel grid with grid resolution matched with the actual 3D space. Each voxel stores a feature vector at the vertices and can then be interpolated to extract a feature value at the query point coordinate and converted to the point density and color via a small network”; Note: feature values interpolated from feature values in a 3D voxel grid are equivalent to the first input feature values); and determining the second input feature value by performing an interpolation operation based on the second reference feature values (Paragraph 4 in 2nd Col. of Page 3 – “Formally, our grid-based radiance field is written as: PNG media_image6.png 25 319 media_image6.png Greyscale where PNG media_image7.png 22 275 media_image7.png Greyscale are the extracted interpolated feature values from the two grid-planes at location X ∈ R3.”; Note: feature values interpolated from feature values in a 2D grid-planes are equivalent to the second input feature values). While Xu separately teaches the first and second input feature values, it would have been obvious to have both in order to more accurately approximate the features of the query point. Continuing on, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Xu to have first and second input feature values for the benefit of having additional data to contribute to the prediction of the query point feature values, which would help make the rendering more accurate. Since Rho already teaches interpolation (Fig. 2 Caption on Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids”), it also would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Xu to perform interpolation to obtain the first and second input feature values for the benefit of efficient and accurate prediction of the query point feature values. Regarding claim 13, Rho in view of Xu teaches the training method of claim 8. Rho further teaches wherein the real-valued feature grid and the binary feature grid have a same size (Fig. 2 Caption, Paragraph 1 in 2nd Col. of Page 5 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features…the grid parameters and corresponding binarized masks are multiplied element by element… replacing grid parameters (Wr, vr) from Eq. (7) with masked grid parameters”; Note: the grid before masking, which is the real-valued feature grid, and the grid after masking, which is the binary feature grid, are the same size since when performing masking, only the element values are changed, so the size of the grid is implied to remain the same), and positions of the real-valued reference feature values in the real-valued feature grid respectively correspond to positions of the binary reference feature values in the binary feature grid in a one-to-one correspondence (Fig. 2 Caption, Paragraph 1 in 2nd Col. of Page 5 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features…the grid parameters and corresponding binarized masks are multiplied element by element… replacing grid parameters (Wr, vr) from Eq. (7) with masked grid parameters”; Note: the grid before masking, which is the real-valued feature grid, and the grid after masking, which is the binary feature grid, have elements that correspond to each other, since when performing masking, only the element values are changed, so the positions of the elements in the grid are implied to remain the same). Regarding claim 15, Rho in view of Xu teaches the training method of claim 8. Rho further teaches wherein the NSR model is trained using a real-valued feature grid comprising feature values in the real number format (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4, Paragraph 1 in 2nd Col. of Page 5 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions…During training, the grid parameters and corresponding binarized masks are multiplied element by element”; Note: the wavelet coefficients/grid parameters (before masking) are real-valued reference feature values in a real-number format). Rho does not explicitly teach when training of the NSR model is completed, neural rendering is performed using the binary feature grid without the real-valued feature grid. Instead, Rho teaches when training of the NSR model is completed, neural rendering is performed using the binary feature grid with the real-valued feature grid (Paragraph 4 in 2nd Col. of Page 3, Paragraph 1 in 1st Col. of Page 4 – “We consider a neural radiance field leveraging grid representations. It takes an input coordinate x ∈ R3 and viewing direction, d ∈ R2, generating a four-dimensional vector consisting of a density and three-channel RGB colors, defined as follows. PNG media_image5.png 68 282 media_image5.png Greyscale where θ is the parameter of an MLP, γ = {γσ, γc} is a set of grid parameters, and M = {Mσ,Mc} is a set of masks for grid parameters, which will be described shortly (Sec. 3.3). The following volumetric rendering equation is used to synthesize novel views: PNG media_image3.png 133 358 media_image3.png Greyscale where r(t) is a ray from a camera viewpoint, and C(r) is the expected color of the ray r(t)”; Note: the grid parameters are used in rendering, along with masks for the grid parameters, which implies that the real-valued grid (before masking) and the binary grid (after masking) are both used in rendering). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to perform neural rendering using the binary feature grid without the real-valued feature grid, because if the model is trained to binarize the features, then the real-valued grid would not be necessary for rendering; only the feature values themselves would be necessary, but not the whole grid. Furthermore, there are a finite number of ways to perform neural rendering with real-valued and binary grids; either it can be performed with both grids or with just one grid. One of ordinary skill in the art could have performed neural rendering using only the binary feature grid with a reasonable expectation of success and would have done so for the benefit of increasing the efficiency of rendering since the real-valued feature grid would not have to be generated. Therefore, it would have been obvious to try the solution of performing neural rendering using only the binary feature grid. Regarding claim 19, Rho teaches the electronic device of claim 18. Rho teaches wherein, to extract the reference feature values, the processor is further configured to: determine a 2D point by projecting the query point onto the 2D plane (Fig. 2 Caption, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid…We sample feature vectors for input coordinates using bilinear interpolation on the grids…For efficient 3D object and scene representation, recent studies proposed using a lower-dimensional grid (2D planes or 1D lines) [3, 4]. For example, EG3D [3] utilizes three 2D planes (tri-plane) and TensoRF [4] employs three sets, each consisting of a plane and a vector. We describe our method based on TensoRF. In detail, we use a set of 2D matrices and 1D vectors for grid representation… Nr is the number of ranks in matrix-vector decomposition and Wxr ∈ RHxW, Wyr ∈ RWxD, Wzr ∈ RHxD are matrices, vxr ∈ RD, vyr ∈ RH, vzr ∈ RW, are vectors in x, y, z directions, respectively. H, W, D are the resolution of the grid. More formally, a 3D grid representation G can be defined as follows. PNG media_image1.png 92 365 media_image1.png Greyscale ”; Note: the matrices, when masked, are 2D binary feature grids. They represent 2D planes corresponding to the 3D scene. The input coordinate, which is the query point, is identified in the grids in order to sample feature vectors); and extract second reference feature values around the 2D point from the 2D binary feature grid (Fig. 2 Caption on Page 4, Paragraph 2 in 2nd Col. of Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids… In detail, we use a set of 2D matrices and 1D vectors for grid representation”; Note: the spatial features used in the bilinear interpolation operation are the second reference feature values, and they come from masked wavelet coefficients, which are in a binary format. Fig. 2 above shows the interpolation on the 2D grid). Rho does not teach extracting first reference feature values around the query point from the 3D binary feature grid. However, Xu teaches extracting first reference feature values around the query point from the 3D binary feature grid (Paragraph 3 in 1st Col. of Page 3, Paragraph 1 in 2nd Col. of Page 3 – “grid-based representations encode a scene into a feature grid, which can be intuitively thought of as a 3D voxel grid with grid resolution matched with the actual 3D space. Each voxel stores a feature vector at the vertices and can then be interpolated to extract a feature value at the query point coordinate and converted to the point density and color via a small network”; Note: feature values are extracted from a 3D voxel grid in order to perform interpolation. The 3D binary feature grid was previously taught by Rho in the rejection of claim 1. In this case, the 3D binary feature grid is represented by the 3D voxel grid). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Xu to extract reference feature values around the query point from the 3D binary feature grid because having reference feature values from both the 3D binary feature grid and corresponding 2D binary feature grids would provide plentiful data to the model in order to properly and accurately estimate the feature values of the query point for rendering. Regarding claim 20, Rho in view of Xu teaches the electronic device of claim 19. Rho does not teach wherein the input feature value comprises a first input feature value and a second input feature value, and wherein, the processor is further configured to: determine the first input feature value by performing an interpolation operation based on the first reference feature values; and determine the second input feature value by performing an interpolation operation based on the second reference feature values. However, Xu teaches wherein the input feature value comprises a first input feature value and a second input feature value (Paragraph 3 in 1st Col. of Page 3, Paragraph 1 in 2nd Col. of Page 3, Paragraph 4 in 2nd Col. of Page 3 – “grid-based representations encode a scene into a feature grid, which can be intuitively thought of as a 3D voxel grid with grid resolution matched with the actual 3D space. Each voxel stores a feature vector at the vertices and can then be interpolated to extract a feature value at the query point coordinate and converted to the point density and color via a small network… Formally, our grid-based radiance field is written as: PNG media_image6.png 25 319 media_image6.png Greyscale where PNG media_image7.png 22 275 media_image7.png Greyscale are the extracted interpolated feature values from the two grid-planes at location X ∈ R3.”; Note: the feature value interpolated from the 3D voxel grid is equivalent to the first input feature value, and the feature value interpolated from the grid-plane is equivalent to the second feature value), and determining the first input feature value by performing an interpolation operation based on the first reference feature values (Paragraph 3 in 1st Col. of Page 3, Paragraph 1 in 2nd Col. of Page 3 – “grid-based representations encode a scene into a feature grid, which can be intuitively thought of as a 3D voxel grid with grid resolution matched with the actual 3D space. Each voxel stores a feature vector at the vertices and can then be interpolated to extract a feature value at the query point coordinate and converted to the point density and color via a small network”; Note: feature values are interpolated from feature values in a 3D voxel grid); and determining the second input feature value by performing an interpolation operation based on the second reference feature values (Paragraph 4 in 2nd Col. of Page 3 – “Formally, our grid-based radiance field is written as: PNG media_image6.png 25 319 media_image6.png Greyscale where PNG media_image7.png 22 275 media_image7.png Greyscale are the extracted interpolated feature values from the two grid-planes at location X ∈ R3.”; Note: features valued are interpolated from the feature values in the 2D grid-planes). While Xu separately teaches the first and second input feature values, it would have been obvious to have both in order to more accurately approximate the features of the query point. Continuing on, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Xu to have first and second input feature values for the benefit of having additional data to contribute to the prediction of the query point feature values, which would help make the rendering more accurate. Since Rho already teaches interpolation (Fig. 2 Caption on Page 4 – “The wavelet coefficients in each 2D grid are multiplied with a binarized mask to form a masked wavelet coefficient grid. Masked wavelet coefficients are then inverse-transformed to spatial features. We sample feature vectors for input coordinates using bilinear interpolation on the grids”), it also would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Xu to perform interpolation to obtain the first and second input feature values for the benefit of efficient and accurate prediction of the query point feature values. Claims 14 is rejected under 35 U.S.C. 103 as being unpatentable over Rho in view of Xu and Ferrarini et al. (Binary Neural Networks for Memory-Efficient and Effective Visual Place Recognition in Changing Environments), hereinafter Ferrarini. Regarding claim 14, Rho in view of Xu teaches the training method of claim 8. Rho does not teach applying a sign function for forward propagation of the real-valued feature grid and the binary feature grid, and applying a substitution function of the sign function for backward propagation of the real-valued feature grid and the binary feature grid. However, Ferrarini teaches applying a sign function for forward propagation of the real-valued feature grid and the binary feature grid (Paragraph 4 in 1st Col. of Page 4 – “Training BNNs with backpropagation is not applicable as it requires a sufficient precision to allow gradient accumulation to work [41]. Courbariaux and Bengio [26] solved this problem with STE [17]. The fundamental idea of STE is that the quantization function is applied in the forward pass but skipped during backpropagation. STE keeps a set of full-precision weights denoted as proxies (WF) which are binarized (WB) on forward pass to make a prediction and compute a loss. Any function can be used as binarization function. Courbariaux and Bengio used sign function PNG media_image8.png 34 186 media_image8.png Greyscale ”; Note: the full-precision weights are equivalent to real-valued features, and the binarized weights are equivalent to the binary features. The sign function is applied in forward propagation for the real-valued features to obtain binarized features. The real-valued feature grid and binary feature grid were previously taught in the rejection of claim 8), and applying a substitution function of the sign function for backward propagation of the real-valued feature grid and the binary feature grid (Paragraph 4 in 1st Col. of Page 4, Paragraph 1 in 2nd Col. of Page 4 – “The fundamental idea of STE is that the quantization function is applied in the forward pass but skipped during backpropagation. STE keeps a set of full-precision weights denoted as proxies (WF) which are binarized (WB) on forward pass to make a prediction and compute a loss…In the backpropagation phase, WF are updated accordingly to the loss gradient as in a regular network PNG media_image9.png 61 164 media_image9.png Greyscale … In the forward pass, it behaves as the sign function performing binarization. In the backward pass, the function returns a clipped identity of the gradient. Courbariaux and Bengio [26] observed that canceling the gradient when activation (aF ) exceeds 1.0 improves a model’s accuracy PNG media_image10.png 67 292 media_image10.png Greyscale ”; Note: the full-precision weights are equivalent to real-valued features, and the binarized weights are equivalent to the binary features. The real-valued feature grid and binary feature grid were previously taught in the rejection of claim 8. The straight-through estimator (STE) is used as a substitution function for the sign function). Because Rho already teaches binarization and STE (Paragraph 1 in 2nd Col. of Page 5 – “During training, the grid parameters and corresponding binarized masks are multiplied element by element. Since calculating gradients directly from binarized masks is not feasible, we used the straight-through-estimator technique [1] to train and use masks”), Ferrarini is merely used to supplement Rho. Ferrarini further demonstrates how binarization occurs and how STE is used in back propagation. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Ferrarini to apply a sign function for forward propagation because Rho already teaches binarization to produce an output, and a sign function is simple way of performing binarization. It also would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Ferrarini to apply a substitution function of the sign function for backward propagation because as Rho suggests, “calculating gradients directly from binarized masks is not feasible” (Paragraph 1 in 2nd Col. of Page 5), and thus, the STE function acts as a substitute to help compute the loss. Additionally, back propagation is beneficial for accurate and efficient learning in models. Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Rho in view of Xu et al. (US 20240013477 A1), hereinafter Xu 2. Regarding claim 16, Rho teaches a processor that performs the method of claim 1 (Paragraph 3 in 2nd Col. of Page 6, Paragraph 1 in 2nd Col. of Page 12 – “Figs. 1 and 5 show the quantitative performances of our method and the baselines on various datasets. Each graph displays the average PSNR and size of methods… current codes lower the GPU utilization rate from 80% (baseline) to 50%”; Note: it is implied that a processor, at least the GPU, was used to perform the method, because it would not be able to run otherwise). Rho does not teach a non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim 1. However, Xu 2 teaches a non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform a method (Paragraph 0026 – “The scene representation subsystem 114 and the model training subsystem 116 may be implemented using software (e.g., code, instructions, program) executed by one or more processing devices (e.g., processors, cores), hardware, or combinations thereof. The software may be stored on a non-transitory storage medium”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rho to incorporate the teachings of Xu 2 to have a non-transitory computer-readable storage medium storing instructions for the benefit of a persistent and reliable storage. It is important that there is a storage to store the program so that it can run and be used multiple times. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Irshad et al. (US 20240171724 A1) teaches generating neural representations of outdoor scenes using various modules, including a 3D feature grid module and feature aggregation module. Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE HAU MA whose telephone number is (571)272-2187. The examiner can normally be reached M-Th 7-5:30. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, King Poon can be reached at (571) 270-0728. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /MICHELLE HAU MA/Examiner, Art Unit 2617 /KING Y POON/Supervisory Patent Examiner, Art Unit 2617
Read full office action

Prosecution Timeline

May 06, 2024
Application Filed
Dec 18, 2025
Non-Final Rejection — §103
Jan 29, 2026
Interview Requested
Feb 05, 2026
Applicant Interview (Telephonic)
Feb 05, 2026
Examiner Interview Summary

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12602750
DIFFERENTIABLE EMULATION OF NON-DIFFERENTIABLE IMAGE PROCESSING FOR ADJUSTABLE AND EXPLAINABLE NON-DESTRUCTIVE IMAGE AND VIDEO EDITING
2y 5m to grant Granted Apr 14, 2026
Patent 12597208
BUILDING INFORMATION MODELING SYSTEMS AND METHODS
2y 5m to grant Granted Apr 07, 2026
Patent 12573217
SERVER, METHOD AND COMPUTER PROGRAM FOR GENERATING SPATIAL MODEL FROM PANORAMIC IMAGE
2y 5m to grant Granted Mar 10, 2026
Patent 12561851
HIGH-RESOLUTION IMAGE GENERATION USING DIFFUSION MODELS
2y 5m to grant Granted Feb 24, 2026
Patent 12536734
Dynamic Foveated Point Cloud Rendering System
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
81%
Grant Probability
99%
With Interview (+36.4%)
2y 7m
Median Time to Grant
Low
PTA Risk
Based on 21 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month