DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of the Claims
Claims 1-6, and 9-15 are currently pending in the present application, with claims 1 and 11 being independent. Claims 16-20 are withdrawn from further consideration as being drawn to a non-elected invention.
Response to Amendments/Arguments
Applicant’s arguments with respect to claim(s) 1-6, 9-15 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Regarding the remaining arguments: Applicant argues with respect to the amended claim language, which is fully addressed in the prior art rejections set forth below.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-6, 10-14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Clark "Volumetric bundle adjustment for online photorealistic scene capture." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6124-6132. 2022, in view of Liu et al. "Neural sparse voxel fields." Advances in Neural Information Processing Systems 33 (2020): 15651-15663, hereinafter referred to as “Liu”.
Regarding claim 1, Clark discloses a computing device for rendering a model volume data structure (Pg. 6126, Section 4; Our system efficiently constructs a dense volumetric representation of the scene from which photorealistic novel views can be rendered), the computing device comprising:
a processor coupled to a storage medium that stores instructions, which, upon execution by the processor, cause the processor to (Pg. 6128; We run all our experiments on a machine with an NVIDIA A6000 GPU and an AMD Ryzen 3990X CPU):
store the model volume data structure, wherein the model volume data structure comprises a B+ tree graph (Fig. 1 and Pg. 6125, Section 1; A neural volumetric dynamic B+Tree, called nVDB, that can efficiently represent 3D scenes and can grow as more areas of the scene are explored),
determine a camera view in which to render the model volume data structure (Fig. 1 and Section 4.2; We use volume rendering to synthesize views of the scene from a particular viewpoint given by the camera pose, Tj),
determine a plurality of rays in three-dimensional space based on the camera view (Section 4.2; For each pixel in the target image, we generate the corresponding ray xo and direction v),
for each ray in the plurality of rays, determine a color value by (Section 4.1; We construct a novel continuous 3D representation of a scene that maps each point and viewing direction to a color and opacity value, F0: (V(X),v) -> (c, σ),∀x ∈ V… Section 4.2; The color of the pixel corresponding to the ray is computed by raymarching through the volume…The output of the ray-marching is a single color value for the pixel):
Clark does not appear to explicitly disclose performing a first ray-marching process by:
determining a plurality of sample points along the ray; and determining a set of valid points from the plurality of sample points based at least upon density values of the plurality of sample points; and performing a second ray-marching process by:
querying the model volume data structure using the set of valid points to determine a plurality of color features; and determining a color value using a multilayer perceptron and the plurality of color features; and render the model volume data structure using the color values of the plurality of rays.
In the same art of neural volumetric rendering, Liu discloses performing a first ray-marching process (Section 3.2; Ray-voxel Intersection…Fig. 8; For any given camera position p0 and the ray direction v, we first intersect the ray with a set of sparse voxels…) by:
determining a plurality of sample points along the ray (Section 3.2; We return the color C (p0, v) by sampling points along a ray using Eq. (2) …Section A.1, Algorithm 1; …Stratified sampling: z1,…zm, with step size T, where …); and
determining a set of valid points from the plurality of sample points based at least upon density values of the plurality of sample points (Section 3.1; c and σ are the color and density of the 3D point p, v is ray direction…Section 3.2; As shown in Figure 1 (c), we create a set of query points using rejection sampling based on sparse voxels); and
performing a second ray-marching process (Section 3.2; Ray Marching inside voxels…Fig. 8; For any given camera position p0 and the ray direction v, we first…then predict the colors and densities with neural networks for points sampled along the ray inside voxels, and accumulate the colors and densities of the sampled points to get the rendered color C (p0, v)) by:
querying the model volume data structure using the set of valid points to determine a plurality of color features (Fig. 2 and Section 3.3 Self-Pruning; self-pruning strategy to effectively remove non-essential voxels during training based on the coarse geometry information…using model's prediction on density. Fig. 9; For any input (p,v), the model first obtains the feature representation by querying and interpolating the voxel embeddings with the 8 corresponding voxel vertices, and then uses the computed feature to further predicts (σ,c) using a MLP shared by all voxels); and
determining a color value using a multilayer perceptron and the plurality of color features (Section 3.1; Each Fiθ is modeled as a multi-layer perceptron (MLP) with shared parameters θ: Eq. (3). Fig. 9; predicts (σ,c) using a MLP shared by all voxels); and
render the model volume data structure using the color values of the plurality of rays (Section 3.2; Rendering is performed in two steps: (1) ray-voxel intersection; and (2) ray-marching inside voxels…Fig. 8-9).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate Liu’s density-based pruning and sparse voxel sampling technique into Clark’s volumetric rendering system. Doing so reduces unnecessary sampling and improve rendering efficiency. Clark’s ray-marching approach evaluates sample points along rays to determine a color value, while Liu provides a known technique for restricting computation to spatial regions by avoiding evaluation of empty or low-contribution regions, therefore integrating Liu’s approach into Clark would have yielded predictable results in reduced redundant sampling operations, improved cache and memory efficiency, and accelerated rendering performance while maintaining rendering quality in neural volumetric rendering frameworks of scene content (Liu Section 4.2 Results).
Regarding claim 2, Clark in view of Liu discloses the computing device of claim 1, and further discloses wherein the model volume data structure is generated by a machine learning model (Clark Pg. 6126, Section 4.1; Our representation combines a VDB tree with a neural network interpolator, which we call an nVDB tree…The function F(θ), projects the features sampled from the VDB grid to the color and opacity outputs. This projection is modelled using a shallow fully-connected neural network. Pg. 6127 and Fig. 3; During rendering the features are sampled from the volume using trilinear interpolation and a shallow MLP is used to project these features to color and occupancy values…once all the rays are rendered).
Clark and Liu are combined for the reason set forth above with respect to claim 1.
Regarding claim 3, Clark in view of Liu discloses the computing device of claim 2, and further discloses wherein the machine learning model uses a plurality of two-dimensional images to generate the model volume data structure (Clark Section 4; Our system takes as input a sequence of images I(i), a rough estimate of the depth D(i), at each frame and camera poses…efficiently constructs a dense volumetric representation of the scene from which photorealistic novel views can be rendered).
Clark and Liu are combined for the reason set forth above with respect to claim 1.
Regarding claim 4, Clark in view of Liu discloses the computing device of claim 1, but Clark does not disclose wherein determining the set of valid points comprises: querying the model volume data structure to determine the density values for the plurality of sample points; and determining the set of valid points from the plurality of sample points by removing sample points having a density value below a predetermined threshold.
In the same art of neural volumetric rendering, Liu discloses wherein determining the set of valid points comprises:
querying the model volume data structure to determine the density values for the plurality of sample points (Fig. 8-9 and Section 3.3; where {pj}G | j=1 are G uniformly sampled points inside the voxel Vi (G = 163 in our experiments), σ (gi (pj)) is the predicted density at point pj…); and
determining the set of valid points from the plurality of sample points by removing sample points having a density value below a predetermined threshold (Section 3.3; Vi is pruned if Eq. (6) …where {pj}G | j=1 are G uniformly sampled points inside the voxel Vi (G = 163 in our experiments) …γ is a threshold (γ = 0.5 in all our experiments)).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate Liu’s density-threshold-based pruning into Clark’s volumetric rendering system. Doing so would be a straightforward optimization to reduce the number of evaluated sample points and improve computational efficiency, yielding predictable benefits in improved performance in rendering (Liu Section 4.2 Results).
Regarding claim 5, Clark in view of Liu discloses the computing device of claim 4, but Clark does not disclose wherein determining the set of valid points further comprises removing sample points in the plurality of sample points that are outside a bounding box of a spatial representation of the model volume data structure.
In the same art of neural volumetric rendering, Liu discloses wherein determining the set of valid points further comprises removing sample points in the plurality of sample points that are outside a bounding box of a spatial representation of the model volume data structure (Section 3.1; We assume that the relevant non-empty parts of a scene are contained within a set of sparse (bounding) voxels V = {V1…Vk), and the scene is modeled as a set of voxel-bounded implicit functions…Section 3.2; We first apply Axis Aligned Bounding Box intersection test (AABB-test)…Section 3.3; Voxel initialization…initial set of voxels subdividing an initial bounding box (with volume V).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate Liu’s bounding-volume spatial restriction into Clark’s volumetric rendering system. Doing so allows limited sampling to a defined spatial region containing relevant scene content to avoid evaluating points outside the modeled scene space. Applying such spatial constraints would have been a straightforward optimization to eliminate unnecessary computations outside the scene bounds, yielding predictable results in improved rendering efficiency (Liu Section 4.2 Results).
Regarding claim 6, Clark in view of Liu discloses the computing device of claim 4, but Clark does not disclose wherein determining the set of valid points further comprises removing sample points along the ray after an accumulated density threshold is reached.
In the same art of neural volumetric rendering, Liu discloses wherein determining the set of valid points further comprises removing sample points along the ray after an accumulated density threshold is reached (Section 3.2 Early Termination; use a heuristic and stop evaluating points earlier when the accumulated transparency A (p0, v) drops below a certain threshold e…Section 3.3 Self-Pruning…)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate Liu’s early termination technique based on accumulated transparency into Clark’s volumetric rendering system. Doing so would be a predictable optimization to reduce redundant sampling along rays, thereby improving rendering speed and computational efficiency (Liu Section 4.2 Results).
Regarding claim 10, Clark in view of Liu discloses the computing device of claim 2, and further discloses wherein leaf nodes of the B+ tree graph correspond to voxels (Clark Pg. 6126, Section 3; VDB trees also represent voxels as the leaf nodes…the structure of VDB tree makes it possibly to efficiently access voxel values), and wherein the B+ tree graph is arranged based on spatial locations of the voxels (Clark Fig. 1 and Fig. 3.; spatial features)
Clark and Liu are combined for the reason set forth above with respect to claim 1.
Regarding claim 11, claim 11 is the method claim of system claim 1, and is accordingly rejected using substantially similar rationale as to which is set forth above with respect to claim 1.
Regarding claim 12, Clark in view of Liu discloses the method of claim 11, and further discloses wherein the model volume data structure is generated by a machine learning model (Clark nVDB) using a plurality of two-dimensional images (Clark Section 4; Our system takes as input a sequence of images I(i), a rough estimate of the depth D(i), at each frame and camera poses…efficiently constructs a dense volumetric representation of the scene from which photorealistic novel views can be rendered).
Clark and Liu are combined for the reason set forth above with respect to claim 1.
Regarding claim 13, claim 13 has similar limitations as of claim 4, except it is the method claim, therefore, it is rejected under the same rationale as claim 4.
Regarding claim 14, claim 14 has similar limitations as of claims 5 and 6, except it is the method claim, therefore, it is rejected under the same rationale as claim 5 and 6.
Claim(s) 9 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Clark "Volumetric bundle adjustment for online photorealistic scene capture." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6124-6132. 2022, in view of Liu et al. "Neural sparse voxel fields." Advances in Neural Information Processing Systems 33 (2020): 15651-15663, hereinafter referred to as “Liu”, and in further view of view of Lightstone et al. (US 7028022), hereinafter referred to as “Lightstone”’.
Regarding claim 9, Clark in view of Liu discloses the computing device of claim 1, but does not disclose wherein the B+ tree graph has a height of four.
In the same art of B+ trees, Lightstone discloses wherein the B+ tree graph (Column 7, lines 53-55; index 14 is stored in the form of a binary tree, such as a B- tree, B+-tree, or B*-tree) has a height of four (Column 8, lines 40-53 T4; In the table T2...T8 represent the percentile thresholds for the tree height 2 to 8).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the B+ tree of Clark to incorporate the B+ tree graph having a height of 4 as taught by Lightstone. The motivation lies in the advantage of selecting an appropriate tree height based on memory usage and indexing efficiency. Furthermore, in accordance with MPEP 2131.03, the claimed value of “four” falls within the disclosed prior art range and therefore would have achieved results consistent with those expected from the disclosed range. Selecting a height of four from this known range would have yielded predictable results using routine optimization.
Regarding claim 15, Clark in view of Liu discloses the method of claim 11, and further discloses leaf nodes of the B+ tree graph corresponds to voxels (Clark Pg. 6126, Section 3; VDB trees also represent voxels as the leaf nodes...the structure of VDB tree makes it possibly to efficiently access voxel values), and the B+ tree graph is arranged based on spatial locations of the voxels (Clark Fig. 1 and Fig. 3.; spatial features).
Clark in view of Liu does not disclose wherein the B+ tree graph has a height of four.
In the same art of B+ trees, Lightstone discloses wherein the B+ tree graph (Column 7, lines 53-55; index 14 is stored in the form of a binary tree, such as a B-tree, B+-tree, or B*-tree) has a height of four (Column 8, lines 40-53 T4; In the table T2...T8 represent the percentile thresholds for the tree height 2 to 8).
The motivation to combine would’ve been the same as that set forth in claim 9.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JENNY NGAN TRAN whose telephone number is (571)272-6888. The examiner can normally be reached Mon-Thurs 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alicia Harrington can be reached at (571) 272-2330. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JENNY N TRAN/Examiner, Art Unit 2615
/ALICIA M HARRINGTON/Supervisory Patent Examiner, Art Unit 2615