DETAILED ACTION
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-12 are rejected under 35 U.S.C. 103 as being unpatentable over Lv et al (US 20220239844 A1), and further in view of Wang et al (US 20240233062 A9).
RE claim 1, Lv teaches A computer implemented method (abstract) comprising:
receiving one or more keyframe images of a scene captured at an initial time and one or more first images of the scene captured a first time following the initial time (Figs 1, 4-5, [0040]) where each of the one or more keyframe images is associated with a corresponding three-dimensional (3D) camera location and camera direction included within a set of keyframe camera extrinsics and each of the one or more first frame images is associated with a corresponding 3D camera location and camera direction included within a set of first frame camera extrinsics (Figs 1, 4-5, [0010], [0029], [0040]);
training a keyframe neural network using the one or more keyframe images and the keyframe camera extrinsics wherein the keyframe neural network includes a plurality of common layers and an initial plurality of adaptive layers (Figs 2-5,7 [0006]-[0007], [0028], [0030], [0032], [0047], [0058] etc wherein continuous space-time representation shares time-invariant information across the entire video for each time step);
training a first frame neural network using the one or more first frame images and the first frame camera extrinsics and the first frame neural network including a first plurality of adaptive layers (Figs 2, 4, 5B, 6-7, [0007], [0038]-[0039], [0043], [0047], [0059] wherein initial latent codes are updated and optimized for each frame), and
wherein the first frame neural network is configured to be queried to produce a first novel view of an appearance of the scene at the first time ([0028], [0030]).
As set forth above, Lv teaches a system and method for dynamic neural scene representation using a continuous space-time representation that shares time-invariant information across frames while optimizing latent codes for each frame. Lv further teaches training neural models based on images of a scene and corresponding camera extrinsics in order to generate novel views of the scene. However, Lv does not explicitly disclose that the first frame neural network includes a plurality of common layers learned during training of a keyframe neural network.
However, Wang teaches per-frame neural network models with adaptive incremental transfer (abstract, [0250]-[0252]). Specifically, Wang teaches that a separate neural model may be trained for each frame, wherein the features learned for a given frame are reused as a starting point when training neural models for subsequent frames (Figs. 22 and 24; [0038], [0251]–[0252], [0262], [0264]). In this manner, previously learned parameters of an earlier frame model may be shared or transferred to later frame-specific models. This incremental transfer reduces the number of parameters that must be retrained for subsequent frames, thereby providing a structural basis for neural networks that include shared parameters together with frame-specific adaptive parameters, implemented as the "plurality of common layers" and "first plurality of adaptive layers" as claimed.
It would have been obvious to one of ordinary skill in the art (PHOSITA) before the effective filing date of the invention to incorporate into Lv the incremental transfer techniques taught by Wang. Doing so would allow features learned from earlier frames to be reused when training neural models for later frames, thereby reducing computational training cost and improving efficiency when generating novel views of a scene across multiple time steps (t₁, t₂).
RE claim 2, Lv as modified by Wang teaches further including: receiving and one or more second images of the scene captured a second time following the first time where each of the one or more second frame images is associated with a corresponding 3D camera location and camera direction included within a set of second frame camera extrinsics; training a second frame neural network using the one or more second frame images and the second frame camera extrinsics, the second frame neural network including a second plurality of adaptive layers and the plurality of common layers learned during training of the keyframe neural network; and wherein the second frame neural network is configured to be queried to produce a second novel view of an appearance of the scene at the second time (Lv Figs 2-7; [0006]-[0007], [0011], [0028], [0030], [0032], [0038]-[0039], [0043], [0047], [0058]-[0059] and Wang Figs. 22, 24 and 26; [0038], [0251]–[0252], [0262], [0264] wherein a second frame NN can equally configured for a second set of frame sequence, similar to the first frame NN as taught by Lv [0011]).
RE claim 3, Lv as modified by Wang teaches wherein the training the keyframe neural network includes: passing the keyframe camera extrinsics through a predetermined function and providing an output of the predetermined function to an input of the plurality of common layers; passing the keyframe camera extrinsics into the initial plurality of adaptive layers (Lv Figs 2-3,6-7, [0032], [0034]-[0036], [0047], [0058]-[0060]).
RE claim 4, Lv as modified by Wang teaches wherein the training the first frame neural network includes: passing the first frame camera extrinsics through the predetermined function and providing a resulting output to an input of the plurality of common layers within the first frame neural network; passing the first frame camera extrinsics into the first plurality of adaptive layers (Lv Figs 2-3,6-7, [0032], [0034]-[0036], [0047], [0058]-[0060] and Wang Figs. 22 and 24; [0038], [0251]–[0252], [0262], [0264]).
RE claim 5, Lv as modified by Wang teaches further including initializing the first plurality of adaptive layers using information included in the initial plurality of adaptive layers (Lv Figs 2-3,6-7, [0032], [0034]-[0036], [0047], [0058]-[0060] and Wang Figs. 22 and 24; [0038], [0251]–[0252], [0262], [0264]).
RE claim 6, Lv as modified by Wang teaches further including initializing the second plurality of adaptive layers using information included in the first plurality of adaptive layers (Lv Figs 2-3,6-7, [0032], [0034]-[0036], [0047], [0058]-[0060] and Wang Figs. 22 and 24; [0038], [0251]–[0252], [0262], [0264]).
RE claim 7, Lv as modified by Wang teaches wherein the training the keyframe neural network includes training a keyframe encoder element included among the initial plurality of adaptive layers (Lv Figs 2-3,6-7, [0032], [0034]-[0036], [0047], [0058]-[0060] and Wang Figs. 22 and 24; [0038], [0251]–[0252], [0262], [0264]).
RE claim 8, Lv as modified by Wang teaches wherein the training the first frame neural network includes training a first encoder element included among the first plurality of adaptive layers (Lv Figs 2-3,6-7, [0032], [0034]-[0036], [0047], [0058]-[0060] and Wang Figs. 22 and 24; [0038], [0251]–[0252], [0262], [0264]).
RE claim 9, Lv as modified by Wang teaches wherein the training the second frame neural network includes training a second encoder element included among the first plurality of adaptive layers (Lv Figs 2-3,6-7, [0032], [0034]-[0036], [0047], [0058]-[0060] and Wang Figs. 22 and 24; [0038], [0251]–[0252], [0262], [0264]).
RE claim 10, Lv as modified by Wang teaches further including transferring encoding information learned during of the keyframe encoder element to a first encoder element included among the first plurality of adaptive layers (Lv Figs 2-3,6-7, [0032], [0034]-[0036], [0047], [0058]-[0060] and Wang Figs. 22 and 24; [0038], [0251]–[0252], [0262], [0264]).
RE claim 11, Lv as modified by Wang teaches further including transferring the encoding information learned during of the keyframe encoder element to a second encoder element included among the second plurality of adaptive layers (Lv Figs 2-3,6-7, [0032], [0034]-[0036], [0047], [0058]-[0060] and Wang Figs. 22 and 24; [0038], [0251]–[0252], [0262], [0264]).
RE claim 12, Lv as modified by Wang teaches further including: transmitting at least the keyframe neural network and the first frame neural network to a viewing device including a volume rendering element and instantiating the keyframe neural network and the first frame neural network on the viewing device as a novel view synthesis (NVS) decoder; wherein the NVS decoder is configured to be queried with coordinates corresponding to novel 3D views of the scene and to responsively generate output causing the volume rendering element to produce to imagery corresponding to the novel 3D views of the scene (Lv Fig 7, [0010], [0030], [0033], [0060]-[0061] wherein encoded information is decoded in the viewing device for effective volume rendering. Wang [0099], [0268], [0275] etc).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SULTANA MARCIA ZALALEE whose telephone number is (571)270-1411. The examiner can normally be reached Monday- Friday 8:00am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached at (571)272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Sultana M Zalalee/ Primary Examiner, Art Unit 2614