DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on January 8th, 2024 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Response to Arguments
This is in response to applicant’s amendment/response filed on 12/30/2025 which have been entered and made of record.
Applicant’s arguments regarding claim rejections under 35 U.S.C. 112(a) have been fully considered and are persuasive. The claim rejections under 35 U.S.C. 112(a) has been withdrawn.
Applicant’s arguments regarding claim rejections under 35 U.S.C. 103 have been fully considered but they are not persuasive.
Applicant argues the Martin-Brualla and Shen references fail to overcome the deficiencies of Chen and Moreau as applied to the independent claims. The remaining §103 rejections of claims 3 and 4 are therefore also respectfully traversed.
Notwithstanding the foregoing traversals, Applicant has amended the claims without prejudice and solely in order to expedite prosecution. The claim amendments herein are not made for reasons relating to patentability over the Chen, Moreau, Martin-Brualla and Shen references, or any other prior art references of record, as the claims as originally presented are believed to recite patentable subject matter over these references.
More particularly, independent claim 1 has been amended to recite as follows:
PNG
media_image1.png
532
667
media_image1.png
Greyscale
Examiner respectable disagrees. Chen teaches a three-dimensional scene model (Page 9 Para. 9 and Page 2 Section: Contents of the Invention Para. 2) that utilizes a 10 layer neural network (Page 10 Para. 4). The three-dimensional scene model obtains an image based on any viewing angle (Visual View/Angle, Page 9 Para. 10 and Page 10 Para. 9, Page 5 Para. 4) and the camera pose (Page 10 Para. 2-3). Thus, Chen teaches wherein the three-dimensional scene model comprises first
Moreau teaches two neural networks, a CNN (Page 3 Lines 13-21) for feature extraction and a Neural Radiance Field that has a Multilayer Perceptron MLP (Page 3 Lines 22-28). Moreau’s Multilayer Perceptron utilizes the features (Page 21 Lines 6-21 and Page 10 Lines 15-26) from the CNN and color/density values (Page 3 Lines 29-32 and Page 4 Lines 1-11) to generate images. Thus, Moreau teaches inputting the target compressed image feature, the density values and the color values to the third neural network, and outputting, from the third neural network, at least a portion of the target image.
Martin-Brualla teaches a Neural Radiance Field that has three Multilayer Perceptrons (Fig.3 Page 7212 or Annotated Fig.3 Below) which are each neural networks. Thus, Martin-Brualla teaches the three-dimensional scene model comprises first, second and third neural networks. Martin-Brualla’s first neural network (Red MLP) utilizes XYZ positions and outputs density values(Fig.3 Page 7212 or Annotated Fig.3 Below). Thus, Martin-Brualla teaches inputting the target camera pose to the first neural network, and outputting, from the first neural network, density values corresponding to the target camera pose. Martin-Brualla’s second neural network (Green MLP) utilizes the output from the first neural network, angles (Viewing Direction), and image features (Appearance Embeddings) to output color values(Fig.3 Page 7212 or Annotated Fig.3 Below). Thus Martin-Brualla teaches inputting the target view angle and the density values to the second neural network, and outputting, from the second neural network, color values corresponding to the target view angle.
Regarding the remaining arguments applicant argues with respect to the amended claim language, which is fully addressed in the prior art rejections set forth below.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-3 and 5-12, 14-20 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. Chinese Patent Application CN 115100339 A (hereinafter Chen) in view of Moreau et al. WIPO International Application WO 2024099593 A1 (hereinafter Moreau) in further view of NPL “NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections” by Ricardo Martin-Brualla, Noha Radwan, Mehdi S. M. Sajjadi, Jonathan T. Barron, Alexey Dosovitskiy, and Daniel Duckworth (hereinafter Martin-Brualla).
Regarding claim 1, Chen teaches a method for image generation (Model Generating Module, Page 2 Section: Contents of the Invention, Para. 3) for a particular view angle (Target Visual Angle), the method comprising:
acquiring a three-dimensional scene model (Page 9 Para. 9 and Page 2 Section: Contents of the Invention Para. 2), a target camera pose, and a target view angle (Visual View/Angle) corresponding to a target scene; (Page 9 Para. 10 and Page 10 Para. 9)
determining a target (Page 14, Para. 1 and Page 10, Para. 4) corresponding to the target camera pose and the target view angle (Visual View/Angle, Page 9 Para. 10 and Page 10 Para. 9) from a plurality of compressed image features; (Page 11 Para. 5)
and inputting the target camera pose, the target view angle (Visual View/Angle, Page 9 Para. 10 and Page 10 Para. 9), and the target (Page 14 Para. 1 and Page 10, Para. 4)to the three-dimensional scene model (Page 9 Para. 9 and Page 2 Section: Contents of the Invention Para. 2), and obtaining a target image corresponding to the target camera pose and the target view angle through rendering by the three-dimensional scene model; (Page 5 Para. 4)
wherein the three-dimensional scene model (Page 9 Para. 9 and Page 2 Section: Contents of the Invention Para. 2) comprises first (10 Layer Neural Network, Page 10 Para. 4),(Visual View/Angle, Page 9 Para. 10 and Page 10 Para. 9) through rendering by the three-dimensional scene model (Page 9 Para. 9 and Page 2 Section: Contents of the Invention Para. 2) comprises: Chen teaches a three-dimensional scene model (Page 9 Para. 9 and Page 2 Section: Contents of the Invention Para. 2) that utilizes a 10 layer neural network(Page 10 Para. 4). The three-dimensional scene model obtains an image based on any viewing angle(Visual View/Angle, Page 9 Para. 10 and Page 10 Para. 9, Page 5 Para. 4) and the camera pose (Page 10 Para. 2-3).
However, Chen fails to teach:
the Compressed Image Features;
the three-dimensional scene model comprises first, second and third neural networks,
inputting the target camera pose to the first neural network, and outputting, from the first neural network, density values corresponding to the target camera pose;
inputting the target view angle and the density values to the second neural network, and outputting, from the second neural network, color values corresponding to the target view angle;
and inputting the target compressed image feature, the density values and the color values to the third neural network, and outputting, from the third neural network, at least a portion of the target image.
Chen and Moreau are analogous to the claimed invention because both of them are in the same field of Scene Generation using Neural Radiation Models.
Moreau teaches:
Compressed Image Features. (Feature/Descriptors/Embeddings, Page 21 Lines 6-21 and Page 10 Lines 15-26) Compressing feature maps is well known as feature maps can be dense and have extensive information.
the three-dimensional scene model comprises first (CNN, Page 3 Lines 13-21), (Multilayer Perceptron MLP, Page 3 Lines 22-28 Referred to as Second Neural Network in Moreau),
and inputting the target compressed image feature (Feature/Descriptors/Embeddings, Page 10 Lines 15-26), the density values and the color values to the third neural network (Page 3 Lines 29-32 and Page 4 Lines 1-11), and outputting, from the third neural network, at least a portion of the target image (Page 6 Lines 8-27). Moreau teaches two neural networks, a CNN (Page 3 Lines 13-21) for feature extraction and a Neural Radiance Field that has a Multilayer Perceptron MLP (Page 3 Lines 22-28). Moreau’s Multilayer Perceptron utilizes the features (Page 21 Lines 6-21 and Page 10 Lines 15-26) from the CNN and color/density values (Page 3 Lines 29-32 and Page 4 Lines 1-11) to generate images.
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chen’s three-dimensional model features to incorporate Moreau’s compressed Image Features and Multilayer Perceptron. Since doing so would provide the benefit of reducing the size the Image Features require in the hardware or cloud. As well as generating images of the environment through density, color values, and image features by utilizing multilayer perceptrons that excel in pattern recognition.
However, Chen and Moreau fail to teach:
the three-dimensional scene model comprises first, second and third neural networks,
inputting the target camera pose to the first neural network, and outputting, from the first neural network, density values corresponding to the target camera pose;
inputting the target view angle and the density values to the second neural network, and outputting, from the second neural network, color values corresponding to the target view angle;
Chen, Moreau, and Martin-Brualla are analogous to the claimed invention because all of them are in the same field of scene generation using neural radiance fields.
Martin-Brualla teaches:
the three-dimensional scene model comprises first (Red MLP), second (Green MLP) and third (Blue MLP) neural networks, (MPLs – Multi-Layer Perceptrons, See Annotated Fig.3 Below and Page 7211 Section: Neural Rendering) Martin-Brualla teaches a Neural Radiance Field that has three Multilayer Perceptrons (Fig.3 Page 7212 or Annotated Fig.3 Below) which are each neural networks.
PNG
media_image2.png
332
407
media_image2.png
Greyscale
inputting the target camera pose (XYZ Position) to the first neural network (Red MLP), and outputting, from the first neural network, density values corresponding to the target camera pose; (Annotated Fig. 3 Above) Martin-Brualla’s first neural network (Red MLP) utilizes XYZ positions and outputs density values(Fig.3 Page 7212 or Annotated Fig.3 Above).
inputting the target view angle (Viewing Direction Angles) and the density values to the second neural network (Green MLP), and outputting, from the second neural network, color values (RGB Color) corresponding to the target view angle; (Annotated Fig. 3 Above) Martin-Brualla’s second neural network (Green MLP) utilizes the output from the first neural network, camera angles(Viewing Direction), and image features(Appearance Embeddings) to output color values (Fig.3 Page 7212 or Annotated Fig.3 Above).
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chen’s Neural Radiation Field altered by Moreau Multilayer Perceptron to incorporate Martin-Brualla’s Specific Multi-Layers. Since, doing so would provide the benefit of generating scenes based on images that have moving objects and different illumination levels. (Martin-Brualla, Page 7213 Section: 4. NeRF in the Wild Col. 1)
Regarding claim 2, Chen teaches the method according to claim 1, wherein the three-dimensional scene model comprises a neural radiance field model (Page 9 Para. 10 and Page 10 Para. 1).
Regarding claim 3, Chen and Moreau fail to teach the method according to claim 1, wherein the first, second and third neural networks comprise respective first, second and third multi-layer perceptrons. (Annotated Fig.3 Above, Blue, Green, Red Multi-Layer Perceptrons)
However, Martin-Brualla teaches the method according to claim 1, wherein the first, second and third neural networks comprise respective first, second and third multi-layer perceptrons (Fig.3 and Page 7211 Section: Neural Rendering). Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chen’s 3D Scene Model altered by Moreau to incorporate Martin-Brualla’s Multi-Layer Perceptrons. Since doing so would provide the benefit of leveraging neural networks to accomplish pattern recognition and prediction tasks often found in generating 3D Scenes. As well as Multi-Layer Perceptrons are a form of neural network and are used to model nonlinear relations in data.
Regarding claim 5, Chen teaches the method according to claim 1, wherein said acquiring a target camera pose and a target view angle comprises:
acquiring a camera movement trajectory (Roll Angle, Yaw Angle, Pitch Angle, Page 10 Para. 2) for the target scene;
and generating the target camera pose and the target view angle by interpolating the camera movement trajectory. (Page 10 Para. 2 and Page 11 Para. 3) Collecting images can be done at various heights around a scene, while moving the image collecting device to acquire different view angles.
Regarding claim 6, Chen teaches the method according to claim 1, further comprising:
acquiring an image sample sequence comprising a plurality of image samples obtained by photographing the same scene at different positions using different view angles; (Page 6 Para. 6 and Page 11 Para. 3)
constructing the three-dimensional scene model, which is configured with training parameters (Network Parameter); (Page 10 Para. 3)
inputting the image samples separately to the three-dimensional scene model to obtain a corresponding rendered image outputted from the three-dimensional scene model (Page 11, Para. 5), the rendered image being obtained through rendering according to compressed image features (Page 14 Para. 1) corresponding to the image samples; (Page 10 Para. 3 and Page 14 Para. 2) A training unit and sampling unit is used to train the three-dimensional scene model. Images can be inputted in groups or by scene image.
and adjusting the training parameters (Network Parameters) iteratively based on differences between the rendered image and the sample images (Scene Image) until the differences satisfy a preset requirement. (Page 10 Para. 3)
However, Chen fails to teach the Compressed Image Features.
Moreau teaches Compressed Image Features. (Feature/Descriptors/Embeddings, Page 21 Lines 6-21 and Page 10 Lines 15-26) Compressing feature maps is well known as feature maps can be dense and have extensive information. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chen’s three-dimensional model features to incorporate Moreau’s compressed Image Features. Since doing so would provide the benefit of reducing the size the Image Features require in the hardware or cloud.
Regarding claim 7, Chen teaches the method according to claim 6, wherein the three-dimensional scene model comprises a feature extraction module (Deep Feature Extraction, Page 11 Para. 5) and a neural radiance field module, and said inputting the image samples separately to the three-dimensional scene model to obtain a corresponding rendered image outputted from the three-dimensional scene model comprises: (Page 10 Para. 3 and Page 14 Para. 2)
determining position embeddings corresponding to the image samples by encoding the image samples using the feature extraction module; (Deep Feature Extraction, Page 11 Para. 5)
and inputting camera poses, view angles (Visual View/Angle), and the compressed image features corresponding to the image samples to the neural radiance field module, and obtaining the corresponding rendered image (Page 5 Para. 4) through rendering by the neural radiance field module. (Page 9 Para. 10 and Page 10 Para. 1)
However, Chen fails to teach:
Compressed Image Features.
obtaining the compressed image features by compressing the position embeddings using the feature extraction module based on a preset compression coefficient;
Moreau teaches:
Compressed Image Features. (Features/Descriptors/Embeddings, Page 21 Lines 6-21) Compressing feature maps is well known as feature maps can be dense and can have extensive information.
obtaining the compressed image features (Feature/Descriptors/Embeddings, Page 21 Lines 6-21 and Page 10 Lines 15-26) by compressing the position embeddings using the feature extraction module (Feature/Descriptor Extractor, Page 10 Lines 15-26 and Page 13 Lines 15-25) based on a preset compression coefficient; (Page 21, Lines 6-21) Methods for compressing features extracted from images is well known. A preset compression coefficient can be any value that dictates the compression for the compression method. Any compression method will have a compression coefficient, since, compression methods compress data to a specific compression.
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chen’s three-dimensional model features to incorporate Moreau’s compressed Image Features. Since doing so would provide the benefit of reducing the size the Image Features require in the hardware or cloud.
Regarding claim 8, Chen teaches the method according to claim 6, further comprising:
acquiring a camera movement trajectory (Roll Angle, Yaw Angle, Pitch Angle, Page 10 Para. 2) after training of the three-dimensional scene model is completed; (Page 6 Para. 6)
However, Chen fails to explicitly teach:
and storing the camera movement trajectory, the compressed image features corresponding to the plurality of image samples, and the trained three-dimensional scene model.
Moreau teaches:
and storing the camera movement trajectory, the compressed image features (Feature/Descriptors/Embeddings, Page 21 Lines 6-21 and Page 10 Lines 15-26) corresponding to the plurality of image samples, and the trained three-dimensional scene model. (Page 19 Lines 24-30 and Page 20 Lines 1-20)
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chen’s Camera Movement and Image Features to incorporate Moreau’s method of storing Camera Movement and Compressed Image Features Since doing so would provide the benefit of storing dense information in a compact way. (Moreau, Page 20 Lines 11-20)
Regarding claim 9, Chen teaches the method according to claim 1, wherein said determining a target compressed image feature corresponding to the target camera pose and the target view angle comprises:
determining a first candidate camera pose, a first candidate view angle (Visual View/Angle, Page 9 Para. 10 and Page 10 Para. 9), a second candidate camera pose, and a second candidate view angle matching the target camera pose and the target view angle; (Page 14 Para. 1) Each image has its features, camera pose, and view angle extracted or determined. Adjacent images are compared to determine matching characteristic points based on features (Page 11 Para. 11). Thus, a first candidate camera pose and view angle can be the features of an image and the second candidate camera pose and view angle can be the features of an adjacent image to the previous image.
determining a first candidate compressed image feature (Page 14, Para. 1 and Page 10, Para. 4) corresponding to the first candidate camera pose and the first candidate view angle; (Page 14 Para. 1 and Page 10, Para. 4)
determining a second candidate compressed image feature (Page 14, Para. 1 and Page 10, Para. 4) corresponding to the second candidate camera pose and the second candidate view angle; (Page 14 Para. 1 and Page 10, Para. 4)
and determining the target compressed image feature (Page 14, Para. 1 and Page 10, Para. 4) by fusing the first candidate compressed image feature and the second candidate compressed image feature. (Page 14 Para. 1 and Page 10, Para. 4) Features are spliced with the camera pose information and inputted into the neural radiation field. Thus, desired features found in adjacent images can be spliced with the camera pose to form mapping relationships between images.
However, Chen fails to teach the Compressed Image Features.
Moreau teaches Compressed Image Features. (Feature/Descriptors/Embeddings, Page 21 Lines 6-21 and Page 10 Lines 15-26) Compressing feature maps methods are known as feature maps can be dense and have extensive information. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chen’s three-dimensional model features to incorporate Moreau’s compressed Image Features. Since doing so would provide the benefit of reducing the size the Image Features require in the hardware or cloud.
Regarding claim 10, Chen teaches the method of claim 1 and an electronic device (Page 2 Section: Contents of the Invention, Para. 1) for model processing, comprising: at least one processor; and memory coupled to the at least one processor and having instructions stored thereon (Page 2 Section: Contents of the Invention, Para. 4), wherein the instructions, when executed by the at least one processor, cause the electronic device to perform claim 10, therefore it is rejected under the same rationale as claim 1.
Regarding claim 11, has similar limitations as of claim 2, therefore it is rejected under the same rationale as claim 2.
Regarding claim 12, has similar limitations as of claim 3, therefore it is rejected under the same rationale as claim 3.
Regarding claim 14, has similar limitations as of claim 5, therefore it is rejected under the same rationale as claim 5.
Regarding claim 15, has similar limitations as of claim 6, therefore it is rejected under the same rationale as claim 6.
Regarding claim 16, has similar limitations as of claim 7, therefore it is rejected under the same rationale as claim 7.
Regarding claim 17, has similar limitations as of claim 8, therefore it is rejected under the same rationale as claim 8.
Regarding claim 18, has similar limitations as of claim 9, therefore it is rejected under the same rationale as claim 9.
Regarding claim 19, Chen teaches the method of claim 1 and computer program product (Page 2 Section: Contents of the Invention Para. 5) according to claim 19, therefore it is rejected under the same rationale as claim 1.
Regarding claim 20, has similar limitations as of claims 2 and 11, therefore it is rejected under the same rationale as claims 2 and 11.
Claim 4 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. Chinese Patent Application CN 115100339 A (hereinafter Chen) in view of Moreau et al. WIPO International Application WO 2024099593 A1 (hereinafter Moreau) and of NPL “NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections” by Ricardo Martin-Brualla, Noha Radwan, Mehdi S. M. Sajjadi, Jonathan T. Barron, Alexey Dosovitskiy, and Daniel Duckworth (hereinafter Martin-Brualla) in further view of Shen et al. Chinese Patent Application CN 116977525 A (hereinafter Shen).
Regarding claim 4, Chen, Moreau, Martin-Brualla fail to teach the method according to claim 1, wherein said acquiring a target camera pose and a target view angle comprises:
receiving an image generation instruction from a user;
and determining the target camera pose and the target view angle based on the image generation instruction.
Chen, Moreau, Martin-Brualla, and Shen are analogous to the claimed invention because all of them are in the same field of scene generation with neural radiation fields.
Shen teaches the method according to claim 1, wherein said acquiring a target camera pose and a target view angle comprises:
receiving an image generation instruction from a user; (Page 6, Para. 2)
and determining the target camera pose and the target view angle based on the image generation instruction. (Page 6, Para. 2) The user can select any target component they wish, which could include a target view angle and target camera pose.
Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chen’s Neural Radiation Field altered by Moreau to incorporate Shen’s User Guided Instructions Since doing so would provide the benefit of the user deciding how to shape the neural radiation field to tailor the scenes generated to the user’s needs.
Regarding claim 13, has similar limitations as of claim 4, therefore it is rejected under the same rationale as claim 4.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRIANNA R COCHRAN whose telephone number is (571)272-4671. The examiner can normally be reached Mon-Fri. 7:30am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alicia Harrington can be reached at (571) 272-2330. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/BRIANNA RENAE COCHRAN/Examiner, Art Unit 2615
/ALICIA M HARRINGTON/Supervisory Patent Examiner, Art Unit 2615