DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
This Office Action is in response to Applicant’s amendment/response filed on 24 November 2025, which has been entered and made of record.
Response to Arguments
Applicant's arguments filed 24 November 2025 have been fully considered but they are not persuasive.
Applicant argues “the combination of Dsouza and Mack fails to disclose or make obvious all features of amended claim 1” (Remarks, pg. 9), specifically arguing that Mack “does not combine with Dsouza to disclose or make obvious … ‘wherein the relighting image is generated based on lighting information of the configured 3D content and information indicative of learned surface characteristics of the foreground portion,’” because “Rendering a scene to include virtual lights applied to 3D subject geometry in a 3D graphics program, as described in Mack, does not combine with Dsouza to disclose or make obvious” the disputed limitations (Remarks, pg. 10; emphasis original). The Examiner respectfully disagrees.
Mack discloses that a 2D captured image of the foreground actor is “projected … onto the displaced mesh to create a 3D virtual reconstruction of the actor” (para. 18). The actor is considered the “foreground portion,” and both the 2D image characteristics projected onto the 3D mesh of the actor and the 3D mesh vertices themselves teach “information indicative of learned surface characteristics” of the foreground actor. For example, if an actor has a different surface shape or different color clothing, these would be considered “surface characteristics” and they would be reproduced in the 3D reconstruction of the foreground actor. Further, the same virtual lighting applied to the background portion of the scene is applied to the 3D reconstruction of the foreground actor. Therefore, Mack teaches “wherein the relighting image is generated based on lighting information of the configured 3D content and information indicative of learned surface characteristics of the foreground portion” as claimed.
Any remaining arguments are considered moot based on the foregoing.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 2, 6, 7, 9, 11, 12, 17, 18, 22, 23, 25, 27, and 28 rejected under 35 U.S.C. 103 as being unpatentable over Dsouza et al. (US 2019/0355172; hereinafter “Dsouza”) in view of Mack (US 2003/0202120).
Regarding claim 1, Dsouza discloses A method comprising: determining an estimated camera pose corresponding to image data (“the orientation, position, and/or movement of the image capturing device (e.g., mobile devices),” para. 27); generating a background replacement view of a configured three-dimensional (3D) content (“to generate the background image, both 3D and 2D image resources may be used,” para. 36; “The background image may be generated from the selected image resource based at least in part on the determined orientation of the mobile device … the image resource is a three-dimensional (3D) world model for background, e.g., obtained from 3D modeling or 3D reconstruction,” paras. 54-55), wherein the background replacement view is associated with an angle-of-view (AOV) based on the estimated camera pose (“the virtual backgrounds can be generated based on the orientation, position, and/or location of the mobile device,” para. 45); determining a segmentation mask for the image data, the segmentation mask indicative of a foreground portion of the image data and a background portion of the image data (“a foreground image is segmented from the original image … a transparency mask (e.g., a binary mask) for the foreground object is produced,” para. 57); and generating an output image based on the image and the background replacement view of the configured 3D content (“obtain … a foreground image, a foreground mask, and a virtual background image and use such components to generate a composite image,” para. 38).
Dsouza does not disclose generating a relighting image corresponding to at least a portion of the image data, wherein the relighting image is generated based on lighting information of the configured 3D content and information indicative of learned surface characteristics of the foreground portion or generating the output image based on the relighting image.
In the same art of compositing a foreground image with a background scene, Mack teaches generating a relighting image corresponding to at least a portion of the image data, wherein the relighting image is generated based on lighting information of the configured 3D content and information indicative of learned surface characteristics of the foreground portion and generating the output image based on the relighting image (“provide a virtual lighting system that enables automatic matching of keyed foreground and virtual background images,” para. 29; “the keyed 2D image from the television or film camera used in the principal photography is projected from the camera's point of view onto the displaced mesh to create a 3D virtual reconstruction of the actor,” para. 18; “FIG. 7 shows deformed mesh 54 with keyed 2D image map 40 applied to it, creating 3D subject geometry 56. The image demonstrates the effects of virtual light 60 applied to 3D subject geometry 56. As the live action subject now has depth and thickness in the 3D graphics program, virtual light 60 is reflected from the subject realistically,” para. 95; “A digital model of the desired background set is constructed inside a computer … the entire scene, including the background and foreground, is rendered by the same computer lighting and rendering algorithms,” paras. 16-18; note that both the 2D image characteristics projected onto the 3D mesh of the actor and the 3D mesh vertices themselves teach “information indicative of learned surface characteristics” of the foreground actor).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Mack to Dsouza. The motivation would have been “physically accurate lighting of both the actor and the background is achieved automatically” (para. 18).
Regarding claim 2, the combination of Dsouza and Mack renders obvious wherein: a foreground portion of the output image comprises pixel data included in the relighting image; and a background portion of the output image comprises pixel data included in the background replacement view of the configured 3D content (“obtain … a foreground image, a foreground mask, and a virtual background image and use such components to generate a composite image,” Dsouza, para. 38; “virtual light applied to 3D subject geometry,” Mack, para. 95; see claim 1 for motivation to combine).
Regarding claim 6, the combination of Dsouza and Mack renders obvious wherein the image data is a frame of video data included in a plurality of frames of video data (“the existing or original background images, for example in a video, can be replaced with a virtual background associated with an orientation or position of the mobile device,” Dsouza, para. 19).
Regarding claim 7, the combination of Dsouza and Mack renders obvious wherein the lighting information (“the entire scene, including the background and foreground, is rendered by the same computer lighting,” Mack, para. 18; see claim 1 for motivation to combine) is included in an equirectangular projection of the configured 3D content (“rasterizes a part of an equirectangular projection of the panoramic image to form the virtual background image,” Dsouza, para. 22).
Regarding claim 9, the combination of Dsouza and Mack renders obvious wherein the configured 3D content is a computer-generated (CG) 3D model of a scene selected from a plurality of CG 3D models of scenes (“The user may browser various image resources … the image resource is a three-dimensional (3D) world model for background, e.g., obtained from 3D modeling or 3D reconstruction,” Dsouza, paras. 54-55).
Regarding claim 11, the combination of Dsouza and Mack renders obvious wherein the background replacement view of the CG 3D scene is generated using an image capture device used to obtain the image data (“system 200 [of Fig. 2] is embodied as a specialized computing device in a mobile phone,” Dsouza, para. 39; “the mobile device has a camera,” Dsouza, para. 50; Fig. 2 of Dsouza illustrates the Background Processor 230 being included within the mobile device).
Regarding claim 12, the combination of Dsouza and Mack renders obvious wherein: the background replacement view comprises a view of a portion of the CG 3D scene, wherein the view of the portion of the CG 3D scene is associated with an AOV corresponding to the image data and the estimated camera pose (“The background image may be generated from the selected image resource based at least in part on the determined orientation of the mobile device … the image resource is a three-dimensional (3D) world model for background … rasterize a part of the panoramic image to generate the background image,” Dsouza, paras. 54-56).
Regarding claim 17, it is rejected using the same citations and rationales described in the rejection of claim 1, with the additional limitations of An apparatus for processing image data, comprising: at least one memory; and at least one processor coupled to the at least one memory (see Dsouza, Fig. 8).
Regarding claims 18, 22, 23, 25, 27, and 28, they are rejected using the same citations and rationales described in the rejections of claims 2, 6, 7, 9, 11, and 12, respectively.
Claims 3-5 and 19-21 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Dsouza and Mack, and further in view of Liang et al. (US 2018/0115714; hereinafter “Liang”).
Regarding claim 3, the combination of Dsouza and Mack does not disclose performing camera motion stabilization to generate: a stabilized camera pose, the stabilized camera pose based on one or more of the estimated camera pose or inertial sensor data corresponding to an image capture device used to obtain the image data; and a stabilization warp grid corresponding to one or more stabilization corrections determined for the image data.
In the same art of using handheld mobile devices for recording images and video, Liang teaches performing camera motion stabilization to generate: a stabilized camera pose (“continuously model a virtual (i.e., corrected or stabilized) camera orientation,” para. 20), the stabilized camera pose based on one or more of the estimated camera pose or inertial sensor data corresponding to an image capture device used to obtain the image data (“video stabilization may be achieved by correlating gyroscope data and focal length sensor data of an image capture device,” para 20); and a stabilization warp grid corresponding to one or more stabilization corrections determined for the image data (“a grid or stabilization mesh can be used to transform the pixels of an input frame associated with the real camera orientation to an output frame associated with the virtual camera orientation, such that the video may be stabilized,” para. 20).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Liang to the combination of Dsouza and Mack. The motivation would have been to “more accurately model the motion of the camera to remove undesired movements” (Liang, para. 4).
Regarding claim 4, the combination of Dsouza, Mack, and Liang renders obvious wherein the background replacement view comprises a view of the configured 3D content rendered using an AOV corresponding to the stabilized camera pose (para. 45 of Dsouza teaches generating a background replacement view based on camera pose, and para. 20 of Liang renders obvious using a stabilized camera pose; see claim 3 for motivation to combine).
Regarding claim 5, the combination of Dsouza, Mack, and Liang renders obvious applying the one or more stabilization corrections to the image data, based on combining the stabilization warp grid and the image data to generate a stabilized image data (“a grid or stabilization mesh can be used to transform the pixels of an input frame,” Liang, para. 20; see claim 3 for motivation to combine); and generating the relighting image based on combining at least a portion of the stabilized image data and the lighting information (para. 95 of Mack teaches generating the relighting image based on image data and lighting information, and para. 20 of Liang renders obvious using stabilized image data; see claims 1 and 3 for motivation to combine).
Regarding claims 19-21, they are rejected using the same citations and rationales described in the rejections of claims 3-5, respectively.
Claims 8, 10, 13-15, 24, 26, 29, and 30 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Dsouza and Mack, and further in view of Wagner et al. (US 2015/0294499; hereinafter “Wagner”).
Regarding claim 8, the combination of Dsouza and Mack does not specifically recite wherein the estimated camera pose comprises an estimated 6 degrees-of-freedom (6DOF) pose of a camera used to capture the image data, at a time of capture of the image data.
In the same art of image capturing and 3D modeling, Wagner teaches wherein the estimated camera pose comprises an estimated 6 degrees-of-freedom (6DOF) pose of a camera used to capture the image data, at a time of capture of the image data (“camera pose may be determined for 6-Degrees Of Freedom (6DOF),” para. 39).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Wagner to Dsouza and Mack. The motivation would have been “for power efficient 3D reconstruction” (Wagner, para. 47).
Regarding claim 10, the combination of Dsouza and Mack does not disclose wherein the 3D model of the scene is generated using a same image capture device used to obtain the image data.
In the same art of mobile device image capture and 3D modeling, Wagner teaches wherein the 3D model of the scene is generated using a same image capture device used to obtain the image data (“mobile device or mobile station (MS) 100 [of Fig. 1], may take the form of a cellular phone, mobile phone,” para. 24; “real-time 3d reconstruction on a mobile device … the Mobile Station (MS) may comprise at least one camera … capturing a first image with at least one camera, wherein the first image comprises color information for at least a portion of an environment being modeled by the MS and obtaining camera pose information for the first image,” para. 7).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Wagner to the combination of Dsouza and Mack. Dsouza teaches using 3D models for generating a background viewpoint, but does not disclose that the 3D models were generated on the mobile device from images captured by the mobile device. Wagner remedies this deficiency by teaching that 3D models can be generated directly on a mobile device from images captured by the mobile device. The motivation would have been “to facilitate power efficient real-time 3D reconstruction on computing and mobile devices” (Wagner, para. 6).
Regarding claim 13, the combination of Dsouza and Mack does not disclose wherein the configured 3D content is based on an additional image captured using a same image capture device used to obtain the image data.
In the same art of mobile device image capture and 3D modeling, Wagner teaches wherein the configured 3D content is based on an additional image captured using a same image capture device used to obtain the image data (“MS [Mobile Station] 100 [of Fig. 1] may comprise multiple cameras 110, such as dual front cameras and/or a front and rear-facing camera,” para. 36; “processing one or more images captured by camera 110 to perform 3D reconstruction of an environment being modeled,” para. 45).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Wagner to the combination of Dsouza and Mack. Dsouza teaches using 3D models for generating a background viewpoint, but does not disclose that the 3D models were generated on the mobile device from images captured by the mobile device. Wagner remedies this deficiency by teaching that 3D models can be generated directly on a mobile device from images captured by the mobile device. The motivation would have been “to facilitate power efficient real-time 3D reconstruction on computing and mobile devices” (Wagner, para. 6).
Regarding claim 14, the combination of Dsouza, Mack, and Wagner renders obvious wherein the image data and the additional image are captured using a same camera of the image capture device (“MS [Mobile Station] 100 [of Fig. 1] may comprise multiple cameras 110, such as dual front cameras and/or a front and rear-facing camera,” Wagner, para. 36; “processing one or more images captured by camera 110 to perform 3D reconstruction of an environment being modeled,” Wagner, para. 45; see claim 13 for motivation to combine; Wagner renders obvious capturing images from any or all of the cameras on a mobile device, including a same camera).
Regarding claim 15, the combination of Dsouza, Mack, and Wagner renders obvious wherein the image data is captured using a first camera of the image capture device, and wherein the additional image is captured using a second camera of the image capture device (“MS [Mobile Station] 100 [of Fig. 1] may comprise multiple cameras 110, such as dual front cameras and/or a front and rear-facing camera,” Wagner, para. 36; “processing one or more images captured by camera 110 to perform 3D reconstruction of an environment being modeled,” Wagner, para. 45; see claim 13 for motivation to combine; Wagner renders obvious capturing images from any or all of the cameras on a mobile device, including using different cameras).
Regarding claims 24, 26, 29, and 30, they are rejected using the same citations and rationales described in the rejections of claims 8, 10, 13, and 14, respectively.
Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Dsouza, Mack, and Wagner, and further in view of Tsuchiya (US 2025/0139879).
Regarding claim 16, the combination of Dsouza, Mack, and Wagner renders obvious wherein: the background replacement view comprises a viewport in the additional image, wherein an AOV of the viewport is a subset of an AOV of the additional image (“The background image may be generated from the selected image resource based at least in part on the determined orientation of the mobile device … rasterize a part of the panoramic image to generate the background image,” Dsouza, paras. 54-56).
The combination of Dsouza, Mack, and Wagner does not specifically recite the AOV of the viewport is determined based on camera intrinsic information associated with the image capture device.
In the same art of combining a foreground image with background 3D content, Tsuchiya teaches the AOV of the viewport is determined based on camera intrinsic information associated with the image capture device (“the viewpoint position with respect to the 3D background data is specified on the basis of the position, the imaging direction, the angle of view, or the like of the camera to be reflected in the current frame, and rendering is performed. At this time, video processing reflecting the focal length, the F-number, the shutter speed, the lens information, or the like can also be performed,” para. 120).
Before the effective filing date of the claimed invention, it would have been obvious to one having ordinary skill in the art to apply the teachings of Tsuchiya to the combination of Dsouza, Mack, and Wagner. The motivation would have been “to improve the overall video production efficiency” (Tsuchiya, para. 362).
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ryan McCulley whose telephone number is (571)270-3754. The examiner can normally be reached Monday through Friday, 8:00am - 4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached at (571) 272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RYAN MCCULLEY/Primary Examiner, Art Unit 2611