DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Claims 1-17, 19-21 are pending. Claims 1, 4, 7, 9-10, 16-17, 20 are amended. Claim 18 is cancelled. Claim 21 is newly added.
Response to Arguments
Applicant's arguments filed 1/27/2026 have been fully considered but they are not persuasive.
In the response of 1/27/2026, Applicant argues –
With respect to the previously claimed "blending portions of the set of two-dimensional textures" (see, rejection of claim 9), the Office asserts that paragraph [0174] of Gronau teaches these features. While Gronau discusses "a continuous blending of the textures can be applied, e.g., by using a weighted average of the two texture maps," there is not a teaching that the continuous blending includes combining the set of two-dimensional textures (based at least in part on a set of images that depict a head of a user) and a neutral template texture map depicting evenly distributed illumination. As such, Gronau cannot reasonably be construed to teach generating a blended texture by combining the set of two-dimensional textures and a neutral template texture map depicting evenly distributed illumination, as recited in claim 1.
[Underlining added by Examiner]
Examiner does not agree with Applicant’s arguments and conclusions drawn therefrom. If the teachings of Gronau (US 20220051412) are taken as a whole, continuous blending to merge texture maps on 3D models indeed means combining a set of 2D textures according to a texture map onto the 3D model to generate a 3D avatar that is illumination balanced.
Therefore, Examiner also contends that the newly amended feature of –
generate a blended texture by combining the set of two-dimensional textures and a neutral template texture map depicting evenly distributed illumination
is also indeed taught in Gronau reference. E.g., read in ¶s 0151-0155 of the reference,
A texture map is a 2D image in which each color pixel represents the red, green and blue reflectance coefficients of a certain area in the 3D model. An example of a texture map is shown in FIG. 20. Each color pixel in the texture map corresponds to certain coordinates within a specific polygon (e.g., triangle) on the surface of the 3D model.
An example of a 3D model composed of triangles and the mapping of the texture map to these triangles is shown in FIG. 15.
Generally, each pixel in the texture map has an index of the triangle to which it is mapped and 3 coordinates defining its exact location within the triangle.
A 3D model composed of a fixed number of triangles and vertices may be deformed as the 3D model changes. For example, a 3D model of a face may be deformed as the face changes its expression. Nevertheless, the pixels in the texture map correspond to the same locations in the same triangles, even though the 3D locations of the triangles change as the expression of the face changes.
Texture maps may be constant or may vary as a function of time, expression or of viewing angle. In any case, the correspondence of a given pixel in a texture map and a certain coordinate in a certain triangle in the 3D model doesn't change.
In ¶0460 and ¶0465 Gronau discloses,
According to an embodiment a texture map of a face of the person can be generated based on texture maps of different areas of the face.
The generating of the texture map of the face from texture maps of different areas of the face may be executed in any manner and may include, for example, smoothing the borders between the different texture maps of the different areas, and the like.
In ¶0375-0377, Gronau discloses,
The input to the suggested method may be a 2D monocular video, a templated 3D model of a face (general) with deformation model (per person or general) for this 3D template (specified below) together with an approximation (specific parameters) of the tracked parameters of the first frame of the video: approximated deformation parameters (of the person) in the video and an approximated camera model.
A 3D face template mesh (templated 3D model)—may include a coarse triangular mesh of a generic human face. By coarse, we mean in the order of 5K or 10K polygons, which may be sufficient to represent the general shape but not wrinkles, microstructures or other fine details.
A 3D face deformation model for the template may include a standard parametric way to deform the template and change the general shape of the 3D mesh (jaw structure, nose length, etc), the expression of the face (smile, frown, etc) or the rigid position and orientation of it, based on positions and cues found in the images.
In ¶0380-0383, Gronau discloses,
At each frame, the deformed mesh will be referred to as the current 3D face mesh, and its deformation parameters on top of the template may be chosen based on a set of landmarks deduced from the 2D face parts segmentation and the pre-annotated segmentation. To that end, the suggested method may use a 2D face parts segmentation method, in conjunction with a classical 2D rigid registration technique utilizing an ICP (Iterative Closest Point) method to track and deform a model of a 3D face based on an input RGB monocular video.
The suggested method builds upon common face parts segmentation networks, that annotate each pixel with a given face part.
FIG. 13 illustrates face segmentation. Input image 131 may be a color image acquired by a camera. Image 132 illustrates a segmentation of different face parts, visualized by different colors.
In addition, the triangular mesh template may be pre-annotated with a predefined annotation of face parts (e.g. nose, eyes, ears, neck, etc). The mesh annotation may assist in finding correspondences between various face parts on the 3D model to face parts on a given target image. The face parts annotation may be done only once on the 3D template, such that the same annotation can be used for multiple people automatically. The annotation can be specified by listing the triangle belonging to each face part, or by using UV coordinates for the mesh along with a 2D texture map for colouring face parts in different colors as in FIG. 12.
Combining all above disclosures we see that, the limitation of generate a blended texture by combining the set of two-dimensional textures (¶0151-0155, ¶0174, ¶0460-0465, ¶0375-0377) and a neutral template texture map (fig. 15, ¶0151-0155, ¶0460-0465, ¶0375-0377) depicting evenly distributed illumination (The illumination may be appropriate and there may be no areas that may be too dark or too bright and saturated – ¶0209. Also see ¶0327-0334) is indeed taught in Gronau.
For details see the rejection below.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-9, 11-20 are rejected under 35 U.S.C. 102(a)(1) and/or 102(a)(2) as being anticipated by Gronau et al. (US 20220051412 A1, hereinafter Gronau).
Regarding claim 1, Gronau discloses a computing device (user devices 4000(1)-4000(R), figs. 2-4, ¶0130-¶0139) comprising:
Circuitry (¶0130-¶0139) configured to:
generate a set of two-dimensional textures based at least in part on a set of images that depict a head of a user (¶0106, ¶0436, ¶0460-¶0461, ¶0490-0491, figs. 8-10, 14-16, 20);
generate a blended texture by combining the set of two-dimensional textures and a neutral template texture map depicting evenly distributed illumination (Illumination corrections between the current and previous texture maps can be calculated based on the areas that may be shown in both maps. These corrections may be applied to the current texture map, so that there may be no distinct border line between the textures captured at different times. In addition, in order to avoid sharp transitions between textures from different times, a continuous blending of the textures can be applied, e.g., by using a weighted average of the two texture maps, where the weights change along a transition zone between the textures. The methods mentioned above may be used for merging texture maps, material maps and also 3D models, ¶0174.
A texture map is a 2D image in which each color pixel represents the red, green and blue reflectance coefficients of a certain area in the 3D model. An example of a texture map is shown in FIG. 20. Each color pixel in the texture map corresponds to certain coordinates within a specific polygon (e.g., triangle) on the surface of the 3D model.
An example of a 3D model composed of triangles and the mapping of the texture map to these triangles is shown in FIG. 15.
Generally, each pixel in the texture map has an index of the triangle to which it is mapped and 3 coordinates defining its exact location within the triangle.
A 3D model composed of a fixed number of triangles and vertices may be deformed as the 3D model changes. For example, a 3D model of a face may be deformed as the face changes its expression. Nevertheless, the pixels in the texture map correspond to the same locations in the same triangles, even though the 3D locations of the triangles change as the expression of the face changes.
Texture maps may be constant or may vary as a function of time, expression or of viewing angle. In any case, the correspondence of a given pixel in a texture map and a certain coordinate in a certain triangle in the 3D model doesn't change, ¶0151-0155.
The input to the suggested method may be a 2D monocular video, a templated 3D model of a face (general) with deformation model (per person or general) for this 3D template (specified below) together with an approximation (specific parameters) of the tracked parameters of the first frame of the video: approximated deformation parameters (of the person) in the video and an approximated camera model.
A 3D face template mesh (templated 3D model)—may include a coarse triangular mesh of a generic human face. By coarse, we mean in the order of 5K or 10K polygons, which may be sufficient to represent the general shape but not wrinkles, microstructures or other fine details.
A 3D face deformation model for the template may include a standard parametric way to deform the template and change the general shape of the 3D mesh (jaw structure, nose length, etc), the expression of the face (smile, frown, etc) or the rigid position and orientation of it, based on positions and cues found in the images, ¶0375-0377.
At each frame, the deformed mesh will be referred to as the current 3D face mesh, and its deformation parameters on top of the template may be chosen based on a set of landmarks deduced from the 2D face parts segmentation and the pre-annotated segmentation. To that end, the suggested method may use a 2D face parts segmentation method, in conjunction with a classical 2D rigid registration technique utilizing an ICP (Iterative Closest Point) method to track and deform a model of a 3D face based on an input RGB monocular video.
The suggested method builds upon common face parts segmentation networks, that annotate each pixel with a given face part.
FIG. 13 illustrates face segmentation. Input image 131 may be a color image acquired by a camera. Image 132 illustrates a segmentation of different face parts, visualized by different colors.
In addition, the triangular mesh template may be pre-annotated with a predefined annotation of face parts (e.g. nose, eyes, ears, neck, etc). The mesh annotation may assist in finding correspondences between various face parts on the 3D model to face parts on a given target image. The face parts annotation may be done only once on the 3D template, such that the same annotation can be used for multiple people automatically. The annotation can be specified by listing the triangle belonging to each face part, or by using UV coordinates for the mesh along with a 2D texture map for colouring face parts in different colors as in FIG. 12. – ¶0380-0383.
According to an embodiment a texture map of a face of the person can be generated based on texture maps of different areas of the face, ¶0460
The generating of the texture map of the face from texture maps of different areas of the face may be executed in any manner and may include, for example, smoothing the borders between the different texture maps of the different areas, and the like, ¶0465)
generate a three-dimensional avatar that depicts the head of the user with substantially even illumination by applying the blended texture to a head model (During the process of creating the 3D avatar, the 3D model and the 2D texture maps, the quality of the 3D model that may be created may be evaluated by projecting it onto two dimensional images from different angles using a simple linear geometrical projection or a more complex model of a camera that includes optical distortions. The projections of the 3D model to 2D images may be compared to the images grabbed by the camera or cameras., ¶0192.
A generative Adversarial Network (GAN) may also be used in order to correct illumination in the texture map of the model, for example in cases where the user's face may be not illuminated uniformly, e.g. there exists a strong illumination from a window at the side of the face or from a spot projector above the user's head, ¶0185. Also see ¶0192); and
an output device configured to facilitate presentation of the three-dimensional avatar of the user (¶0060-0064, ¶0111-0114, ¶0192, 3d avatar 141, fig. 15, ¶0387).
Regarding claim 2, Gronau discloses the computing device of claim 1, wherein the circuitry is further configured to select the set of images for unwrapping into the set of two-dimensional textures due at least in part to the set of images depicting the head of the user from different viewing angles (During the process of creating the 3D avatar, the 3D model and the 2D texture maps, the quality of the 3D model that may be created may be evaluated by projecting it onto two dimensional images from different angles using a simple linear geometrical projection or a more complex model of a camera that includes optical distortions, ¶0192).
Regarding claim 3, Gronau discloses the computing device of claim 1, wherein the circuitry is further configured to shape the head model based at least in part on at least one of the set of images (The process is repeated where the comparison of the rendered textured 3D model is performed with several camera images from a set of images, such as from a video sequence. Since there may be many images in the image set or video, at each image the 3D model and texture map may be sampled by the camera at different positions, ¶0251).
Regarding claim 4, Gronau discloses the computing device of claim 1, wherein the circuitry is further configured to:
render an estimate of the head model based at least in part on one or more input parameters (The 3D model may have separate parameters for shape, pose and expression, ¶0104.
In one embodiment, a 3D model and texture maps are created before the beginning of the meeting and this model is then animated and rendered at run time according to the user's pose and expressions that are estimated from the video images, ¶0150);
compare the estimate of the head model to the at least one of the set of images (The projections of the 3D model to 2D images may be compared to the images grabbed by the camera or cameras, ¶0192. The projections of the 3D model can be compared to the grabbed 2D images in order to verify that the 3D geometrical structure may be accurate and also that the reflection maps may be accurate, ¶0193); and
update the estimate of the head model by modifying the one or more input parameters based at least in part on a result of the comparison (Step 96 may include monitoring each participant by a user device of the participant, during the conference call, updating parameters of 3D model of each participant accordingly and sending updated parameters (sending may be subjected to communication parameters). Step 98 may include receiving by a user device of each participant updated parameters of 3D models related to other participants and updating the display accordingly to reflect the changes to the model, ¶0231-0232. Also see ¶0277).
Regarding claim 5, Gronau discloses the computing device of claim 4, wherein the circuitry is further configured to:
compare the updated estimate of the head model to the at least one of the set of images according to a loss function (Approximated deformation parameters of the person in the video and an approximated camera model can be found by standard 3DMM fitting techniques, for example by using a face landmark detection method to detect known face parts parameters and optimize the camera and pre-annotated landmarks in a least-squared sense. The initialization does not need to be precise but only approximated and can be generated via commonly known techniques, ¶0378.
Step 175 may include using a deformation model (e.g., a 3DMM as explained above) to deform the face mesh and change the camera parameters such that the projection of the first image 3D features matches the 2D locations of the second image 2D locations, as in a typical sparse landmarks and camera fitting, ¶0396); and
refine the updated estimate of the head model by modifying the one or more input parameters based at least in part on an output rendered by the loss function (Approximated deformation parameters of the person in the video and an approximated camera model can be found by standard 3DMM fitting techniques, for example by using a face landmark detection method to detect known face parts parameters and optimize the camera and pre-annotated landmarks in a least-squared sense. The initialization does not need to be precise but only approximated and can be generated via commonly known techniques, ¶0378.
Step 175 may include using a deformation model (e.g., a 3DMM as explained above) to deform the face mesh and change the camera parameters such that the projection of the first image 3D features matches the 2D locations of the second image 2D locations, as in a typical sparse landmarks and camera fitting, ¶0396).
Regarding claim 6, Gronau discloses the computing device of claim 5, wherein the circuitry is further configured to iteratively compare the updated estimate of the head model to the at least one of the set of images and refine the updated estimate of the head model until the output rendered by the loss function satisfies a certain threshold (Approximated deformation parameters of the person in the video and an approximated camera model can be found by standard 3DMM fitting techniques, for example by using a face landmark detection method to detect known face parts parameters and optimize the camera and pre-annotated landmarks in a least-squared sense. The initialization does not need to be precise but only approximated and can be generated via commonly known techniques, ¶0378.
Step 175 may include using a deformation model (e.g., a 3DMM as explained above) to deform the face mesh and change the camera parameters such that the projection of the first image 3D features matches the 2D locations of the second image 2D locations, as in a typical sparse landmarks and camera fitting, ¶0396
At each frame, the deformed mesh will be referred to as the current 3D face mesh, and its deformation parameters on top of the template may be chosen based on a set of landmarks deduced from the 2D face parts segmentation and the pre-annotated segmentation. To that end, the suggested method may use a 2D face parts segmentation method, in conjunction with a classical 2D rigid registration technique utilizing an ICP (Iterative Closest Point) method to track and deform a model of a 3D face based on an input RGB monocular video, ¶380. Finding closest point mandates comparison against an inherent threshold.).
Regarding claim 7, Gronau discloses the computing device of claim 1, wherein the circuitry is further configured to:
render the head model (The 3D model and texture map may be used to render an image of the head and/or body or the person, ¶0340);
compare one or more facial features represented in the head model to one or more facial features identified in the at least one of the set of images (Comparing the location of facial landmarks such as the corners of the eyes and lips, the tip and edges of the nose and the edges of the cheeks and chin, that may be found in the image pairs, ¶0195); and
modify the one or more facial features represented in the head model based at least in part on a result of the comparison (Each texture map may be selected and/or augmented based on at least one out of shape, pose and expression. The augmentation may include, modifying values due to lighting, facial make-up effects (lipstick, blush and the like . . . ), adding or removing facial hair features (such as beard, moustache), accessories (such as eyeglasses, ear buds) and the like, ¶0435).
Regarding claim 8, Gronau discloses the computing device of claim 1, wherein the circuitry is further configured to unwrap the set of images into the set of two-dimensional textures by flattening a depiction of a face of the user in the set of images to fit across a set of segmentation masks (In addition, the triangular mesh template may be pre-annotated with a predefined annotation of face parts (e.g. nose, eyes, ears, neck, etc). The mesh annotation may assist in finding correspondences between various face parts on the 3D model to face parts on a given target image. The face parts annotation may be done only once on the 3D template, such that the same annotation can be used for multiple people automatically. The annotation can be specified by listing the triangle belonging to each face part, or by using UV coordinates for the mesh along with a 2D texture map for colouring face parts in different colors as in FIG. 12, ¶0383.
An example of a 3D model composed of triangles and the mapping of the texture map to these triangles is shown in fig. 15, ¶0152.
Step 171 may include using the previous iteration's model of the deformed face mesh and the camera screen space projection parameters, the method uses the camera's extrinsic and intrinsic parameters to perform a perspective projection on the 3D face mesh to get the 2D screen space pixel locations of each visible annotated face part vertex. Using the 3D pre-annotation (FIG. 15—see 3D model 141 and UV map 142) the method finds the 2D position of vertices in each face part by matching the annotations, ¶0387).
Regarding claim 9, Gronau discloses the computing device of claim 1, wherein generating he blended texture comprises:
sequentially blending portions of the set of two-dimensional textures (
All the models created from multiple images may be merged into one 3D model or into several different models that vary with the expression or illumination conditions, but all have common shape parameters, ¶0188.
Illumination corrections between the current and previous texture maps can be calculated based on the areas that may be shown in both maps. These corrections may be applied to the current texture map, so that there may be no distinct border line between the textures captured at different times. In addition, in order to avoid sharp transitions between textures from different times, a continuous blending of the textures can be applied, e.g., by using a weighted average of the two texture maps, where the weights change along a transition zone between the textures. The methods mentioned above may be used for merging texture maps, material maps and also 3D models, ¶0174); and
applying the sequentially blended portions of the set of two-dimensional textures to a template texture map (In addition, the triangular mesh template may be pre-annotated with a predefined annotation of face parts (e.g. nose, eyes, ears, neck, etc). The mesh annotation may assist in finding correspondences between various face parts on the 3D model to face parts on a given target image. The face parts annotation may be done only once on the 3D template, such that the same annotation can be used for multiple people automatically. The annotation can be specified by listing the triangle belonging to each face part, or by using UV coordinates for the mesh along with a 2D texture map for colouring face parts in different colors as in FIG. 12, ¶0383.
The new 3D model and texture map may be again rendered to obtain a second rendered image that is compared to the original camera image to create a second difference image that may be used as feedback for enhancing the resolution of the 3D model and texture map. This process may be repeated a given number of times or until a certain criterion is met, e.g., the difference between the actual camera image and the rendered image is below a certain threshold. The process is repeated where the comparison of the rendered textured 3D model is performed with several camera images from a set of images, such as from a video sequence. Since there may be many images in the image set or video, at each image the 3D model and texture map may be sampled by the camera at different positions, ¶0250-0251).
Regarding claim 11, Gronau discloses the computing device of claim 1, wherein the circuitry is further configured to mitigate, in the three-dimensional avatar, directional illumination depicted on the head of the user in the set of images (¶0174, ¶0185, ¶0292).
Regarding claim 12, Gronau discloses the computing device of claim 1, wherein the circuitry is further configured to evenly illuminate facial features of the user in the three-dimensional avatar despite the facial features being unevenly illuminated in the set of images (¶0174, ¶0185, ¶0292).
Regarding claim 13, Gronau discloses the computing device of claim 1, wherein the circuitry is further configured to refine the set of two-dimensional textures via a neural network architecture (¶0151-0155, ¶0183, ¶0174, ¶0254-0256, ¶0325, ¶0355, ¶0451-0453, ¶0491).
Regarding claim 14, Gronau discloses the computing device of claim 13, wherein the neural network architecture comprises at least one of:
a U-Net;
an artificial neural network; or a convolutional neural network (¶0175, 0254-0256, ¶0325, ¶0333, ¶0363).
Regarding claim 15, Gronau discloses the computing device of claim 1, wherein the circuitry comprises a pipeline equipped with a plurality of data processing elements configured to generate the three- dimensional avatar from the set of images (¶0175, ¶0254).
Regarding claim 16, Gronau discloses a system (systems shown in figs. 2-4) comprising:
a camera configured to capture a set of images that depict a head of a user (This may be obtained by modifying the rendered image according to movements of the viewer and the viewer's eyes, thus creating a 3D effect. In order to do this, an image of the viewer is acquired by a camera such as a webcam., ¶0295); and
a computing device (user devices 4000(1)-4000(R), figs. 2-4, ¶0130-¶0139) configured to:
generate a set of two-dimensional textures based at least in part on the set of images (The generating of the 3D model and one or more texture maps may be based on images of the participant that were acquired under different circumstances., ¶0177);
generate a blended texture by combining the set of two-dimensional textures and a neutral template texture map depicting evenly distributed illumination (Illumination corrections between the current and previous texture maps can be calculated based on the areas that may be shown in both maps. These corrections may be applied to the current texture map, so that there may be no distinct border line between the textures captured at different times. In addition, in order to avoid sharp transitions between textures from different times, a continuous blending of the textures can be applied, e.g., by using a weighted average of the two texture maps, where the weights change along a transition zone between the textures. The methods mentioned above may be used for merging texture maps, material maps and also 3D models, ¶0174.
A texture map is a 2D image in which each color pixel represents the red, green and blue reflectance coefficients of a certain area in the 3D model. An example of a texture map is shown in FIG. 20. Each color pixel in the texture map corresponds to certain coordinates within a specific polygon (e.g., triangle) on the surface of the 3D model.
An example of a 3D model composed of triangles and the mapping of the texture map to these triangles is shown in FIG. 15.
Generally, each pixel in the texture map has an index of the triangle to which it is mapped and 3 coordinates defining its exact location within the triangle.
A 3D model composed of a fixed number of triangles and vertices may be deformed as the 3D model changes. For example, a 3D model of a face may be deformed as the face changes its expression. Nevertheless, the pixels in the texture map correspond to the same locations in the same triangles, even though the 3D locations of the triangles change as the expression of the face changes.
Texture maps may be constant or may vary as a function of time, expression or of viewing angle. In any case, the correspondence of a given pixel in a texture map and a certain coordinate in a certain triangle in the 3D model doesn't change, ¶0151-0155.
The input to the suggested method may be a 2D monocular video, a templated 3D model of a face (general) with deformation model (per person or general) for this 3D template (specified below) together with an approximation (specific parameters) of the tracked parameters of the first frame of the video: approximated deformation parameters (of the person) in the video and an approximated camera model.
A 3D face template mesh (templated 3D model)—may include a coarse triangular mesh of a generic human face. By coarse, we mean in the order of 5K or 10K polygons, which may be sufficient to represent the general shape but not wrinkles, microstructures or other fine details.
A 3D face deformation model for the template may include a standard parametric way to deform the template and change the general shape of the 3D mesh (jaw structure, nose length, etc), the expression of the face (smile, frown, etc) or the rigid position and orientation of it, based on positions and cues found in the images, ¶0375-0377.
At each frame, the deformed mesh will be referred to as the current 3D face mesh, and its deformation parameters on top of the template may be chosen based on a set of landmarks deduced from the 2D face parts segmentation and the pre-annotated segmentation. To that end, the suggested method may use a 2D face parts segmentation method, in conjunction with a classical 2D rigid registration technique utilizing an ICP (Iterative Closest Point) method to track and deform a model of a 3D face based on an input RGB monocular video.
The suggested method builds upon common face parts segmentation networks, that annotate each pixel with a given face part.
FIG. 13 illustrates face segmentation. Input image 131 may be a color image acquired by a camera. Image 132 illustrates a segmentation of different face parts, visualized by different colors.
In addition, the triangular mesh template may be pre-annotated with a predefined annotation of face parts (e.g. nose, eyes, ears, neck, etc). The mesh annotation may assist in finding correspondences between various face parts on the 3D model to face parts on a given target image. The face parts annotation may be done only once on the 3D template, such that the same annotation can be used for multiple people automatically. The annotation can be specified by listing the triangle belonging to each face part, or by using UV coordinates for the mesh along with a 2D texture map for colouring face parts in different colors as in FIG. 12. – ¶0380-0383.
According to an embodiment a texture map of a face of the person can be generated based on texture maps of different areas of the face, ¶0460.
The generating of the texture map of the face from texture maps of different areas of the face may be executed in any manner and may include, for example, smoothing the borders between the different texture maps of the different areas, and the like, ¶0465)
generate a three-dimensional avatar that depicts the head of the user with even illumination by applying the blended texture to a head model (During the process of creating the 3D avatar, the 3D model and the 2D texture maps, the quality of the 3D model that may be created may be evaluated by projecting it onto two dimensional images from different angles using a simple linear geometrical projection or a more complex model of a camera that includes optical distortions. The projections of the 3D model to 2D images may be compared to the images grabbed by the camera or cameras., ¶0192.
A generative Adversarial Network (GAN) may also be used in order to correct illumination in the texture map of the model, for example in cases where the user's face may be not illuminated uniformly, e.g. there exists a strong illumination from a window at the side of the face or from a spot projector above the user's head, ¶0185. Also see ¶0192).
Regarding claim 17, Gronau discloses the system of claim 16, wherein the camera comprises a webcam (¶0295).
Regarding claim 19, Gronau discloses the system of claim 16, wherein the computing device is further configured to select the set of images for unwrapping into the set of two-dimensional textures due at least in part to the set of images depicting the head of the user from different viewing angles (During the process of creating the 3D avatar, the 3D model and the 2D texture maps, the quality of the 3D model that may be created may be evaluated by projecting it onto two dimensional images from different angles using a simple linear geometrical projection or a more complex model of a camera that includes optical distortions, ¶0192).
Regarding method claim(s) 20, although wording is different, the material is considered substantively equivalent to the device claim(s) 1 as described above.
Regarding claim 21, Gronau discloses the computing device of claim 1, wherein the circuitry is further configured to perform a mask operation to define at least a portion of the set of two-dimensional textures to be blended with the neutral template texture map, wherein the mask operation includes a set of segmentation masks derived from the set of images by applying a face segmentation neural network (¶0375-0377, ¶0380-0383).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Gronau in view of Lin et al. (US 20210183044 A1, hereinafter Lin).
Regarding claim 10, Gronau discloses the computing device of claim 9, wherein the circuitry is further configured to apply the sequentially blended portions of the set of two-dimensional textures to the template texture map (…a continuous blending of the textures can be applied, ¶0174)
Gronau is not found disclosing the blending is performed using Laplacian pyramids.
However, Lin discloses that blending (fusing) can be done using Laplacian pyramids method (¶0062, 0078, 0097, 0110, 0121, claims 8, 18).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention (AIA ) to implement the sequentially blended portions of the set of two-dimensional textures to the template texture map of Gronau using one or more Laplacian pyramids as disclosed by Lin, because, combining prior art elements ready to be improved according to known method to yield predictable results is obvious.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NURUN FLORA whose telephone number is (571)272-5742. The examiner can normally be reached M-F 9:30 am -5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jason Chan can be reached at (571) 272-3022. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/NURUN FLORA/Primary Examiner, Art Unit 2619