Prosecution Insights
Last updated: April 19, 2026
Application No. 18/316,490

DIGITAL IMAGE DECALING

Final Rejection §103
Filed
May 12, 2023
Examiner
PROVIDENCE, VINCENT ALEXANDER
Art Unit
2617
Tech Center
2600 — Communications
Assignee
Adobe Inc.
OA Round
4 (Final)
83%
Grant Probability
Favorable
5-6
OA Rounds
2y 5m
To Grant
99%
With Interview

Examiner Intelligence

Grants 83% — above average
83%
Career Allow Rate
15 granted / 18 resolved
+21.3% vs TC avg
Strong +25% interview lift
Without
With
+25.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 5m
Avg Prosecution
38 currently pending
Career history
56
Total Applications
across all art units

Statute-Specific Performance

§101
0.9%
-39.1% vs TC avg
§103
82.4%
+42.4% vs TC avg
§102
14.8%
-25.2% vs TC avg
§112
0.9%
-39.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 18 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Amendment The Amendment filed November 21th, 2025 has been entered. Claims 1-4, 6, 8-16, 18-19, and 24-26 are pending in the application. Claims 5, 7, 17, 20, 21, 22, 23, and 24 are cancelled. Applicant’s amendments to the Claims 1, 9, and 12 have overcome the rejections previously set forth in the Final Office Action mailed August 22nd, 2025. Further search has been performed to address the material amended in the aforementioned claims. Response to Arguments Applicant’s arguments with respect to Claims 1-4, 6, 8-16, 18-19, and 24-26 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Newly found references Jin (NPL: NOVEL REPRESENTATIONS FOR 3D CLOTH SIMULATION), Wang (NPL: Designing Deep Networks for Surface Normal Estimation), and Liao (US 20230306673 A1) was used for the newly amended claim limitations. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1, 2, 3, 4, 6, 9, 10, 11, 27, and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Fang (NPL: RotoTexture: Automated Tools for Texturing Raw Video) in view of Jin (NPL: NOVEL REPRESENTATIONS FOR 3D CLOTH SIMULATION), and Wang (NPL: Designing Deep Networks for Surface Normal Estimation). Regarding claim 1: Fang teaches: A method comprising: identifying, by a processing device (see Note 1A), a portion of a two-dimensional (Fang: Rototexture also adds the ability to map an image onto the depiction of a surface in a photograph, Pg. 2, Fig. 1) digital image corresponding to an object (Fang: RotoTexture Synthesis, maintains a temporally coherent collection of surface patches that allow TextureShop to texture the surface depicted by each frame such that the texture continuously follows the moving undulations of the surface, Pg. 1, par. 5); generating, by the processing device, a two-dimensional output digital image (Fang: Figure 1, see Note 1B) using the surface map to configure a two-dimensional overlay object over the portion of the two-dimensional digital image on the surface (Fang: Fig. 8. Parts of the statue are not visible from its initial pose (a). New clusters are generated at a later moment (b) and advected with MAT to cover the whole surface (c). Pg. 7; see Note 1C) by warping the two-dimensional overlay object onto the two-dimensional digital image (Fang: warp an entire texture image onto the surface depicted in a single frame, Pg. 1, par. 6) according to the three-dimensional geometry of the object described by the surface map (Fang: Our spring model is a nonlinear least-squares fit, a common approach in vision for fitting geometry to image constraints, Pg. 2-3, Section 2.2: Optical flow, par. 1; see Note 1D). Note 1A: Fang teaches an “automated tool” (title) for texturing. Because Fang teaches the automated tool in relation to computer vision (Pg. 2, Section 2.1, Shape from Shading, par. 2), it is reasonable to conclude that the method taught by Fang is to be implemented on a computer. A computer inherently comprises a processor and memory. Note 1B: Sample output digital images are shown in Figure 1, 9, and 10 of Fang. Note 1C: Figure 8 of Fang showcases that as part of generating an image, “clusters” are formed that cover the image according to the detected geometry. A skeleton of such geometry can be seen in (b) of Figure 8. Note 1D: Fang shows an example of such warping in Figure 1 on Pg. 2. In the Figure, Fang warps a texture onto skin utilizing the RotoTexture method. Fang further teaches that RotoTexture fits to “geometry” described by the surface of the image as cited above on Pg. 2-3, Section 2.2: Optical flow, par. 1. Fang fails to teach: estimating, surface normals of the object based on the portion of the two-dimensional digital image; generating, by the processing device, a surface map using a neural network of a machine learning system applied to the surface normals to infer flexible object features that include one or more wrinkles or folds, the surface map describing a three-dimensional surface geometry of the object including the flexible object features based on a UV map of the portion of the two-dimensional digital image relative to a surface of a three-dimensional model of the object with the flexible object features; and generating, by the processing device, a two-dimensional output digital image using the surface map to configure a two-dimensional overlay object over the portion of the two-dimensional digital image by warping the two- dimensional overlay object onto the two-dimensional digital image according to the three-dimensional surface geometry of the object by simulating the flexible object features described by the surface map within the two-dimensional overlay object. Jin teaches: generating, by the processing device, a surface map using a neural network of a machine learning system (Jin: we describe training CNN models that take input pose parameters […] and predicts 256 × 256 displacement cloth images, emphasis added, Pg. 89, Section 4.9: Learning Cloth Images with CNNs) applied to the surface normals (Jin: one can add features to the cloth shape by painting in image space, especially using a blue brush that changes the offset values in the normal directions, emphasis added, Pg. 75, Section 4.4.1 Shape Modification via Image Editing; see Fig. 4.4) to infer flexible object features that include one or more wrinkles or folds (Jin: we predict cloth images […] These images represent offsets dx […] dx is parameterized in local geodesic coordinates u and v as well as the normal direction n in order to enable the representation of complex surfaces […] even small perturbations in offset directions [that] can lead to interesting structures, emphasis added, Pg. 79, Section 4.6: Network Considerations), the surface map describing a three-dimensional surface geometry of the object including the flexible object features (Jin: Figure 4.6: Left: part of a 3D cloth shape. Middle: cloth pixels embedded on the body mesh storing displacements dx as RGB values. Right: corresponding cloth image in the two dimensional pattern space. Pg. 77, Fig. 4.6; see also Note 1E) based on a UV map (Jin: Left: Triangle mesh depicted in texture space using the vertices' UV coordinates. Pg. 72, Fig. 4.1) of the portion of the two-dimensional digital image relative to a surface of a three-dimensional model of the object (Jin: displacements dx(u; v), Pg. 72, Section 4.3: Pixel-Based Cloth) with the flexible object features (Jin: Using UV space as the domain, each vertex stores a vector-valued function of displacements dx(u; v) = (Δu; Δv; Δn) representing perturbations in the texture coordinate and normal directions, emphasis added, Pg. 72, Section 4.3: Pixel based cloth); and Note 1E: Jin teaches that the flexible object features (displacements dx) are added to the cloth shape to generate the final cloth shape: “The final cloth shapes obtained by adding displacements dx depicted in Figure 4.1b to the cloth pixel locations in the top row,” (Pg. 76, Fig. 4.5). Therefore, it would at least be obvious to one of ordinary skill in the art to incorporate the flexible object features within the geometry of the 3D model. Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to combine the teachings of Jin with Fang. Generating, by the processing device, a surface map using a neural network of a machine learning system, as in Jin, would benefit the Fang in view of Jin teachings by enabling accurate wrinkle generation of clothing when a template mesh is used: “This framework allows us to leverage standard texture mapping [117, 118, 113] as well as other common approaches, such as using bump maps [119] to perturb normal directions and displacement maps [120] to alter vertex positions; these techniques have been well-established over the years and have efficient implementations on graphics hardware enabling us to hijack and take advantage of the GPU-supported pipeline for optimized performance.” (Jin, Pg. 73, Section 4.3: Pixel-Based Cloth) Fang in view of Jin teaches: generating, by the processing device, a two-dimensional output digital image using the surface map to configure a two-dimensional overlay object over the portion of the two-dimensional digital image by warping the two- dimensional overlay object onto the two-dimensional digital image according to the three-dimensional surface geometry of the object by simulating the flexible object features described by the surface map within the two-dimensional overlay object (see Note 1F). Note 1F: The plain language of the amended claim 1 may suggest to one of ordinary skill in the art that a cloth physics simulation is performed based on “flexible object features”. The specification of the present application recites: “the decal application module 810 takes features from the decal input 124 and simulates wrapping them onto a surface geometry of the object in the digital image 120 by using the predicted surface map 128 or 728 (e.g., UV map) as an "unwrapped" version of the object in the digital image 120 to simulate an appearance of the decal input being "wrapped" or projected on a surface of the object,” emphasis added, [0066]. Therefore, the Examiner understands “simulating” to not be, for example, a cloth physics simulation for generating wrinkles and folds (though such functionality is taught by Jin on Pg. 85, Section 4.7.5 Simulation), but rather a rendering of the features that mimic a realistic appearance of cloth. Fang showcases a rendering of an overlay object according to the 3D surface that “simulates” cloth features in Fig. 4. Fang in view of Jin fails to teach: estimating, surface normals of the object based on the portion of the two-dimensional digital image; Wang teaches: estimating, surface normals of the object based on the portion of the two-dimensional digital image (Wang: In this paper, we use CNNs for the task of predicting surface normals from a single image, (Abstract)); Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to combine the teachings of Wang with Fang in view of Jin. Directly generating surface normals with a CNN, as in Wang, would benefit the Fang in view of Jin teachings by enhancing the quality of the generated surface normals. (Wang: Our overall objective motivation is to frame the single-view 3D problem so that the structure we know is captured and convolutional networks can do what they do best – learn strong mappings from visual data to labels. […] Each input network obtains strong performance by themselves, but by combining them, we obtain substantially better results, both quantitatively and qualitatively. Pg. 2, Section 3: Overview, par. 2) Regarding claim 2: Fang in view of Jin and Wang teaches: The method of claim 1 (as shown above), wherein the two-dimensional overlay object includes an insertion two-dimensional digital image different than the two-dimensional digital image (Fang: Rototexture also adds the ability to map an image onto the depiction of a surface in a photograph or video, demonstrated by the shirt’s Da Vinci image whose deformation follows the wrinkles, Pg. 2, Fig. 1; see Note 2A). Note 2A: Fang showcases in Figure 1 that the two-dimensional digital image the person wearing a shirt in (a) may have a two-dimensional texture overlaid onto it (the Da Vinci texture in (b) and (c)). The texture is different from the image. Regarding claim 3: Fang in view of Jin and Wang teaches: The method of claim 1 (as shown above), wherein the two-dimensional overlay object includes a two-dimensional texture map different than the two-dimensional digital image (see Note 2A and Note 3A). Note 3A: The Examiner notes that the amended claim 3 is similar to amended claim 2, save for the overlay object containing a 2D texture map rather than a 2D digital image. The Examiner submits that one of ordinary skill in the art would consider a texture map analogous to a digital image. Regarding claim 4: Fang in view of Jin and Wang teaches: The method of claim 1 (as shown above), wherein the estimating includes using convolutional neural networks to predict the surface normals directly (see Note 4A) from the two-dimensional digital image (Wang: In this paper, we use CNNs for the task of predicting surface normals from a single image, (Abstract)). Note 4A: Wang showcases in Figure 2 that a single image is taken as input, said single image is used to directly generate surface normals of varying quality, and those normals are fused to obtain the output surface normal. Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to combine the teachings of Wang with Fang in view of Long and Chen. Directly generating surface normals with a CNN, as in Wang, would benefit the Fang in view of Long and Chen teachings by enhancing the quality of the generated surface normals. (Wang: Our overall objective motivation is to frame the single-view 3D problem so that the structure we know is captured and convolutional networks can do what they do best – learn strong mappings from visual data to labels. […] Each input network obtains strong performance by themselves, but by combining them, we obtain substantially better results, both quantitatively and qualitatively. Pg. 2, Section 3: Overview, par. 2) Regarding claim 6: Fang in view of Jin and Wang teaches: The method of claim 1 (as shown above), further comprising calculating a geometric loss (Jin: T-shirt Total Loss, Pg. 92, Equation 4.4), using a loss function implemented by the neural network (Jin: Equation 4.1 and 4.2, Pg. 91, Section 4.9.2: Loss functions), based on the surface normals and the surface map (Jin: For the T-shirts, the total loss is Ltshirt = Limage + λLnormal, Pg. 92, Equation 4.4; see Note 6A); and generating the surface map using the neural network based further on the loss function (Jin: Fig. 4.20; see also Note 6A). Note 6A: Figure 4.20 on Pg. 89 of Jin is illustrative with respect to the citations. Notably, the image loss is calculated based on the cloth images (surface maps) and the normal loss is calculated based on the output 3D shape. Figure 4.20 showcases that the total loss Ltotal generated is fed back into the convolutional neural network for training for further generating other surface maps. Regarding claim 9: Fang teaches: A system (see Note 1A) comprising: a synthesizing module implemented by the processing device to generate a two-dimensional output digital image (Fang: Figure 1, see Note 1B) using the surface map to configure a two-dimensional overlay object over the portion of the two-dimensional digital image on the surface (Fang: Fig. 8. Parts of the statue are not visible from its initial pose (a). New clusters are generated at a later moment (b) and advected with MAT to cover the whole surface (c). Pg. 7; see Note 1C) by warping the two-dimensional overlay object onto the two-dimensional digital image (Fang: warp an entire texture image onto the surface depicted in a single frame, Pg. 1, par. 6) according to the three-dimensional geometry of the object described by the surface map (Fang: Our spring model is […] a common approach in vision for fitting geometry to image constraints, Pg. 2-3, Section 2.2: Optical flow, par. 1; see Note 1D). Fang fails to teach: a normal estimation module implemented by a processing device to estimate surface normals of an object based on a portion of a two-dimensional digital image corresponding to the object; a machine learning system including a neural network implemented by the processing device to generate a surface map using the surface normals as input to infer flexible object features that include one or more wrinkles or folds, the surface map describing a three-dimensional surface geometry of the object including the flexible object features based on a UV map of the portion of the two-dimensional digital image relative to a surface of a three-dimensional model of the object with the flexible object features; and a synthesizing module implemented by the processing device to generate a two-dimensional output digital image using the surface map to configure a two-dimensional overlay object over the portion of the two-dimensional digital image by warping the two-dimensional overlay object onto the two- dimensional digital image according to the three-dimensional surface geometry of the object by simulating the flexible object features described by the surface map within the two-dimensional overlay object. Jin teaches: a machine learning system including a neural network implemented by the processing device to generate a surface map using a neural network of a machine learning system (Jin: we describe training CNN models that take input pose parameters […] and predicts 256 × 256 displacement cloth images, emphasis added, Pg. 89, Section 4.9: Learning Cloth Images with CNNs) using to the surface normals as input (Jin: one can add features to the cloth shape by painting in image space, especially using a blue brush that changes the offset values in the normal directions, emphasis added, Pg. 75, Section 4.4.1 Shape Modification via Image Editing; see Fig. 4.4) to infer flexible object features that include one or more wrinkles or folds (Jin: we predict cloth images […] These images represent offsets dx […] dx is parameterized in local geodesic coordinates u and v as well as the normal direction n in order to enable the representation of complex surfaces […] even small perturbations in offset directions [that] can lead to interesting structures, emphasis added, Pg. 79, Section 4.6: Network Considerations), the surface map describing a three-dimensional surface geometry of the object including the flexible object features (Jin: Figure 4.6: Left: part of a 3D cloth shape. Middle: cloth pixels embedded on the body mesh storing displacements dx as RGB values. Right: corresponding cloth image in the two dimensional pattern space. Pg. 77, Fig. 4.6; see also Note 1D) based on a UV map (Jin: Left: Triangle mesh depicted in texture space using the vertices' UV coordinates. Pg. 72, Fig. 4.1) of the portion of the two-dimensional digital image relative to a surface of a three-dimensional model of the object (Jin: displacements dx(u; v), Pg. 72, Section 4.3: Pixel-Based Cloth) with the flexible object features (Jin: Using UV space as the domain, each vertex stores a vector-valued function of displacements dx(u; v) = (Δu; Δv; Δn) representing perturbations in the texture coordinate and normal directions, emphasis added, Pg. 72, Section 4.3: Pixel based cloth); and Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to combine the teachings of Jin with Fang. Generating, by the processing device, a surface map using a neural network of a machine learning system, as in Jin, would benefit the Fang in view of Jin teachings by enabling accurate wrinkle generation of clothing when a template mesh is used: “This framework allows us to leverage standard texture mapping [117, 118, 113] as well as other common approaches, such as using bump maps [119] to perturb normal directions and displacement maps [120] to alter vertex positions; these techniques have been well-established over the years and have efficient implementations on graphics hardware enabling us to hijack and take advantage of the GPU-supported pipeline for optimized performance.” (Jin, Pg. 73, Section 4.3: Pixel-Based Cloth) Fang in view of Jin teaches: a synthesizing module implemented by the processing device to generate a two-dimensional output digital image using the surface map to configure a two-dimensional overlay object over the portion of the two-dimensional digital image by warping the two-dimensional overlay object onto the two- dimensional digital image according to the three-dimensional surface geometry of the object by simulating the flexible object features described by the surface map within the two-dimensional overlay object (see Note 1E). Fang in view of Jin fails to teach: a normal estimation module implemented by a processing device to estimate surface normals of an object based on a portion of a two-dimensional digital image corresponding to the object; Wang teaches: a normal estimation module implemented by a processing device to estimate surface normals of an object (Wang: we use CNNs for the task of predicting surface normals from a single image, Abstract) based on a portion of a two-dimensional digital image corresponding to the object (Wang: The goal of this network is to capture the coarse structure, enabling the interpretation of ambiguous portions of the image which cannot be decoded by local evidence alone, Pg. 3, Section 4.2: Top-down Global Network); Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to combine the teachings of Wang with Fang in view of Jin. Directly generating surface normals with a CNN, as in Wang, would benefit the Fang in view of Jin teachings by enhancing the quality of the generated surface normals. (Wang: Our overall objective motivation is to frame the single-view 3D problem so that the structure we know is captured and convolutional networks can do what they do best – learn strong mappings from visual data to labels. […] Each input network obtains strong performance by themselves, but by combining them, we obtain substantially better results, both quantitatively and qualitatively. Pg. 2, Section 3: Overview, par. 2) Regarding claim 10: Fang in view of Jin and Wang teaches: The system of claim 9 (as shown above), further comprising: a texture transfer module implemented by the processing device to output a texture map different than the two-dimensional digital image (Fang: Rototexture also adds the ability to map an image onto the depiction of a surface in a photograph or video, demonstrated by the shirt’s Da Vinci image whose deformation follows the wrinkles, Pg. 2, Fig. 1; see Note 2A) as the two-dimensional overlay object (Fang: RotoTexture Mapping, creates a temporally coherent mapping of a texture image onto the depiction of a moving surface, such that the texture image continuously deforms to follow the changing undulations of the surface, Pg. 1, par. 5; see Note 10A). Note 10A: Fang teaches overlaying a texture map different than the two-dimensional image. In order to overlay the texture image, there must be a module that outputs an image to overlay onto the original two-dimensional digital image. Regarding claim 11: Fang in view of Jin and Wang teaches: The system of claim 9 (as shown above), further comprising: an insertion module implemented by the processing device to output an insertion digital image different than the two-dimensional digital image (Fang: Rototexture also adds the ability to map an image onto the depiction of a surface in a photograph or video, demonstrated by the shirt’s Da Vinci image whose deformation follows the wrinkles, Pg. 2, Fig. 1; see Note 2A) as the two- dimensional overlay object (Fang: RotoTexture Mapping, creates a temporally coherent mapping of a texture image onto the depiction of a moving surface, such that the texture image continuously deforms to follow the changing undulations of the surface, Pg. 1, par. 5; see Note 10A and Note 11A). Note 11A: The Examiner notes that the amended claim 11 is similar to amended claim 10, save for the overlay object containing a 2D texture map rather than a 2D digital image. The Examiner submits that one of ordinary skill in the art would consider a texture map analogous to a digital image. Regarding claim 27: Fang in view of Jin, and Wang teaches: The method of claim 2 (as shown above), wherein the object represents a flexible article of clothing or a flexible garment depicted by the two-dimensional digital image (Fang: Rototexture also adds the ability to map an image onto the depiction of a surface in a photograph or video, demonstrated by the shirt’s Da Vinci image whose deformation follows the wrinkles, Pg. 2, Fig. 1; see Note 27A) Note 27A: Fang showcases that the shirt in Fig. 1(a) may have an overlay decal (the Da Vinci image) projected onto it. Regarding claim 28: Fang in view of Jin, and Wang teaches: The method of claim 27 (as shown above), wherein the insertion two-dimensional digital image represents a two-dimensional decal of a logo or text (Fang: Fig. 1 and Fig. 4; see Note 27A and Note 28A). Note 28A: It would be obvious to one of ordinary skill in the art to switch out the Da Vinci decal showcased in Fig. 1 for a logo or text, as Fang showcases in Fig. 4 that Rototexture works with text (i.e., the text seen within the Vitruvian Man decal). Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Fang (NPL: RotoTexture: Automated Tools for Texturing Raw Video) in view of Jin (NPL: NOVEL REPRESENTATIONS FOR 3D CLOTH SIMULATION), Wang (NPL: Designing Deep Networks for Surface Normal Estimation), and Jones (US 20210335039 A1). Fang in view of Jin and Wang teaches: The method of claim 1 (as shown above), further comprising: Fang in view of Jin and Wang fails to teach: determining, by the processing device, that the object belongs to a category of objects; and rendering, by the processing device, a mesh template associated with the category of objects as the three-dimensional model by adapting the mesh template to model the flexible object features. Jones teaches: determining, by the processing device, that the object belongs to a category of objects (Jones: determining a category of the object based on the 2D image using image matching [0007]); and rendering (Jones: In some implementations, a character is implemented as a 3D model and includes a surface representation used to draw the character (also known as a skin or mesh) [0086]), by the processing device, a mesh template associated with the category of objects as the three-dimensional model (Jones: The 3D mesh for the object is obtained by deforming the template 3D mesh [0007]). Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to combine the teachings of Jones with Fang in view of Jin and Wang. Determining, by the processing device, that the object belongs to a category of objects; and rendering, by the processing device, a mesh template associated with the category of objects as the three-dimensional object model, as in Jones, would benefit the Fang in view of Jin and Wang teachings by enabling the system to generate an accurate depiction of the object without having access to a mesh that replicates the specific details and dimensions of the object. Fang in view of Jin, Wang, and Jones teaches: rendering by the processing device, a mesh template associated with the category of objects as the three-dimensional model by adapting the mesh template to model the flexible object features (see Note 8A). Note 8A: Jin teaches that they “shrink wrap a cloth mesh onto the underlying body shape, viewing the resulting shrink-wrapped vertex locations as pixels containing RGB values that represent displacements of the shrinkwrapped cloth vertices from their pixel locations in texture and normal coordinates,” (Pg. 68, par. 1). That is, given a cloth mesh, it is aligned to a body mesh and then displacements are applied, which manifest as folds and wrinkles, as seen in Figure 4.5 of Jin. Similarly, Jones teaches that a “3D mesh for the object is obtained by deforming the template 3D mesh” [0007]. Jones further teaches that that “objects may include a part, model, character, […] clothing, […] components of the aforementioned (e.g., windows of a building), and so forth.” emphasis added, [0079]. Therefore, it would be obvious to one of ordinary skill in the art to obtain a template 3D mesh of clothing and render the clothing as a 3D model by adapting the mesh template to model the flexible object features. Claims 12, 14, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Fang (NPL: RotoTexture: Automated Tools for Texturing Raw Video) in view of Jin (NPL: NOVEL REPRESENTATIONS FOR 3D CLOTH SIMULATION), Wang (NPL: Designing Deep Networks for Surface Normal Estimation), and Chen (US 20230274492 A1). Regarding claim 12: Fang teaches: A method comprising: selecting, by a processing device (see Note 1A), a portion of a two-dimensional (Fang: Rototexture also adds the ability to map an image onto the depiction of a surface in a photograph, Pg. 2, Fig. 1) digital image corresponding to an object (Fang: RotoTexture Synthesis, maintains a temporally coherent collection of surface patches that allow TextureShop to texture the surface depicted by each frame such that the texture continuously follows the moving undulations of the surface, Pg. 1, par. 5); obtaining, by the processing device, annotation data indicating a plurality of anchor points at respective pixel locations in the two-dimensional digital image corresponding to the plurality of anchor points (Fang: It is therefore necessary to fix the position and orientation of the image on the surface through the identification and tracking of a minimal collection of surface feature points, Pg. 4, Section 3.3: Feature Points, par. 1, see Note 12A); using, by the processing device, the surface map to configure a two-dimensional overlay object over the portion of the two-dimensional digital image (Fang: Fig. 8. Parts of the statue are not visible from its initial pose (a). New clusters are generated at a later moment (b) and advected with MAT to cover the whole surface (c). Pg. 7; see Note 1C) by warping the two-dimensional overlay object onto the two-dimensional digital image (Fang: warp an entire texture image onto the surface depicted in a single frame, Pg. 1, par. 6) according to the three-dimensional geometry of the object described by the surface map (Fang: Our spring model is […] a common approach in vision for fitting geometry to image constraints, Pg. 2-3, Section 2.2: Optical flow, par. 1; see Note 1D). Note 12A: Figure 4 of Fang showcases the locations of various control points on the input image. As the input image is composed of pixels, each control point will have a corresponding pixel location. Jin teaches: training, by the processing device, a neural network of a machine learning system (Jin: CNN architecture, Pg. 89, Fig. 4.20) to infer flexible object features that include one or more wrinkles or folds (Jin: we predict cloth images […] These images represent offsets dx […] dx is parameterized in local geodesic coordinates u and v as well as the normal direction n in order to enable the representation of complex surfaces […] even small perturbations in offset directions [that] can lead to interesting structures, emphasis added, Pg. 79, Section 4.6: Network Considerations) and Fang in view of Jin fails to teach: generating, by the processing device, a normal map indicating surface normals of the object estimated based on the portion of the two-dimensional digital image; training, by the processing device, a neural network of a machine learning system to predict a surface map of the object based on the two-dimensional digital image, the annotation data, and the normal map, the surface map describing a three-dimensional surface geometry of the object including the flexible object features based on a UV map of the portion of the two-dimensional digital image relative to a surface of a three-dimensional model of the object with the flexible object features; and Wang teaches: generating, by the processing device, a normal map indicating surface normals of the object estimated based on the portion of the two-dimensional digital image (Wang: we use CNNs for the task of predicting surface normals from a single image, Abstract; see Note 12B); Note 12B: Wang shows output normal map examples in Figure 1. In Figure 2 on Pg. 3, Wang showcases that their network my also predict a “structured local patch from a part of the image” and fuse the patch into the final image. Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to combine the teachings of Wang with Fang in view of Jin. Directly generating surface normals with a CNN, as in Wang, would benefit the Fang in view of Jin teachings by enhancing the quality of the generated surface normals. (Wang: Our overall objective motivation is to frame the single-view 3D problem so that the structure we know is captured and convolutional networks can do what they do best – learn strong mappings from visual data to labels. […] Each input network obtains strong performance by themselves, but by combining them, we obtain substantially better results, both quantitatively and qualitatively. Pg. 2, Section 3: Overview, par. 2) Fang in view of Jin and Wang still fails to teach: training, by the processing device, a neural network of a machine learning system to predict a surface map of the object based on the two-dimensional digital image, the annotation data, and the normal map, the surface map describing a three-dimensional surface geometry of the object including the flexible object features based on a UV map of the portion of the two-dimensional digital image relative to a surface of a three-dimensional model of the object with the flexible object features; and Chen teaches: training, by the processing device, a neural network of a machine learning system to predict a surface map of the object based on the two-dimensional digital image (Chen: a neural network can be trained to predict the UV mapping and the texture image jointly, resulting in high quality texture synthesis without needing to conform to a pre-defined shape topology [0028]), the annotation data (Chen: AI-assisted annotation 1310 may be used to aid in generating annotations corresponding to imaging data [0121]), and the normal map (Chen: in addition to a color map 326, the alignment module can also output a normal map 328 and a coordinate map 330, for example, where the normal map 328 predicts the unit normals for input points [0036]), the surface map describing a surface geometry of the object with the flexible object features (Chen: The visual features of the feature image 100 can be used to generate one or more two-dimensional representations, such as a 2D front texture 104 and a 2D back texture 106 (or other first and second textures) [0026]) including a UV map that maps the portion of the two-dimensional digital image relative to a surface of a three-dimensional model of the object with the flexible object features (Chen: When these deformed front and back textures are combined, the result is a 3D synthesized result 110 that is a representation of a new object [0026]; see Note 12B). Note 12B: When the teachings of Chen are combined with Fang, Jin, and Wang, the surface of the three-dimensional object will include flexible object features, as described in Note 1E above. Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to combine the teachings of Chen with Fang in view of Jin and Wang. Training, by the processing device, a neural network of a machine learning system to predict a surface map of the object based on the digital image, the annotation data, and the normal map, the surface map describing a surface geometry of the object including a UV map that maps the portion of the digital image to a surface of a three-dimensional object model of the object, as in Chen, would benefit the Fang in view of Jin and Wang teachings by enhancing the detail of textures on the generated object model: “Previous work on texture generation mostly relies on warping a spherical mesh template to the target shape, […] Although texture fields were successfully applied to multi-view image reconstruction, they have primarily been used for fitting a single object or scene. Generative models used for such purposes also usually suffer from overly smoothed output textures.” (Chen, [0002]). Regarding claim 14: Fang in view of Jin, Wang, and Chen teaches: The method of claim 12 (as shown above), wherein the selecting the portion of the two-dimensional digital image includes generating an image mask having an unmasked region that overlaps the portion of the two-dimensional digital image (Fang: RotoTexture Synthesis advects image clusters corresponding to surface patches and retextures these clusters at each frame of the video, Pg. 3, Fig. 2; see Note 14A). Note 14A: Fang teaches that segments overlapping the image that are part of a “cluster” are retextured. Generating these clusters is analogous to an image mask, as they define the regions that are retextured and which regions are not. Both regions (i.e., the regions that have clusters and the regions that do not) overlap the two-dimensional digital image. Regarding claim 15: Fang in view of Jin, Wang, and Chen teaches: The method of claim 14 (as shown above), further comprising estimating, by the processing device, the respective pixel locations that correspond to the plurality of anchor points (Fang: Let Fk be a feature point and let Xk be its corresponding control point, see Note 15A) based on a rendering of the three-dimensional model of the object with the flexible object features (see Note 15A) Note 15A: In Figure 3 of Fang, the underlying surface exhibits displacement, similar to the displacement seen in Figure 4.1 of Jin (Pg. 72, Section 4.3 Pixel-Based Cloth). Fang teaches that they determine points “Xi” based on a rendering of the model. Because Jin teaches that their cloth model may be augmented with displacement data, one of ordinary skill in the art would be motivated to combine the 3D model rendering of Jin with the pixel determinations of Fang. In other words, when Fang teaches “identifying a control node in our mesh”, one of ordinary skill in the art that has combined the teachings of Jin with Fang would utilize the final cloth model of Jin that was augmented by the displacements (e.g., the cloth model of Figure 4.5 of Jin). Fang later teaches that the positions Xi each have a corresponding feature point or “anchor point”: “Let Fk be a feature point and let Xk be its corresponding control point.” (Pg. 4, Section 3.3: Feature Points, par. 2). Therefore, the Examiner understands Fang to teach determining pixel locations that correspond to the plurality of feature points based on a rendering of the model of the object. Claims 13 and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Fang (NPL: RotoTexture: Automated Tools for Texturing Raw Video) in view of Jin (NPL: NOVEL REPRESENTATIONS FOR 3D CLOTH SIMULATION), Wang (NPL: Designing Deep Networks for Surface Normal Estimation), Chen (US 20230274492 A1), and Liao (US 20230306673 A1). Regarding claim 13: Fang in view of Jin, Wang, and Chen teaches: The method of claim 12 (as shown above), further comprising: adapting, by the processing device, a mesh template associated with a category of objects to model the flexible object features (Jin: shrink wrap a cloth mesh onto the underlying body shape, viewing the resulting shrink-wrapped vertex locations as pixels containing RGB values that represent displacements of the shrinkwrapped cloth vertices from their pixel locations in texture and normal coordinates,” Pg. 68, par. 1; see Note 13A), wherein the object depicted in the portion of the two-dimensional digital image belongs to the category of objects (see Note 13A); rendering, by the processing device, the mesh template to generate the three-dimensional model of the object with the flexible object features (see Note 13B); and Note 13A: On Pg. 77, Jin teaches that: “we start in a rest pose and uniformly shrink the edges of the cloth mesh making it skin-tight on the body. […] this preprocessing step is only done once, and moreover can be accomplished on a template mesh”. That is, the adapting of the cloth mesh may be performed based on a mesh template. Furthermore, Jin teaches: “Once we have constructed generic models for each broad category of body shape and garment, we would further address fine grained personalization in order to obtain customized results for each user given their specific body measurements and preferences.” In other words, the garment and body mesh may be templates that are adapted based on the category of garments and bodies respectively. Note 13B: In Note 13A it was discussed that Jin may adapt a template mesh of bodies and/or garments. In Note 1E, it was shown that Jin also may generate a three-dimensional model of the object with flexible object features. Furthermore, because Jin showcases multiple images of the completed cloth shape (for example, Figure 4.17 on Pg. 86), it would be obvious to one of ordinary to render, by the processing device, said completed cloth shape. Fang in view of Jin, Wang, and Chen fails to teach: defining, by the processing device, the plurality of anchor points based on the rendering of the mesh template. Liao teaches: defining, by the processing device, the plurality of anchor points based on the rendering of the mesh template (Liao: Those skilled in the art will appreciate that the keypoints 402 on a training clothing image 312 are annotated or labelled (for example, manually or as a result of previous performing of the rendering subprocess 304), emphasis added, [0093]) Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to combine the teachings of Liao with Fang in view of Jin, Wang, and Chen. Defining, by the processing device, the plurality of anchor points based on the rendering of the mesh template, as in Liao, would benefit the Fang in view of Jin, Wang and Chen teachings because it enhances realism of the generated output while also remaining convenient for the user: “For example, in the methods disclosed in References [1] and [2], the synthesized persons in existing datasets are different from realistic persons because synthesized persons are mostly cartoon-like and dress in random collocation. The method disclosed in Reference [3] requires the input of both the front-view and back-view images of the clothing, which is inconvenient. Other methods such as CMR and HPBTT (see References [5] and [6]) are based on generative models, which usually result in blurred texture and artifacts.” [0005]. Regarding claim 25: Fang in view of Jin, Wang, Chen, and Liao teaches: The method of claim 13, wherein the defining the plurality of anchor points relative to positions along edges of the three-dimensional model of the object (Liao, see Fig. 6A and Fig. 7A; see also Note 25A) that include the flexible object features (see Note 1E). Note 25A: In Figure 6A and 7A, Liao showcases that the keypoints may be defined along the edge of various clothing. Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Fang (NPL: RotoTexture: Automated Tools for Texturing Raw Video) in view of Jin (NPL: NOVEL REPRESENTATIONS FOR 3D CLOTH SIMULATION), Wang (NPL: Designing Deep Networks for Surface Normal Estimation), Chen (US 20230274492 A1), and Liao (US 20230306673 A1). Regarding claim 16: Fang in view of Jin, Wang, and Chen teaches: The method of claim 12, wherein: the obtaining the annotation data includes annotating, by the processing device, the two-dimensional digital image to indicate the portion of the two-dimensional digital image and the respective pixel locations of the plurality of anchor points (see Note 12A and Note 16A), and Note 16A: In Figure 4, Fang showcases that control points may be identified that indicate a portion of the image that will be warped. Fang in view of Jin, Wang, and Chen fails to teach: the training the machine learning system is further based on the two-dimensional digital image after the annotating. Liao teaches: the training the machine learning system is further based on the two-dimensional digital image after the annotating (Liao: Those skilled in the art will appreciate that the keypoints 402 on a training clothing image 312 are annotated or labelled (for example, manually or as a result of previous performing of the rendering subprocess 304) before being used for training [0093]). Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to combine the teachings of Liao with Fang in view of Jin, Wang, and Chen. Training the machine learning system is further based on the two-dimensional digital image after the annotating, as in Liao, would benefit the Fang in view of Jin, Wang and Chen teachings by enhances realism of the generated output while also remaining convenient for the user: “For example, in the methods disclosed in References [1] and [2], the synthesized persons in existing datasets are different from realistic persons because synthesized persons are mostly cartoon-like and dress in random collocation. The method disclosed in Reference [3] requires the input of both the front-view and back-view images of the clothing, which is inconvenient. Other methods such as CMR and HPBTT (see References [5] and [6]) are based on generative models, which usually result in blurred texture and artifacts.” [0005]. Claims 18, 19, and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Fang (NPL: RotoTexture: Automated Tools for Texturing Raw Video) in view of Jin (NPL: NOVEL REPRESENTATIONS FOR 3D CLOTH SIMULATION), Wang (NPL: Designing Deep Networks for Surface Normal Estimation), Chen (US 20230274492 A1), and Zatepyakin (US 20180181840 A1). Regarding claim 18: Fang in view of Jin, Wang, and Chen teaches: The method of claim 12, further comprising Fang in view of Jin, Wang, and Chen fails to teach: calculating an anchor loss, using a loss function implemented by the neural network of the machine learning system, based on reference locations of the plurality of anchor points in a reference surface map and corresponding locations of the plurality of anchor points in the surface map, wherein the training the machine learning system is further based on the loss function and the anchor loss. Zatepyakin teaches: calculating an anchor loss, using a loss function implemented by the neural network of the machine learning system (Zatepyakin: To reduce noise in training, a Huber cost function may be used. The Huber cost function is a loss function [0050]), based on reference locations of the plurality of anchor points in a reference surface map and corresponding locations of the plurality of anchor points in the surface map (Zatepyakin: An error is defined a difference between a current shape and a target shape, which may be measured by the difference in labeled anchor points and the predicted anchor points after applying a shape modification from a set of trees [0048]), wherein the training the machine learning system is further based on the loss function and the anchor loss (Zatepyakin: To train the models, the facial anchor points predicted by the prediction model are compared to the labeled anchor points, and the error is evaluated by the Huber cost function. The models are adjusted to reduce the error defined by the cost function, [0050]; see Note 18A). Note 18A: Zatepyakin uses the error to train the model, which is calculated based on the loss function (the Huber cost function). Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to combine the teachings of Zatepyakin with Fang in view of Jin, Wang, and Chen. Calculating an anchor loss, using a loss function implemented by the neural network of the machine learning system, based on reference locations of the plurality of anchor points in a reference surface map and corresponding locations of the plurality of anchor points in the surface map, as in Zatepyakin, would benefit the Fang in view of Jin, Wang, and Chen teachings by preventing outlier predictions caused by the anchor points. Regarding claim 19: Fang in view of Jin, Wang, Chen, and Zatepyakin teaches: The method of claim 18 (as shown above), further comprising calculating a geometric loss (Jin: T-shirt Total Loss, Pg. 92, Equation 4.4), using the (see Note 19B) loss function implemented by the neural network (Jin: Equation 4.1 and 4.2, Pg. 91, Section 4.9.2: Loss functions), based on the surface map (Jin: For the T-shirts, the total loss is Ltshirt = Limage + λLnormal, Pg. 92, Equation 4.4; see Note 6A) and the normal map (see Note 19A); wherein the training the machine learning system is further based on the geometric loss (Jin: Fig. 4.20; see also Note 6A). Note 19A: The Examiner notes that claim 19 is similar to claim 6, except that the loss function is based on a “normal map” instead of “surface normals”. The Examiner submits that one of ordinary skill in the art would consider a normal map analogous to surface normals, because a normal map essentially describes surface normals that are stored within a container (such as a digital image). Such a normal map is showcased in Figure 1 of Wang. Note 19B: Jin shows that one of ordinary skill in the art could add loss functions together to obtain one loss function, as cited above. It would be obvious to one of ordinary skill in the art, when combining the teachings of Zatepyakin with Jin, to train using one loss function that adds the geometric and anchor losses. Therefore, it would be obvious to utilize the same loss function for calculating the geometric and anchor loss. Regarding claim 26: Fang in view of Jin, Wang, Chen, and Zatepyakin teaches: The method of claim 19, wherein the surface map is predicted by the neural network by: enforcing a geometric consistency (Zatepyakin: The Huber cost function is able to reduce noise and jitter in prediction by an order of magnitude. emphasis added, [0050], see Note 26A) between the surface normals, the plurality of anchor points, and the three-dimensional surface geometry of the object including the flexible object features (see Note 1E) described by the surface map by training the neural network based on the geometric loss and the anchor loss calculated from the loss function of the neural network (see Note 19B). Note 26A: One of ordinary skill in the art would understand that training via a loss function that takes into account the anchor points, surface geometry, and surface map would inherently enforce a geometry consistency. To elaborate further, the Examiner understands “geometric consistency” to be a state where the neural network does not predict outlier values, such as the kind discussed by Zatepyakin in [0050]. By utilizing a loss function, the neural network “learns” to not predict outliers that would otherwise compromise the output data. Therefore, utilizing the loss function as described in Note 19B would prevent outlier generation between the normal, anchor points, and geometry and therefore “enforce a geometric consistency”. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: Liao (NPL: Cloning Outfits from Real-World Images to 3D Characters for Generalizable Person Re-Identification) corresponds to the US patent publication US 20230306673 A1 cited in this action. Liu et al. (NPL: Spatial-Aware Texture Transformer for High-Fidelity Garment Transfer) teaches a “novel coordinate-prior map that defines the spatial relationship between the coordinates in the UV texture map” (Abstract). Cushen et al. (NPL: Markerless Real-Time Garment Retexturing from Monocular 3D Reconstruction) teaches “a practical and photorealistic method for reconstruction of cloth geometry and retexturing for augmented reality in a non-lab environment” (Pg. 1, par. 3). Guo et al. (NPL: Mesh-Guided Optimized Retexturing for Image and Video) teaches a “novel approach for replacing textures of specified regions in the input image and video using stretch-based mesh optimization” (Abstract). Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to VINCENT ALEXANDER PROVIDENCE whose telephone number is (571)270-5765. The examiner can normally be reached Monday-Thursday 8:30-5:00. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, King Poon can be reached on (571)270-0728. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /VINCENT ALEXANDER PROVIDENCE/Examiner, Art Unit 2617 /KING Y POON/Supervisory Patent Examiner, Art Unit 2617
Read full office action

Prosecution Timeline

May 12, 2023
Application Filed
Feb 05, 2025
Non-Final Rejection — §103
Mar 18, 2025
Interview Requested
Mar 26, 2025
Applicant Interview (Telephonic)
Mar 26, 2025
Response Filed
Mar 26, 2025
Examiner Interview Summary
Apr 23, 2025
Final Rejection — §103
Jul 01, 2025
Interview Requested
Jul 17, 2025
Examiner Interview Summary
Jul 17, 2025
Applicant Interview (Telephonic)
Jul 28, 2025
Request for Continued Examination
Jul 30, 2025
Response after Non-Final Action
Aug 19, 2025
Non-Final Rejection — §103
Nov 06, 2025
Interview Requested
Nov 13, 2025
Applicant Interview (Telephonic)
Nov 13, 2025
Examiner Interview Summary
Nov 21, 2025
Response Filed
Feb 05, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12586303
GEOMETRY-AWARE THREE-DIMENSIONAL SYNTHESIS IN ALL ANGLES
2y 5m to grant Granted Mar 24, 2026
Patent 12530847
IMAGE GENERATION FROM TEXT AND 3D OBJECT
2y 5m to grant Granted Jan 20, 2026
Patent 12530808
Predictive Encoding/Decoding Method and Apparatus for Azimuth Information of Point Cloud
2y 5m to grant Granted Jan 20, 2026
Patent 12524946
METHOD FOR GENERATING FIREWORK VISUAL EFFECT, ELECTRONIC DEVICE, AND STORAGE MEDIUM
2y 5m to grant Granted Jan 13, 2026
Patent 12380621
COMPUTER-IMPLEMENTED SYSTEMS AND METHODS FOR GENERATING ENHANCED MOTION DATA AND RENDERING OBJECTS
2y 5m to grant Granted Aug 05, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

5-6
Expected OA Rounds
83%
Grant Probability
99%
With Interview (+25.0%)
2y 5m
Median Time to Grant
High
PTA Risk
Based on 18 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month