Office Action Analysis: 18780264 — SHAPE RECONSTRUCTION AND EDITING USING ANATOMICALLY CONSTRAINED IMPLICIT SHAPE MODELS

Office Action

§102
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statements filed on 8/22/24, 1/15/25 and 2/10/26 were considered. 
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-3, 7-12, and 15-20 are rejected under 35 U.S.C. 102(a) as being anticipated by Bailey et al. (US 20220262073 A1), hereinafter referred to as Bailey.
Regarding Claim 1, Bailey teaches a computer-implemented method for fitting a shape model for an object to a set of constraints associated with a target shape , the method comprising: determining, based on the set of constraints (Bailey [0163] “input image”), one or more ground truth positions (Bailey [0163] “facial landmarks”) of one or more points on the target shape (Bailey [0106] “deformed mesh output”); (Bailey [0163] “The facial landmarks are detected on the input image.”) generating, via execution of a set of neural networks, a set of fitting parameters (Bailey [0064] “For example, the rig parameters 302 may include inputs for controlling movement in the lip, eye brow, nose, etc.”) associated with the one or more points;  (Bailey [0163] “To animate the mesh, the movement of the detected landmarks in the recording was tracked and the IK model was used to estimate the rig parameters required to match the new landmark configuration.”) computing, via the shape model, one or more predicted positions (Bailey [0065] “final vertex positions”) of the one or more points based on the set of fitting parameters; (Bailey [0060] “According to at least one embodiment, a method of approximating utilizes artist-created texture coordinates U∈R.sup.n×2 of the facial mesh. The approximation may rely on CNNs, which generate deformation maps given input rig parameters.” [0065] “For each of the deformation maps, vertex offsets may be extracted by performing interpolation of the deformation map at each vertex position in texture coordinate space.” [0076] “With continued reference to FIG. 3, the coarse approximation computes the final vertex positions for the mesh by adding (e.g., at adder 310) the vertex offsets to the vertices of the neutral pose of the mesh”) training the set of neural networks based on one or more losses associated with the one or more predicted positions and the one or more ground truth positions (Bailey [0163] “The rig parameters are then passed to a disclosed approximation model to produce the deformed target mesh.” [0077] “Given the approximation model, a loss function is defined to find the optimal model parameters θ.sub.k. According to at least one embodiment, a loss function is proposed that penalizes both inaccuracies in approximated vertex positions, as well as inaccuracies in face normals on the mesh. Given a target mesh V, and approximated vertex offsets Δ, the loss function may be defined in Equation (1)” [0081] “Because the coarse approximation model works on separate mesh segments, the model could produce discontinuities across boundaries and/or seams between particular mesh segments. To address (e.g., minimize) this potential problem, the error function (e.g., the loss function of Equation (1)) may strongly penalize inaccurate face normals, to encourage smooth (or smoother) results along mesh segment boundaries. Penalizing normal errors also suppresses low-amplitude, high-frequency errors that may be visually disturbing or distracting.” Examiner Note: To penalize losses the predicted has to be compared to ground truth positions.); and generating, via execution of the trained set of neural networks, a three-dimensional (3D) model (Bailey [0185] “mesh deformation”) corresponding to the target shape. (Bailey [0180] “a second plurality of deformation maps is generated by applying a second plurality of neural network-trained models” [0182] “a second plurality of vertex offsets is extracted based on the second plurality of deformation maps.” [0185] “The second plurality of vertex offsets is applied to the neutral mesh by applying the second plurality of vertex offsets to values of at most a subset of the vertices of the neutral mesh to generate the mesh deformation. The subset of the vertices of the neutral mesh may correspond to one or more regions of the neutral mesh exhibiting a level of approximation error that is above a particular threshold.”)
Regarding Claim 2, Bailey teaches the computer-implemented method of claim 1, wherein determining the one or more ground truth positions comprises: deforming a template mesh (Bailey [0110] “deformed mesh”) to match the target shape and determining the one or more ground truth positions (Bailey [0110] “landmark vertices”) of the one or more points in the template mesh. (Bailey [0110] “the positions of the landmark vertices … in the deformed mesh were gathered.” Examiner’s Note: The rig function (Bailey [0059]) maps the rig parameters to the deformed mesh, and the rig function is based off the “character's facial rig with a polygonal mesh,” meaning the deformed mesh corresponds to the target shape, the input image.)
Regarding Claim 3, Bailey teaches the computer-implemented method of claim 1, wherein the one or more ground truth positions comprise a set of two-dimensional (2D) positions of a set of landmarks associated with an image of the target shape. (Bailey [0163] “The facial landmarks are detected on the input image.” [0034] “While this disclosure is presented in the context of 3D animation applications, it is not limited thereto, and other implementations of the systems, media, and methods described herein are contemplated, including deformation of geometric models within a 2D or 3D coordinate system, as well as for various interactive geometric modeling applications involving production and modification of geometric models, including, but not limited to, rigging, animation, architecture, automotive design, consumer product design, virtual reality applications, augmented reality applications, game development, visual effects, 3D printing, and the like. Any reference in this disclosure to a geometric model or components thereof, within a 3D model or 3D space will be understood to include the disclosure as applicable to 2D models and 2D spaces.” Examiner Note: This embodiment shows these landmarks can be considered in either a 3D or 2D space.) 
Regarding Claim 7, Bailey teaches the computer-implemented method of claim 1, wherein the one or more losses comprise one or more distances between the one or more predicted positions (Bailey [0076] “With continued reference to FIG. 3, the coarse approximation computes the final vertex positions for the mesh by adding (e.g., at adder 310) the vertex offsets to the vertices of the neutral pose of the mesh 312.” [0077] “Given the approximation model, a loss function is defined to find the optimal model parameters θ.sub.k. According to at least one embodiment, a loss function is proposed that penalizes both inaccuracies in approximated vertex positions, as well as inaccuracies in face normals on the mesh.”)
 and the one or more ground truth positions. (Bailey [0066] “According to at least one embodiment, a loss function is proposed that penalizes both inaccuracies in approximated vertex positions, as well as inaccuracies in face normals on the mesh.” Examiner Note: Inaccuracies in the approximated vertex positions (which are a predicted position) are due to a comparison of the distance between the predicted and ground truth positions.)
Regarding Claim 8, Bailey teaches the computer-implemented method of claim 1, wherein the one or more losses comprise a coefficient regularization loss associated with a set of blending coefficients included in the set of fitting parameters. (Bailey [0114] “r(p) … may denote the rig function that maps the rig parameters p to the subset of vertices V_C.” [0117] “The loss function used to train the model contains both a point-matching component to ensure that the deformed mesh closely matches the control points, as well as a regularization component to avoid large rig parameters that would create unnatural poses.” Examiner’s Note: The rig parameters (fitting parameters) determine how the mesh is deformed; they act as blending coefficients that control the contribution of deformation influences on the final shape. It is also explicitly stated that there is a loss function that includes a regularization component. Accordingly, the rig parameters function as blending coefficients included in the fitting parameters, and the loss includes a coefficient regularization loss associated with those blending coefficients.)
Regarding Claim 9, Bailey teaches the computer-implemented method of claim 1, wherein computing the one or more predicted positions of the one or more points comprises: generating, via the shape model, a set of attributes (Bailey [0060] “deformation maps”) associated with a set of learned shapes for the object (Bailey [0052] “a deformation model is learned without requiring (or otherwise utilizing) assumption of a skeletal system, which is appropriate for complex facial rigs.“) and computing the one or more predicted positions based on the set of attributes and the set of fitting parameters. (Bailey [0052] “a deformation model is learned without requiring (or otherwise utilizing) assumption of a skeletal system, which is appropriate for complex facial rigs.“ [0060] “The approximation may rely on CNNs, which generate deformation maps given input rig parameters.” [0065] “For each of the deformation maps, vertex offsets may be extracted by performing interpolation of the deformation map at each vertex position in texture coordinate space.” Examiner’s Note: Deformation maps modify vertex positions, are based on the target shape, and refine or approximate the final deformed mesh, and are therefore working as corrective displacement.)
Regarding Claim 10, Bailey teaches the computer-implemented method of claim 9, wherein: the set of attributes comprises at least one of a bone point position, a soft tissue thickness, a bone normal, a skinning weight associated, or a set of corrective displacements (Bailey [0060] “deformation maps”); and the set of fitting parameters comprises at least one of an anatomical transformation or a set of blending coefficients associated with the set of corrective displacements. (Bailey [0060] “The approximation may rely on CNNs, which generate deformation maps given input rig parameters.” [0065] “For each of the deformation maps, vertex offsets may be extracted by performing interpolation of the deformation map at each vertex position in texture coordinate space.” Examiner’s Note: The rig parameters(fitting parameters) determine how the mesh is deformed; they act as blending coefficients that control the contribution of deformation influences on the final shape. Deformation maps modify vertex positions, are based on the target shape, and refine or approximate the final deformed mesh, and are therefore working as corrective displacement.)
Regarding Claim 11, Bailey teaches one or more non-transitory computer readable media storing instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: determining one or more ground truth positions (Bailey [0163] “facial landmarks“) of one or more points on a target shape associated with an object; (Bailey [0106] “deformed mesh output”); (Bailey [0163] “The facial landmarks are detected on the input image.”) generating, via execution of a set of neural networks, a set of fitting parameters associated with the one or more points; (Bailey [0064] “For example, the rig parameters 302 may include inputs for controlling movement in the lip, eye brow, nose, etc.”) (Bailey [0163] “To animate the mesh, the movement of the detected landmarks in the recording was tracked and the IK model was used to estimate the rig parameters required to match the new landmark configuration.”) computing, via a shape model, (Bailey [0010] “a method for generating a mesh deformation of a facial model includes” Examiner Note: The shape model is the system that generates deformation maps via neural networks, extracts vertex offsets, and applies those offsets to achieve the output, as the system takes input parameters and outputs a shape, making it a shape model.) one or more predicted positions of the one or more points based on the set of fitting parameters(Bailey [0060] “According to at least one embodiment, a method of approximating utilizes artist-created texture coordinates U∈R.sup.n×2 of the facial mesh. The approximation may rely on CNNs, which generate deformation maps given input rig parameters.” [0065] “For each of the deformation maps, vertex offsets may be extracted by performing interpolation of the deformation map at each vertex position in texture coordinate space.” [0076] “With continued reference to FIG. 3, the coarse approximation computes the final vertex positions for the mesh by adding (e.g., at adder 310) the vertex offsets to the vertices of the neutral pose of the mesh”); training the set of neural networks based on one or more losses associated with the one or more predicted positions and the one or more ground truth positions (Bailey [0163] “The rig parameters are then passed to a disclosed approximation model to produce the deformed target mesh.” [0077] “Given the approximation model, a loss function is defined to find the optimal model parameters θ.sub.k. According to at least one embodiment, a loss function is proposed that penalizes both inaccuracies in approximated vertex positions, as well as inaccuracies in face normals on the mesh. Given a target mesh V, and approximated vertex offsets Δ, the loss function may be defined in Equation (1)” [0081] “Because the coarse approximation model works on separate mesh segments, the model could produce discontinuities across boundaries and/or seams between particular mesh segments. To address (e.g., minimize) this potential problem, the error function (e.g., the loss function of Equation (1)) may strongly penalize inaccurate face normals, to encourage smooth (or smoother) results along mesh segment boundaries. Penalizing normal errors also suppresses low-amplitude, high-frequency errors that may be visually disturbing or distracting.” Examiner Note: To penalize losses the predicted has to be compared to ground truth positions.); and generating, via execution of the trained set of neural networks, a three-dimensional (3D) model corresponding to the target shape (Bailey [0180] “a second plurality of deformation maps is generated by applying a second plurality of neural network-trained models” [0182] “a second plurality of vertex offsets is extracted based on the second plurality of deformation maps.” [0185] “The second plurality of vertex offsets is applied to the neutral mesh by applying the second plurality of vertex offsets to values of at most a subset of the vertices of the neutral mesh to generate the mesh deformation. The subset of the vertices of the neutral mesh may correspond to one or more regions of the neutral mesh exhibiting a level of approximation error that is above a particular threshold.”).
Regarding Claim 12, Bailey teaches the one or more non-transitory computer readable media of claim 11, wherein generating the set of fitting parameters comprises: generating, via execution of a first neural network included in the set of neural networks, one or more transformations associated with an anatomy of the object (Bailey [0010] “generating a first plurality of deformation maps by applying a first plurality of neural network-trained models; extracting a first plurality of vertex offsets based on the first plurality of deformation maps; and applying the first plurality of vertex offsets to a neutral mesh of the facial model to generate the mesh deformation of the facial model.” [0053] “In order to support inverse kinematics (IK) for a facial rig in real-time, an efficient and accurate inversion of the rig function may be necessary to compute character poses given a set of constraints.” Examiner’s Note: Changing character poses require a transformation of anatomy.); and generating, via execution of a second neural network included in the set of neural networks, a set of blending coefficients associated with a set of corrective displacements outputted by the shape model for the one or more points. (Bailey [0180] “a second plurality of deformation maps is generated by applying a second plurality of neural network-trained models” [0062] “To determine the position of the vertex, weighted sums of the pixel information in the deformation map (e.g., a weighted sum of the x-coordinate information of the two pixels, a weighted sum of the y-coordinate information of the two pixels, a weighted sum of the z-coordinate information of the two pixels) may be used to determine an offset for the vertex position in the mesh.” Examiner’s Note: Blending is the combination of multiple values using weights, which is done here with x, y, and z components combining using weighted sums.)
Regarding Claim 15, Bailey teaches the one or more non-transitory computer readable media of claim 12, wherein: the 3D model comprises a deformation of a first face via the one or more transformations, (Bailey [Abstract] “and applying the first plurality of vertex offsets to a neutral mesh of the facial model to generate the mesh deformation of the facial model.”) and the one or more transformations are determined using a second face corresponding to the target shape (Bailey [Abstract] “generating a first plurality of deformation maps by applying a first plurality of neural network-trained models; extracting a first plurality of vertex offsets based on the first plurality of deformation maps” Examiner’s Note: The first face is the neutral mesh, second face is the face in the input image, which is used to generate the landmarks needed for rig parameters (Bailey [0163] which is then used for deformation maps.) 
Regarding Claim 16, Bailey teaches the one or more non-transitory computer readable media of claim 11, wherein the 3D model comprises a reconstruction of the target shape. (Bailey [Abstract] “and applying the first plurality of vertex offsets to a neutral mesh of the facial model to generate the mesh deformation of the facial model.” Examiner’s Note: The mesh deformation is a reconstruction of the targeted face.)
Regarding Claim 17, Bailey teaches the one or more non-transitory computer readable media of claim 11, wherein the 3D model comprises an edit to an anatomy of the object. (Bailey [0013] “first rig parameter pose; providing a second plurality of vertices to a second network to produce a second rig parameter pose; and processing the rig parameter pose and the second rig parameter pose to produce a composite rig parameter pose” Examiner’s Note: Composite rig parameter pose is used as an edit to anatomy.)
Regarding Claim 18, Bailey teaches the one or more non-transitory computer readable media of claim 11, wherein the shape model comprises an additional set of neural networks. (Bailey [0180] “a second plurality of deformation maps is generated by applying a second plurality of neural network-trained models” [0062] “To determine the position of the vertex, weighted sums of the pixel information in the deformation map (e.g., a weighted sum of the x-coordinate information of the two pixels, a weighted sum of the y-coordinate information of the two pixels, a weighted sum of the z-coordinate information of the two pixels) may be used to determine an offset for the vertex position in the mesh.”)
Regarding Claim 19, Bailey teaches the one or more non-transitory computer readable media of claim 11, wherein the one or more ground truth positions comprise at least one of one or more 3D positions of the one or more points in a mesh associated with the target shape or one or more two-dimensional (2D) positions of the one or more points in an image of the target shape. (Bailey [0163] “The facial landmarks are detected on the input image.” Examiner’s Note: Ground truth points are 2D positions on an image of the target shape.)
Claim 20 is a system claim (Bailey [0128] illustrates a computer system) that recites similar limitations to Claim 11, and is therefore rejected under similar rationale.

Allowable Subject Matter
Claims 4-6 and 13-14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID W SOON whose telephone number is (571)272-8113. The examiner can normally be reached M-F 7:30-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alicia Harrington can be reached at (571) 272-2330. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DAVID W SOON/Examiner, Art Unit 2615                                                                                                                                                                                                        
/ALICIA M HARRINGTON/Supervisory Patent Examiner, Art Unit 2615
Read full office action
SHAPE RECONSTRUCTION AND EDITING USING ANATOMICALLY CONSTRAINED IMPLICIT SHAPE MODELS

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

SHAPE RECONSTRUCTION AND EDITING USING ANATOMICALLY CONSTRAINED IMPLICIT SHAPE MODELS

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email