Prosecution Insights
Last updated: April 19, 2026
Application No. 18/735,893

GENERATIVE FACIAL MAPPING AND BODY BLENDING DURING VIDEO CAPTURE

Final Rejection §103
Filed
Jun 06, 2024
Examiner
WAMBST, DAVID ALEXANDER
Art Unit
2663
Tech Center
2600 — Communications
Assignee
Emovid Corporation
OA Round
4 (Final)
67%
Grant Probability
Favorable
5-6
OA Rounds
2y 11m
To Grant
99%
With Interview

Examiner Intelligence

Grants 67% — above average
67%
Career Allow Rate
18 granted / 27 resolved
+4.7% vs TC avg
Strong +47% interview lift
Without
With
+47.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 11m
Avg Prosecution
25 currently pending
Career history
52
Total Applications
across all art units

Statute-Specific Performance

§101
4.5%
-35.5% vs TC avg
§103
56.6%
+16.6% vs TC avg
§102
21.5%
-18.5% vs TC avg
§112
16.1%
-23.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 27 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Amendment The Amendment filed October 31 2025 has been entered and considered. Claims 1 and 13 have been amended. Claims 5, 11-12, 18, and 24-28 were canceled by way of previous amendment. In light of the amendment the prior art rejections of claims 1 and 13 are withdrawn as moot. The new grounds of rejection set forth in the present action were necessitated by Applicants’ claim amendments. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1-4, 6-7, 9, 13-17, 19, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Yin et al. (NPL, published 3/16/2022, pdf attached) in view of Liu (NPL, previously cited), in view of Mouizina (US Patent Pub. No. 2022/0192346, previously cited). PNG media_image1.png 485 1032 media_image1.png Greyscale Regarding claim 1, Yin teaches a method in a computing system having a display device and a camera (Fig. 1, shows pictures being used an displayed), the method comprising: with the camera, capturing an audio/video sequence of a person speaking, the captured audio/video sequence comprising a sequence of frames and an audio track (Pg. 1, Col. 1, “One-shot talking face generation aims at synthesizing a high-quality talking face video from an arbitrary portrait image, driven by a video or an audio segment”; Pg. 1, Col. 2, “We design a video-based motion generation module and an audio-based one, which can be plugged into the framework either individually or jointly to drive the video generation”); causing presentation on the display of a plurality of images of the person (Fig. 1, reprinted below); receiving user input selecting one of a plurality of images, wherein the selected image shows a target environmental setting including at least one of particular hair, particular makeup, particular clothing, particular background, or particular lighting (Fig. 1, reprinted above, Pg. 6, Col. 1, “The goal of the video-driven motion generator is to generate dense flows with the driving video and the source image as inputs”, a user needs to select the source image for input); performing facial mapping for the frames in the captured audio/video sequence to produce a first facial mapping result (Pg. 6, Col. 1, “We use the 3DMM parameters 𝒑𝑡 from the driving frame 𝑑𝑡 as the motion representation. Specifically, these parameters are first mapped to a latent vector via a 3-layer MLP to aggregate the temporal information”, the driving frame is the captured audio/video sequence); performing facial mapping for the selected image to produce a second facial mapping result (Pg. 5, Col. 2, “Given a single source image, we first use the GAN inversion method [55] to get the latent style code and feature maps of the source image.”); for each frame of the captured audio/video sequence: spatially correlating the frame with the selected image using the first and second facial mapping results to produce spatial correlation results (Fig. 4, partly reprinted below, both PNG media_image2.png 477 360 media_image2.png Greyscale mapping result are used to produce flow fields (spatial correlation result)); and generating a target frame consistent with the target environment setting shown in the selected image, at least in part by warping and transferring one or more first regions of the frame of the captured audio/video sequence to one or more second regions of the selected image using the spatial correlation results while maintaining one or more third regions of the selected image in the generated target frame, wherein the one or more first regions warped and transferred correspond to one or more facial portions of the person shown in the frame and wherein the one or more third regions maintained correspond to one or more body portions of the person shown in the selected image; and combining the audio track with the generated target frames to obtain a resulting audio/video sequence (Figs. 9-10, showcase qualitative results of the method, displaying the warping and transferring of facial portions of the source image based on the driving frame while maintaining the body portions of the source image to generate a target frame). Yin does not explicitly disclose performing facial mapping at least in part by detecting first facial points including corners of eyes, mouth, and nose from each of the frames or performing spatial correlation based, at least in part, on the detected facial points. Yin also does not explicitly disclose that responsive to the presentation on the display, to receive user input selecting one of the plurality of images presented on the display. However, they do require user input to input the file path of the image into the code (Pg. 21). Liu teaches to performing facial mapping for the frames in the captured audio/video sequence to produce a facial mapping result, at least in part by detecting first facial points including corners of eyes, mouth, and nose from each of the frames (Fig. 1, reprinted below, PNG media_image3.png 270 1060 media_image3.png Greyscale , Pg. 2, Col. 1, “For the generation of target’s facial landmark, a GAN-based synthesizer is utilized to build the mapping from source’s FAUP to target’s facial landmark.”)... and for each frame of the captured video sequence: spatially correlating the frame with the selected image using the first and second facial mapping results to produce spatial correlation results based, at least in part, on the detected first and second facial points (Fig. 4, reprinted below, PNG media_image4.png 257 480 media_image4.png Greyscale , shows spatial correlation results based on the detected facial points). Mouizina teaches to cause presentation on the display a plurality of images of the person; responsive to the presentation on the display, receiving user input selecting one of the plurality of images presented on the display, wherein the selected image shows a target environmental setting including at least one of particular hair, particular makeup, particular clothing, particular background, or particular lighting (Para. 139, “As shown in FIG. 22C, the display can show a plurality of user-selectable options 210A-210H. Each of the user- selectable options 210A-210H are associated with a unique modification of a first characteristic related to the user. In this example, that characteristic is hair color, and each individual user-selectable option 210A-21H is associated with a unique proposed modification of that characteristic, e.g. different hair colors. Each of the user-selectable options 210A-210H can also include an image showing the different proposed modifications.” Para. 140, “Once the system 10 receives via an input device a selection from the user of one of the options 210A-210H, the display can show a modified image 208 that shows the user with the proposed modification of the characteristic corresponding to the elected option.”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yin to incorporate the teachings of Liu and Mouizina to include facial mapping using key facial points as well as receiving user input responsive to a display selecting one of the plurality of images presented on the display. Yin performs facial mapping using GAN inversion and a predicted flow field is used as the motion descriptor, however this warping “introduces noticeable artifacts in the final output, especially around the eyes and teeth.” (Pg. 2, Col. 2), as disclosed by Yin. One of ordinary skill in the art would recognize that the facial mapping of Yin could be improved, leaving it obvious to implement the similarly GAN-based method of Liu for more accurate and precise facial mapping and motion synthesis. The method of Yin requires a user to input images and videos, but they do not disclose a presentation on a display to perform these actions. One skilled in the art would recognize that allowing for a user to specify the input image being used after showing it on a display is an obvious enhancement that gives the user more control over the merging process. Since the user can see the environment and characteristics of the image used for the merge, the user can find “any other suitable or desired modification” (Para. 156), as disclosed by Mouizina. Regarding claim 2, Yin as modified teaches all of the elements of claim 1, as stated above, as well as wherein the one or more first regions comprise at least one region containing the mouth of the person, and wherein the one or more third regions comprise at least one region containing one or more of hair of the person, clothing worn by the person, makeup applied to the person, and surroundings of the person (Fig. 1, reprinted above, shows the surroundings, as well as all aspects of the person being captured and subjected to feature mapping and transferring). Regarding claim 3, Yin as modified teaches all of the elements of claim 2, as stated above, as well as wherein the one or more first regions comprise at least one region collectively containing the eyes, cheeks, and mouth of the person (Fig. 1, reprinted above, shows the eyes, cheeks, and mouth being captured). Regarding claim 4, Yin as modified teaches all of the elements of claim 1, as stated above, as well as for each frame of the captured audio/video sequence: before the generating of the target frame, applying a geometric transformation to warp the selected image to more closely match a shape and size of the person's face in the frame using the spatial correlation results (Fig. 2, reprinted below, Pg. 2, Col. 2, “The predicted flow field is used to spatially warp the latent feature map”; Pg.4, Col. 2, “To determine the proper layer for performing spatial transformation, we warp the feature map of each layer individually.”). Regarding claim 6, Yin as modified teaches all of the elements of claim 1, as stated above, as well as causing presentation on the display of at least a portion of the resulting audio/video sequence (Fig. 1, reprinted above, shows the output being displayed). Regarding claim 7, Yin as modified teaches all of the elements of claim 1, as stated above, as well as persistently storing the resulting audio/video sequence (Fig. 1, reprinted above, shows the resulting audio/video sequence which would need to be stored). PNG media_image5.png 313 492 media_image5.png Greyscale Regarding claim 9, Yin as modified teaches all of the elements of claim 1, as stated above, as well as wherein the generating of the target frame uses output of an identity module trained with video captured of the person (Fig. 5, reprinted below). Regarding claim 13, the computer-readable media performs variably the same function as the method of claim 1. It is rejected under the same analysis. Regarding claim 14, the recited elements perform the same function as that of claim 2. It is rejected under the same analysis. Regarding claim 15, the recited elements perform the same function as that of claim 4. It is rejected under the same analysis. PNG media_image6.png 409 1029 media_image6.png Greyscale Regarding claim 16, Yin as modified teaches all of the elements of claim 15, as stated above, as well as wherein the geometric transformation uses one or more of an affine transformation, a thin plate spline transformation, or both an affine transformation and a thin plate spline transformation (Fig. 2, reprinted below) . Regarding claim 17, Yin as modified teaches all of the elements of claim 13, as stated above, as well as wherein the spatial correlation uses a keypoint detection technique (Pg. 5, Col. 1, “Finally, when the complicated deformations, such as TPS operations, are applied to the feature map, the source image is also interpolated to match the randomly sampled target keypoints”). Regarding claim 23, the recited elements perform the same function as that of claim 9. It is rejected under the same analysis. Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID A WAMBST whose telephone number is (703)756-1750. The examiner can normally be reached M-F 9-6:30 EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Gregory Morse can be reached at (571)272-3838. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /DAVID ALEXANDER WAMBST/ Examiner, Art Unit 2663 /GREGORY A MORSE/ Supervisory Patent Examiner, Art Unit 2698
Read full office action

Prosecution Timeline

Jun 06, 2024
Application Filed
Sep 25, 2024
Non-Final Rejection — §103
Oct 21, 2024
Interview Requested
Nov 07, 2024
Examiner Interview Summary
Nov 07, 2024
Applicant Interview (Telephonic)
Nov 15, 2024
Response Filed
Nov 27, 2024
Final Rejection — §103
Jan 28, 2025
Interview Requested
Feb 13, 2025
Applicant Interview (Telephonic)
Feb 13, 2025
Examiner Interview Summary
Feb 24, 2025
Response after Non-Final Action
Apr 17, 2025
Request for Continued Examination
Apr 21, 2025
Response after Non-Final Action
May 06, 2025
Non-Final Rejection — §103
Oct 31, 2025
Response Filed
Feb 20, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12597278
IMAGE AUTHENTICITY DETECTION METHOD AND DEVICE, COMPUTER DEVICE, AND STORAGE MEDIUM
2y 5m to grant Granted Apr 07, 2026
Patent 12524892
SYSTEMS AND METHODS FOR IMAGE REGISTRATION
2y 5m to grant Granted Jan 13, 2026
Patent 12437437
DIFFUSION MODELS HAVING CONTINUOUS SCALING THROUGH PATCH-WISE IMAGE GENERATION
2y 5m to grant Granted Oct 07, 2025
Patent 12423783
DIFFERENTLY CORRECTING IMAGES FOR DIFFERENT EYES
2y 5m to grant Granted Sep 23, 2025
Patent 12380566
METHOD OF SEPARATING TERRAIN MODEL AND OBJECT MODEL FROM THREE-DIMENSIONAL INTEGRATED MODEL AND APPARATUS FOR PERFORMING THE SAME
2y 5m to grant Granted Aug 05, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

5-6
Expected OA Rounds
67%
Grant Probability
99%
With Interview (+47.4%)
2y 11m
Median Time to Grant
High
PTA Risk
Based on 27 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month