Last updated: April 19, 2026

Application No. 18/735,893

GENERATIVE FACIAL MAPPING AND BODY BLENDING DURING VIDEO CAPTURE

Final Rejection §103

Filed

Jun 06, 2024

Examiner

WAMBST, DAVID ALEXANDER

Art Unit

2663

Tech Center

2600 — Communications

Assignee

Emovid Corporation

OA Round

4 (Final)

Interview Optional

— +47.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 27 resolved cases, 2023–2026

Examiner Intelligence

WAMBST, DAVID ALEXANDER View full profile →

Grants 67% — above average

Career Allow Rate

18 granted / 27 resolved

+4.7% vs TC avg

Strong +47% interview lift

Without

With

+47.4%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

25 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

4.5%

-35.5% vs TC avg

§103

56.6%

+16.6% vs TC avg

§102

21.5%

-18.5% vs TC avg

§112

16.1%

-23.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 27 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The Amendment filed October 31 2025 has been entered and considered. Claims 1 and 13 have been amended. Claims 5, 11-12, 18, and 24-28 were canceled by way of previous amendment. In light of the amendment the prior art rejections of claims 1 and 13 are withdrawn as moot. The new grounds of rejection set forth in the present action were necessitated by Applicants’ claim amendments.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-4, 6-7, 9, 13-17, 19, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Yin et al. (NPL, published 3/16/2022, pdf attached) in view of Liu (NPL, previously cited), in view of Mouizina (US Patent Pub. No. 2022/0192346, previously cited).

    PNG
    media_image1.png
    485
    1032
    media_image1.png
    Greyscale
Regarding claim 1, Yin teaches a method in a computing system having a display device and a camera (Fig. 1, shows pictures being used an displayed), the method comprising: with the camera, capturing an audio/video sequence of a person speaking, the captured audio/video sequence comprising a sequence of frames and an audio track (Pg. 1, Col. 1, “One-shot talking face generation aims at synthesizing a high-quality talking face video from an arbitrary portrait image, driven by a video or an audio segment”; Pg. 1, Col. 2, “We design a video-based motion generation module and an audio-based one, which can be plugged into the framework either individually or jointly to drive the video generation”); causing presentation on the display of a plurality of images of the person (Fig. 1, reprinted below); receiving user input selecting one of a plurality of images, wherein the selected image shows a target environmental setting including at least one of particular hair, particular makeup, particular clothing, particular background, or particular lighting (Fig. 1, reprinted above, Pg. 6, Col. 1, “The goal of the video-driven motion generator is to generate dense flows with the driving video and the source image as inputs”, a user needs to select the source image for input); performing facial mapping for the frames in the captured audio/video sequence to produce a first facial mapping result (Pg. 6, Col. 1, “We use the 3DMM parameters 𝒑𝑡 from the driving frame 𝑑𝑡 as the motion representation. Specifically, these parameters are first mapped to a latent vector via a 3-layer MLP to aggregate the temporal information”, the driving frame is the captured audio/video sequence); performing facial mapping for the selected image to produce a second facial mapping result (Pg. 5, Col. 2, “Given a single source image, we first use the GAN inversion method [55] to get the latent style code and feature maps of the source image.”); for each frame of the captured audio/video sequence: spatially correlating the frame with the selected image using the first and second facial mapping results to produce spatial correlation results (Fig. 4, partly reprinted below, both 
    PNG
    media_image2.png
    477
    360
    media_image2.png
    Greyscale
mapping result are used to produce flow fields (spatial correlation result)); and generating a target frame consistent with the target environment setting shown in the selected image, at least in part by warping and transferring one or more first regions of the frame of the captured audio/video sequence to one or more second regions of the selected image using the spatial correlation results while maintaining one or more third regions of the selected image in the generated target frame, wherein the one or more first regions warped and transferred correspond to one or more facial portions of the person shown in the frame and wherein the one or more third regions maintained correspond to one or more body portions of the person shown in the selected image; and combining the audio track with the generated target frames to obtain a resulting audio/video sequence (Figs. 9-10, showcase qualitative results of the method, displaying the warping and transferring of facial portions of the source image based on the driving frame while maintaining the body portions of the source image to generate a target frame).
Yin does not explicitly disclose performing facial mapping at least in part by detecting first facial points including corners of eyes, mouth, and nose from each of the frames or performing spatial correlation based, at least in part, on the detected facial points. Yin also does not explicitly disclose that responsive to the presentation on the display, to receive user input selecting one of the plurality of images presented on the display. However, they do require user input to input the file path of the image into the code (Pg. 21).
Liu teaches to performing facial mapping for the frames in the captured audio/video sequence to produce a facial mapping result, at least in part by detecting first facial points including corners of eyes, mouth, and nose from each of the frames (Fig. 1, reprinted below, 
    PNG
    media_image3.png
    270
    1060
    media_image3.png
    Greyscale
, Pg. 2, Col. 1, “For the generation of target’s facial landmark, a GAN-based synthesizer is utilized to build the mapping from source’s FAUP to target’s facial landmark.”)... and for each frame of the captured video sequence: spatially correlating the frame with the selected image using the first and second facial mapping results to produce spatial correlation results based, at least in part, on the detected first and second facial points (Fig. 4, reprinted below, 
    PNG
    media_image4.png
    257
    480
    media_image4.png
    Greyscale
, shows spatial correlation results based on the detected facial points).
Mouizina teaches to cause presentation on the display a plurality of images of the person; responsive to the presentation on the display, receiving user input selecting one of the plurality of images presented on the display, wherein the selected image shows a target environmental setting including at least one of particular hair, particular makeup, particular clothing, particular background, or particular lighting (Para. 139, “As shown in FIG. 22C, the display can show a plurality of user-selectable options 210A-210H. Each of the user- selectable options 210A-210H are associated with a unique modification of a first characteristic related to the user. In this example, that characteristic is hair color, and each individual user-selectable option 210A-21H is associated with a unique proposed modification of that characteristic, e.g. different hair colors. Each of the user-selectable options 210A-210H can also include an image showing the different proposed modifications.” Para. 140, “Once the system 10 receives via an input device a selection from the user of one of the options 210A-210H, the display can show a modified image 208 that shows the user with the proposed modification of the characteristic corresponding to the elected option.”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yin to incorporate the teachings of Liu and Mouizina to include facial mapping using key facial points as well as receiving user input responsive to a display selecting one of the plurality of images presented on the display. Yin performs facial mapping using GAN inversion and a predicted flow field is used as the motion descriptor, however this warping “introduces noticeable artifacts in the final output, especially around the eyes and teeth.” (Pg. 2, Col. 2), as disclosed by Yin. One of ordinary skill in the art would recognize that the facial mapping of Yin could be improved, leaving it obvious to implement the similarly GAN-based method of Liu for more accurate and precise facial mapping and motion synthesis. 
The method of Yin requires a user to input images and videos, but they do not disclose a presentation on a display to perform these actions. One skilled in the art would recognize that allowing for a user to specify the input image being used after showing it on a display is an obvious enhancement that gives the user more control over the merging process. Since the user can see the environment and characteristics of the image used for the merge, the user can find “any other suitable or desired modification” (Para. 156), as disclosed by Mouizina.
Regarding claim 2, Yin as modified teaches all of the elements of claim 1, as stated above, as well as wherein the one or more first regions comprise at least one region containing the mouth of the person, and wherein the one or more third regions comprise at least one region containing one or more of hair of the person, clothing worn by the person, makeup applied to the person, and surroundings of the person (Fig. 1, reprinted above, shows the surroundings, as well as all aspects of the person being captured and subjected to feature mapping and transferring).
Regarding claim 3, Yin as modified teaches all of the elements of claim 2, as stated above, as well as wherein the one or more first regions comprise at least one region collectively containing the eyes, cheeks, and mouth of the person (Fig. 1, reprinted above, shows the eyes, cheeks, and mouth being captured).
Regarding claim 4, Yin as modified teaches all of the elements of claim 1, as stated above, as well as for each frame of the captured audio/video sequence: before the generating of the target frame, applying a geometric transformation to warp the selected image to more closely match a shape and size of the person's face in the frame using the spatial correlation results (Fig. 2, reprinted below, Pg. 2, Col. 2, “The predicted flow field is used to spatially warp the latent feature map”; Pg.4, Col. 2, “To determine the proper layer for performing spatial transformation, we warp the feature map of each layer individually.”).
Regarding claim 6, Yin as modified teaches all of the elements of claim 1, as stated above, as well as causing presentation on the display of at least a portion of the resulting audio/video sequence (Fig. 1, reprinted above, shows the output being displayed).
Regarding claim 7, Yin as modified teaches all of the elements of claim 1, as stated above, as well as persistently storing the resulting audio/video sequence (Fig. 1, reprinted above, shows the resulting audio/video sequence which would need to be stored).

    PNG
    media_image5.png
    313
    492
    media_image5.png
    Greyscale
Regarding claim 9, Yin as modified teaches all of the elements of claim 1, as stated above, as well as wherein the generating of the target frame uses output of an identity module trained with video captured of the person (Fig. 5, reprinted below).
Regarding claim 13, the computer-readable media performs variably the same function as the method of claim 1. It is rejected under the same analysis.
Regarding claim 14, the recited elements perform the same function as that of claim 2. It is rejected under the same analysis.
Regarding claim 15, the recited elements perform the same function as that of claim 4. It is rejected under the same analysis.

    PNG
    media_image6.png
    409
    1029
    media_image6.png
    Greyscale
Regarding claim 16, Yin as modified teaches all of the elements of claim 15, as stated above, as well as wherein the geometric transformation uses one or more of an affine transformation, a thin plate spline transformation, or both an affine transformation and a thin plate spline transformation (Fig. 2, reprinted below) .
Regarding claim 17, Yin as modified teaches all of the elements of claim 13, as stated above, as well as wherein the spatial correlation uses a keypoint detection technique (Pg. 5, Col. 1, “Finally, when the complicated deformations, such as TPS operations, are applied to the feature map, the source image is also interpolated to match the randomly sampled target keypoints”).
Regarding claim 23, the recited elements perform the same function as that of claim 9. It is rejected under the same analysis.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID A WAMBST whose telephone number is (703)756-1750. The examiner can normally be reached M-F 9-6:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Gregory Morse can be reached at (571)272-3838. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DAVID ALEXANDER WAMBST/               Examiner, Art Unit 2663                                                                                                                                                                                         

/GREGORY A MORSE/               Supervisory Patent Examiner, Art Unit 2698

Read full office action

Prosecution Timeline

Jun 06, 2024

Application Filed

Sep 25, 2024

Non-Final Rejection — §103

Oct 21, 2024

Interview Requested

Nov 07, 2024

Examiner Interview Summary

Nov 07, 2024

Applicant Interview (Telephonic)

Nov 15, 2024

Response Filed

Nov 27, 2024

Final Rejection — §103

Jan 28, 2025

Interview Requested

Feb 13, 2025

Applicant Interview (Telephonic)

Feb 13, 2025

Examiner Interview Summary

Feb 24, 2025

Response after Non-Final Action

Apr 17, 2025

Request for Continued Examination

Apr 21, 2025

Response after Non-Final Action

May 06, 2025

Non-Final Rejection — §103

Oct 31, 2025

Response Filed

Feb 20, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/076,021

Patent 12597278

IMAGE AUTHENTICITY DETECTION METHOD AND DEVICE, COMPUTER DEVICE, AND STORAGE MEDIUM

2y 5m to grant Granted Apr 07, 2026

18/146,445

Patent 12524892

SYSTEMS AND METHODS FOR IMAGE REGISTRATION

2y 5m to grant Granted Jan 13, 2026

18/052,658

Patent 12437437

DIFFUSION MODELS HAVING CONTINUOUS SCALING THROUGH PATCH-WISE IMAGE GENERATION

2y 5m to grant Granted Oct 07, 2025

17/886,664

Patent 12423783

DIFFERENTLY CORRECTING IMAGES FOR DIFFERENT EYES

2y 5m to grant Granted Sep 23, 2025

17/866,710

Patent 12380566

METHOD OF SEPARATING TERRAIN MODEL AND OBJECT MODEL FROM THREE-DIMENSIONAL INTEGRATED MODEL AND APPARATUS FOR PERFORMING THE SAME

2y 5m to grant Granted Aug 05, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

5-6

Expected OA Rounds

67%

Grant Probability

99%

With Interview (+47.4%)

2y 11m

Median Time to Grant

High

PTA Risk

Based on 27 resolved cases by this examiner. Grant probability derived from career allow rate.