Prosecution Insights
Last updated: April 19, 2026
Application No. 18/334,168

REAL-TIME AUGMENTATION OF A PLURALITY OF TARGET FACES

Final Rejection §103
Filed
Jun 13, 2023
Examiner
CLOTHIER, MATTHEW MORRIS
Art Unit
2614
Tech Center
2600 — Communications
Assignee
Deep Voodoo LLC
OA Round
2 (Final)
100%
Grant Probability
Favorable
3-4
OA Rounds
1y 11m
To Grant
99%
With Interview

Examiner Intelligence

Grants 100% — above average
100%
Career Allow Rate
3 granted / 3 resolved
+38.0% vs TC avg
Minimal +0% lift
Without
With
+0.0%
Interview Lift
resolved cases with interview
Fast prosecutor
1y 11m
Avg Prosecution
29 currently pending
Career history
32
Total Applications
across all art units

Statute-Specific Performance

§101
6.1%
-33.9% vs TC avg
§103
65.2%
+25.2% vs TC avg
§102
21.2%
-18.8% vs TC avg
§112
6.1%
-33.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 3 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Amendment 1. This action is in response to the amendment filed on 11/14/2025. Claims 1, 4, 6, 9, 13, 16, 18, and 21-22 have been amended. Claims 1-6, 8-18, and 20-22 remain rejected in the application. Response to Arguments 2. Applicant’s arguments with respect to claim 1, and similarly claims 13 and 22, filed on 11/14/2025, with respect to the rejection under 35 U.S.C. 102 regarding that the prior art does not teach the limitation(s): “obtain a set of user facial features corresponding to the first input user face”, “use at least the recorded video frame and the set of user facial features to generate a cropped image comprising the first input user face and a set of alignment information that describes at least a rotation of the first input user face and a translation of the first input user face in the cropped image relative to the recorded video frame”, “use a target face swap model associated with the first target face and the cropped image to generate a first representation of the first target face that matches a facial expression of the first input user face in the cropped image”, “transform the first representation of the first target face using the set of alignment information to match the at least the rotation of the first input user face and the translation of the first input user face in the recorded video frame”, and “overlay, using the first mapping, a transformed first representation of the first target face over the first input user face in the recorded video frame.” have been fully considered, but are moot because of new grounds for rejection. Claim 1, and similarly claims 13 and 22, are now disclosed by Ganong and Helminger. 3. Regarding arguments with respect to claims 2-6, 8-12, 14-18, and 20-21, they are dependent on independent claims 1 and 13 respectively. Applicant does not argue anything other than independent claim 1, and similarly claims 13 and 22. The limitations in those claims, in conjunction with combination, has previously been established and explained. Claim Rejections - 35 USC § 103 4. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. 5. Claims 1-6, 8-18, 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over Ganong et al. (US-2018/0046855-A1, hereinafter "Ganong") in view of Helminger et al. (US-11568524-B2, hereinafter "Helminger"). 6. As per claim 1, Ganong discloses: A system, comprising: a memory; and a processor coupled to the memory and configured to: detect a first input user face and a second input user face in a recorded video frame; (Ganong, Claims 1, 5, “1. A method performed by at least one computer, the method comprising: detecting at least one face in at least one digital image; ... 5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.” and page 2, ¶ [0020], “In a further aspect of the present invention, a system for recognizing one or more faces in a digital image is provided, the system characterized by: (a) one or more face coordinates corresponding to one or more candidate regions for one or more faces ...”) associate a first face identifier (ID) with the first input user face; store a first mapping between the first face ID and a first target face; and (Ganong, page 7, ¶ [0103], “The computer program may perform a face detection technique to detect the one or more faces in the image, which may result in the generation of one or more face signatures, each face signature corresponding to one of the faces. The computer program may then access a database, wherein the database links face signatures with a list of known persons, each known person being associated with one or more face signatures.” and page 15, ¶ [0209], “An identification of a person may be associated with each stored face portrait in a database stored at or linked to one or more computers.”) obtain a set of user facial features corresponding to the first input user face; (Ganong, page 7, ¶ [0103], “The computer program may perform a face detection technique to detect the one or more faces in the image, which may result in the generation of one or more face signatures, each face signature corresponding to one of the faces.” and page 18, ¶ [0229], “Another variation on this invention would be to look for people with similar facial features taken separately from the entire face—such as mouth, nose, and eyes.”) use at least the recorded video frame and the set of user facial features to generate a cropped image comprising the first input user face (Ganong, Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.” and page 3, ¶ [0029], “The present invention, in a further aspect thereof, enables capturing portraits of people whose faces are located in an image. ... The computer program may be configured to adjust the size of the face region in order to capture and create a portrait (or thumbnail) of the person.” and page 15, ¶ [0207], “The record of the portrait/thumbnail image 23d may be recorded in the database as illustrated in FIG. 13 for future use.”) [[and a set of alignment information that describes at least a rotation of the first input user face and a translation of the first input user face in the cropped image relative to the recorded video frame;]] [[use a target face swap model associated with the first target face and the cropped image to generate a first representation of the first target face that matches a facial expression of the first input user face in the cropped image;]] [[transform the first representation of the first target face using the set of alignment information to match the at least the rotation of the first input user face and the translation of the first input user face in the recorded video frame; and]] overlay, using the first mapping, a [[transformed]] first representation of the first target face over the first input user face in the recorded video frame. (Ganong, Fig. 25 (see below); page 16, ¶ [0214], “FIG. 25 illustrates using a selected image to overlay on a digital image photo to cover the face of a subject, also known as face substitution. An application may be to hide negative memories.” and page 16, ¶ [0215], “FIG. 34 illustrates a workflow for face substitution. When a user wants to hide negative memories one aspect of the present invention may matche faces in the face database 34a to be hidden in the photos from the photo database 34d with an image that is selected or provided by a user ...” and page 16, ¶ [0216], “Optionally, the masking may comprise overlaying a selected image over the area co-ordinate corresponding to the location of the at least one face to be suppressed.” and Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.”) PNG media_image1.png 700 600 media_image1.png Greyscale 7. Ganong doesn't explicitly disclose but Helminger discloses: [[use at least the recorded video frame and the set of user facial features to generate a cropped image comprising the first input user face]] and a set of alignment information that describes at least a rotation of the first input user face and a translation of the first input user face in the cropped image relative to the recorded video frame; (Helminger, column 4, lines 6-30, “As discussed in greater detail below, the encoder 152 takes as input a two-dimensional (2D) image that includes a face and has been normalized. For example, the image could be a high-definition (HD) resolution image, such as a one megapixel image, including a face that has been normalized. ... In embodiments, an image can be normalized in any technically-feasible manner, including using face alignment techniques to compute an affine transformation that rotates, scales, translates, etc. the image, and/or cropping the image. ... For example, in some embodiments, normalizing an image includes detecting a largest face in the image and determining the locations of facial landmarks using a modified deep alignment network (DAN). In such cases, the image is then rotated and scaled so that the eyes of the largest face lie on a predefined horizontal line and have a predefined ocular distance. The image can then be cropped and resized to, e.g., 1024×1024 pixels.” and column 2, lines 8-12, “At least one technical advantage of the disclosed techniques relative to the prior art is that the disclosed techniques can be effectively utilized to change or modify facial identities in high-resolution (e.g., megapixel) images or frames of a video.”) use a target face swap model associated with the first target face and the cropped image to generate a first representation of the first target face that matches a facial expression of the first input user face in the cropped image; (Helminger, column 3, lines 55-58, “The model trainer 116 is configured to train machine learning models, including a machine learning (ML) model 150 that can be used to change the identities of faces in images.” and column 9, lines 35-40, “Although the facial identities in the images 400, 402, 404, 406, 408, 410, and 412 are different, the performance in the images 400 and 412, including the smiling facial expression and facial pose, has been preserved by the ML model 210 when generating the images 402, 404, 406, 408, and 410 that include interpolated facial identities.” and column 4, lines 14-21, “As a result, facial features such as the eyes, nose, etc. are at similar locations within normalized images that are input into the encoder 152, which can improve training of the ML model 150. In embodiments, an image can be normalized in any technically-feasible manner, including using face alignment techniques to compute an affine transformation that rotates, scales, translates, etc. the image, and/or cropping the image.” and column 4, lines 17-34, “In embodiments, an image can be normalized in any technically-feasible manner, including using face alignment techniques to compute an affine transformation that rotates, scales, translates, etc. the image, and/or cropping the image. … Given a normalized image that includes a face, the encoder 152 outputs an encoded representation of the normalized image, which is also referred to herein as a “latent representation” of the normalized image.”) transform the first representation of the first target face using the set of alignment information to match the at least the rotation of the first input user face and the translation of the first input user face in the recorded video frame; and [[overlay, using the first mapping, a]] transformed [[first representation of the first target face over the first input user face in the recorded video frame.]] (Helminger, column 4, lines 6-34, “As discussed in greater detail below, the encoder 152 takes as input a two-dimensional (2D) image that includes a face and has been normalized. For example, the image could be a high-definition (HD) resolution image, such as a one megapixel image, including a face that has been normalized. ... In embodiments, an image can be normalized in any technically-feasible manner, including using face alignment techniques to compute an affine transformation that rotates, scales, translates, etc. the image, and/or cropping the image. ... For example, in some embodiments, normalizing an image includes detecting a largest face in the image and determining the locations of facial landmarks using a modified deep alignment network (DAN). In such cases, the image is then rotated and scaled so that the eyes of the largest face lie on a predefined horizontal line and have a predefined ocular distance. The image can then be cropped and resized to, e.g., 1024×1024 pixels. Given a normalized image that includes a face, the encoder 152 outputs an encoded representation of the normalized image, which is also referred to herein as a “latent representation” of the normalized image.” and column 6, line 64-column 7, line 6, “To change the identity of a face appearing in the input image 200 to one of the facial identities 220 or 230, a user can select to use AdaIN coefficients output by one of the sets of dense layers 222 or 232, respectively, to control convolution layers in levels 212 2 to 212 n of the decoder 154. As described, AdaIN coefficients are coefficients that can be used to perform multiplications and/or additions on activations of convolution layers, which is similar to performing an affine transformation and can cause the decoder 154 to generate images including different facial identities.” and column 2, lines 8-12, “At least one technical advantage of the disclosed techniques relative to the prior art is that the disclosed techniques can be effectively utilized to change or modify facial identities in high-resolution (e.g., megapixel) images or frames of a video.”) 8. Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the system of Ganong to include the disclosure of a set of alignment information that describes at least a rotation of the first input user face and a translation of the first input user face in the cropped image relative to the recorded video frame; use a target face swap model associated with the first target face and the cropped image to generate a first representation of the first target face that matches a facial expression of the first input user face in the cropped image; transform the first representation of the first target face using the set of alignment information to match the at least the rotation of the first input user face and the translation of the first input user face in the recorded video frame; and overlay, using the first mapping, a transformed first representation of the first target face over the first input user face in the recorded video frame, of Helminger. The motivation for this modification could have been to ensure that when an image and a target swap model associated with the face is ready to be overlaid, that the final overlaid facial representation is properly oriented, fits the target face properly, and matches the target face expression. By doing these steps, this helps further the illusion that the person’s face is different than what it actually is. 9. As per claim 2, Ganong in view of Helminger discloses: The system of claim 1, wherein to associate the first face ID with the first input user face comprises to: obtain previously generated facial signatures associated with known faces; (Ganong, page 7, ¶ [0103], “The computer program may then access a database, wherein the database links face signatures with a list of known persons, each known person being associated with one or more face signatures.”) generate new facial signatures corresponding to the first input user face and the second input user face; (Ganong, page 7, ¶ [0103], “The computer program may perform a face detection technique to detect the one or more faces in the image, which may result in the generation of one or more face signatures, each face signature corresponding to one of the faces.”) compare the previously generated facial signatures to the new facial signatures; and (Ganong, page 18, ¶ [0230], “The at least one computer may receive at least one unidentified portrait and compare a face signature of the at least one unidentified portrait against face signatures of portraits of identified persons known to the user.”) associate the first input user face with the first face ID that corresponds to a previously generated facial signature that matches a new facial signature associated with the first input user face. (Ganong, page 14, ¶ [0196]-[0197], “Finally, looping (161) may be applied to match the unknown face with a known person. Each face signature (represented as an array of numbers) may be mathematically compared to any other face signature using linear or non-linear classification logic to determine a distance value (163). ... To compare a face to all of the faces associated with a known person, all of the individual one-to-one comparisons may be made, and then either all of the results may be used in the next step or a set of best matches as determined by comparison to some threshold (165) may be used.”) 10. As per claim 3, Ganong in view of Helminger discloses: The system of claim 2, wherein the previously generated facial signature comprises a first previously generated facial signature, the new facial signature comprises a first new facial signature, and wherein the processor is further configured to: associate the second input user face with a second face ID that corresponds to a second previously generated facial signature that matches a second new facial signature associated with the second input user face. (Ganong, page 7, ¶ [0103], “The computer program may perform a face detection technique to detect the one or more faces in the image, which may result in the generation of one or more face signatures, each face signature corresponding to one of the faces. The computer program may then access a database, wherein the database links face signatures with a list of known persons, each known person being associated with one or more face signatures. Each detected face signature may be provided to the individual as associated to the corresponding known person, or where the face signature is not associated with any known person, that information can be provided by the individual.” and page 14, ¶ [0196]-[0197], “Finally, looping (161) may be applied to match the unknown face with a known person. Each face signature (represented as an array of numbers) may be mathematically compared to any other face signature using linear or non-linear classification logic to determine a distance value (163). ... To compare a face to all of the faces associated with a known person, all of the individual one-to-one comparisons may be made, and then either all of the results may be used in the next step or a set of best matches as determined by comparison to some threshold (165) may be used.”) 11. As per claim 4, Ganong in view of Helminger discloses: The system of claim 1, wherein to associate the first face ID with the first input user face comprises to: determine that a set of reference images is available; (Ganong, page 4, ¶ [0039], “In accordance with an aspect of the present invention, there is provided a method performed by at least one computer, the at least one computer comprising or interfacing with a database, the database comprising a plurality of portraits, each portrait associated with an identified person shown in the respective portrait ...”) determine that the cropped image of the first input user face from the recorded video frame matches a first reference image; and (Ganong, page 3, ¶ [0029], “The present invention, in a further aspect thereof, enables capturing portraits of people whose faces are located in an image. ... The computer program may be configured to adjust the size of the face region in order to capture and create a portrait (or thumbnail) of the person.” and page 15, ¶ [0207], “The record of the portrait/thumbnail image 23d may be recorded in the database as illustrated in FIG. 13 for future use.” and page 14, ¶ [0196]-[0197], “Finally, looping (161) may be applied to match the unknown face with a known person. Each face signature (represented as an array of numbers) may be mathematically compared to any other face signature using linear or non-linear classification logic to determine a distance value (163). ... To compare a face to all of the faces associated with a known person, all of the individual one-to-one comparisons may be made, and then either all of the results may be used in the next step or a set of best matches as determined by comparison to some threshold (165) may be used.” and Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.”) in response to the determination that the cropped image of the first input user face from the recorded video frame matches the first reference image, associate the first face ID of the first reference image with the first input user face. (Ganong, page 7, ¶ [0103], “The computer program may then access a database, wherein the database links face signatures with a list of known persons, each known person being associated with one or more face signatures. Each detected face signature may be provided to the individual as associated to the corresponding known person, or where the face signature is not associated with any known person, that information can be provided by the individual. The individual may be provided a means to confirm the association between a face signature and a known person.” and Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.”) 12. As per claim 5, Ganong in view of Helminger discloses: The system of claim 4, wherein the cropped image comprises a first cropped image, and wherein the processor is further configured to: determine that a second cropped image of the second input user face from the recorded video frame matches a second reference image; and (Ganong, page 3, ¶ [0029], “The present invention, in a further aspect thereof, enables capturing portraits of people whose faces are located in an image. ... The computer program may be configured to adjust the size of the face region in order to capture and create a portrait (or thumbnail) of the person.” and page 15, ¶ [0207], “The record of the portrait/thumbnail image 23d may be recorded in the database as illustrated in FIG. 13 for future use.” and page 14, ¶ [0196]-[0197], “Finally, looping (161) may be applied to match the unknown face with a known person. Each face signature (represented as an array of numbers) may be mathematically compared to any other face signature using linear or non-linear classification logic to determine a distance value (163). ... To compare a face to all of the faces associated with a known person, all of the individual one-to-one comparisons may be made, and then either all of the results may be used in the next step or a set of best matches as determined by comparison to some threshold (165) may be used.” and Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.”) in response to the determination that the second cropped image of the second input user face from the recorded video frame matches the second reference image, associate a second face ID of the second reference image with the second input user face. (Ganong, page 7, ¶ [0103], “The computer program may then access a database, wherein the database links face signatures with a list of known persons, each known person being associated with one or more face signatures. Each detected face signature may be provided to the individual as associated to the corresponding known person, or where the face signature is not associated with any known person, that information can be provided by the individual. The individual may be provided a means to confirm the association between a face signature and a known person.” and Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.”) 13. As per claim 6, Ganong in view of Helminger discloses: The system of claim 1, wherein the cropped image comprises the first input user face comprises a first cropped image of the first input user face, (Helminger, column 4, lines 6-21, “As discussed in greater detail below, the encoder 152 takes as input a two-dimensional (2D) image that includes a face and has been normalized. For example, the image could be a high-definition (HD) resolution image, such as a one megapixel image, including a face that has been normalized. ... In embodiments, an image can be normalized in any technically-feasible manner, including using face alignment techniques to compute an affine transformation that rotates, scales, translates, etc. the image, and/or cropping the image.”) and wherein to associate the first face ID with the first input user face comprises to: receive a first operator submission of the first face ID with the first input user face; receive a second operator submission of a second face ID with the second input user face; (Ganong, page 7, ¶ [0103], “The present invention, in another aspect thereof, provides a computer program operable to enable each of the individuals to interface with the networked computer architecture herein provided for sharing information including images. The computer program enables the individuals to upload images including images having depictions of the faces of one or more persons. The computer program may perform a face detection technique to detect the one or more faces in the image, which may result in the generation of one or more face signatures, each face signature corresponding to one of the faces.”) store a first cropped image of the first input user face as a first reference image; and store a second cropped image of the second input user face as a second reference image. (Ganong, page 3, ¶ [0029], “The present invention, in a further aspect thereof, enables capturing portraits of people whose faces are located in an image. ... The computer program may be configured to adjust the size of the face region in order to capture and create a portrait (or thumbnail) of the person.” and page 15, ¶ [0207], “The record of the portrait/thumbnail image 23d may be recorded in the database as illustrated in FIG. 13 for future use.”) 14. Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the system of claim 1 of Ganong to include the disclosure that the cropped image comprises a first cropped image of the first input user face, of Helminger. The motivation for this modification could have been to ensure a proper cropped image related to the first input face is ready to be overlaid on top of the user face. The cropped image ensures that is of the proper dimensions so that it is a good fit over the user face. 15. As per claim 8, Ganong in view of Helminger discloses: The system of claim 6, wherein the recorded video frame comprises a first recorded video frame, and wherein the processor is further configured to: receive a second recorded video frame; (Ganong, Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.”) determine a third cropped image of a third input user face from the second recorded video frame; determine a fourth cropped image of a fourth input user face from the second recorded video frame; (Ganong, page 3, ¶ [0029], “The present invention, in a further aspect thereof, enables capturing portraits of people whose faces are located in an image. ... The computer program may be configured to adjust the size of the face region in order to capture and create a portrait (or thumbnail) of the person.” and page 15, ¶ [0207], “The record of the portrait/thumbnail image 23d may be recorded in the database as illustrated in FIG. 13 for future use.” and Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.”) in response to a determination that the third cropped image matches the first reference image, associate the third input user face with the first face ID; and in response to a determination that the fourth cropped image matches the second reference image, associate the fourth input user face with the second face ID. (Ganong, page 14, ¶ [0196]-[0197], “Finally, looping (161) may be applied to match the unknown face with a known person. Each face signature (represented as an array of numbers) may be mathematically compared to any other face signature using linear or non-linear classification logic to determine a distance value (163). ... To compare a face to all of the faces associated with a known person, all of the individual one-to-one comparisons may be made, and then either all of the results may be used in the next step or a set of best matches as determined by comparison to some threshold (165) may be used.” and Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.” and page 7, ¶ [0103], “The computer program may then access a database, wherein the database links face signatures with a list of known persons, each known person being associated with one or more face signatures. Each detected face signature may be provided to the individual as associated to the corresponding known person, or where the face signature is not associated with any known person, that information can be provided by the individual. The individual may be provided a means to confirm the association between a face signature and a known person.”) 16. As per claim 9, Ganong in view of Helminger discloses: The system of claim 1, wherein to use the target face swap model associated with the first target face to generate the first representation of the first target face comprises to encode at least a portion of the cropped image into a plurality of user extrinsic features; and use the target face swap model and the plurality of user extrinsic features to generate the first representation of the first target face. (Helminger, column 3, lines 55-58, “The model trainer 116 is configured to train machine learning models, including a machine learning (ML) model 150 that can be used to change the identities of faces in images.” and column 4, lines 14-21, “As a result, facial features such as the eyes, nose, etc. are at similar locations within normalized images that are input into the encoder 152, which can improve training of the ML model 150. In embodiments, an image can be normalized in any technically-feasible manner, including using face alignment techniques to compute an affine transformation that rotates, scales, translates, etc. the image, and/or cropping the image.” and column 4, lines 31-34, “Given a normalized image that includes a face, the encoder 152 outputs an encoded representation of the normalized image, which is also referred to herein as a “latent representation” of the normalized image.”) 17. Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the system of claim 1 of Ganong to include the disclosure of using target face swap models to encode a plurality of extrinsic features from target faces and generate representations of those faces, of Helminger. The motivation for this modification could have been to allow the encoded target face swap models to be transformed into "latent representation" for use with machine learning models, such as for training. In addition, "latent representation" provides a more optimally compressed version of encoded target face swap models. 18. As per claim 10, Ganong in view of Helminger discloses: The system of claim 9, wherein the set of user facial features comprises a first set of user facial features, wherein the cropped image comprises a first cropped image, wherein the target face swap model comprises a first target face swap model, and wherein the plurality of user extrinsic features comprises a first plurality of user extrinsic features, and wherein the processor is further configured to: obtain a second set of user facial features corresponding to the second input user face; (Ganong, page 7, ¶ [0103], “The computer program may perform a face detection technique to detect the one or more faces in the image, which may result in the generation of one or more face signatures, each face signature corresponding to one of the faces.” and page 18, ¶ [0229], “Another variation on this invention would be to look for people with similar facial features taken separately from the entire face—such as mouth, nose, and eyes.”) use at least the recorded video frame and the second set of user facial features to generate a second cropped image comprising the second input user face; (Ganong, Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.” and page 3, ¶ [0029], “The present invention, in a further aspect thereof, enables capturing portraits of people whose faces are located in an image. ... The computer program may be configured to adjust the size of the face region in order to capture and create a portrait (or thumbnail) of the person.” and page 15, ¶ [0207], “The record of the portrait/thumbnail image 23d may be recorded in the database as illustrated in FIG. 13 for future use.”) use a second target face swap model associated with a second target face to encode at least a portion of the second cropped image into a second plurality of user extrinsic features; and use the second target face swap model and the second plurality of user extrinsic features to generate a second representation of the second target face. (Helminger, column 3, lines 55-58, “The model trainer 116 is configured to train machine learning models, including a machine learning (ML) model 150 that can be used to change the identities of faces in images.” and column 4, lines 14-21, “As a result, facial features such as the eyes, nose, etc. are at similar locations within normalized images that are input into the encoder 152, which can improve training of the ML model 150. In embodiments, an image can be normalized in any technically-feasible manner, including using face alignment techniques to compute an affine transformation that rotates, scales, translates, etc. the image, and/or cropping the image.” and column 4, lines 31-34, “Given a normalized image that includes a face, the encoder 152 outputs an encoded representation of the normalized image, which is also referred to herein as a “latent representation” of the normalized image.” and column 12, lines 50-54, “Although described with respect to a single image, the face changing application 146 can also receive a video including multiple frames that include faces and process each frame according to steps of the method 700.”) 19. Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the system of claim 9 of Ganong to include the disclosure of using target face swap models to encode a plurality of extrinsic features from target faces and generate representations of those faces, of Helminger. The motivation for this modification could have been to allow the encoded target face swap models to be transformed into "latent representation" for use with machine learning models, such as for training. In addition, "latent representation" provides a more optimally compressed version of encoded target face swap models. 20. As per claim 11, Ganong in view of Helminger discloses: The system of claim 10, wherein the processor is further configured to overlay the second representation of the second target face on the recorded video frame. (Ganong, page 16, ¶ [0214], “FIG. 25 illustrates using a selected image to overlay on a digital image photo to cover the face of a subject, also known as face substitution. An application may be to hide negative memories.” and page 16, ¶ [0215], “FIG. 34 illustrates a workflow for face substitution. When a user wants to hide negative memories one aspect of the present invention may matche faces in the face database 34a to be hidden in the photos from the photo database 34d with an image that is selected or provided by a user which is stored in the negative memory image database 34b. ... The image retrieved from the negative memory image database 34b is resized 34c to match the size requirements of the faces to be hidden ...” and page 16, ¶ [0216], “Optionally, the masking may comprise overlaying a selected image over the area co-ordinate corresponding to the location of the at least one face to be suppressed.” and Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.” and page 2, ¶ [0020], “In a further aspect of the present invention, a system for recognizing one or more faces in a digital image is provided, the system characterized by: (a) one or more face coordinates corresponding to one or more candidate regions for one or more faces ...”) 21. As per claim 12, Ganong in view of Helminger discloses: The system of claim 11, wherein the processor is configured to output the recorded video frame with both overlays of the first representation of the first target face and the second representation of the second target face at a display. (Ganong, page 16, ¶ [0214], “FIG. 25 illustrates using a selected image to overlay on a digital image photo to cover the face of a subject, also known as face substitution. An application may be to hide negative memories.” and page 16, ¶ [0215], “FIG. 34 illustrates a workflow for face substitution. When a user wants to hide negative memories one aspect of the present invention may matche faces in the face database 34a to be hidden in the photos from the photo database 34d with an image that is selected or provided by a user which is stored in the negative memory image database 34b. ... The image retrieved from the negative memory image database 34b is resized 34c to match the size requirements of the faces to be hidden ...” and page 16, ¶ [0216], “Optionally, the masking may comprise overlaying a selected image over the area co-ordinate corresponding to the location of the at least one face to be suppressed.” and Claim 5, “5. The method of claim 1 wherein the at least one digital image is from a frame of a video having a plurality of frames.” and page 2, ¶ [0020], “In a further aspect of the present invention, a system for recognizing one or more faces in a digital image is provided, the system characterized by: (a) one or more face coordinates corresponding to one or more candidate regions for one or more faces ...”) 22. Claim 13, which is similar in scope to claim 1, is thus rejected under the same rationale as described above. 23. Claim 14, which is similar in scope to claims 2 and 13, is thus rejected under the same rationale as described above. 24. Claim 15, which is similar in scope to claims 3 and 13, is thus rejected under the same rationale as described above. 25. Claim 16, which is similar in scope to claims 4 and 13, is thus rejected under the same rationale as described above. 26. Claim 17, which is similar in scope to claims 5 and 13, is thus rejected under the same rationale as described above. 27. Claim 18, which is similar in scope to claims 6 and 13, is thus rejected under the same rationale as described above. 28. Claim 20, which is similar in scope to claims 8 and 13, is thus rejected under the same rationale as described above. 29. Claim 21, which is similar in scope to claims 9 and 13, is thus rejected under the same rationale as described above. 30. Claim 22 is similar in scope to claim 1 except for additional limitations that Ganong in view of Helminger discloses: A computer program product, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for: (Ganong, page 25, ¶ [0302], “In further aspects, the disclosure provides systems, devices, methods, and computer programming products, including non-transitory computer readable memory, or non-transient machine-readable instruction sets, for use in implementing such methods and enabling the functionality described previously.”) Claim 22 is also rejected under the same rationale as claim 1, described above. Conclusion 31. The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. These are as follows: Weber et al. (US-2024/0078726-A1) and He et al. (US-2025/0054108-A1). 32. Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 33. Any inquiry concerning this communication or earlier communications from the examiner should be directed to MATTHEW CLOTHIER whose telephone number is (571)272-4667. The examiner can normally be reached Mon-Fri 8:00am-4:00pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached at (571)272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /MATTHEW CLOTHIER/Examiner, Art Unit 2614 /KENT W CHANG/Supervisory Patent Examiner, Art Unit 2614
Read full office action

Prosecution Timeline

Jun 13, 2023
Application Filed
Aug 09, 2025
Non-Final Rejection — §103
Nov 04, 2025
Interview Requested
Nov 13, 2025
Applicant Interview (Telephonic)
Nov 14, 2025
Response Filed
Nov 17, 2025
Examiner Interview Summary
Mar 03, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12530842
AIRBORNE LiDAR POINT CLOUD FILTERING METHOD DEVICE BASED ON SUPER-VOXEL GROUND SALIENCY
2y 5m to grant Granted Jan 20, 2026
Patent 12499800
IN-VEHICLE DISPLAY DEVICE
2y 5m to grant Granted Dec 16, 2025
Study what changed to get past this examiner. Based on 2 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
100%
Grant Probability
99%
With Interview (+0.0%)
1y 11m
Median Time to Grant
Moderate
PTA Risk
Based on 3 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month