Last updated: April 19, 2026
Application No. 18/600,717
PASSIVE AND CONTINUOUS DEEP LEARNING METHODS AND SYSTEMS FOR THE GENERATION OF OBJECTS RELATIVE TO A FACE

Final Rejection §103§DP
Filed
Mar 10, 2024
Examiner
HE, WEIMING
Art Unit
2611
Tech Center
2600 — Communications
Assignee
BLINK O.G. LTD.
OA Round
2 (Final)
Interview Optional

— +13.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 410 resolved cases, 2023–2026
Examiner Intelligence

HE, WEIMING View full profile →
Grants 46% of resolved cases
Career Allow Rate
190 granted / 410 resolved
-15.7% vs TC avg
Moderate +14% lift
Without
With
+13.8%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
40 currently pending
Career history
450
Total Applications
across all art units
Statute-Specific Performance

§101
7.4%
-32.6% vs TC avg
§103
59.2%
+19.2% vs TC avg
§102
12.4%
-27.6% vs TC avg
§112
15.0%
-25.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 410 resolved cases
Office Action

§103 §DP
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
The amendment filed on 2/26/26 has been entered and made of record. Claims 1, 8, 14 and 18 are amended. Claims 1-20 are pending. 

Response to Arguments
Applicant’s arguments with respect to claims 1 and 18 have been fully considered but they are moot because the arguments do not apply to the references being used in the current rejection.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.  A nonstatutory obviousness-type double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. 
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b).
Claims 1-13, 15-17 and 19-20 are provisionally rejected on the ground of non-statutory obviousness-type double patenting as being unpatentable over claims 1-20 of copending US Application 18/600,721 in view of Lukac et al. (US 2023/0086807 A1). Although the conflicting claims are not identical, they are not patentably distinct from each other because the present claims have the same scope of US Patent.
This is a provisional obviousness-type double patenting rejection because the conflicting claims have not in fact been patented.
Table 1 illustrates the conflicting claims.

Present Application 18/600,717
1
2
3
4
5
6
7
8
9
10
Copending Application 18/600,721
1
2
3
4
5
6
7
8
9
10
Present Application 18/600,717
11
12
13
14
15
16
17
18
19
20
Copending Application 18/600,721
11
12
13

15
16
17

19
20


Table 2 provides a comparative mapping of the limitations of claim 1 of the present application when compared against the limitations of claim 1 of Copending Application 18/600,721. 
Present Application 18/600,717
Copending Application 18/600,721
1.A computer-implemented method for generating augmented face images, the method comprising:
training a deep neural network, wherein the training the deep neural network comprises:
accepting face digital image data of one or more users, wherein the face digital image data includes at least head pose data, face and eye landmark data, eye localization data, and gaze direction data; accepting segmented eye region image data of the one or more users; accepting facial object information; using the face digital image data, the segmented eye region image data, and the facial object information to train a deep neural network to encode and decode at least one facial object characteristic, wherein the deep neural network is operable to reconstruct the at least one facial object characteristic post-training; and providing a head pose encoder, an eye pose encoder, a face encoder, a glasses frame encoder and a lens encoder; providing a shared eye decoder, wherein the shared eye decoder is shared by all of the head pose encoder, the eye pose encoder, the face encoder, the glasses frame encoder and the lens encoder;
receiving, by a device for generating the augmented face images, an input image of a real face of a real individual viewing a digital display of the device; receiving, by the shared eye decoder, inference head pose data from the head pose encoder, inference segmented eye region image data from the eye pose encoder, and inference face digital image data from the face encoder based on the input image of the real face of the real individual; receiving, by the device for generating the augmented face images, a user selection of at least one inference facial object to be augmented on said input image; receiving, by the shared eye decoder, a combination of embeddings from the head pose encoder the eye pose encoder the face encoder the glasses frame encoder and the lens encoder, wherein embeddings from the glasses frame encoder and the lens encoder are based on the user selection of the at least one inference facial object; decoding, by the shared eye decoder, the combination of embeddings from the head pose encoder, the eye pose encoder, the face encoder, the glasses frame encoder and the lens encoder to render a modification of the input image of the real face of the  real individual with a virtual object corresponding to the user selected at least one inference facial object; and generating one or more images of the real face of the real individual with the inference facial object in place on the image, wherein the generating the one or more images of the real face comprises generating, by the device for generating the augmented face images, the modification of the input image
of the real face of the real individual with the inference facial object in place on the image in real-time, wherein the one or more images of the individual with the inference facial object in place on the image includes an inference of facial object appearance derived from the deep neural network based on the user head pose data, segmented eye region image data, face digital image data, and the user selection of at least one inference facial object.
2. The computer-implemented method of claim 1, wherein the head pose data comprises at least one of pan data, tilt data, or pan-tilt data pairs.  
3. The computer-implemented method of claim 1, wherein the eye landmark data comprises iris image data and outer region of the eye image data.  
4. The computer-implemented method of claim 1, wherein the gaze direction data comprises at least one of gaze angle data or point-of-regard data. 
5. The computer-implemented method of claim 1, wherein the face digital image data comprises at least one of eye region image data, eyeglasses lens region image data, or whole face image data.  
6. The computer-implemented method of claim 1, wherein the accepting segmented eye region image data of the user comprises accepting segmented eyeglasses lens region image data of the user.  
7. The computer-implemented method of claim 6, wherein the accepting segmented eyeglasses lens region image data of the user comprises accepting at least one estimated size of the lens region image data.  
8. The computer-implemented method of claim 1, wherein the segmented eye region image data is received by a camera that is proximate to the display viewed by the user.  
9. The computer-implemented method of claim 1, wherein the accepting facial object information comprises accepting at least one of eyeglasses information, facial hair information, face covering information, or plastic surgery information.  
10. The computer-implemented method of claim 9, wherein the accepting eyeglasses information comprises accepting information about at least one of eyeglasses frame size, eyeglasses frame color, eyeglasses frame shape, eyeglasses lens size, eyeglasses lens color, eyeglasses lens coating type, eyeglasses lens shape, eyeglasses lens refraction properties, or eyeglasses lens opacity.  
11. The computer-implemented method of claim 9, wherein the accepting face covering information comprises accepting information about at least one of mask size, mask color, mask texture, mask pattern, or mask composition.  
12. The computer-implemented method of claim 9, wherein the accepting plastic surgery information comprises accepting information about at least one of nose appearance, lip appearance, jawline appearance, neck appearance, eye region appearance, face appearance, brow appearance, skin appearance, or forehead appearance. 
13. The computer-implemented method of claim 1, wherein the at least one facial object characteristic comprises at least one eyeglasses characteristic.  
 
15. The computer-implemented method of claim 1, wherein the at least one facial object characteristic comprises at least one of a facial hair characteristic, a face covering characteristic, or a plastic surgery characteristic.  
16. The computer-implemented method of claim 1, wherein the receiving a user selection of an inference facial object comprises receiving a user selection of a pair of eyeglasses and at least one of a lens type or a lens coating type.  
17. The computer-implemented method of claim 1, wherein the receiving a user selection of an inference facial object comprises receiving a user selection of at least one of a facial hair feature, a face covering, or a plastic surgery effect.  
 
19. A system comprising one or more processors configured to carry out the operations of claim 1.  
20. A computer program product comprising a non-transitory computer-readable medium having instructions that, when executed by a computer, cause the computer to perform the operations of claim 1.
1.A computer-implemented method for removing an object from one or more images of a face, the method comprising:
accepting face digital image data of a user, wherein the face digital image data includes at least head pose data, face and eye landmark data, eye position, and gaze direction data; accepting segmented eye region image data of the user; accepting facial object information; using the face digital image data, the segmented eye region image data, and the facial object information to train a deep neural network to encode and decode at least one face object region; wherein the deep neural network is operable to replace at least one face region image having a face object with a face region image without the face object post-training; receiving inference head pose data, inference segmented eye region image data, eye position data, gaze direction data, and inference face digital image data from an individual; receiving a user selection of at least one inference facial object; and generating one or more images of the individual without the inference facial object, wherein the one or more images of the individual without the inference facial object includes an inference of face appearance derived from the deep neural network based on the user head pose data, segmented eye region image data, face digital image data, and the user selection of at least one inference facial object.
2. The computer-implemented method of claim 1, wherein the head pose data comprises at least one of pan data, tilt data, or pan-tilt data pairs.  
3. The computer-implemented method of claim 1, wherein the eye landmark data comprises iris image data and outer region of the eye image data.  
4. The computer-implemented method of claim 1, wherein the gaze direction data comprises at least one of gaze angle data or point-of-regard data. 
5. The computer-implemented method of claim 1, wherein the face digital image data comprises at least one of eye region image data, eyeglasses lens region image data, or whole face image data.  
6. The computer-implemented method of claim 1, wherein the accepting segmented eye region image data of the user comprises accepting segmented eyeglasses lens region image data of the user.  
7. The computer-implemented method of claim 6, wherein the accepting segmented eyeglasses lens region image data of the user comprises accepting at least one estimated size of the lens region image data.  
8. The computer-implemented method of claim 1, wherein the segmented eye region image data is received by a camera that is proximate to the display viewed by the user.  
9. The computer-implemented method of claim 1, wherein the accepting facial object information comprises accepting at least one of eyeglasses information, facial hair information, face covering information, or plastic surgery information.  
10. The computer-implemented method of claim 9, wherein the accepting eyeglasses information comprises accepting information about at least one of eyeglasses frame size, eyeglasses frame color, eyeglasses frame shape, eyeglasses lens size, eyeglasses lens color, eyeglasses lens coating type, eyeglasses lens shape, eyeglasses lens refraction properties, or eyeglasses lens opacity. 
11. The computer-implemented method of claim 9, wherein the accepting face covering information comprises accepting information about at least one of mask size, mask color, mask texture, mask pattern, or mask composition.  
12. The computer-implemented method of claim 9, wherein the accepting plastic surgery information comprises accepting information about at least one of nose appearance, lip appearance, jawline appearance, neck appearance, eye region appearance, face appearance, brow appearance, skin appearance, or forehead appearance. 
13. The computer-implemented method of claim 1, wherein the at least one facial object characteristic comprises at least one eyeglasses characteristic.  
 
15. The computer-implemented method of claim 1, wherein the at least one face object region comprises at least one of a facial hair region, a face covering region, or a plastic surgery region.  
16. The computer-implemented method of claim 1, wherein the receiving a user selection of an inference facial object comprises receiving a user selection of a pair of eyeglasses and at least one of a lens type or a lens coating type.  
17. The computer-implemented method of claim 1, wherein the receiving a user selection of an inference facial object comprises receiving a user selection of at least one of a facial hair feature, a face covering, or a plastic surgery effect.  

19. A system comprising one or more processors configured to carry out the operations of claim 1.  
20. A computer program product comprising a non-transitory computer-readable medium having instructions that, when executed by a computer, cause the computer to perform the operations of claim 1.

As Table 2 illustrates, all the limitations of claim 1 of the present application are included in claim 1 of Copending US Patent 18/600, 721 except for the bolded limitations. Lukac further discloses “For example, an input image of a face may be divided into a mouth segment, and eyes segment, a nose segment, etc. A different latent code may then be optimized for each segment. As a result, when an image is reconstructed, each segment can be generated separately, and the final image composed from the generated segments and/or segments of the original input image. This reduces the number of constraints, which enables estimation of latent codes that produce a more accurate segment, as compared to a single latent code for an entire image. This also ensures precise localization, where changes in the latent code of one segment cannot affect other segments of the image since they are separately generated” in [0022]; “In some embodiments, each GAN 108-113 may be a clone of a single trained model (e.g., same parameters, weights, etc.), but may each optimize their own latent space for a particular segment of input images. For example, where the input images are of faces, a first GAN may generate a segment corresponding to a right eye, a second GAN may generate a segment corresponding to a left eye, a third GAN may generate a segment corresponding to hair, and so on. As each GAN is tasked with producing a realistic generated image for only a particular segment of the input image, the result is much more realistic than a single GAN generated a realistic version of the entire input image. The resulting generated segments are then provided to the segmentation layer 113, which can stitch the segments together into output image 114” in [0036]; “By manipulating those codes in a specific direction, one may alter the appearance of the input photo in a specific way while retaining the original visual features, e.g., adding more hair to a bald person while retaining its identity.” in [0032]; see also Fig 2-3 & 5-6. Thus, claim 1 of the present application would have been obvious to one of ordinary skill in the art at the time of the invention, as anticipation of all limitations is tantamount to obviousness.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 8 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Kim et al. (US 2023/0343006 A1) in view of Vemulapalli (US 2023/0306789 A1) and Lukac et al. (US 2023/0086807 A1).
As to Claim 1, Kim teaches A computer-implemented method for generating augmented face images, the method comprising:
accepting face digital image data of one or more users, wherein the face digital image data includes at least head pose data, face and eye landmark data, eye localization data, and gaze direction data; accepting segmented eye region image data of the one or more users; accepting facial object information (Kim discloses receiving sources images of several persons and information on a facial expression, a mouth shape, an eye shape, etc. of a face in [0055]; “the device 100 for generating the virtual face may perform at least one extraction process selected from a group of face detection, face alignment, and face segmentation from an input source image.” in [0049]; see also [0010, 0012] and Fig 2 below:

    PNG
    media_image1.png
    435
    756
    media_image1.png
    Greyscale
 
Here, Kim doesn’t explicitly teach head pose, eye landmark and gaze direction, which is obvious to the facial recognition. Vemulapalli further discloses “The captured images can be analyzed to identify head pose changes in parallel to eye gaze fixation on the optical target” in [0028]; “The deep learning based method may include face detection for finding faces, extracting facial landmarks from the detected faces using key points to locate the eye regions, and performing detection of eye state ( open, closed) for the eyes in the detected eye regions… In some implementations, a head pose estimation is to provide input for a pretrained gaze estimation model that can indicate the gaze direction 208a, 208b.” in [0033]; see also [0035,0042].); 
using the face digital image data, the segmented eye region image data, and the facial object information to train a deep neural network to encode and decode at least one facial object characteristic, wherein the deep neural network is operable to reconstruct the at least one facial object characteristic post-training (Kim, Fig 2); and 
generating one or more images of the real face of the real individual with the inference facial object in place on the image, wherein the one or more images of the individual with the inference facial object in place on the image includes an inference of facial object appearance derived from the deep neural network based on the user head pose data, segmented eye region image data, face digital image data, and the user selection of at least one inference facial object (Kim discloses “In addition, the images transmitted from the internetwork part 120 to the decoder 130 may generate a virtual face image that is a combination of an inferred source face and a feature of the background image through a decoding process of multiple deconvolution layers. The generated image does not have the same face as the source images, and a new virtual person that is a mix of the source images may be generated. In addition, the virtual person with the virtual face may be generated as not a person without a facial expression, but a person imitating the facial expression, the mouth shape, the eye shape, etc. of the background image.” in [0056].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Kim with the teaching of Vemulapalli so as to detect facial and eye landmark, location of eye regions, gaze direction and head pose from the captured image in deep learning based method (Vemulapalli, [0033]).
In response to the new limitations training a deep neural network, wherein the training the deep neural network comprises: providing a head pose encoder, an eye pose encoder, a face encoder, a glasses frame encoder and a lens encoder; providing a shared eye decoder, wherein the shared eye decoder is shared by all of the head pose encoder, the eye pose encoder, the face encoder, the glasses frame encoder and the lens encoder; receiving, by a device for generating the augmented face images, an input image of a real face of a real individual viewing a digital display of the device; receiving, by the shared eye decoder, inference head pose data from the head pose encoder, inference segmented eye region image data from the eye pose encoder, and inference face digital image data from the face encoder based on the input image of the real face of the real individual; receiving, by the device for generating the augmented face images, a user selection of at least one inference facial object to be augmented on said input image; receiving, by the shared eye decoder, a combination of embeddings from the head pose encoder the eye pose encoder the face encoder the glasses frame encoder and the lens encoder, wherein embeddings from the glasses frame encoder and the lens encoder are based on the user selection of the at least one inference facial object; decoding, by the shared eye decoder, the combination of embeddings from the head pose encoder, the eye pose encoder, the face encoder, the glasses frame encoder and the lens encoder to render a modification of the input image of the real face of the real individual with a virtual object corresponding to the user selected at least one inference facial object; and wherein the generating the one or more images of the real face comprises generating, by the device for generating the augmented face images, the modification of the input image of the real face of the real individual with the inference facial object in place on the image in real-time, Kim discloses “The encoder 110 receives a background image of a face of a particular person…Through this, information on a facial expression, a mouth shape, an eye shape, etc. of a face may be obtained” in [0055]; “In addition, the face background data may include at least one selected from a group of a facial expression, an eye/nose/mouth shape, and eye blinking in a face” in [0010]; “After a virtual face generation model is generated through the learning part 200, the inference part 300 may receive one piece of face background data and generate virtual face data…” in [0042]; see also Fig 2 above. Lukac further discloses “For example, an input image of a face may be divided into a mouth segment, and eyes segment, a nose segment, etc. A different latent code may then be optimized for each segment. As a result, when an image is reconstructed, each segment can be generated separately, and the final image composed from the generated segments and/or segments of the original input image. This reduces the number of constraints, which enables estimation of latent codes that produce a more accurate segment, as compared to a single latent code for an entire image. This also ensures precise localization, where changes in the latent code of one segment cannot affect other segments of the image since they are separately generated” in [0022]; “In some embodiments, each GAN 108-113 may be a clone of a single trained model (e.g., same parameters, weights, etc.), but may each optimize their own latent space for a particular segment of input images. For example, where the input images are of faces, a first GAN may generate a segment corresponding to a right eye, a second GAN may generate a segment corresponding to a left eye, a third GAN may generate a segment corresponding to hair, and so on. As each GAN is tasked with producing a realistic generated image for only a particular segment of the input image, the result is much more realistic than a single GAN generated a realistic version of the entire input image. The resulting generated segments are then provided to the segmentation layer 113, which can stitch the segments together into output image 114” in [0036]; “By manipulating those codes in a specific direction, one may alter the appearance of the input photo in a specific way while retaining the original visual features, e.g., adding more hair to a bald person while retaining its identity.” in [0032]; see also Fig 2-3 & 5-6. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Kim and Vemulapalli with the teaching of Lukac so as to receive an input image and a segmentation mask, project, using a differentiable machine learning pipeline, a plurality of segments of the input image into a plurality of latent spaces associated with a plurality of generators to obtain a plurality of projected segments, and composite the plurality of projected segments into an output image (Lukac, Abstract).

As to Claim 2, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 1, wherein the head pose data comprises at least one of pan data, tilt data, or pan-tilt data pairs (Vemulapalli discloses “The reference image can include one or more reference points or landmarks for extracting/cropping a region of interest. Using reference points (e.g., facial landmarks), the images can be cropped to extract the face of the subject illustrating different head poses (e.g., yaw, roll, or pitch change)” in [0042]; tracking head movement and eye gaze direction in [0020].)

As to Claim 3, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 1, wherein the eye landmark data comprises iris image data and outer region of the eye image data (Kim discloses “That is, the face background data may include information capable of representing various facial shapes and movements of one person, for example, eyes, nose, mouth, eyebrows, etc.,” in [0041]. Vemulapalli also discloses “Eye landmarks are then used on the two selected images to crop the eye regions” in [0035]. It is obvious that the cropped eye regions may include iris image data.)

As to Claim 4, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 1, wherein the gaze direction data comprises at least one of gaze angle data or point-of-regard data (Vemulapalli discloses “The captured images can be analyzed to identify head pose changes in parallel to eye gaze fixation on the optical target.” in [0028]; “the eyes 306a, 306b have eye gaze directions 328a, 328b focused on a point 326, spatially distanced 328a from the optical target 310 at an angle 328b” in [0028].)

As to Claim 5, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 1, wherein the face digital image data comprises at least one of eye region image data, eyeglasses lens region image data, or whole face image data (Vemulapalli discloses “The head position 204 can be determined by using eye landmarks (e.g., intraocular distance) and facial landmarks (e.g., nose, mouth comers) relative to the axes XYZ of a Cartesian system of coordinates 214a, 214b, 214c… The deep learning based method may include face detection for finding faces, extracting facial landmarks from the detected faces using key points to locate the eye regions, and performing detection of eye state ( open, closed) for the eyes in the detected eye regions” in [0033], see also [0035].)

As to Claim 8, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 1, wherein the segmented eye region image data is received by a camera that is proximate to a display viewed by the user (Vemulapalli discloses “The mobile device can use the camera 116 to capture one or more images of the head and eyes of the subject 126. The captured images can be analyzed to identify head pose changes in parallel to eye gaze fixation on the optical target. The images captured by the user device 102 can be analyzed using an image analysis engine (e.g., image analysis engine 120 or 122).” in [0028].)

Claim 19 recites similar limitations as claim 1 but in a system form. Therefore, the same rationale used for claim 1 is applied.
Claim 20 recites similar limitations as claim 1 but in a computer program product form. Therefore, the same rationale used for claim 1 is applied.

Claims 6-7, 9-10, 12-13 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Vemulapalli, Lukac and Zhang et al. (US 2019/0138854 A1).
As to Claim 6, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 1. The combination of Zhang further teaches wherein the accepting segmented eye region image data of the user comprises accepting segmented eyeglasses lens region image data of the user (Kim discloses eye area in [0052] and “Through this, information on a facial expression, a mouth shape, an eye shape, etc. of a face may be obtained” in [0055]. Zhang further discloses “Specifically, as shown in FIG. 4, first, in step S41, the recognized merged area and lens area around the eyes are determined using affine transformation, based on keypoint information and lens information corresponding to the randomly changed glasses image as well as the recognized keypoint information near the eyes” in [0050]; “In step S42, an original image around the eyes and larger than the lens area is extracted from the face image in the second training data” in [0051].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Kim, Vemulapalli and Lukac with the teaching of Zhang so as to extract lens area around the eyes based on the keypoint information and lens information (Zhang, [0050]).

As to Claim 7, Kim in view of Vemulapalli, Lukac and Zhang teaches The computer-implemented method of claim 6, wherein the accepting segmented eyeglasses lens region image data of the user comprises accepting at least one estimated size of the lens region image data (Zhang discloses “Specifically, as shown in FIG. 4, first, in step S41, the recognized merged area and lens area around the eyes are determined using affine transformation, based on keypoint information and lens information corresponding to the randomly changed glasses image as well as the recognized keypoint information near the eyes” in [0050].)

As to Claim 9, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 1, wherein the accepting facial object information comprises accepting at least one of eyeglasses information, facial hair information, face covering information, or plastic surgery information (Zhang discloses “Specifically, as shown in FIG. 4, first, in step S41, the recognized merged area and lens area around the eyes are determined using affine transformation, based on keypoint information and lens information corresponding to the randomly changed glasses image as well as the recognized keypoint information near the eyes” in [0050].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Kim, Vemulapalli and Lukac with the teaching of Zhang so as to extract lens area around the eyes based on the keypoint information and lens information (Zhang, [0050]).

As to Claim 10, Kim in view of Vemulapalli, Lukac and Zhang teaches The computer-implemented method of claim 9, wherein the accepting eyeglasses information comprises accepting information about at least one of eyeglasses frame size, eyeglasses frame color, eyeglasses frame shape, eyeglasses lens size, eyeglasses lens color, eyeglasses lens coating type, eyeglasses lens shape, eyeglasses lens refraction properties, or eyeglasses lens opacity (Zhang discloses “In step S31, a glasses type is randomly selected from existing glasses types. That is, one type of glasses data, such as near-sighted glasses, certain frame and lens shapes, a certain frame thickness, a certain lens color and the like, is selected; corresponding keypoint information can represent a shape and a structure of the glasses…” in [0042].)

As to Claim 12, Kim in view of Vemulapalli, Lukac and Zhang teaches The computer-implemented method of claim 9, wherein the accepting plastic surgery information comprises accepting information about at least one of nose appearance, lip appearance, jawline appearance, neck appearance, eye region appearance, face appearance, brow appearance, skin appearance, or forehead appearance (Kim discloses “In addition, the face background data may include at least one selected from a group of a facial expression, an eye/nose/mouth shape, and eye blinking in a face” in [0010], see also [0055].)

As to Claim 13, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 1. The combination of Zhang further teaches wherein the at least one facial object characteristic comprises at least one eyeglasses characteristic (Zhang discloses “Glasses data is data prepared in advance, and includes various types of glasses such as plain glasses, near-sighted glasses, farsighted glasses and sunglasses, different frame and lens shapes, frame thicknesses, lens colors and the like. The glasses data comprises keypoint information, glasses images, and lens information of each type of glasses. The keypoint information is sufficient to represent a shape and a structure of the glasses…” in [0040].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Kim, Vemulapalli and Lukac with the teaching of Zhang so as to obtain the image data including the object properties (i.e. color, size, texture etc.) (Zhang, [0050]).

As to Claim 16, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 1, wherein the receiving a user selection of an inference facial object comprises receiving a user selection of a pair of eyeglasses and at least one of a lens type or a lens coating type (Zhang discloses “In step S31, a glasses type is randomly selected from existing glasses types. That is, one type of glasses data, such as near-sighted glasses, certain frame and lens shapes, a certain frame thickness, a certain lens color and the like, is selected; corresponding keypoint information can represent a shape and a structure of the glasses, for merging; a corresponding glass image reflects frame and lens shapes, a frame thickness, a lens color and the like;” in [0042].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Kim, Vemulapalli and Lukac with the teaching of Zhang so as to obtain the image data including the object properties (i.e. color, size, texture etc.) (Zhang, [0050]).

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Vemulapalli and Lukac, further in view of Zhang and CHOUKROUN et al. (US 2018/0005448 A1).
As to Claim 11, Kim in view of Vemulapalli, Lukac and Zhang teaches The computer-implemented method of claim 9. The combination of CHOUKROUN further teaches wherein the accepting face covering information comprises accepting information about at least one of mask size, mask color, mask texture, mask pattern, or mask composition (Vemulapalli discloses 3D/2D masks in [0019]; “On the other hand, for certain PAIs such as 3D masks with fixed eyes…” in [0020]. CHOUKROUN further discloses “The mask is comprised of pixels covering a continuous or non-continuous area in the initial image. The mask may cover the entirety or a portion of the object. In the example of a pair of glasses, the mask may solely cover the frame of the pair of glasses, the frame and a part of the lenses, the frame and the lenses entirely, or only the lenses” in [0028]; “The modification of the appearance of the mask corresponds to a modification of the color and/or of the opacity of a part or the entirety of the pixels of the mask” in [0029]; “In a particular embodiment of the invention, the modification of the appearance of the mask comprises a step of substitution of the texture of all or part of the object in the final image” in [0030].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Kim, Vemulapalli, Lukac and Zhang with the teaching of CHOUKROUN so as to identify the object property (i.e. color, texture and size etc.) in the image.

Claims 15 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Vemulapalli, Lukac and Mallick et al. (US 2012/0075331 A1).
As to Claim 15, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 1. The combination of Mallick further teaches wherein the at least one facial object characteristic comprises at least one of a facial hair characteristic, a face covering characteristic, or a plastic surgery characteristic (Vemulapalli discloses 3D/2D masks in [0019]; “On the other hand, for certain PAIs such as 3D masks with fixed eyes…” in [0020]. Mallick further discloses an user interface to select facial hair characteristic for modification in Fig 3-10.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Kim, Vemulapalli and Lukac with the teaching of Mallick so as to provide user to select facial characteristic for modification.

As to Claim 17, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 1. The combination of Mallick further teaches wherein the receiving a user selection of an inference facial object comprises receiving a user selection of at least one of a facial hair feature, a face covering, or a plastic surgery effect (Mallick further discloses an user interface to select facial hair characteristic for modification in Fig 3-10.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Kim, Vemulapalli and Lukac with the teaching of Mallick so as to provide user to select facial characteristic for modification.

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Vemulapalli and Lukac, further in view of Tai et al. (US 2020/0372243 A1).
Claim 18 cites similar limitations as claim 1 except wherein the generating an image of the individual with the inference facial object in place comprises generating an image of the individual wearing a pair of virtual eyeglasses (Tai discloses “The image processing method includes: obtaining a target image comprising an object wearing glasses; inputting the target image to a glasses-removing model… and generating a glasses-removed image corresponding to the target image” in Abstract, see also Fig 3.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Kim, Vemulapalli and Lukac  with the teaching of Tai so as to train a glasses-removing model  to generate a target image without glasses (Tai, [0082]).

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Vemulapalli and Lukac, further in view of Tai and Zhang.
As to Claim 14, Kim in view of Vemulapalli and Lukac teaches The computer-implemented method of claim 18, wherein the at least one at least one facial object characteristic comprises eyeglasses characteristic comprises at least one of a frame color, a texture, or a reflection of at least one eyeglasses lens region, wherein the at least one inference facial object in place on the image comprises a pair of virtual eyeglasses and virtual lenses, and wherein the virtual lenses have realistic view-dependent reflective effects (Zhang discloses “In step S31, a glasses type is randomly selected from existing glasses types. That is, one type of glasses data, such as near-sighted glasses, certain frame and lens shapes, a certain frame thickness, a certain lens color and the like, is selected; corresponding keypoint information can represent a shape and a structure of the glasses, for merging; a corresponding glass image reflects frame and lens shapes, a frame thickness, a lens color and the like;” in [0042].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Kim, Vemulapalli, Lukac and Tai with the teaching of Zhang so as to obtain attributes associated with an object.

Conclusion
THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WEIMING HE whose telephone number is (571)270-1221. The examiner can normally be reached Monday-Friday, 8:30am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tammy Goddard can be reached on 571-272-7773. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/Weiming He/
Primary Examiner, Art Unit 2611
Read full office action
Prosecution Timeline

Mar 10, 2024
Application Filed
Nov 03, 2025
Non-Final Rejection — §103, §DP
Feb 26, 2026
Response Filed
Mar 20, 2026
Final Rejection — §103, §DP (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/580,103
Patent 12567135
MULTIMEDIA PLAYBACK MONITORING SYSTEM AND METHOD, AND ELECTRONIC APPARATUS
2y 5m to grant Granted Mar 03, 2026
18/059,377
Patent 12561876
System and method for an audio-visual avatar creation
2y 5m to grant Granted Feb 24, 2026
18/513,815
Patent 12514672
System, Method And Software Program For Aiding In Positioning Of Objects In A Surgical Environment
2y 5m to grant Granted Jan 06, 2026
18/001,120
Patent 12494003
AUTOMATIC LAYER FLATTENING WITH REAL-TIME VISUAL DEPICTION
2y 5m to grant Granted Dec 09, 2025
16/532,321
Patent 12468949
SYSTEMS AND METHODS FOR FEW-SHOT TRANSFER LEARNING
2y 5m to grant Granted Nov 11, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
46%
Grant Probability
60%
With Interview (+13.8%)
3y 4m
Median Time to Grant
Moderate
PTA Risk
Based on 410 resolved cases by this examiner. Grant probability derived from career allow rate.