Prosecution Insights
Last updated: April 19, 2026
Application No. 18/738,765

SECURE AUTHENTICATION OF DIGITAL HUMANS

Final Rejection §103
Filed
Jun 10, 2024
Examiner
ALGIBHAH, HAMZA N
Art Unit
2441
Tech Center
2400 — Computer Networks
Assignee
Charter Communications Operating LLC
OA Round
2 (Final)
79%
Grant Probability
Favorable
3-4
OA Rounds
2y 11m
To Grant
82%
With Interview

Examiner Intelligence

Grants 79% — above average
79%
Career Allow Rate
566 granted / 713 resolved
+21.4% vs TC avg
Minimal +3% lift
Without
With
+3.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 11m
Avg Prosecution
31 currently pending
Career history
744
Total Applications
across all art units

Statute-Specific Performance

§101
12.1%
-27.9% vs TC avg
§103
50.2%
+10.2% vs TC avg
§102
20.0%
-20.0% vs TC avg
§112
10.4%
-29.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 713 resolved cases

Office Action

§103
Details Claims 1-20 are pending. Claims 1-20 are rejected. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hicks et al (Pub. No.: US 2018/0232591 A1) in view of Astarabadi et al. (Pub. No.: US 2021/0314526 A1). As per claim 1, Hicks discloses a method, comprising: - receiving, by a computing device from a first participant computing device (Hicks, Fig 3, Fig 4 step 402-404, paragraph 0058-0059, wherein “At 402, a video stream and an audio stream of a user speaking is received. For example, recognition module 124 receives a video stream 304 and an audio stream 306 of a user 301 speaking one or more words or phrases 302. At 404, the video stream is processed to identify a face of the user. For example, the recognition module 124 processes the video stream 304 to identify a face 303 of the user 301”); - accessing, by the computing device (Hicks, Fig 4 step 408, paragraph 0061, wherein “At 408, the dynamic face and voice signature of the user is compared to a stored dynamic face and voice signature of an authorized user. For example, the recognition module 124 compares the dynamic face and voice signature 314 of the user 301 to a stored dynamic face and voice signature 212 of an authorized user that is generated during the enrollment phase and stored in authentication database 218”; wherein the stored dynamic face and/or voice signature can be the predetermined validation data), the predetermined validation data comprises motion capture data generated by a motion capture sensor while the known individual speaking (Hicks paragraph 0033, wherein “The signature generation component 208 generates a dynamic face and voice signature 212, which uniquely identifies, the user by correlating facial motion of the face 203 of the user 201 in the processed video stream 204 to the audio stream 206 of the user speaking. For example, the changes in volume, pitch, and cadence of the user's voice in the audio stream 206 is correlated to the facial motions that the user makes when speaking in order to generate the dynamic face and voice signature 212”);; - performing an analysis, by the computing device, of a segment of the video stream based on the predetermined validation data (Hicks, Fig 4 step 408-409, paragraph 0061-0062, wherein “At 408, the dynamic face and voice signature of the user is compared to a stored dynamic face and voice signature of an authorized user. At 410, the user is verified as the authorized user if the dynamic face and voice signature of the user matches the stored dynamic face and voice signature of the authorized user. For example, the recognition module 124 verifies the user as the authorized user if the dynamic face and voice signature 314 of the user matches the stored dynamic face and voice signature 212); and - providing, by the computing device based on the analysis, an output signal indicative of (Hicks, Fig 4 step 408-409, paragraph 0061-0062, wherein “At 408, the dynamic face and voice signature of the user is compared to a stored dynamic face and voice signature of an authorized user. At 410, the user is verified as the authorized user if the dynamic face and voice signature of the user matches the stored dynamic face and voice signature of the authorized user. For example, the recognition module 124 verifies the user as the authorized user if the dynamic face and voice signature 314 of the user matches the stored dynamic face and voice signature 212). Hicks does not explicitly disclose that the - first participant computing device is participating in a video call with a second participant computing device; - receiving also an individual identifier that identifies a known individual who the individual participating in the video call via the second computing device purports to be;- the accessing of the predetermined validation data is based on the individual identifier; and- a confidence level.However, Astarabadi discloses:- first participant computing device is participating in a video call with a second participant computing device (Astarabadi, Fig 1, 6 paragraph 0016, wherein “As shown in FIG. 6, a similar variation of the method S100 includes, during a video call between a first device and a second device following the setup period: accessing a first frame captured by the first device in Block S110; deriving characteristics of a first set of facial features detected in the first frame in Block S120”); - receiving also an individual identifier that identifies a known individual who the individual participating in the video call via the second computing device purports to be (Astarabadi, Fig 1A, wherein the identifier identifying the caller (MATT) or the callee (Mom) can be the individual identifier); - the accessing of the predetermined validation data is based on the individual identifier (Astarabadi, Fig 1A, paragraph 0012, wherein “detecting a first face in a first region of the target image in Block S104; extracting a constellation of facial biometric values from the first region of the target image in Block S106; and, in response to the constellation of facial biometric values aligning with a first faceprint of a first user account, enabling the first user at the first device to select a first look model, from a first set of look models, associated with the first user account in Block S108. The method S100 also includes, during the video call: accessing a first video feed captured at the first device in Block S110; for a first frame, in the first video feed, captured at a first time, detecting a first constellation of facial landmarks in the first frame in Block S120; representing the first constellation of facial landmarks in a first facial landmark container in Block S122; extracting a first constellation of facial biometric values from the first frame in Block S124; and, in response to the constellation of facial biometric values aligning with the first faceprint of the first user account, transmitting the first facial landmark container and confirmation of the first user to the second device in Block S130”); and- a confidence level (Astarabadi, paragraph 0050-0051, wherein “In another example, the device implements a structural similarity index (or “SSIM”) to quantify a baseline difference between the baseline synthetic face image and the authentic face image. In yet another example, the device: implements a facial recognition system to calculate a confidence that the face depicted in the synthetic face image is identical to the face depicted in the authentic face image; and characterizes a baseline difference between the synthetic face image and the authentic face image based on (e.g., inversely proportional to) this confidence”). Therefore, it would have it would have been obvious to one ordinary skill in the art before the effective filing date of the invention to incorporate Astarabadi teachings into Hicks to achieve the claimed limitations because this would have provided a way to authenticate and verify user identities during video conference call and to improve the flexibility of the system by allowing a controlled threshold for matching faces features which improves the accuracy and useability of the system. As per claim 2, claim 1 is incorporated and Hicks further discloses wherein the motion capture data quantifies facial expressions of the known individual while speaking (Hicks paragraph 0033, wherein “The signature generation component 208 generates a dynamic face and voice signature 212, which uniquely identifies, the user by correlating facial motion of the face 203 of the user 201 in the processed video stream 204 to the audio stream 206 of the user speaking. For example, the changes in volume, pitch, and cadence of the user's voice in the audio stream 206 is correlated to the facial motions that the user makes when speaking in order to generate the dynamic face and voice signature 212”); As per claim 3, claim 2 is incorporated and Hicks further discloses wherein performing the analysis, by the computing device, of the segment of the video stream based on the predetermined validation data comprises: generating, by the computing device, generated motion capture data that quantifies facial expressions of the individual depicted in the segment of the video stream; and comparing, by the computing device, the generated motion capture data to the motion capture data quantifying facial expressions of the known individual while speaking (Hicks, Fig 4 step 408-409, paragraph 0061-0062, wherein “At 408, the dynamic face and voice signature of the user is compared to a stored dynamic face and voice signature of an authorized user. At 410, the user is verified as the authorized user if the dynamic face and voice signature of the user matches the stored dynamic face and voice signature of the authorized user. For example, the recognition module 124 verifies the user as the authorized user if the dynamic face and voice signature 314 of the user matches the stored dynamic face and voice signature 212); As per claim 4, claim 1 is incorporated and Astarabadi further discloses wherein the predetermined validation data comprises imagery depicting facial muscles and/or facial wrinkles of the known individual (Astarabadi, paragraph 0071, wherein “Furthermore, insertion of this face model and a different facial landmark container—such as extracted from a video frame captured by the device during a later video call—into the synthetic face generator produces a realistic approximation of: the face shape, skin tone, facial hair, makeup, freckles, wrinkles, eye color, hair color, hair style, and/or jewelry, etc. depicted in the set of frames; and the facial expression depicted in the video frame”); As per claim 5, claim 4 is incorporated and Astarabadi further discloses wherein performing the analysis, by the computing device, of the segment of the video stream based on the predetermined validation data comprises: inputting, by the computing device, the segment of the video stream into a machine learned model that was trained with the imagery depicting the facial muscles and/or the facial wrinkles of the known individual (Astarabadi, paragraph 0041-0042, wherein “n particular, the remote computer system can train the conditional generative adversarial network to output a synthetic face image based on a set of input conditions, including: a facial landmark container, which captures relative locations (and/or sizes, orientations) of facial landmarks that represent a facial expression; and a face model, which contains a (pseudo-) unique set of coefficients characterizing a unique human face and secondary physiognomic features (e.g., face shape, skin tone, facial hair, makeup, freckles, wrinkles, eye color, hair color, hair style, and/or jewelry). Therefore, the remote computer system can input values from a facial landmark container and coefficients from a face model into the conditional generative adversarial network to generate a synthetic face image that depicts a face—(uniquely) represented by coefficients in the face model—exhibiting a facial expression represented by the facial landmark container”); As per claim 6, claim 1 is incorporated and Astarabadi further discloses wherein the predetermined validation data comprises imagery depicting hair of the known individual (Astarabadi, paragraph 0071, wherein “Furthermore, insertion of this face model and a different facial landmark container—such as extracted from a video frame captured by the device during a later video call—into the synthetic face generator produces a realistic approximation of: the face shape, skin tone, facial hair, makeup, freckles, wrinkles, eye color, hair color, hair style, and/or jewelry, etc. depicted in the set of frames; and the facial expression depicted in the video frame”); As per claim 7, claim 6 is incorporated and Astarabadi further discloses wherein performing the analysis, by the computing device, of the segment of the video stream based on the predetermined validation data comprises: inputting, by the computing device, the segment of the video stream into a machine learned model that was trained with the imagery depicting hair of the known individual (Astarabadi, paragraph 0041-0042, wherein “n particular, the remote computer system can train the conditional generative adversarial network to output a synthetic face image based on a set of input conditions, including: a facial landmark container, which captures relative locations (and/or sizes, orientations) of facial landmarks that represent a facial expression; and a face model, which contains a (pseudo-) unique set of coefficients characterizing a unique human face and secondary physiognomic features (e.g., face shape, skin tone, facial hair, makeup, freckles, wrinkles, eye color, hair color, hair style, and/or jewelry). Therefore, the remote computer system can input values from a facial landmark container and coefficients from a face model into the conditional generative adversarial network to generate a synthetic face image that depicts a face—(uniquely) represented by coefficients in the face model—exhibiting a facial expression represented by the facial landmark container”); As per claim 8, claim 1 is incorporated and Astarabadi further discloses wherein the predetermined validation data comprises imagery depicting skin of the known individual (Astarabadi, paragraph 0071, wherein “Furthermore, insertion of this face model and a different facial landmark container—such as extracted from a video frame captured by the device during a later video call—into the synthetic face generator produces a realistic approximation of: the face shape, skin tone, facial hair, makeup, freckles, wrinkles, eye color, hair color, hair style, and/or jewelry, etc. depicted in the set of frames; and the facial expression depicted in the video frame”); As per claim 9, claim 8 is incorporated and Astarabadi further discloses wherein performing the analysis, by the computing device, of the segment of the video stream based on the predetermined validation data comprises: inputting, by the computing device, the segment of the video stream into a machine learned model that was trained with the imagery depicting the skin of the known individual (Astarabadi, paragraph 0041-0042, wherein “n particular, the remote computer system can train the conditional generative adversarial network to output a synthetic face image based on a set of input conditions, including: a facial landmark container, which captures relative locations (and/or sizes, orientations) of facial landmarks that represent a facial expression; and a face model, which contains a (pseudo-) unique set of coefficients characterizing a unique human face and secondary physiognomic features (e.g., face shape, skin tone, facial hair, makeup, freckles, wrinkles, eye color, hair color, hair style, and/or jewelry). Therefore, the remote computer system can input values from a facial landmark container and coefficients from a face model into the conditional generative adversarial network to generate a synthetic face image that depicts a face—(uniquely) represented by coefficients in the face model—exhibiting a facial expression represented by the facial landmark container”); As per claim 10, claim 1 is incorporated and Hicks further discloses wherein the predetermined validation data comprises audio data generated from voice signals of the known individual (Hicks, Fig 3, Fig 4 step 402-404, paragraph 0058-0059, wherein “At 402, a video stream and an audio stream of a user speaking is received. For example, recognition module 124 receives a video stream 304 and an audio stream 306 of a user 301 speaking one or more words or phrases 302. At 404, the video stream is processed to identify a face of the user. For example, the recognition module 124 processes the video stream 304 to identify a face 303 of the user 301”); As per claim 11, claim 10 is incorporated and Hicks further discloses wherein performing the analysis, by the computing device, of the segment of the video stream based on the predetermined validation data comprises: comparing, by the computing device, audio data contained in the segment of the video stream to the audio data generated from the voice signals of the known individual (Hicks, Fig 4 step 408-409, paragraph 0061-0062, wherein “At 408, the dynamic face and voice signature of the user is compared to a stored dynamic face and voice signature of an authorized user. At 410, the user is verified as the authorized user if the dynamic face and voice signature of the user matches the stored dynamic face and voice signature of the authorized user. For example, the recognition module 124 verifies the user as the authorized user if the dynamic face and voice signature 314 of the user matches the stored dynamic face and voice signature 212); As per claim 12, claim 1 is incorporated and Hicks further discloses wherein the video stream is a live video stream being streamed from a first computing device to a second computing device, and wherein the computing device receives the video stream from the second computing device, and wherein the computing device provides the output signal to the second computing device (Hicks, paragraph 0042-0044, wherein “The user then speaks one or more words or phrases 302, which in this example is “unlock my computer”. The recognition module 124 triggers video capture device 114 to capture a video stream 304 of the user 301 as the user speaks the one or more words or phrases 302, and also triggers audio capture device 116 to capture an audio stream 306 of the user speaking the one or more words or phrases 302. Notably, the video stream 304 includes a face 303 of the user 301. The video stream 304 and the audio stream 306 are then passed to the recognition module 124” Hicks, Fig 5, paragraph 0030, wherein “Although illustrated as part of computing device 102, functionality of authentication system 120 may also be implemented in a distributed environment, remotely via a network 126 (e.g., “over the cloud”) as further described in relation to FIG. 5, and so on. Although network 126 is illustrated as the Internet, the network may assume a wide variety of configurations. For example, network 126 may include a wide area network (WAN), a local area network (LAN), a wireless network, a public telephone network, an intranet, and so on”); As per claim 13, claim 1 is incorporated and Hicks further discloses wherein the video stream is a live video stream being streamed from a first computing device to a second computing device, and wherein the computing device comprises the second computing device (Hicks, paragraph 0042-0044, wherein “The user then speaks one or more words or phrases 302, which in this example is “unlock my computer”. The recognition module 124 triggers video capture device 114 to capture a video stream 304 of the user 301 as the user speaks the one or more words or phrases 302, and also triggers audio capture device 116 to capture an audio stream 306 of the user speaking the one or more words or phrases 302. Notably, the video stream 304 includes a face 303 of the user 301. The video stream 304 and the audio stream 306 are then passed to the recognition module 124” Hicks, Fig 5, paragraph 0030, wherein “Although illustrated as part of computing device 102, functionality of authentication system 120 may also be implemented in a distributed environment, remotely via a network 126 (e.g., “over the cloud”) as further described in relation to FIG. 5, and so on. Although network 126 is illustrated as the Internet, the network may assume a wide variety of configurations. For example, network 126 may include a wide area network (WAN), a local area network (LAN), a wireless network, a public telephone network, an intranet, and so on”); As per claim 14, claim 1 is incorporated and Hicks further discloses wherein the predetermined validation data derived from the known individual comprises at least two of: motion capture data that includes motion capture data quantifying facial expressions of the known individual while speaking; imagery depicting facial muscles and/or facial wrinkles of the known individual; imagery depicting hair of the known individual; imagery depicting skin of the known individual; and audio data generated from voice signals of the known individual (Hicks paragraph 0033, wherein “The signature generation component 208 generates a dynamic face and voice signature 212, which uniquely identifies, the user by correlating facial motion of the face 203 of the user 201 in the processed video stream 204 to the audio stream 206 of the user speaking. For example, the changes in volume, pitch, and cadence of the user's voice in the audio stream 206 is correlated to the facial motions that the user makes when speaking in order to generate the dynamic face and voice signature 212”).In addition Astarabadi further discloses wherein the predetermined validation data derived from the known individual comprises at least two of: imagery depicting facial muscles and/or facial wrinkles of the known individual; imagery depicting hair of the known individual; and imagery depicting skin of the known individual (Astarabadi, paragraph 0071, wherein “Furthermore, insertion of this face model and a different facial landmark container—such as extracted from a video frame captured by the device during a later video call—into the synthetic face generator produces a realistic approximation of: the face shape, skin tone, facial hair, makeup, freckles, wrinkles, eye color, hair color, hair style, and/or jewelry, etc. depicted in the set of frames; and the facial expression depicted in the video frame”); As per claim 15, claim 14 is incorporated and Hicks further discloses wherein performing the analysis, by the computing device, of the segment of the video stream based on the predetermined validation data, comprises at least two of: a) generating, by the computing device, generated motion capture data that quantifies facial expressions of the individual depicted in the segment of the video stream; comparing, by the computing device, the generated motion capture data to the motion capture data quantifying facial expressions of the known individual while speaking; or e) comparing, by the computing device, audio data contained in the segment of the video stream to the audio data generated from the voice signals of the known individual (Hicks, Fig 4 step 408-409, paragraph 0061-0062, wherein “At 408, the dynamic face and voice signature of the user is compared to a stored dynamic face and voice signature of an authorized user. At 410, the user is verified as the authorized user if the dynamic face and voice signature of the user matches the stored dynamic face and voice signature of the authorized user. For example, the recognition module 124 verifies the user as the authorized user if the dynamic face and voice signature 314 of the user matches the stored dynamic face and voice signature 212”).In addition Astarabadi further discloses wherein performing the analysis, by the computing device, of the segment of the video stream based on the predetermined validation data, comprises at least two of: b) inputting, by the computing device, the segment of the video stream into a first machine learned model that was trained with the imagery depicting the facial muscles and/or the facial wrinkles of the known individual; c)inputting, by the computing device, the segment of the video stream into a second machine learned model that was trained with the imagery depicting the hair of the known individual; or d) inputting, by the computing device, the segment of the video stream into a third machine learned model that was trained with the imagery depicting the skin of the known individual (Astarabadi, paragraph 0041-0042, wherein “n particular, the remote computer system can train the conditional generative adversarial network to output a synthetic face image based on a set of input conditions, including: a facial landmark container, which captures relative locations (and/or sizes, orientations) of facial landmarks that represent a facial expression; and a face model, which contains a (pseudo-) unique set of coefficients characterizing a unique human face and secondary physiognomic features (e.g., face shape, skin tone, facial hair, makeup, freckles, wrinkles, eye color, hair color, hair style, and/or jewelry). Therefore, the remote computer system can input values from a facial landmark container and coefficients from a face model into the conditional generative adversarial network to generate a synthetic face image that depicts a face—(uniquely) represented by coefficients in the face model—exhibiting a facial expression represented by the facial landmark container”); Claims 16-20 are rejected under the same rationale as claims 1-15. Response to Arguments Applicant's arguments filed on 01/22/2026 have been fully considered but they are now moot in light of the new grounds of rejection. Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to HAMZA N ALGIBHAH whose telephone number is (571)270-7212. The examiner can normally be reached 7:30 am - 3:30 pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wing Chan can be reached at (571) 272-7493. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /HAMZA N ALGIBHAH/Primary Examiner, Art Unit 2441
Read full office action

Prosecution Timeline

Jun 10, 2024
Application Filed
Oct 18, 2025
Non-Final Rejection — §103
Jan 22, 2026
Response Filed
Feb 19, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12602224
NON-TERMINATING FIRMWARE UPDATE
2y 5m to grant Granted Apr 14, 2026
Patent 12598111
ENABLING INTENT-BASED NETWORK MANAGEMENT WITH GENERATIVE AI AND DIGITAL TWINS
2y 5m to grant Granted Apr 07, 2026
Patent 12598656
METHOD FOR EDGE COMPUTING
2y 5m to grant Granted Apr 07, 2026
Patent 12598096
METHOD AND APPARATUS FOR ACCESSING VIRTUAL MACHINE, DEVICE AND STORAGE MEDIUM
2y 5m to grant Granted Apr 07, 2026
Patent 12528442
SYSTEM, METHOD, AND APPARATUS FOR MANAGING VEHICLE DATA COLLECTION
2y 5m to grant Granted Jan 20, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
79%
Grant Probability
82%
With Interview (+3.1%)
2y 11m
Median Time to Grant
Moderate
PTA Risk
Based on 713 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month