Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
The response received on 3/2/2026 has been placed in the file and was considered by the examiner. An action on the merit follows.
Response to Amendment
The amendments filed on 2026 March 2 have been fully considered. Response to these amendments is provided below.
Summary of Amendment/ Arguments and Examiner’s Response:
The applicant has amended the independent claims to incorporate previous subject matter and claim new subject matter. The applicant provides support for the amendments on pages 8-10 of the remarks. The applicant then argues, on page11 of the remarks, that Kvochko does not disclose the new limitations, and further limitations that are not provided in the claim.
The examiner disagrees. Although Kvochko teaches most of the limitations, as explained in the rejection below. It is noted that any limitations that the applicant wants considered interpreted in the claims should be put directly in the language of the claims. Currently, the applicant does not claim generating golden data, as argued, and further does not provide language that requires the steps be carried out in the particular order that the applicant argues.
On page 11, the applicant further states that Kvochko does not teach the newly claimed limitations of claim 12.
All arguments are moot in view of new grounds of rejection, below.
On pages 12-13, the applicant argues Ramaswamy does not disclose the newly claimed features, along with Michaeli et al and Packzkowski et al.
As can be seen in the rejection below, the combination of references teach the claimed limitations. Much of what was argued that Ramaswamy does not teach is taught by Kvochko et al, as explained in the rejection below.
Claim Objections
Applicant is advised that should claims 1, 2 or 15 be found allowable, claims 10, 8-9 and 20, respectively, will be objected to under 37 CFR 1.75 as being a substantial duplicate thereof. When two claims in an application are duplicates or else are so close in content that they both cover the same thing, despite a slight difference in wording, it is proper after allowing one claim to object to the other as being a substantial duplicate of the allowed claim. See MPEP § 608.01(m).
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 2-9, 11 and 16-20 are is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention
Claims 2, 18 and 19 recites the limitation "the high-resolution video" in the second to last line, line 2, line 3,respectively. The applicant previously claims two “high resolution video”s. It is unclear as to which video the applicant is referring to.
Claims 4, 5, 8, 16, 17 and 20 recites the limitation "the authentic speech characteristics" in lines 1-2, 1-2, 2, 2, 2 and 3 respectively. The applicant previously claims two “authentic speech characteristics”s. It is unclear as to which video the applicant is referring to.
Claims 5, 8 and 20 recites the limitation "the authentic bodily movements" in lines 1-2, 2-3 and 4 respectively. The applicant previously claims two “authentic bodily movement”s. It is unclear as to which video the applicant is referring to.
Claim 11 recites the limitation "the percentage of accuracy”. It is unclear as to which percentage of accuracy the applicant is referring to, because multiple percentages of accuracy are previously claimed.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 10, 11, 13-15 and 17-20 are rejected under 35 U.S.C. 103(a) as being unpatentable over U.S. Patent No. 11482049 (Kvochko et al) in view of U.S. Patent Application Publication NO. 20200293783 (Ramaswamy et al).
Regarding claim 1, Kvochko et al discloses a method (fig. 2) comprising the steps of: examining a questionable video (col. 7, line 66- col. 8, line 40) of a prominent individual, i.e. a public figure (col. 7, lines 2-3), wherein the questionable video shows the prominent individual speaking (col. 7, lines 5-13 describes subjects speaking); detecting speech characteristics of the prominent individual from the questionable video (col. 8, lines 1-6, col. 7, lines 42-44); detecting bodily movements of the prominent individual from the questionable video (col. 8, lines 1-6, col. 7, lines 39-41) while the prominent individual is speaking, since the individual is speaking and hand gesturing which accompanies speech (col. 3, lines 12-13); comparing the detected speech characteristics and detected bodily movements with reliable baseline characteristics (col. 8, lines 8-35, fig. 2, item 226) that are certified as authentic because they are confirmed to be real (col. 4, lines 27-28); and based on the comparing step, tagging the questionable video as fake or real, by tagging the video as fake for alert (fig. 2, item 124, col. 6, lines 13-14); wherein the reliable baseline characteristics are derived from a video recording of the prominent individual captured in a controlled setting (col. 7, lines 1-3, fig. 2, setting of 202a, 202b), the method further comprising: (i) detecting parallel correlations between the authentic speech characteristics i(col. 7, lines 42-44) and the authentic bodily movements (col. 7 lines 39-41) in the video recording, since they are considered together along with other characteristics as a feature (col. 3, lines 12-14), (ii) using the parallel correlations as training data to create a machine-learning model defining the reliable baseline characteristics by using the features in the AI learning (col. 5, lines 60-62) and (iii) certifying the video recording as authentic (col. 3, lines 27-28); and wherein tagging the questionable video as fake or real is further based on determining a percentage of accuracy, a probability (col. 8, lines 36-40) of the detected speech characteristics and detected bodily movements with respect to the reliable baseline characteristics (col. 8, lines 4-7, col. 7, lines 39-44) and comparing the percentage of accuracy to a predetermined threshold (col. 9, lines 1-3, fig 4, item 416).
Kvochko et al does not disclose expressly the video recording is high resolution, and creating a digital certificate for the high resolution video recording to certify the high-resolution video recording as authentic.
Ramaswamy et al discloses video used for recognition corresponds to high-resolution video (page 2, paragraph 11) and that the video being labeled based on what the model is to be trained, thereby creating a digital certificate (page 12, paragraph 116).
Kvochko et al and Ramaswamy et al are combinable because they are from the same field of endeavor, i.e. video recognition.
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to use a high-resolution video as reference video and label the video.
The suggestion/motivation for doing so would have been to provide a more accurate method by providing video with detail and keeping an accurate recording.
Therefore, it would have been obvious to combine the method of Kvochko et al with the high resolution video data of Ramaswamy et al to obtain the invention as specified in claim 1.
Claim 14 is rejected for the same reasons as claim 1. Thus, the arguments analogous to that presented above for claim 1 are equally applicable to claim 14. Claim 14 distinguishes from claim 1 only in that claim 14 is a non-transitory computer-readable medium for storing a computer program having instructions that enable a processor to perform the steps of claim 1’s method. Kvochko et al teaches further this feature, i.e. the processor and memory of fig. 5, item 502, 504.
Regarding claim 10, Kvochko et al discloses the step of determining a percentage of accuracy, a probability (col. 8, lines 36-40) of the detected speech characteristics and detected bodily movements with respect to the reliable baseline characteristics (col. 8, lines 4-7, col. 7, lines 39-44).
Regarding claim 11, Kvochko et al discloses when the percentage of accuracy is below a predetermined threshold, the questionable video is tagged as fake, i.e. the score falls below the threshold value and the alert is provided (col. 9, lines 1-3, fig. 4, item 416, “no”, items 418, 420), and, when the percentage of accuracy is above the predetermined threshold, the questionable video is tagged as real by alerting the video as real (col. 11, lines 39-41, fig. 4, item 416, “yes”, item 422, 420).
Regarding claim 13, Kvochko et al discloses prominent individual is a celebrity, i.e. a public figure (col. 7, lines 2-3), a famous person, i.e. a public figure (col. 7, lines 2-3), a luminary, an actor, an actress, a musician, a politician, and/or an influencer who is important, well-known, or famous.
Regarding claim 15, Kvochko et al discloses the reliable baseline characteristics include authentic speech characteristics (col. 7, lines 42-44) and authentic bodily movements (col. 7 lines 39-41) derived from a video recording of the prominent individual in a controlled setting (col. 7, lines 1-3, fig. 2, setting of 202a, 202b). Ramaswamy et al discloses video used for recognition corresponds to high-resolution video (page 2, paragraph 11).
Regarding claim 17, Kvochko et al discloses the authentic speech characteristics include parameters related to one or more of pronunciation, enunciation, intonation, articulation, volume, pauses, pitch, rate (col. 5, lines 39-42), rhythm, clarity, intensity, timbre, overtones, resonance, breaths (col. 5, lines 22-23), throat clearing, coughs, and regularity of filler sounds and phrases (col. 5, lines 40-42).
Regarding claim 18, Ramaswamy et al discloses using high resolution video (page 2, paragraph 11) and Kvochko et al discloses the individual is prominent performing one or more actions (col. 7, lines 2-13) selected from the group consisting of a) reading a script, b) reading a script multiple times within a range of different emotions, c) answering questions, d) answering questions multiple times within a range of different emotions, e) responding to prompts, f) telling a story, g) telling a joke, h) describing something the prominent individual likes, i) describing something the prominent individual dislikes, j) laughing, k) singing, l) smiling (col. 7, line 40), and m) frowning.
Regarding claim 19, Kvochko et al discloses the step of certifying the video recording to certify video recording as authentic (col. 4, lines 23-40) and training a model based on the authenticity (col. 5, lines 60-62), and Ramaswamy et al discloses using high resolution video (page 2, paragraph 11), and that the video being labeled based on what the model is to be trained, thereby creating a digital certificate (page 12, paragraph 116).
Regarding claim 20, Kvochko et al discloses the step of detecting parallel correlations between the authentic speech characteristics and the authentic bodily movements, since they are considered together along with other characteristics (col. 3, lines 12-14); and using the parallel correlations as training data to create a model defining the reliable baseline characteristics (col. 5, lines 60-62).
Claims 2 and 5-9 are rejected under 35 U.S.C. 103(a) as being unpatentable over Kvochko et al in view of Ramaswamy et al, as applied to claims 1 and 15 above, and further in view of U.S. Patent No. 12125256 (Demir et al) and U.S. Patent Application Publication No. 20220067570 (Kong et al).
Regarding claim 2, Kvochko et al (as modified by Ramaswamy et al) discloses all of the claimed elements as set forth above and incorporated herein by reference. Kvochko et al further discloses the reliable baseline characteristics include authentic speech characteristics (col. 7, lines 42-44) and authentic bodily movements (col. 7 lines 39-41) derived from a video recording of the prominent individual in a controlled setting (col. 7, lines 1-3, fig. 2, setting of 202a, 202b), and capturing audio and video for the prominent visual from one or more angles, i.e. the angle in which baseline media of fig. 3, item 104 is captured from. Ramaswamy et al discloses video used for recognition corresponds to high-resolution video (page 2, paragraph 11) and further that video data is obtained in setting includes one or more lights (page 9, paragraph 78) and one or more cameras (page 9, paragraph 78).
Kvochko et al (as modified by Ramaswamy et al) does not disclose expressly that the training data (video data) is watermarked and encrypted to safeguard the certified baseline data.
Demir et al discloses the training data is watermarked (col. 18, lines 13-17).
Kvochko et al (as modified by Ramaswamy et al) & Demir et al are combinable because they are from the same field of endeavor, i.e. training deepfake detectors.
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to provide a watermark on the data.
The suggestion/motivation for doing so would have been to provide a more robust system by considering data with watermarks.
Kvochko et al (as modified by Ramaswamy et al and Demir et al) does not disclose expressly that the training data is encrypted to safeguard the certified baseline data.
Kong et al discloses training data is encrypted to safeguard the certified baseline data (page 9, paragraph 70).
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to encrypt the training data
The suggestion/motivation for doing so would have been to provide a more secure system.
Therefore, it would have been obvious to combine Kvochko et al (as modified by Ramaswamy et al) with the watermarking of Demir et al and encrypting of Kong et al to obtain the invention as specified in claim 2.
Claims 5-9 are is rejected for the same reasons as claims 17-20 and 20, respectively. Thus, the arguments analogous to that presented above for claims 17-20 and 20 are equally applicable to claims 5-9. Claims 5-9 distinguish from claims 17-20 and 20 only in that they have different dependencies, both of which have been previously rejected and that claim 20 is a combination of claims 8 and 9. Therefore, prior art applies.
Claims 3 and 4 are rejected under 35 U.S.C. 103(a) as being unpatentable over Kvochko et al in view of Ramaswamy et al, Demir et al and Kong et al, as applied to claim 2 above, and further in view of U.S. Patent Application Publication No. 20240127630 (Michaeli et al).
Regarding claim 3, Kvochko et al (as modified by Ramaswamy et al, Demir et al and Kong et al) discloses all of the claimed elements as set forth above and incorporated herein by reference. Kvochko et al further discloses the authentic speech characteristics include voice parameters defining how specific phonemes, words (col. 7, lines 10-12, col. 1, lines 60-61, col. 3, lines 16-24), phrases, and/or sentences are spoken by the prominent individual.
Kvochko et al (as modified by Ramaswamy et al, Demir et al and Kong et al) does not disclose expressly the voice data is high-resolution voice data.
Michaeli et al discloses the voice data is high-resolution/ fine resolution audio data (page 2, paragraph 18).
Kvochko et al (as modified by Ramaswamy et al, Demir et al and Kong et al) & Michaeli et al are combinable because they are from the same field of endeavor, i.e. deep fake detection.
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to have high resolution data be audio-video data.
The suggestion/motivation for doing so would have been to provide a more robust system by having all data be at high resolutions.
Therefore, it would have been obvious to combine the method of Kvochko et al (as modified by Ramaswamy et al, Demir et al and Kong et al) with the high resolution audio-video data of Michaeli et al to obtain the invention as specified in claim 3.
Regarding claim 4, Kvochko et al discloses the authentic bodily movements include movements of one or more of mouth (col. 7, lines 48-50), face (col. 3, lines 25-41), eyes (col. 5, lines 17-20), eyebrows (col. 3, liens 12-14), head, shoulders (col. 3, lines 12-14), arms, hands (col. 3, lines 12-14), fingers, and chest of the prominent individual when the specific phonemes, words, phrases, and/or sentences are spoken (col. 3, lines 12-14).
Claim 16 is rejected under 35 U.S.C. 103(a) as being unpatentable over Kvochko et al in view of Ramaswamy et al, as applied to claim 15 above, and further in view of Michaeli et al.
Claim 16 is rejected for the same reasons as claims 3 and 4. Thus, the arguments analogous to that presented above for claims 3 and 4 are equally applicable to claim 16. Claim 16 distinguishes from claims 3 and 4 only in that they have different dependencies, in which claim 16 omits portions of claim 2, both of which have been previously rejected, and claim 16 is a combined version of claims 3 and 4. Therefore, prior art applies.
Allowable Subject Matter
Claim 12 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Claim 12 contains allowable subject matter regarding the method further comprises verifying a signature of the claimed JSON Web Token (in which the hidden watermark comprises of and is embedded using LSB steganography and includes a has hash of the questionable video), using a public key, AND comparing the claimed hash in the JSON Web Token to a locally computed hash of the questionable video as an additional basis for tagging the questionable video as fake or real, the questionable video having applied one or more of visible and/or hidden watermark in the claimed step of tagging the questionable video as fake or real.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KATHLEEN YUAN DULANEY whose telephone number is (571)272-2902. The examiner can normally be reached M1:9am-5pm, th1:9am-1pm, fri1 9am-3pm, m2: 9am-5pm, t2:9-5 th2:9am-5pm, f2: 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached at 5712703717. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/KATHLEEN Y DULANEY/Primary Examiner, Art Unit 2666 3/17/2026