DETAILED ACTION
Preliminary Remarks
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The preliminary amendment of 08/19/24 is noted.
Priority
This application is a 371 of PCT/KR2022/095118 filed 08/23/22 which claims the benefit of KR 10-2022-0102317 filed 08/16/22
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 9-16 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because the language of the claims when taken as a whole, raise questions as whether the claims fall within any of the statutory categories of invention. Claim 9 refers to “generators” and a “unit” of which the Examiner deems maybe interpreted as solely performed via software due to the explicit definition in the specification of these terms (see paragraph 37 of Applicant’s specification which explicitly states that each component of the invention maybe implemented in software.). Although claim 9’s preamble states an “apparatus,” this element solely occurs within the preamble and in combination with a such a definition from Applicant’s specification, allows for the interpretation that the “generators” and the “unit” can solely be performed in “software” thus constituting a rejection due to failing to fall within at least one of the statutory categories. Therefore, such claimed elements are software per se, which fails to fall within a statutory category of invention and necessitates the rejection of claims 9-16.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
In reference to claims 1 and 9, these claims are seen as indefinite as they comprise the language in the specification of, “A method (apparatus) for providing a speech video…” (see line 1 of each preamble) and then further recite elements in the body of the claims, none of which explicitly or even implicitly can be construed as “providing speech video.” In other words, the claimed preambles set out for the claims to “provide a speech video” however no speech video is ever “provided.” The Examiner notes the final limitation of each of claims 1 and 9 recites “generating a synthesized speech video…” (or the like for claim 9), however one of ordinary skill in the art cannot make the equivalent connection between “generating” and “providing.” The Examiner questions whether such an issue may be due to a matter of translation of the Patent application from a foreign language to English? Regardless, the claims are indefinite since they fail to particularly point out and distinctly claim that which Applicant regards as the invention. Note, claims 2-8 and 10-16 depend upon claims 1 and 9 respectively and are therefore at least inherently included in this rejection.
In reference to claims 1, 5, 6 and 9, these claims recite the limitation "the video" in lines 7-8 of claim 1, for example.. There is insufficient antecedent basis for this limitation in the claims since there are multiple or a plurality of “videos” being discussed in the claims thus solely referring back to “the video” is insufficient as per antecedent basis. Note, claims 5 and 6 explicitly comprise such language in question while claims 2-4, 7-8 and 10-16 depend upon claims 1 and 9 respectively and are therefore at least inherently included in this rejection.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1-16 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kim (KR 10-2022-00557754) (see machine translation via ESPacenet).
In reference to claim 1, Kim discloses a method for providing a speech video performed by a computing device (see paragraphs 7-9, 12 wherein Kim discloses a techniques for providing real-time artificial intelligence-based speech images. Kim discloses the invention implemented via a computing device comprising one or more processors that implement a standby state image generation module, a speech state image generation module, and an image reproduction module. Kim further discloses the invention generation one or more motion images for intended playback.), the method comprising:
sequentially playing back first sections of a plurality of standby state videos, wherein each standby state video includes the first section in which a person in the video is in a standby state and a second section for image interpolation between a last frame of the first section and a reference frame (see paragraphs 38-41 wherein Kim discloses a standby state image generation module that can generate an image having a preset playback time (e.g. 5 seconds to 30 seconds, and therefore a “standby video”) in which a person in the image is in a standby state or a “waiting to speak” or “listening” state showing the natural movements of the person. Kim discloses the standby video having multiple frames and being arranged for playback in an order from a first frame to a last frame and then back to the first frame. Kim further discloses the standby state image generation module generating a backmotion image for image interpolation between an arbitrary frame of the standby image and a preset reference frame of the standby image. Note, it is clear that the “first section” and “second section” claim limitations correspond to the Kim’s “standby video” comprising images of “standby state” and the “backmotion images” for interpolation respectively.);
generating a plurality of speech state images in which the person in the video is in a speech state and a speech voice based on a source of speech contents (see paragraph 44 wherein Kim discloses the speech state video generation module can generate a speaking state video images. Kim further explicitly discloses utilizing source input speech content in the form of text or audio formats to produce the speaking state video.);
playing back, when the generating of the plurality of speech state images and the speech voice is completed, the second section of the standby state video being played back at the time of completion (see paragraphs 48 and 52 wherein Kim discloses a video playback module that can play the video generated by the standby video module. Kim explicitly discloses that the video generation state is in the completed status when playback is playing the standby video.); and
generating a synthesized speech video by synthesizing the plurality of speech state images and the speech voice with at least some of the plurality of standby state videos (see paragraphs 51-53 wherein Kim discloses synthesizing a standby state image with an initial speech state image part (e.g. the face part of the person, referred to in the foreign translation as an “ignition state image.”) using the synthesized voice part of the speech state image to generate synthetic speech images/video.).
In reference to claims 2 and 10, Kim discloses all of the claim limitations as applied to claims 1 and 9 respectively. Kim also discloses one embodiment wherein the generation of the ignition state image is completed while the image playback module is playing the standby state video by returning process to a first frame of the standby state image (see at least paragraph 52).
In reference to claims 3 and 11, Kim discloses all of the claim limitations as applied to claims 1 and 9 respectively. Kim explicitly discloses playback of the standby state video to include playing repeatedly from the first frame to last frame and then back to the first frame from the last frame (see at least paragraph 40).
In reference to claims 4 and 12, Kim discloses all of the claim limitations as applied to claims 1 and 9 respectively. Kim explicitly discloses that the standby state image generation model generates a backmotion image so that the first frame and an arbitrary frame are naturally connected (see paragraph 42).
In reference to claims 5, 6, 13 and 14, Kim discloses all of the claim limitations as applied to claims 1 and 9 respectively. Kim explicitly discloses the invention generating a video part for the face part of a person in the standby state video (see paragraph 47). Kim further explicitly discloses replacing the ace part of the standby sate image with the image part of the speech state image (see paragraph 51).
In reference to claims 7, 8, 15 and 16, Kim discloses all of the claim limitations as applied to claims 1 and 9 respectively. Kim explicitly discloses that the reference frame can be set to the first frame in the standby state video (see at least paragraph 54). Kim discloses synthesizing the first frame of the standby state image and ignition state image (an initial speech state image part(e.g. the face part of the person, referred to in the foreign translation as an “ignition state image.”) (see paragraphs 51-54).
In reference to claim 9, claim 9 is similar in scope to claim 1 above and is therefore rejected under like rationale. In addition to the rationale as applied in the rejection of claim 1 above, claim 9 further recites, “An apparatus for providing a speech video…” and “speech state generator,” “speech voice generator,” “playback unit” and “synthesized speech video generator.” Taking into consideration the 35 USC 101 rejection of claim 9 above and further that which has been disclosed in the rejection of claim 1 above by prior art Kim, the Examiner deems the disclosure of the Kim to teach the “generators” and “unit” elements of the claim.
References Cited
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Strietzel et al. (U.S. Publication 2009/0132371)
Strietzel et al. discloses systems and methods for creating dynamic interactive advertisements using individualized 3D human models.
Waibel (U.S. Publication 2025/0315631)
Waibel discloses a neural end-to-end system for the face and voice persevering translation of videos.
Kim (U.S. Patent 11,967,336) (related application)
Kim discloses a video reproduction technique that generates synthesized speech video using a present reference frame of a standby state video being reproduced and a speech state video.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Antonio Caschera whose telephone number is (571) 272-7781. The examiner can normally be reached Monday-Friday between 6:30 AM and 2:30 PM EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Said Broome, can be reached at (571) 272-2931.
Any response to this action should be mailed to:
Mail Stop ____________
Commissioner for Patents
P.O. Box 1450
Alexandria, VA 22313-1450
or faxed to:
571-273-8300 (Central Fax)
See the listing of “Mail Stops” at http://www.uspto.gov/patents/mail.jsp and include the appropriate designation in the address above.
Any inquiry of a general nature or relating to the status of this application or proceeding should be directed to the Technology Center 2600 Customer Service Office whose telephone number is (571) 272-2600.
/Antonio A Caschera/
Primary Examiner, Art Unit 2612
2/11/26