DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims under Step 1 are directed towards a method (Claims 1-5) and a device (Claims 5-10). The independent claims 1 and 6 recite “receiving the text prompt by an analysis device; inputting the text prompt into the text-to-image generative model by the analysis device; calculating, by the analysis device, degrees of correlation between text elements in the text prompt and conditional latent vectors for each text element; and determining, by the analysis device, at least one text element among the text elements for generating the image if the degrees of correlation for the at least on text element is greater than a threshold value, wherein the conditional latent vectors are generated in a process of generating the image by the text-to-image generative model”.
The limitation of “receiving the text prompt...” “inputting the text prompt…”, “calculating… “determining…”, and “generating image…” as drafted covers a human organizing of activities. More specifically, a human seeing receiving texts and applying NLP processes and mathematics formulas. All these activities can be performed by writing down on piece of a paper using a pen or using a generic machine and the claim language therefore appears to be merely an abstract idea, a mental process that is able to be performed by a person in their mind. While in claims 1 and 6, a generative model is used, the model is described at a high level of generality without any particular details regarding how the model “generates.” As such, a human equipped with pen and paper or a generic machine could follow a similar “model” to arrive at the claimed generated and identified data.
This judicial exception is not integrated into a practical application. In particular, claims 1 and 6 recite additional elements (analysis device)-- “storage device”, “memory”, “interface device”, “communication device”, “output device” (specification, page 9 lines 2-7, refers analysis device to as computer device, PC or computer terminal with generic components such as “storage device”, “memory”, “interface device”, “communication device”, “output device”), are well known and generic components used conventionally in most of the generic computer devices. Also, it is known that CRM or computer-implementation of an abstract idea is not a factor that weighs in favor of patentability under subject matter eligibility. In addition, “storage device”, “memory”, “interface device”, “communication device”, “output device”, as suggested are generic elements and account to no additional limits that may result in subject matter eligibility. According, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using a computer is noted as a general computer as noted. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Furthermore, “receiving the text prompt by an analysis device; inputting the text prompt into the text-to-image generative model by the analysis device; calculating, by the analysis device, degrees of correlation between text elements in the text prompt and conditional latent vectors for each text element; and determining, by the analysis device, at least one text element among the text elements for generating the image if the degrees of correlation for the at least on text element is greater than a threshold value, wherein the conditional latent vectors are generated in a process of generating the image by the text-to-image generative model” are directed towards insignificant extra solution activity such as collecting data and then using results/data, as supported by the MPEP, “Adding insignificant extra-solution activity to the judicial exception, e.g., mere data gathering in conjunction with a law of nature or abstract idea such as a step of obtaining information about credit card transactions so that the information can be analyzed by an abstract mental process, as discussed in CyberSource v. Retail Decisions, Inc., 654 F.3d 1366, 1375, 99 USPQ2d 1690, 1694 (Fed. Cir. 2011) (see MPEP § 2106.05(g))”. Furthermore, “receiving the text prompt by an analysis device; inputting the text prompt into the text-to-image generative model by the analysis device; calculating, by the analysis device, degrees of correlation between text elements in the text prompt and conditional latent vectors for each text element; and determining, by the analysis device, at least one text element among the text elements for generating the image if the degrees of correlation for the at least on text element is greater than a threshold value, wherein the conditional latent vectors are generated in a process of generating the image by the text-to-image generative model” amounts to merely applying the mental process using a computer, which are not enough to qualify as significantly more under the MPEP, “Adding the words "apply it" (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, e.g., a limitation indicating that a particular function such as creating and maintaining electronic records is performed by a computer, as discussed in Alice Corp., 134 S. Ct. at 2360, 110 USPQ2d at 1984 (see MPEP § 2106.05(f))”. Therefore, the claim is not patent eligible under 35 U.S.C. 101.
Claims 2 and 7 are dependent on independent claims 1 and 6 and include all the limitations of claims 1 and 6. Claims 2 and 7 recite “wherein the text-to-image generative model is a Stable Diffusion model” under the broadest reasonable interpretation, involves mathematical calculations. So, the claim recites judicial exceptions and it falls within the “Mathematical concepts” grouping of abstract idea. The claims 2 and 7 do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea.
Claims 3 and 8 are dependent on independent claims 1 and 6 and include all the limitations of claims 1 and 6. Claims 3 and 8 recite “wherein the analysis device calculates the degrees of correlation on the basis of attention maps generated by performing cross attention between the conditional latent vectors and the at least one text element” under the broadest reasonable interpretation, involves mathematical calculations. So, the claim recites judicial exceptions and it falls within the “Mathematical concepts” grouping of abstract idea. The claims 3 and 8 do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea.
Claims 4 and 9 are dependent on independent claims 1 and 6 and include all the limitations of claims 1 and 6. Claims 4 and 9 recite “wherein the analysis device calculates the degrees of correlation on the basis of attention maps generated by performing cross attention between text embeddings generated by a text encoder receiving the text prompt in the process of generating the image and the conditional latent vectors generated conditional on the text embeddings” under the broadest reasonable interpretation, involves mathematical calculations. So, the claim recites judicial exceptions and it falls within the “Mathematical concepts” grouping of abstract idea. The claims 4 and 9 do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea.
Claims 5 and 10 are dependent on independent claims 3 and 8 and include all the limitations of claims 3 and 8. Claims 5 and 10 recite “wherein the analysis device distinguishes or segments an area where the at least one text element is positioned in the image on the basis of the attention maps” (mental process – observation, evaluation, judgment, opinion). The claim language provides only further specifying what the data used in the underlying mental process. The claims 5 and 10 do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-10 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Non-Patent Literature “Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models” (May 31, 2023) to Chefer et al. (“Chefer”).
As to claims 1 and 6, Chefer discloses a method and an analysis device for analyzing a degree of correlation between an image and a text prompt, which are generated by a generative model [Chefer pages 1-24], the method comprising: receiving the text prompt by an analysis device [page 1:3, Fig. 3: “A lion with a crown”]; inputting the text prompt into the text-to-image generative model by the analysis device [page 1:3, Fig. 3, pages 1:3-1:5: sections 3 and 4]; calculating, by the analysis device, degrees of correlation between text elements in the text prompt and conditional latent vectors for each text element [Fig 3, pages 1:3-1:4: “Text-Conditioning via Cross-Attention”, also see section 4 on pages 1:3-1:4 and algorithm 1]; and determining, by the analysis device, at least one text element among the text elements for generating the image if the degrees of correlation for the at least on text element is greater than a threshold value [pages: 1:3-1:4: “set of thresholds…”, also page 1:10, Appendix A.1: “performing the iterative latent refinement … util the specified threshold value is attained..”], wherein the conditional latent vectors are generated in a process of generating the image by the text-to-image generative model [Fig. 3, pages 1:3-1:5].
As to claims 2 and 7, Chefer discloses wherein the text-to-image generative model is a Stable Diffusion model [page 1:3, section 3: “we apply our method over the state-of-the-art Stable Diffusion mode (SD)”].
As to claims 3 and 8, Chefer discloses wherein the analysis device calculates the degrees of correlation on the basis of attention maps generated by performing cross attention between the conditional latent vectors and the at least one text element [pages 1:3, 1:4: “Extracting the Cross-Attention Maps”].
As to claims 4 and 9, Chefer discloses wherein the analysis device calculates the degrees of correlation on the basis of attention maps generated by performing cross attention between text embeddings generated by a text encoder receiving the text prompt in the process of generating the image and the conditional latent vectors generated conditional on the text embeddings [pages 1:3-1:5: sections 3 and 4, Figs. 3-4].
As to claims 5 and 10, Chefer discloses wherein the analysis device distinguishes or segments an area where the at least one text element is positioned in the image on the basis of the attention maps [pages 1:3-1:4, Fig. 4].
Conclusion
The following prior arts made of record and not relied upon are considered pertinent to applicant's disclosure:
U.S. Patent Application Publication No. 20240320867 to Bean [See Figs. 2-7 and corresponding paragraphs].
U.S. Patent Application Publication No. 20240153153 to LIU et al. [See Figs. 7-11 and corresponding paragraphs].
U.S. Patent Application Publication No. 20250078327 to Bao et al. [See Figs. 2-7 and corresponding paragraphs].
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANTIM G SHAH whose telephone number is (571)270-5214. The examiner can normally be reached Mon-Fri 7:30am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached at 571-272-7488. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ANTIM G SHAH/Primary Examiner, Art Unit 2693