Prosecution Insights
Last updated: April 19, 2026
Application No. 18/312,246

ABSTRACT BACKGROUND GENERATION

Final Rejection §103
Filed
May 04, 2023
Examiner
AHN, CHRISTINE YERA
Art Unit
2615
Tech Center
2600 — Communications
Assignee
Adobe Inc.
OA Round
4 (Final)
69%
Grant Probability
Favorable
5-6
OA Rounds
2y 7m
To Grant
99%
With Interview

Examiner Intelligence

Grants 69% — above average
69%
Career Allow Rate
11 granted / 16 resolved
+6.8% vs TC avg
Strong +38% interview lift
Without
With
+37.5%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
34 currently pending
Career history
50
Total Applications
across all art units

Statute-Specific Performance

§101
5.2%
-34.8% vs TC avg
§103
49.6%
+9.6% vs TC avg
§102
21.9%
-18.1% vs TC avg
§112
20.1%
-19.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 16 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status 1. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Amendment 2. The amendment filed February 9, 2026 has been entered. Claims 1-5, 7-15, and 17-20 remain pending in the application. Applicant’s amendments to the Claims have overcome the 35 U.S.C. 112(a) and (b) rejections previously set forth in the Non-Final Office Action mailed on November 7, 2025. The objections previously set forth in the Non-Final Office Action mailed on November 7, 2025 remain. Response to Arguments 3. Applicant's arguments filed February 9, 2026 have been fully considered but they are not persuasive. 4. Applicant argues that Roich et al. (“Pivotal Tuning for Latent-based Editing of Real Images”), hereinafter referred to as Roich, does not teach using multimodal encoder embeddings as the PTI pivot. Examiner replies that although Roich does not teach using multimodal encoder embeddings as the PTI pivot, Roich combined with Guo et al. (U.S. Patent Application Publication No. 2023/0022550 A1), hereinafter referred to as Guo, does. Roich teaches in Section 3.1 deriving the fixed pivot from an input image x which is inverted into the latent space. Guo teaches in Paragraphs 79-80 and Figure 6 using a CLIP model to create embeddings of images which is then passed into a latent code mapper. Thus, Guo’s CLIP model which passes in embeddings of images to latent code mappers can be combined with Roich to embed the input image x through the multimodal encoder as it is inverted into the latent space. 5. Applicant argues that there is no motivation to combine Guo and Roich. The Applicant asserts that Guo teaches using text prompts to modify images whereas Roich teaches preserving the exact identity and appearance of real photographs. The Applicant further asserts that Guo and Roich teach contradictory approaches and that replacing Roich’s GAN-derived pivot with an externally-generated multimodal embedding would fundamentally alter Roich’s approach. Examiner replies that Roich not only teaches identity preservation but also teaches modifying images and making edits like changing their age or expression as seen in Figure 1. Guo also teaches not only modifying images but also preserving parts of an image in Paragraph 94 by reducing the impact of an edit to parts of an image that are not intended to be edited. Furthermore, Guo’s externally-generated multimodal embedding would not fundamentally alter Roich’s approach because Roich teaches in Section 3.1 generating the fixed pivot using an external input image embedding x which is then inverted into the latent space. Guo’s CLIP model which passes in embeddings of images to latent code mappers can be combined with Roich to embed the input image in Roich through the multimodal encoder as it is mapped into the latent space. Thus, Roich and Guo are combinable. There also exists a motivation to combine through predictable results in substitution. Roich teaches the fixed pivot using an input image x that is inverted into the latent space. Guo teaches encoding the input image through a multimodal encoder to pass into a latent space. A person holding ordinary skill in the art before the effective filing date would have recognized that the passing in the input image embedding x into the latent space taught by Roich can be substituted with passing in the multimodal encoder embedding of the input image taught by Guo because both Roich’s and Guo’s embedding serve the purpose of passing the input image into latent space. Furthermore, a person holding ordinary skill in the art would have been able to carry out the substitution to achieve the predictable result of passing an input image into the latent space using the multimodal encoder embedding taught by Guo. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to combine Roich with Guo in order to map and invert an image into latent space create an image that is consistent with contents described by text or an input as taught by Guo in Paragraph 33. To conclude, there is motivation to combine the references and Claims 1, 9, 15, and their dependents remain rejected. 6. Conclusion: The rejections set in the previous Office Action are shown to have been proper, and the claims are rejected below. New citations and parenthetical remarks can be considered new grounds of rejection and such new grounds of rejection are necessitated by the Applicant’s amendments to the claims. Therefore, the present Office Action is made final. Claim Objections 7. Claims 2 and 3 objected to because of the following informalities: "the prompt" in line 2 should be "the input prompt". Appropriate correction is required. Claim 17 objected to because of the following informalities: Claim 17 is dependent on cancelled claim 16. Examiner will examine the claim as being dependent on claim 15 instead. Appropriate correction is required. Claim Rejections - 35 USC § 103 8. In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 9. The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action. 10. Claim(s) 1, 3-5, 7-12, 15, and 17-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Tan et al. ("ArtGAN: Artwork Synthesis with Conditional Categorical GANs"), hereinafter referred to as Tan, in view of Guo et al. (U.S. Patent Application Publication No. 2023/0022550 A1), hereinafter referred to as Guo, and Roich et al. (“Pivotal Tuning for Latent-based Editing of Real Images”), hereinafter referred to as Roich. 11. Regarding claim 1, Tan teaches a method comprising: obtaining an input prompt indicating an abstract design (Section 2.3, Paragraph 1 mentions receiving an input image x which can be considered an input prompt. Section 1, Paragraph 2 mentions that the artwork can include abstract paintings which allows the input image to indicate an abstract design); encoding the input prompt (Section 2.3, Paragraph 1 mentions encoding the input image x through an encoder to produce an embedding y. Figure 2 shows in the region D, an input image x being passed into an encoder to produce the embedding y; Section 1, Paragraph 2 mentions that the artwork can include abstract paintings which allows the input image to indicate an abstract design), generating a latent vector representing the abstract design in a latent space based on the prompt embedding and a noise vector using a mapping network of a generative adversarial network (GAN) (Section 2.3, Paragraphs 1-2, Figure 2 show that in region G, a noise vector z ^   and embedding vector y ^   are input and transformed into latent space using the zNet, which can be considered the mapping network. This can be considered transforming it into a latent vector. The latent vector represents the input prompt which can be an abstract image, per Section 1 Paragraph 2, which means it represents an abstract design); and generating an image depicting the abstract design based on the latent vector using the GAN (Section 2.3, Paragraph 1 and Figure 2 show that an image is generated represented by the variable x ^ . The image is generated using the ArtGAN process which uses the latent vector generated in section G in Figure 2 of the ArtGAN; Section 1, Paragraph 2 mentions artwork can include abstract paintings which allows the input image to be abstract and thus, also allows the output image to be abstract), wherein the GAN is trained to depict abstract designs (Section 1, Paragraph 2 mentions artwork can include abstract paintings which allows the input image to be abstract and thus, also allows the output image to be abstract. Thus, the GAN is trained to output abstract images; Section 2.2 teaches that the ArtGAN is trained to grasp abstract concepts and artistic styles; Section 2.3 teaches training the GAN with reconstruction losses) However, Tan fails to teach wherein the mapping network is separate from the multimodal encoder; and encoding the input prompt using a multimodal encoder and training the GAN using Pivotal Tuning Inversion (PTI) based on a fixed pivot of the latent space, wherein the fixed pivot is an embedding of a training image based on an output of the multimodal encoder. Guo teaches wherein the mapping network is separate from the multimodal encoder (Figure 6 teaches the CLIP image encoder is separate from the latent code mapper or mapping network); and encoding the input prompt using a multimodal encoder an embedding of a training image based on an output of the multimodal encoder (Paragraph 79, Figure 6 explains a CLIP model that encodes information for both a text and image input. The CLIP model teaches a multimodal encoder). Tan and Guo are considered analogous to the claimed invention because both are in the same field of using GANs to create an image. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the encoder in Tan with the multimodal encoder taught in Guo in order to allow users to easily edit images using GANs (Guo Paragraph 30). However, Tan and Guo fail to teach training the GAN using Pivotal Tuning Inversion (PTI) based on a fixed pivot of the latent space, wherein the fixed pivot is an embedding of a training image. Roich teaches training the GAN using Pivotal Tuning Inversion (PTI) based on the latent space, wherein the fixed pivot is an embedding of a training image (Section 3 Paragraph 1 teaches using latent code w_p as a pivot to tune the generator or GAN. Section 3.1 and Equation 1 teaches that the latent code w_p incorporates an input image vector x. The input image vector x can be considered the training image embedding. Thus, w_p can be considered an embedding of the training image. Section 3.2 explains using the w_p as the fixed pivot to tune the generator. It teaches keeping w_p constant while tuning the generator and its parameters) based on an output of the multimodal encoder (Taught through Guo above. Roich teaches the fixed pivot using an input image x. Guo teaches encoding the input image through a multimodal encoder to pass into a mapping network. A person holding ordinary skill in the art before the effective filing date would have recognized that the passing in the input image embedding x into the latent space taught by Roich can be substituted with passing in the multimodal encoder embedding of the input image taught by Guo because both Roich’s and Guo’s embedding serve the purpose of passing the input image into latent space. Furthermore, a person holding ordinary skill in the art would have been able to carry out the substitution to achieve the predictable result of passing an input image into the latent space using the multimodal encoder embedding taught by Guo. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to combine Roich with Guo in order to create an image that is consistent with contents described by text or an input as taught by Guo in Paragraph 33). Tan, Guo, and Roich considered analogous to the claimed invention because all are in the same field of using GANs to create an image. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the method of generating abstract designs through GAN taught by Tan in view of Guo with the PTI taught by Roich in order to preserve the identity of an image despite some edits which is useful in training the GAN to reproduce abstract images during training (Roich Abstract). 12. Regarding claim 3, Tan in view of Guo and Roich teaches the limitations of claim 1. Tan further teaches the method wherein: the prompt comprises an image depicting the abstract design (Section 2.3, Paragraph 1 mentions inputting an input image x into the GAN as the prompt. Section 1, Paragraph 2 mentions that the artwork can include abstract paintings which allows the input image to be abstract). 13. Regarding claim 4, Tan in view of Guo and Roich teaches the limitations of claim 1. Tan further teaches the method wherein: the GAN is trained to generate abstract images based on conditioning from abstract image embeddings (Section 1, Paragraph 2 mentions artwork can include abstract paintings which allows the input image to be abstract and thus, also allows the output image to be abstract. Section 2.3, Paragraph 1 and Figure 2 show receiving an input image x and encoding it through an encoder to produce an embedding z and y. Section D in Figure 2 then processes the embeddings in order to generate the final output image x ^ ). 14. Regarding claim 5, Tan in view of Guo and Roich teaches the limitations of claim 4. However, Tan fails to teach the method wherein: the abstract image embeddings and the prompt embedding are generated using a multimodal encoder. Guo teaches the method wherein: the abstract image embeddings and the prompt embedding are generated using a multimodal encoder (Paragraph 79, Figure 6 explains a CLIP model that encodes information for both a text and image input. The CLIP model can be considered a multimodal encoder). Tan, Guo, and Roich are considered analogous to the claimed invention because all are in the same field of using GANs to create an image. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the encoder in Tan in view of Roich with the multimodal encoder taught in Guo in order to allow users to easily edit images using GANs (Guo Paragraph 30). 15. Regarding claim 7, Tan in view of Guo and Roich teaches the limitations of claim 1. Tan further teaches the method wherein: the input prompt comprises an original image including the abstract design (Section 2.3, Paragraph 1 mentions inputting an input image x into the GAN as the prompt. Section 1, Paragraph 2 mentions that the artwork can include abstract paintings which allows the input image to be abstract). However, Tan fails to teach the method wherein: the input prompt comprises text describing a modification to the original image. Guo teaches the method wherein: the input prompt comprises text describing a modification to the original image (Paragraph 35 mentions editing the picture based on the text description). Tan, Guo, and Roich are considered analogous to the claimed invention because all are in the same field of using GANs to create an image. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the input prompt in Tan in view of Roich with the text describing a modification taught in Guo in order to allow users to easily edit images using GANs (Guo Paragraph 30). 16. Regarding claim 8, Tan in view of Guo and Roich teaches the limitations of claim 1. Tan further teaches the method wherein: the input prompt comprises an original image (Section 2.3, Paragraph 1 mentions inputting an input image x into the GAN as the prompt. Section 1, Paragraph 2 mentions that the artwork can include abstract paintings which allows the input image to be abstract). However, Tan fails to teach the method wherein: the input prompt comprises text describing the abstract design. Guo teaches the method wherein: the input prompt comprises an original image and text describing an abstract design (Paragraph 79 mentions receiving an image and text description of the image. A text description of an image can describe an abstract design when the original image is abstract). Tan, Guo, and Roich are considered analogous to the claimed invention because all are in the same field of using GANs to create an image. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the input prompt in Tan in view of Roich with the text describing an abstract design taught in Guo in order to allow users to easily edit images using GANs (Guo Paragraph 30). 17. Regarding claim 9, Tan teaches a method comprising: obtaining a training image depicting an abstract design (Section 2.3, Paragraph 1 mentions receiving an input image x which can be considered an input prompt. Section 1, Paragraph 2 mentions that the artwork can include abstract paintings which allows the input image to indicate an abstract design); encoding the training image using (Section 2.3, Paragraph 1 mentions encoding the input image x through an encoder to produce an embedding y. Figure 2 shows in the region D, an input image x being passed into an encoder to produce the embedding y; Section 1, Paragraph 2 mentions that the artwork can include abstract paintings which allows the input image to indicate an abstract design and thus the embedding y also represents the abstract design); generating a latent vector in a latent space for a generative adversarial network (GAN) based on the image embedding using a mapping network (Section 2.3, Paragraphs 1-2, Figure 2 show that in region G, a noise vector z ^   and embedding vector y ^   are input and transformed into latent space using the zNet, which can be considered the mapping network. This can be considered transforming it into a latent vector), and training the GAN to generate an output image depicting the abstract design based on the latent vector using a discriminator network (Section 2.3, Paragraph 1 and Figure 2 show that an image is generated represented by the variable x ^ . The image is generated using the ArtGAN process which uses the latent vector; Section 2.3, Paragraph 2 mentions using and evaluating the generated image in a loss function and updating the discriminator network ‘D’ to train the GAN; Section 1, Paragraph 2 mentions artwork can include abstract paintings which allows the input image to be abstract and thus, also allows the output image to be abstract), However, Tan fails to teach wherein the mapping network is separate from the multimodal encoder; and encoding the training image using a multimodal encoder and wherein training the GAN to generate the output image includes performing Pivotal Tuning Inversion (PTI) using the image embedding as a fixed pivot of the latent space of the mapping network, wherein the image embedding is based on an output of the multimodal encoder. Guo teaches wherein the mapping network is separate from the multimodal encoder (Figure 6 teaches the CLIP image encoder is separate from the latent code mapper or mapping network); and encoding the training image using a multimodal encoder and wherein the image embedding is based on an output of the multimodal encoder; and wherein the image embedding is based on an output of the multimodal encoder (Paragraph 79, Figure 6 explains a CLIP model that encodes information for both a text and image input. The CLIP model can be considered a multimodal encoder). Tan and Guo are considered analogous to the claimed invention because both are in the same field of using GANs to create an image. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the encoder in Tan with the multimodal encoder taught in Guo in order to allow users to easily edit images using GANs (Guo Paragraph 30). However, Tan and Guo fail to teach wherein training the GAN to generate the output image includes performing Pivotal Tuning Inversion (PTI) using the image embedding as a fixed pivot of the latent space of the mapping network, wherein the image embedding is based on an output of the multimodal encoder. Roich teaches wherein training the GAN to generate the output image includes performing Pivotal Tuning Inversion (PTI) using the image embedding as a fixed pivot of the latent space of the mapping network (Section 3 Paragraph 1 teaches using latent code w_p as a pivot to tune the generator or GAN. Section 3.1 and Equation 1 teaches that the latent code w_p incorporates an input image vector x. The input image vector x can be considered the training image embedding and is taught to be inverted into the latent space. This inversion can be considered as part of the mapping network that maps the training image into the latent space. Thus, w_p can be considered an embedding of the training image. Section 3.2 explains using the w_p as the fixed pivot to tune the generator. It teaches keeping w_p constant while tuning the generator and its parameters), wherein the image embedding is based on an output of the multimodal encoder (Taught through Guo above. Roich teaches the fixed pivot using an input image x. Guo teaches encoding the input image through a multimodal encoder to pass into a mapping network. A person holding ordinary skill in the art before the effective filing date would have recognized that the passing in the input image embedding x into the latent space taught by Roich can be substituted with passing in the multimodal encoder embedding of the input image taught by Guo because both Roich’s and Guo’s embedding serve the purpose of passing the input image into latent space. Furthermore, a person holding ordinary skill in the art would have been able to carry out the substitution to achieve the predictable result of passing an input image into the latent space using the multimodal encoder embedding taught by Guo. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to combine Roich with Guo in order to create an image that is consistent with contents described by text or an input as taught by Guo in Paragraph 33). Tan, Guo, and Roich considered analogous to the claimed invention because all are in the same field of using GANs to create an image. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the method of generating abstract designs through GAN taught by Tan in view of Guo with the PTI taught by Roich in order to preserve the identity of an image despite some edits which is useful in training the GAN to reproduce abstract images during training (Roich Abstract). 18. Regarding claim 10, Tan in view of Guo and Roich teaches the limitations of claim 9. Tan further teaches the method further comprising: obtaining training data including abstract background images, wherein the GAN is trained based on the abstract background images (Section 3.1 mentions using a wikiart dataset containing artwork to use for training the GAN. Section 1, Paragraph 2 mentions artwork can include abstract paintings which allows for the wikiart dataset to have abstract artwork). 19. Regarding claim 11, Tan in view of Guo and Roich teaches the limitations of claim 10. Tan further teaches the method further comprising: classifying the output image using the discriminator network (Section 2.3, Paragraph 1 and Figure 2 shows the discriminator network can consume the output image x ^ and run it through the classifier clsNet); and computing a discriminator loss based on the classification, wherein the GAN is trained based on the discriminator loss (Section 2.3, Paragraph 2 shows computing a discriminator loss L D through equation 2. It also mentions updating parameters in the discriminator, D, based on the discriminator loss). 20. Regarding claim 12, Tan in view of Guo and Roich teaches the limitations of claim 10. Tan further teaches the method further comprising: computing a reconstruction loss based on an abstract image from the abstract background images and a predicted image, wherein the GAN generates the predicted image and is trained based on the reconstruction loss (Section 2.3, Paragraph 3 shows computing a reconstruction loss based on the abstract image input x, and the reconstructed image represented as Dec(Enc( x r )) which was generated by the GAN; Section 1, Paragraph 4 mentions using the reconstruction loss to train the GAN). 21. Regarding claim 15, Tan teaches an apparatus wherein the GAN is trained to generate a latent vector in a latent space based on prompt embedding representing an abstract design and a noise vector using a mapping network of the GAN, and to generate an output image based on the latent vector ( Section 2.3, Paragraphs 1-2, Figure 2 show that in region G, a noise vector z ^   and embedding vector y ^   are input and transformed into latent space using the zNet, which can be considered the mapping network. This can be considered transforming it into a latent vector; Section 1, Paragraph 2 mentions that the artwork can include abstract paintings which allows the input image or prompt to indicate an abstract design and thus the prompt embedding y also represents the abstract design; Section 2.3, Paragraph 1 and Figure 2 show that an image is generated represented by the variable x ^ . The image is generated using the ArtGAN process which uses the latent vector generated in section G in Figure 2 of the ArtGAN). However, Tan fails to teach an apparatus comprising: at least one processor; at least one memory storing instructions executable by the processor; and a generative adversarial network (GAN) comprising parameters stored in the at least one memory; and using Pivotal Tuning Inversion (PTI) based on a fixed point of the latent space, wherein the fixed pivot is an embedding of a training image based on an output of a multimodal encoder of the GAN separate from the mapping network. Guo teaches an apparatus comprising: at least one processor; at least one memory storing instructions executable by the processor (Paragraph 109, Figure 9 shows a memory 902 and a processor 901 which executes instructions from memory); and a generative adversarial network (GAN) comprising parameters stored in the at least one memory (Paragraph 111 mentions different components of the GAN are stored in memory which would include the parameters); and an embedding of a training image based on an output of a multimodal encoder of the GAN (Paragraph 79, Figure 6 teaches a CLIP model that encodes information for both a text and image input. The CLIP model teaches a multimodal encoder) separate from the mapping network (Figure 6 teaches the CLIP image encoder is separate from the latent code mapper or mapping network);. Tan and Guo are considered analogous to the claimed invention because both are in the same field of using GANs to create an image. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to combine the apparatus of creating images in Tan et al. with the processor and memory taught in Guo in order to implement the image generation method on an electronic device (Guo Paragraph 108). However, Tan and Guo fail to teach training the GAN using Pivotal Tuning Inversion (PTI) based on a fixed point of the latent space, wherein the fixed pivot is an embedding of a training image. Roich teaches training the GAN using Pivotal Tuning Inversion (PTI) based on a fixed point of the latent space, wherein the fixed pivot is an embedding of a training image (Section 3 Paragraph 1 teaches using latent code w_p as a pivot to tune the generator or GAN. Section 3.1 and Equation 1 teaches that the latent code w_p incorporates an input image vector x. The input image vector x can be considered the training image embedding. Thus, w_p can be considered an embedding of the training image. Section 3.2 explains using the w_p as the fixed pivot to tune the generator. It teaches keeping w_p constant while tuning the generator and its parameters) based on an output of a multimodal encoder of the GAN separate from the mapping network (Taught through Guo above. Roich teaches the fixed pivot using an input image x. Guo teaches encoding the input image through a multimodal encoder to pass into a mapping network. A person holding ordinary skill in the art before the effective filing date would have recognized that the passing in the input image embedding x into the latent space taught by Roich can be substituted with passing in the multimodal encoder embedding of the input image taught by Guo because both Roich’s and Guo’s embedding serve the purpose of passing the input image into latent space. Furthermore, a person holding ordinary skill in the art would have been able to carry out the substitution to achieve the predictable result of passing an input image into the latent space using the multimodal encoder embedding taught by Guo. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to combine Roich with Guo in order to create an image that is consistent with contents described by text or an input as taught by Guo in Paragraph 33). Tan, Guo, and Roich considered analogous to the claimed invention because all are in the same field of using GANs to create an image. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the apparatus of generating abstract designs through GAN taught by Tan in view of Guo with the PTI taught by Roich in order to preserve the identity of an image despite some edits which is useful in training the GAN to reproduce abstract images during training (Roich Abstract). 22. Regarding claim 17, Tan in view of Guo and Roich teaches the limitations of claim 16. However, Tan and Guo fail to teach the apparatus further comprising: an optimization component configured to tune the GAN via the PTI. Roich teaches the apparatus further comprising: an optimization component configured to tune the GAN via the PTI (Section 3.2 teaches using PTI to tune the GAN. The step of running the GAN using a pivot latent code can be considered the optimization component). Tan, Guo, and Roich considered analogous to the claimed invention because all are in the same field of using GANs to create an image. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the apparatus of generating abstract designs through GAN taught by Tan in view of Guo with the PTI taught by Roich in order to preserve the identity of an image despite some edits which is useful in training the GAN to reproduce abstract images during training (Roich Abstract). 23. Regarding claim 18, Tan in view of Guo and Roich the limitations of claim 15. Tan fails to teach the apparatus wherein: the multimodal encoder is configured to generate the prompt embedding based on an input image or an input text. However, Guo teaches the apparatus wherein: the multimodal encoder is configured to generate the prompt embedding based on an input image or an input text (Paragraph 79, Figure 6 explains a CLIP model that encodes information for both a text and image input. The CLIP model can be considered a multimodal encoder). Tan, Guo, and Roich are considered analogous to the claimed invention because all are in the same field of using GANs to create an image. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the encoder in Tan in view of Roich with the multimodal encoder taught in Guo in order to allow users to easily edit images using GANs (Guo Paragraph 30). 24. Regarding claim 19, Tan in view of Guo and Roich the limitations of claim 15. Tan further teaches the apparatus further comprising: a discriminator network configured to classify an output of the GAN (Section 2.3, Paragraph 1 and Figure 2 shows the discriminator network can consume the output image x ^ and run it through the classifier clsNet). 25. Regarding claim 20, Tan in view of Guo and Roich the limitations of claim 15. Tan further teaches the apparatus further comprising: a training component configured to train the GAN based on a reconstruction loss, wherein the reconstruction loss is computed based on the output image (Section 2.3, Paragraph 3 shows computing a reconstruction loss, in equation 4, based on the abstract image input x and the reconstructed image represented in the equation as Dec(Enc( x r )) which was generated by the GAN). 26. Claim(s) 2 is/are rejected under 35 U.S.C. 103 as being unpatentable over Tan et al. ("ArtGAN: Artwork Synthesis with Conditional Categorical GANs"), hereinafter referred to as Tan, in view of Guo et al. (U.S. Patent Application Publication No. 2023/0022550 A1), hereinafter referred to as Guo, and Roich et al. (“Pivotal Tuning for Latent-based Editing of Real Images”), hereinafter referred to as Roich, as applied to claim 1 above, and further in view of Saharia et al. (United States Patent Application Publication No. 2023/0377226 A1), hereinafter referred to as Saharia. Regarding claim 2, Tan in view of Guo and Roich teaches the limitations of claim 1. However, Tan, Guo, and Roich fail to teach the method wherein: the prompt comprises a text prompt describing an abstract design. Saharia teaches the method wherein: the prompt comprises a text prompt describing an abstract design (Paragraph 4 mentions inputting a text prompt that describes a scene to create an image through a generative neural network. Paragraph 77 also mentions the generative neural network can include GANs; Paragraph 60 mentions the text prompt can describe abstract art). Tan, Guo, Roich, and Saharia are considered analogous to the claimed invention because all are in the same field of generating an image through a GAN with an input prompt. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the input prompt in Tan in view of Guo and Roich with the text prompt in Saharia in order to generate a high resolution image that generates an image that is accurately described by the text prompt (Saharia et al. Paragraph 26). 27. Claim(s) 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Tan et al. ("ArtGAN: Artwork Synthesis with Conditional Categorical GANs"), hereinafter referred to as Tan, in view of Guo et al. (U.S. Patent Application Publication No. 2023/0022550 A1), hereinafter referred to as Guo, and Roich et al. (“Pivotal Tuning for Latent-based Editing of Real Images”), hereinafter referred to as Roich, as applied to claim 12 above, and further in view of Ling et al. (United States Patent Application Publication No. 2022/0383570 A1). Regarding claim 13, Tan in view of Guo and Roich teaches the limitations of claim 12. Tan further teaches the method wherein: the reconstruction loss comprises a pixel-based loss term (Section 2.3, Paragraph 4, equation 4 shows computing a pixel-wise reconstruction loss which is represented by L L 2 ). However, Tan, Guo, and Roich to teach the method wherein: the reconstruction loss comprises a perceptual loss term. Ling teaches the method wherein: the reconstruction loss comprises a perceptual loss term (Paragraph 25 mentions when training the GAN, reconstruction losses can include both pixel-wise and perceptual reconstruction loss). Tan, Guo, Roich, and Ling are considered analogous to the claimed invention because all are in the same field of generating an output image using GANs. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the reconstruction loss in Tan in view of Guo and Roich with the addition of the perceptual loss term in Ling in order to ensure that specified parts of the image are edited and non-specified parts are left unchanged (Ling et al. Paragraph 7). 28. Claim(s) 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Tan et al. ("ArtGAN: Artwork Synthesis with Conditional Categorical GANs"), hereinafter referred to as Tan, in view of Guo et al. (U.S. Patent Application Publication No. 2023/0022550 A1), hereinafter referred to as Guo, and Roich et al. (“Pivotal Tuning for Latent-based Editing of Real Images”), hereinafter referred to as Roich, as applied to claim 12 above, and further in view of Johnson et al. (“Perceptual Losses for Real-Time Style Transfer and Super-Resolution”), hereinafter referred to as Johnson, and Xu et al. (U.S. Patent Application Publication No. 2021/0264236 A1), hereinafter referred to as Xu. Regarding claim 14, Tan in view of Guo and Roich teaches the limitations of claim 12. Tan further teaches the method further comprising: training the GAN during a first phase without the reconstruction loss; and training the GAN during a second phase using the reconstruction loss, (Section 1, Paragraph 4 mentions applying the reconstruction loss only to train the G, the generator, part of the GAN and not to the D, discriminator, part of the GAN. This can be interpreted as training the GAN without the reconstruction loss in the first stage, which is the discriminator, and with the reconstruction loss in the second stage, the generator. It also mentions the reconstruction loss trains the generator to improve the quality of the generated images which implies the first stage, the discriminator, does not train for high resolution). However, the Applicant has changed the scope of the claims to specify only the high and low resolution layers in the generator of the GAN is trained in those two phases. Tan does not teach only training the resolution layers in the generator in two phases. Thus, Tan, Guo, and Roich fail to teach the method further comprising: training the GAN during a first phase without the reconstruction loss; and training the GAN during a second phase using the reconstruction loss, wherein the second phase trains high resolution layers of the generator of the GAN and the first phase trains low resolution layers of the generator of the GAN. Xu teaches the method further comprising: training the GAN during a first phase (Paragraph 3 teaches training the GAN that has coarse and fine layers. The coarse layers are lower resolution layers and the fine layers are high resolution layers; Paragraphs 35-37 teach first training the coarse layers which is the lower resolution and then proceeding to train the higher resolutions or fine layers. The training of the coarse and fine layers are the first and second phases) Tan, Guo, Roich, and Xu are considered analogous to the claimed invention because all are in the same field of generating an output image using neural networks. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the method of training the GAN taught by Tan in view of Guo and Roich with the two phases taught by Xu in order to allow for better control over the content and style of the generated images (Xu Paragraph 3). However, Tan, Guo, Roich, and Xu fail to teach training the generator of the GAN during a first phase without the reconstruction loss; and training the generator of the GAN during a second phase using the reconstruction loss. Johnson teaches training the GAN during a first phase without the reconstruction loss; and training the GAN during a second phase using the reconstruction loss (Section 3.2, ‘Feature Reconstruction Loss’ section teaches that the feature reconstruction loss is only computed from a jth layer of the network onward in order to produce images that preserve image content and overall spatial structure without being completely identical to the input image. The jth layer can be considered in the second phase with the higher resolution layers). Tan, Guo, Roich, Xu, and Johnson are considered analogous to the claimed invention as because both are in the same field of generating an output image using neural networks. Thus, it would have been obvious to a person holding ordinary skill in the art before the effective filing date to modify the method of training the GAN taught by Tan in view of Guo, Roich, and Xu with the reconstruction loss used only with the higher layers taught by Johnson in order to output images that are not identical to the input but with the same style (Johnson Section 3.2, ‘Feature Reconstruction Loss’ section). Conclusion 29. Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 30. Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHRISTINE Y AHN whose telephone number is (571)272-0672. The examiner can normally be reached M-F 9-5pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alicia Harrington can be reached at (571)272-2330. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /CHRISTINE YERA AHN/Examiner, Art Unit 2615 /ALICIA M HARRINGTON/Supervisory Patent Examiner, Art Unit 2615
Read full office action

Prosecution Timeline

May 04, 2023
Application Filed
Mar 04, 2025
Non-Final Rejection — §103
Apr 30, 2025
Interview Requested
May 06, 2025
Applicant Interview (Telephonic)
May 06, 2025
Examiner Interview Summary
Jun 05, 2025
Response Filed
Jul 09, 2025
Final Rejection — §103
Aug 26, 2025
Interview Requested
Sep 10, 2025
Examiner Interview Summary
Sep 10, 2025
Applicant Interview (Telephonic)
Sep 15, 2025
Request for Continued Examination
Oct 01, 2025
Response after Non-Final Action
Nov 04, 2025
Non-Final Rejection — §103
Jan 26, 2026
Interview Requested
Feb 03, 2026
Applicant Interview (Telephonic)
Feb 03, 2026
Examiner Interview Summary
Feb 09, 2026
Response Filed
Mar 26, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12602877
BODY MODEL PROCESSING METHODS AND APPARATUSES, ELECTRONIC DEVICES AND STORAGE MEDIA
2y 5m to grant Granted Apr 14, 2026
Patent 12548187
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM
2y 5m to grant Granted Feb 10, 2026
Patent 12456274
FACIAL EXPRESSION AND POSE TRANSFER UTILIZING AN END-TO-END MACHINE LEARNING MODEL
2y 5m to grant Granted Oct 28, 2025
Patent 12450810
ANIMATED FACIAL EXPRESSION AND POSE TRANSFER UTILIZING AN END-TO-END MACHINE LEARNING MODEL
2y 5m to grant Granted Oct 21, 2025
Patent 12439025
APPARATUS, SYSTEM, METHOD, STORAGE MEDIUM, AND FILE FORMAT
2y 5m to grant Granted Oct 07, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

5-6
Expected OA Rounds
69%
Grant Probability
99%
With Interview (+37.5%)
2y 7m
Median Time to Grant
High
PTA Risk
Based on 16 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month