Prosecution Insights
Last updated: April 19, 2026
Application No. 18/595,096

CUSTOM IMAGE AND CONCEPT COMBINER USING DIFFUSION MODELS

Non-Final OA §103
Filed
Mar 04, 2024
Examiner
LHYMN, SARAH
Art Unit
2613
Tech Center
2600 — Communications
Assignee
Adobe Inc.
OA Round
1 (Non-Final)
65%
Grant Probability
Favorable
1-2
OA Rounds
2y 4m
To Grant
81%
With Interview

Examiner Intelligence

Grants 65% — above average
65%
Career Allow Rate
357 granted / 546 resolved
+3.4% vs TC avg
Strong +15% interview lift
Without
With
+15.2%
Interview Lift
resolved cases with interview
Typical timeline
2y 4m
Avg Prosecution
30 currently pending
Career history
576
Total Applications
across all art units

Statute-Specific Performance

§101
5.4%
-34.6% vs TC avg
§103
63.2%
+23.2% vs TC avg
§102
5.9%
-34.1% vs TC avg
§112
15.3%
-24.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 546 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Election/Restrictions Claims 10-16 are withdrawn (now, cancelled) from further consideration pursuant to 37 CFR 1.142(b) as being drawn to a nonelected invention of Group II, there being no allowable generic or linking claim. Election was made without traverse in the reply filed on 13 February 2026. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1, 3, 9, 17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Lin (U.S. Patent App. Pub. No. 2022/0036127 A1) in view of Anandkumar (U.S. Patent App. Pub. No. 2024/0095534 A1). Regarding claim 1: Lin teaches: a computer-implemented method (para. 5, methods executed by computers), comprising: receiving a plurality of input modalities comprising multiple images and a text input in a natural language (Fig. 2: receiving input image 110 and language based editing instruction 115 – these are two input modalities (image and language). In terms of “multiple images”, the Anandkumar reference teaches that it is known for neural network systems, to receive multiple images as inputs (claim 2, para. 573) (along with text inputs, see claim 3). Lin is also neural network relevant (para 33)); generating image embeddings for the multiple images (Lin, Fig. 3A: 330, embed image feature maps from the input image(-s)) and a text embedding for the text input (Lin, Fig. 3A; 335: embedding the text input); and generating an output image based on the image embeddings and the text embedding by a machine learning model…. (Fig.3A: 350, construct a new image including modified visual attributes of the input image). Regarding the output image comprising portions of the multiple images, Anandkumar teaches that modifications or variations of one or more images based on text prompts is known (para. 135, 202). Portions of multiple images being used to generate an output image, based also on a text prompt, per both references, is an obvious modification taught by the prior art and within the purview of one of ordinary skill in the art. Modifying the applied references, such to include multiple input images, per Anandkumar, to generate the output image of Lin, is all of taught and suggested by the prior art, and would have been obvious and predictable to one of ordinary skill in the art as of the effective filing date of the claimed invention. See MPEP §2143(A). The prior art included each element recited in claim 1, although not necessarily in a single embodiment, with the only difference being between the claimed element and the prior art being the lack of actual combination of certain elements in a single prior art embodiment, as described above. One of ordinary skill in the art could have combined the elements as claimed by known methods, and in that combination, each element merely performs the same function as it does separately. One of ordinary skill in the art would have also recognized that the results of the combination were predictable as of the effective filing date of the claimed invention. Regarding claim 3: Lin and/or Anandkumar teach: the method of claim 1, wherein the plurality of input modalities includes at least one of: one or more images, one or more text inputs, and any combination thereof (Lin, Fig. 2: receiving input image 110 and language based editing instruction 115 – these are two input modalities (image and language)) (Anandkumar, para. 620, text and images as inputs). It would have been obvious for one of ordinary skill in the art, as of the effective filing date of Applicant’s claims, to have further modified the applied reference(-s) in view of same to have obtained the above, motivated to make use of known machine learning to receive and/or modify inputs of varied style or type. Regarding claim 9: Anandkumar teaches: the method of claim 1, wherein the machine learning model includes at least one of: a diffusion machine learning model, a generative machine learning model, and any combination thereof (para. 586, generative model). It would have been obvious for one of ordinary skill in the art, as of the effective filing date of Applicant’s claims, to have further modified the applied reference(-s) in view of same to have obtained the above, motivated to make use of known machine learning to receive and/or modify inputs. Regarding claim 17: see also claim 1. Lin teaches: a non-transitory computer-readable medium storing executable instructions, which when executed by a processing device, cause the processing device to perform operations (claim 1) comprising. The operations of claim 17 correspond to the method of claim 1; the same rationale for rejection applies. Regarding claim 20: see claim 9. These claims are similar; the same rationale for rejection applies. Claim(s) 2 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin in view of Anandkumar and further in view o Kouris (U.S. Patent App. Pub. No. 2023/0128637) Regarding claim 2: It would have been obvious for one of ordinary skill in the art to have combined and modified the applied reference(-s), in view of same, to have obtained: the method of claim 1, wherein the machine learning model has been trained using a reference image and a plurality of portions of the reference image by semantically arranging the plurality of portions of the reference image in accordance with a structure of the reference image, and the results of the modification would have been obvious and predictable to one of ordinary skill in the art as of the effective filing date of the claimed invention. See MPEP §2143(A). Anandkumar teaches that it is known to compare training outputs to a set of expected or desired outputs, such as ground truth in supervised learning (para. 140-42). Labeled data (par. 142, 587). Ground truth data corresponds to a reference image. Regarding a reference image whereby portions are semantically arranged in accordance with the reference image structure, Kouris teaches a training a system for semantic image segmentation, whereby image portions are arranged by image structure (e.g. Figs. 5A-6A). Modifying the applied references, in view of Kouris, such that the reference image and training is done via semantically arranging image portions in accordance with the reference image, per Kouris, motivated to train a system for effective image segmentation, to, i.e. better process input images (Lin, Fig. 6B, the cat or sink), and/or image segmentation, classification, object detection (Anandkumar, para. 145), is all of taught and suggested by the prior art, and would have been obvious and predictable to one of ordinary skill in the art as of the effective filing date of the claimed invention. See MPEP §2143(A). The prior art included each element recited in claim 2, although not necessarily in a single embodiment, with the only difference being between the claimed element and the prior art being the lack of actual combination of certain elements in a single prior art embodiment, as described above. One of ordinary skill in the art could have combined the elements as claimed by known methods, and in that combination, each element merely performs the same function as it does separately. One of ordinary skill in the art would have also recognized that the results of the combination were predictable as of the effective filing date of the claimed invention. Claim(s) 4-7 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin in view of Anandkumar, and further in view of Qu (U.S. Patent App. Pub. No. 2025/0104309 A1). Regarding claim 4: It would have been obvious for one of ordinary skill in the art to have combined and modified the applied reference(-s), in view of same, to have obtained: the method of claim 3, wherein the image embeddings are generated based on the one or more images and one or more image portions embeddings generated based on one or more portions of the one or more images, and the results of the modification would have been obvious and predictable to one of ordinary skill in the art as of the effective filing date of the claimed invention. See MPEP §2143(A). Qu teaches that it is known to generate image embeddings based on one or more images (e.g. Figs. 4, 5 and 7), and one or more image portions embeddings (Fig. 7, objects 1, 2….N from an input image, used to generate a feature map for said object. See also para. 16). Modifying the applied references, such to include the above features of Qu in the systems of Lin and Anandkumar, such to be able to receive and process inputs of various modalities (all three references teach this), is all of taught and suggested by the prior art, and would have been obvious and predictable to one of ordinary skill. The prior art included each element recited in claim 4, although not necessarily in a single embodiment, with the only difference being between the claimed element and the prior art being the lack of actual combination of certain elements in a single prior art embodiment, as described above. One of ordinary skill in the art could have combined the elements as claimed by known methods, and in that combination, each element merely performs the same function as it does separately. One of ordinary skill in the art would have also recognized that the results of the combination were predictable as of the effective filing date of the claimed invention. Regarding claim 5: Anandkumar teaches: the method of claim 4, further comprising converting the image embeddings and the one or more image portions embeddings to a uniform dimension (paras. 74-75, OpenAI’s CLIP model generates embeddings with the same number of dimensions. Modifying the applied references, in view of same, such to use CLIP for embeddings, known perform said task, is all of taught and suggested by the prior art, and would have been obvious and predictable to one of ordinary skill in the art as of the effective filing date of the claimed invention. See MPEP §2143(A). One of ordinary skill in the art could have combined the elements as claimed by known methods, and in that combination, each element merely performs the same function as it does separately. One of ordinary skill in the art would have also recognized that the results of the combination were predictable as of the effective filing date of the claimed invention. Regarding claim 6: Anandkumar teaches: the method of claim 4, wherein the one or more portions of the one or more images include one or more randomly generated portions of the one or more images(paras. 65-68, randomly generated images as a pre-processing is known. Applying this to the portions of images, per mapping in claim 4, is all of taught and suggested by the prior art, and would have been obvious and predictable to one of ordinary skill in the art as of the effective filing date of the claimed invention. See MPEP §2143(A)). One of ordinary skill in the art could have combined the elements as claimed by known methods, and in that combination, each element merely performs the same function as it does separately. One of ordinary skill in the art would have also recognized that the results of the combination were predictable as of the effective filing date of the claimed invention. Regarding claim 7: Anandkumar teaches: the method of claim 6, wherein a size of each of the one or more portions of the one or more images is randomly determined (paras. 65-68, random cropping, zooming, scaling or otherwise adjusting as a pre-processing step for images is known. Applying this to the portions of images, per mapping in claim 4, is all of taught and suggested by the prior art, and would have been obvious and predictable to one of ordinary skill in the art as of the effective filing date of the claimed invention. See MPEP §2143(A)). One of ordinary skill in the art could have combined the elements as claimed by known methods, and in that combination, each element merely performs the same function as it does separately. One of ordinary skill in the art would have also recognized that the results of the combination were predictable as of the effective filing date of the claimed invention. Regarding claim 18: see also claims 3, 4. Claim 18 is a combination of claims 3 and 4, with an additional feature below, mapped to either Qu or Lin. See below. the non-transitory computer-readable medium of claim 17, wherein the plurality of input modalities include at least one of: one or more images, one or more text inputs, and any combination thereof (claim 3); wherein the image embeddings are generated based on the one or more images and one or more image portions embeddings generated based on one or more portions of the one or more images (claim 4); and one or more text embeddings generated based on the text input (Qu, Fig. 3, or Lin, Fig. 2). It would have been obvious for one of ordinary skill in the art to have further modified the applied reference(-s), in view of same, to have obtained the above, and the results of the modification would have been obvious and predictable to one of ordinary skill in the art as of the effective filing date of the claimed invention. See MPEP §2143(A). The prior art included each element recited in claim 18, although not necessarily in a single embodiment, with the only difference being between the claimed element and the prior art being the lack of actual combination of certain elements in a single prior art embodiment, as described above. One of ordinary skill in the art could have combined the elements as claimed by known methods, and in that combination, each element merely performs the same function as it does separately. One of ordinary skill in the art would have also recognized that the results of the combination were predictable as of the effective filing date of the claimed invention. Claim(s) 8 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin in view of Anandkumar and further in view of Huang (U.S. Patent No. 11,463,455 B1). Regarding claim 8: It would have been obvious for one of ordinary skill in the art to have combined and modified the applied reference(-s), in view of same, to have obtained: the method of claim 1, further comprising assigning one or more weights to each of the image embeddings and the text embedding, each weight in the one or more weights is associated with a probability of not using at least one of the image embeddings and the text embedding, and the results of the modification would have been obvious and predictable to one of ordinary skill in the art as of the effective filing date of the claimed invention. See MPEP §2143(A). Huang teaches that it is known to assign weights to embeddings (i.e. a character embedding, per claim 3, as an example). Each weight is associated with a probability (example: claim 3, a probability of obfuscation). Huang also teaches using a Softmax function (C9, last partial paragraph to C10), which is a mathematical function that converts a tuple of real numbers into a probability distribution over those numbers. See also Huang, C11, last partial paragraph to C12. While the probability of interests in Huang is related to obfuscation, this is non-limiting use of Softmax (which does not limit itself to any specific intended use probability focus). Modifying the applied references, such to apply the teaching of Huang and probabilities to embeddings, per Huang, to the embeddings mapped in claim 1, and the probability of not using one of the embeddings, is all of taught and suggested by the prior art, and would have been obvious and predictable to one of ordinary skill. The choice of probability also would have been an obvious design choice for one of ordinary skill, based on intended use or output design of system and design preferences. Also note: for claim interpretation purposes, Applicant’s specification describes “not using” probabilities in the context of training, not output image generation. See [0028] of specification as filed. For weights assigned in training, see Anandkumar, paras. 126, 139). The prior art included each element recited in claim 8, although not necessarily in a single embodiment, with the only difference being between the claimed element and the prior art being the lack of actual combination of certain elements in a single prior art embodiment, as described above. One of ordinary skill in the art could have combined the elements as claimed by known methods, and in that combination, each element merely performs the same function as it does separately. One of ordinary skill in the art would have also recognized that the results of the combination were predictable as of the effective filing date of the claimed invention. Regarding claim 19: see claim 8. These claims are similar; the same rationale for rejection applies. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure, relevant to machine learning and image/text manipulations. * * * * * Any inquiry concerning this communication or earlier communications from the examiner should be directed to Sarah Lhymn whose telephone number is (571)270-0632. The examiner can normally be reached M-F, 9:00 AM to 6:00 PM EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached at 571-272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. Sarah Lhymn Primary Examiner Art Unit 2613 /Sarah Lhymn/Primary Examiner, Art Unit 2613
Read full office action

Prosecution Timeline

Mar 04, 2024
Application Filed
Mar 18, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12602882
AUGMENTED REALITY DISPLAY DEVICE AND AUGMENTED REALITY DISPLAY SYSTEM
2y 5m to grant Granted Apr 14, 2026
Patent 12602764
METHODS OF ARTIFICIAL INTELLIGENCE-ASSISTED INFRASTRUCTURE ASSESSMENT USING MIXED REALITY SYSTEMS
2y 5m to grant Granted Apr 14, 2026
Patent 12602746
SYSTEM AND METHOD FOR BACKGROUND MODELLING FOR A VIDEO STREAM
2y 5m to grant Granted Apr 14, 2026
Patent 12585888
AUTOMATICALLY GENERATING DESCRIPTIONS OF AUGMENTED REALITY EFFECTS
2y 5m to grant Granted Mar 24, 2026
Patent 12586163
INTERACTIVELY REFINING A DIGITAL IMAGE DEPTH MAP FOR NON DESTRUCTIVE SYNTHETIC LENS BLUR
2y 5m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
65%
Grant Probability
81%
With Interview (+15.2%)
2y 4m
Median Time to Grant
Low
PTA Risk
Based on 546 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month