Prosecution Insights
Last updated: May 29, 2026
Application No. 18/485,358

GENERATING EDITABLE TEMPLATES FOR DESIGNS

Final Rejection §103
Filed
Oct 12, 2023
Examiner
AUGUSTINE, NICHOLAS
Art Unit
2178
Tech Center
2100 — Computer Architecture & Software
Assignee
Microsoft Technology Licensing, LLC
OA Round
2 (Final)
73%
Grant Probability
Favorable
3-4
OA Rounds
1y 0m
Est. Remaining
99%
With Interview

Examiner Intelligence

Grants 73% — above average
73%
Career Allowance Rate
599 granted / 818 resolved
+18.2% vs TC avg
Strong +28% interview lift
Without
With
+27.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 8m
Avg Prosecution
23 currently pending
Career history
862
Total Applications
across all art units

Statute-Specific Performance

§101
0.6%
-39.4% vs TC avg
§103
45.4%
+5.4% vs TC avg
§102
53.4%
+13.4% vs TC avg
§112
0.2%
-39.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 818 resolved cases

Office Action

§103
DETAILED ACTION A. This action is in response to the following communications: Amendment filed 01/20/2026 . This action is made Final. B. Claims 1-20 remain pending. C. 35 USC 101 and 102 withdrawn due to amendment and remarks. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jindal, Nipun et al. (US Pub. 2025/0061610 A1), herein referred to as “Jindal” in view of Glenn E. Sugden, (US Pub. 2025/0095222 A1), herein referred to as “Sugden”. As for claims 1,12 and 20, Jindal teaches. A data processing system and corresponding method of claim 12 increasing a design template library supporting a design recommendation feature in a productivity application comprising: a processor, and a memory storing executable instructions which, when executed by the processor, cause the processor alone or in combination with other processors to perform the following functions (par. 28 hardware and software environment for implementing text to image generation): based on a list of design purposes, generate prompts requesting a Large Language Model (LLM) (par. 56 LLM examples) to produce corresponding prompts for input to a text-to-image model to generate a proposed design corresponding to each design purpose (par. 26 utilize both a character-level encoder and a prompt encoder. Embodiments extract scene text from a text prompt and encode the scene text at the character level, and combine this character-level encoding with the prompt encoding to condition image generation; par. 55 using Contrastive Language-Image Pre-Training (CLIP) which is a neural network that is trained to efficiently learn visual concepts from natural language supervision; thus concepts are equivalent to design purposes; par. 76 gives examples of building/creating/storing image features or “designs” these features are saved in the model for lookup to find designs related to user text prompts in the future: “ For example, during training, guided latent diffusion model 500 may take an original image 505 in a pixel space 510 as input and apply and image encoder 515 to convert original image 505 into original image features 520 in a latent space 525. Then, a forward diffusion process 530 gradually adds noise to the original image features 520 to obtain noisy features 535 (also in latent space 525) at various noise levels.”); submit the prompts from the LLM to the text-to-image model (par. 67 image generation through text prompts); receive the proposed designs from the text-to-image model (par. 67 one example of proposed design is to render scene text to an image; an image that includes the scene text based on the prompt embedding and the character-level embedding); removing text generated by the text-to-image model from within the plurality of proposed designs in an image separation pipeline by: using an Optical Character Recognition (OCR) tool to identify the text in the plurality of proposed designs, using a Segment Anything Model (SAM) to identify a text mask for the text used to remove the text, and using an inpainting tool to fill in the plurality of prosed designs where the text was removed to produce a plurality of textless designs (par. 74 image inpainting to remove/add text to images; par. 79 use of segmentation map and/or mask to remove text or other content form an image using reverse diffusion process 540); and and return the plurality of editable designs via the UI to the user; and determine, by a quality control review, a subset of the plurality of editable designs that will be added to a design template library (par. 55 adding to the CLIP model/”design template library”; A CLIP model can be applied to nearly arbitrary visual classification tasks so that the model may predict the likelihood of a text description being paired with a particular image, removing the need for users to design their own classifiers and the need for task-specific training data. For example, a CLIP model can be applied to a new task by inputting names of the task's visual concepts to the model's text encoder. The model can then output a linear classifier of CLIP's visual representations; Also par. 74 Diffusion models are a class of generative neural networks which can be trained to generate new data with features similar to features found in training data; the image features are also function in same functionality as claim limitation a “design”). Examiner notes that Jindal provides different examples of claim limitations and recommends an amendment that narrows the limitations specifically with functional language that is different than what Jindal teaches. Jindal teaches a version of compiling text in at least paragraph 72 which states further include identifying an additional scene text. Some examples further include generating an additional character-level embedding based on the additional scene text, wherein the image is generated based on the additional character-level embedding. For example, the additional scene text may be another text object that is extracted from the prompt. A text object is a set of text that may include multiple lines and is intended to be placed in a same region based on the prompt. For example, a prompt describing a scene text including text to be displayed on a handheld sign and text to be displayed on a neon billboard may include two text objects; In an attempt to advance prosecution and in the same field of endeavor Sugden teaches compiling a list of design purposes received via a user interface (UI) to a design system either from a user or from user queries (par. 74 user inputs various design purposes, attributes that denote the final design of the image from the text prompt (e.g. abstract and green)); generate a prompt using a prompt generator instructing a Large Language Model (LLM) to produce a plurality of design prompts for input to a text-to-image model, the plurality of design prompts instructing the text-to-image model to generate a plurality of proposed designs corresponding to the list of design purposes (par. 75 the user may adjust and/or revise the image prompt in the image prompt reviewer 618. For example, the text box in which the image prompt is displayed may be an interactive text box. The user may add and/or remove text from the image prompt in the text box of the image prompt reviewer 618. This may allow the user to revise the prompt to fine-tune the resulting image based on the prompt); Submit the prompt from the prompt generator to the LLM; submit the plurality of design prompts generated by the LLM to the text-to-image model; receive the plurality of proposed designs from the text-to-image model (fig. 6 and 7 depict the user input into the prompt generator and passing the prompt to the LLM for image generation). Jindal teaches input the plurality of proposed designs to an image separation pipeline to produce a plurality of textless designs; input the plurality of textless designs to a text generation and placement model to produce a plurality of editable designs (par. 74 image inpainting to remove/add text to images; par. 79 segmentation and mask used to remove text ). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Sugden into Jindal because Sugden suggests in paragraph 16 methods for generating consumable content, such as digital images (or simply “images”), using a generative artificial intelligence (AI) model. The generative AI model may generate images using a set of descriptors, such as input words. Using the descriptors, a prompt engine may generate an image prompt for the generative AI model. As for claims 2 and 13. The system of claim 1, Jindal teaches, wherein the text-to-image model is a diffusion model (par. 76 diffusion model). As for claims 3 and 13. The system of claim 1, Jindal teaches, wherein the LLM is a Generative Pretrained Transformer (GPT) model (par. 66, LLM GPT). As for claims 4 and 14. The system of claim 1, Jindal teaches, wherein the instructions further cause the processor to remove text generated by the text-to-image model in the plurality of proposed designs (par. 72, generating an additional scene text and means to do so; par. 77 and 79 the reverse diffusion process gradually removes noise which include text through OCR recognition which saves text as encoded data; another example is par. 55-57 where the text to be rendered on the image is removed/extracted from the prompt by text decomposer and then later rendered on the image as scene text as input to the character-level encoder). Examiner recommends clarification amendment as this limitation can be interpreted multiple ways as shown by the prior art. As for claims 5 and 15. The system of claim 4, Jindal teaches, wherein removing the text generated by the text-to-image model comprises: using an Optical Character Recognition (OCR) tool to identify the text in the plurality of proposed designs (par. 53 and 72 using OCR component for text with images through training or generating images to create “scene text”); using a Segment Anything Model (SAM) to identify a text mask for the text used to remove the text (par. 79 segmentation map is used to guide reverse diffusion process 540; a Segment Anything Model as known in the art is a foundational model that generates segmentation masks similar to a segmentation map); and using an inpainting tool to fill in the proposed design where the text was removed (par. 74 image inpainting to remove/add text to images). As for claims 6 and 16. The system of claim 4, Jindal teaches, wherein the executable instructions further cause the processor to use a text generation/placement model to add the text back to the plurality of proposed designs (par. 101 the system generates an image that includes the scene text based on the prompt embedding and the character-level embedding). As for claims 7 and 17. The system of claim 6, Jindal teaches, wherein the text generation/placement model uses text attributes from the plurality of proposed designs as output by the text-to-image model (par. 92 user describes what the output image should include, which is scene text through guidance prompt). As for claims 8 and 18. The system of claim 6, Jindal teaches, wherein the added text is in a text box that is editable (par. 101 support of iterative diffusion process allows for user it update/add to image outputted in GUI; par. 98 the system may display a prompt text field to a user via a GUI, and the user may input the prompt via the text prompt text field). As for claims 9 and 19. The system of claim 6, Jindal teaches, wherein the text generation/placement model corrects typographical or other errors from text in the plurality of proposed designs as generated by the text-to-image model (par. 72 and 115 using OCR to aid in the error checking for spelling and legibility). As for claim 10. The system of claim 1, Jindal teaches, wherein the executable instructions further cause the processor to associate metadata with the plurality of proposed designs added to the template library to facilitate retrieval of the plurality of proposed designs based on a user query (par. 55 and 72 utilization of CLIP for classification wherein classification functions similar to metadata/tagging). As for claim 11. The system of claim 1, Jindal teaches, wherein the executable instructions further cause the processor to complete a quality control review workflow on the plurality of proposed designs before a subset of the plurality of proposed designs is added to the template library (par.72 and 77 the reverse diffusion process gradually removes noisy features (designs) at various noise levels in a latent space comparing versions of images to train the model is equivalent to a corrective quality control, as; wherein it is stated that “the denoised image features 545 are compared to the original image features 520 at each of the various noise levels, and parameters of the reverse diffusion process 540 of the diffusion model are updated based on the comparison.” ). (Note :) It is noted that any citation to specific, pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006,1009, 158 USPQ 275, 277 (CCPA 1968)). Response to Arguments Applicant’s arguments with respect to claim(s) 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Inquires Any inquiry concerning this communication should be directed to NICHOLAS AUGUSTINE at telephone number (571)270-1056. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. PNG media_image1.png 208 559 media_image1.png Greyscale /NICHOLAS AUGUSTINE/Primary Examiner, Art Unit 2178 May 5, 2026
Read full office action

Prosecution Timeline

Show 1 earlier event
Aug 22, 2024
Response after Non-Final Action
Oct 21, 2025
Non-Final Rejection mailed — §103
Dec 08, 2025
Interview Requested
Dec 15, 2025
Examiner Interview Summary
Dec 15, 2025
Applicant Interview (Telephonic)
Jan 20, 2026
Response Filed
May 08, 2026
Final Rejection mailed — §103
May 26, 2026
Interview Requested

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12625521
VIRTUAL INPUT DEVICE, VIRTUAL INPUT SYSTEM, VIRTUAL INPUT METHOD, AND RECORDING MEDIUM
2y 4m to grant Granted May 12, 2026
Patent 12614922
TESTING OF A DISTRIBUTED ENERGY RESOURCE (DER) MANAGEMENT SYSTEM (DERMS)
2y 10m to grant Granted Apr 28, 2026
Patent 12598212
Cybersecurity Risk Analysis and Modeling of Risk Data on an Interactive Display
2y 8m to grant Granted Apr 07, 2026
Patent 12584752
VISUAL VEHICLE-POSITIONING FUSION SYSTEM AND METHOD THEREOF
2y 3m to grant Granted Mar 24, 2026
Patent 12586264
WORD EVALUATION VALUE ACQUISITION METHOD, APPARATUS AND PROGRAM
2y 3m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4
Expected OA Rounds
73%
Grant Probability
99%
With Interview (+27.9%)
3y 8m (~1y 0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 818 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month