Prosecution Insights
Last updated: April 19, 2026
Application No. 18/677,087

SYSTEM FOR CONTEXTUAL IMAGE EDITING

Non-Final OA §102
Filed
May 29, 2024
Examiner
LI, JAI WEI TOMMY
Art Unit
2613
Tech Center
2600 — Communications
Assignee
Microsoft Technology Licensing, LLC
OA Round
1 (Non-Final)
Grant Probability
Favorable
1-2
OA Rounds
2y 9m
To Grant

Examiner Intelligence

Grants only 0% of cases
0%
Career Allow Rate
0 granted / 0 resolved
-62.0% vs TC avg
Minimal +0% lift
Without
With
+0.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
9 currently pending
Career history
9
Total Applications
across all art units

Statute-Specific Performance

§103
46.2%
+6.2% vs TC avg
§102
53.9%
+13.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 0 resolved cases

Office Action

§102
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Specification The disclosure is objected to because of the following informalities: missing labeling across multiple figures. Appropriate correction is required. Regarding paragraph 36, “Computing device 111” is not properly labeled on figure 1. Appropriate correction is required. Regarding paragraph 92, 99, and 102, “Computing device 901” is incorrectly being displayed as “COMPUTING SYSTEM 901” on figure 9. Appropriate correction is required. Claim Rejections - 35 USC § 102 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention. Claim(s) 1-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Singh et al. (U.S. Pub. No. 2024/0135511). Regarding claim 1, Singh discloses a computing apparatus (Fig. 82, 8200; also, paragraph 837, line(s) 4-5 "FIG. 82 shows the scene-based image editing system 106 implemented by a computing device 8200") comprising: one or more computer readable storage media; one or more processors operatively coupled with the one or more computer readable storage media (Fig. 82, also, paragraph 843, line(s) 3-5 "the components 8202-8216 include one or more instructions stored on a computer-readable storage medium and executable by processors"); and program instructions stored on the one or more computer readable storage media that, when executed by the one or more processors (Fig. 82, also, paragraph 843, line(s) 3-5 "the components 8202-8216 include one or more instructions stored on a computer-readable storage medium and executable by processors"), direct the computing apparatus to at least: receive an image generated by a generative artificial intelligence (AI) model in response to a prompt (paragraph 3, line(s) 3-10 "non-transitory computer-readable media that implement artificial intelligence models to facilitate flexible and efficient scene-based image editing. To illustrate, in one or more embodiments, a system utilizes one or more machine learning models to learn/identify characteristics of a digital image, anticipate potential edits to the digital image, and/or generate supplementary components that are usable in various edits"; also, paragraph 3, line(s) 24-26 "system utilizes various generative models and various instances of artificial intelligence to generate modified digital images or animations"; also, paragraph 393, line(s) "the scene-based image editing system 106 provides a prompt for entry of textual user input.") wherein the prompt includes a natural language request from a user (paragraph 393, line(s) 11-12 "the scene-based image editing system 106 provides a prompt for entry of textual user input"; also, paragraph 583, line(s) 7-9 "106 provides an option to the user of the client device to query a portion to be selected (via a selection or a natural language input)."; also, paragraph 584, line(s) 5-7 "the scene-based image editing system 106 can utilize the segmentation machine learning model 4404 to analyze user inputs (e.g., clicks, drawings, outlines, or natural language)"); display an image canvas of the image comprising multiple segments of the image identified by a segmentation model (Fig. 3, 318-324, 326; also, paragraph 8, line(s) 1 "FIG. 3 illustrates a segmentation neural network utilized by the scene-based image editing system to generate object masks for objects in accordance with one or more embodiments"; also, paragraph 159, line(s) 8-9 "To illustrate, FIG. 3 shows a label 318 for woman, a label 320 for bird, and a label 322 for man"); and display a context menu of options to edit the image based on a selection of a segment of the image by the user (Fig. 8B, 814; also, paragraph 246, line(s) 1-12 "As further illustrated, based on detecting the user interaction for selecting the object 808d, the scene-based image editing system 106 provides an option menu 814 for display via the graphical user interface 802. The option menu 814 shown in FIG. 8B provides a plurality of options, though the option menu includes various numbers of options in various embodiments. For instance, in some implementations, the option menu 814 includes one or more curated options, such as options determined to be popular or used with the most frequency. For example, as shown in FIG. 8B, the option menu 814 includes an option 816 to delete the object 808d"). Regarding claim 2, Singh discloses a computing apparatus of claim 1, wherein the program instructions further direct the computing apparatus to display a second context menu of options to edit the image based on a selection of a canvas editing button by the user (Fig. 8B, 814; also, paragraph 246, line(s) 1-12 "As further illustrated, based on detecting the user interaction for selecting the object 808d, the scene-based image editing system 106 provides an option menu 814 for display via the graphical user interface 802. The option menu 814 shown in FIG. 8B provides a plurality of options, though the option menu includes various numbers of options in various embodiments. For instance, in some implementations, the option menu 814 includes one or more curated options, such as options determined to be popular or used with the most frequency. For example, as shown in FIG. 8B, the option menu 814 includes an option 816 to delete the object 808d"). Regarding claim 3, Singh discloses a computing apparatus of claim 1, wherein the context menu comprises options based on a semantic understanding of the segment identified by the segmentation model (Fig. 3, 318-324, 326; also, paragraph 8, line(s) 1 "FIG. 3 illustrates a segmentation neural network utilized by the scene-based image editing system to generate object masks for objects in accordance with one or more embodiments"; also, paragraph 159, line(s) 8-9 "To illustrate, FIG. 3 shows a label 318 for woman, a label 320 for bird, and a label 322 for man"; also, Fig. 8B, 814' also, paragraph 246, line(s) 1-12 "As further illustrated, based on detecting the user interaction for selecting the object 808d, the scene-based image editing system 106 provides an option menu 814 for display via the graphical user interface 802. The option menu 814 shown in FIG. 8B provides a plurality of options, though the option menu includes various numbers of options in various embodiments. For instance, in some implementations, the option menu 814 includes one or more curated options, such as options determined to be popular or used with the most frequency. For example, as shown in FIG. 8B, the option menu 814 includes an option 816 to delete the object 808d"). Regarding claim 4, Singh discloses a computing apparatus of claim 3, wherein the program instructions further direct the computing apparatus to receive a selection of a second segment of the image by the user (Fig. 8A-8D; also, paragraph 12, line(s) 1-4 "FIG. 7 illustrates a diagram for generating object masks and content fills to facilitate object-aware modifications to a digital image in accordance with one or more embodiments;"; also, Fig. 3, 318-324, 326; also, paragraph 8, line(s) 1-4 "FIG. 3 illustrates a segmentation neural network utilized by the scene-based image editing system to generate object masks for objects in accordance with one or more embodiments"; also, paragraph 164, line(s) 4-9 "the scene-based image editing system 106 receives input from a user indicating a selection of one of the detected objects. To illustrate, in the implementation shown, the scene-based image editing system 106 receives user input 312 of the user selecting bounding boxes 321 and 323") and to display a second context menu of options to edit the image, wherein the second context menu includes at least one option to edit the image that is different from the options of the context menu (paragraph 236, line(s) 7-11 "it should be understood that the graphical user interface 802 displays at least some menus, options, or other visual elements in various embodiments—at least when the digital image 806 is initially displayed."; also, Fig. 20A-20C, 2002, 2004, 2008, 2010, 2012a-2012c; also, paragraph 397, 1-12 line(s) "FIGS. 20A-20C illustrate another graphical user interface implemented by the scene-based image editing system 106 to facilitate modifying object attributes of objects portrayed in a digital image in accordance with one or more embodiments. As shown in FIG. 20A, the scene-based image editing system 106 provides a digital image 2006 portraying an object 2008 for display within a graphical user interface 2002 of a client device 2004. Further, upon detecting a user interaction with the object 2008, the scene-based image editing system 106 provides an attribute menu 2010 having attribute object indicators 2012a-2012c listing object attributes of the object 2008."; also, Fig. 21A-C, 2102, 2104, 2108, 2110, 2112a-2112c; also, paragraph 401, line(s) 1-12 "FIGS. 21A-21C illustrate another graphical user interface implemented by the scene-based image editing system 106 to facilitate modifying object attributes of objects portrayed in a digital image in accordance with one or more embodiments. As shown in FIG. 21A, the scene-based image editing system 106 provides a digital image 2106 portraying an object 2108 for display within a graphical user interface 2102 of a client device 2104. Further, upon detecting a user interaction with the object 2108, the scene-based image editing system 106 provides an attribute menu 2110 having attribute object indicators 2112a-2012c listing object attributes of the object 2108."). Regarding claim 5, Singh discloses a computing apparatus of claim 3, wherein the program instructions further direct the computing apparatus to prompt the generative AI model to regenerate the image based on a selection by the user of an option in the context menu (Fig. 45A-52B; also, paragraph 531, line(s) 1-23 "the scene-based image editing system 106 provides options to generate the modified digital image 4012 within a user interface. For instance, the scene-based image editing system 106 provides client devices with more flexibility and controllability during editing. The user interface has various options to indicate the type of modification (e.g., expanding the digital image or removing an object) and the scene-based image editing system 106 further provides customizable options for modifying the digital image 4002. For example, the scene-based image editing system 106 can generate a diverse pool of possible solutions for user selection. Moreover, the scene-based image editing system 106 can allow a client device to provide user input of strokes, style guides, or color patches at a desired region to guide the image generation. Indeed, by including a style input within a region of digital image 4002, the scene-based image editing system 106 can utilize a machine learning model to expand/apply the style to the entire semantic region in generating the modified digital image 4012. Similarly, by considering user input/modification of semantic regions, the scene-based image editing system 106 can flexibly guide generation of completed digital images to accurately reflect desired features."; also, paragraph 533, line(s) 18-24 "the scene-based image editing system 106 can receive a variety of user inputs, such as user input of a mask identifying an object or region to replace, selection of an object to remove, an area or region to expand, or a style/color patch to expand within an input region."). Regarding claim 6, Singh discloses a computing apparatus of claim 1, wherein to display the context menu of options, the program instructions further direct the computing apparatus to identify an option for display based on contextual information about the segment received from the segmentation model (Fig. 25A-25D; also, paragraph 30, line(s) 1-5 "FIGS. 25A-25D illustrate a graphical user interface implemented by the scene-based image editing system to add objects to a selection for modification based on classification relationships in accordance with one or more embodiments"; also, paragraph 441, line(s) 1-6 "FIG. 25A further illustrates semantic scene graph components 2510a-2510c from a semantic scene graph of the digital image 2506. Indeed, the semantic scene graph components 2510a-2510c include portions of a semantic scene graph providing a hierarchy of object classifications for each of the objects 2508a-2508g."; also, paragraph 442, line(s) 1-23 "As shown in FIG. 25A, the semantic scene graph component 2510a includes a node 2512 representing a clothing class, a node 2514 representing an accessory class, and a node 2516 representing a shoe class. As further shown, the accessory class is a subclass of the clothing class, and the shoe class is a subclass of the accessory class. Similarly, the semantic scene graph component 2510b includes a node 2518 representing the clothing class, a node 2520 representing the accessory class, and a node 2522 representing a glasses class, which is a subclass of the accessory class. Further, the semantic scene graph component 2510c includes a node 2524 representing the clothing class and a node 2526 representing a coat class, which is another subclass of the clothing class. Thus, the semantic scene graph components 2510a-2510c provide various classifications that apply to each of the objects 2508a-2508g. In particular, the semantic scene graph component 2510a provides a hierarchy of object classifications associated with the shoes presented in the digital image 2506, the semantic scene graph component 2510b provides a hierarchy of object classifications associated with the pairs of glasses, and the semantic scene graph component 2510c provides a hierarchy of object classifications associated with the coat."; also, paragraph 443, 1-8 line(s) "As shown in FIG. 25B, the scene-based image editing system 106 detects a user interaction selecting the object 2508e. Further, the scene-based image editing system 106 detects a user interaction selecting the object 2508b. As further shown, in response to detecting the selection of the object 2508b and the object 2508e, the scene-based image editing system 106 provides a text box 2528 suggesting all shoes in the digital image 2506 be added to the selection."). Regarding claim 7, Singh discloses a computing apparatus of claim 1, wherein the program instructions further direct the computing apparatus to identify a generative AI model (paragraph 133, line(s) 12-16 "the scene-based image editing system 106 utilizes one or more machine learning models, such as the neural network(s) 114 to perform the pre-processing operations."; also, paragraph 145, line(s) 4-6 "the scene-based image editing system 106 utilizes a machine learning model, such as a segmentation neural network"; also, paragraph 150, line(s) 7-13 "in some embodiments to both detect objects in a digital image and generate object masks for those objects. Indeed, FIG. 3 illustrates a detection-masking neural network 300 that comprises both an object detection machine learning model 308 (in the form of an object detection neural network) and an object segmentation machine learning model 310") or editing tool for each option of the context menu of options to edit the segment (paragraph 101, line(s) 9-15 "Accordingly, in one or more embodiments, the scene-based image editing system pre-processes the digital image in preparation for an object-aware modification, such as a move operation or a delete operation, by pre-generating object masks and/or content fills before receiving user input for such a modification"; also, Fig. 8A-8D; also, paragraph 12, line(s) "FIG. 7 illustrates a diagram for generating object masks and content fills to facilitate object-aware modifications to a digital image in accordance with one or more embodiments"). Regarding claim 8, Singh discloses a computing apparatus of claim 1, wherein to display the image canvas, the program instructions further direct the computing apparatus to visually differentiate the segment selected (Fig. 3, 316, 318-324, 326; also, paragraph 8, line(s) 1 "FIG. 3 illustrates a segmentation neural network utilized by the scene-based image editing system to generate object masks for objects in accordance with one or more embodiments"; also, paragraph 159, line(s) 8-9 "To illustrate, FIG. 3 shows a label 318 for woman, a label 320 for bird, and a label 322 for man"; also, paragraph 243, line(s) 1-4 "based on detecting the user interaction for selecting the object 808d, the scene-based image editing system 106 provides a visual indication 812") by the user from other segments of the multiple segments of the image (Fig. 8B-8C, 808d; also, paragraph 243, line(s) 1-4 "based on detecting the user interaction for selecting the object 808d, the scene-based image editing system 106 provides a visual indication"). Regarding claim 9, Singh discloses the method of method of operating a computing device (Fig. 1; also, Fig. 2, 200; also, paragraph 133, line(s) "the scene-based image editing system 106 operates on a computing device 200")comprising: receiving an image generated by a generative artificial intelligence (AI) model in response to a prompt (paragraph 3, line(s) 24-26 "system utilizes various generative models and various instances of artificial intelligence to generate modified digital images or animations"; also, paragraph 393, line(s) 11-12 "the scene-based image editing system 106 provides a prompt for entry of textual user input."), wherein the prompt includes a natural language request from a user (paragraph 393, line(s) 11-12 "the scene-based image editing system 106 provides a prompt for entry of textual user input"; also, paragraph 583, line(s) 7-9 "106 provides an option to the user of the client device to query a portion to be selected (via a selection or a natural language input)."); displaying an image canvas of the image comprising multiple segments of the image identified by a segmentation model (Fig. 3, 318-324, 326; also, paragraph 8, line(s) 1 "FIG. 3 illustrates a segmentation neural network utilized by the scene-based image editing system to generate object masks for objects in accordance with one or more embodiments"; also, paragraph 159, line(s) 8-9 "To illustrate, FIG. 3 shows a label 318 for woman, a label 320 for bird, and a label 322 for man"); and displaying a context menu of options to edit the image based on a selection of a segment of the image by the user (Fig. 8B, 814' also, paragraph 246, line(s) 1-12 "As further illustrated, based on detecting the user interaction for selecting the object 808d, the scene-based image editing system 106 provides an option menu 814 for display via the graphical user interface 802. The option menu 814 shown in FIG. 8B provides a plurality of options, though the option menu includes various numbers of options in various embodiments. For instance, in some implementations, the option menu 814 includes one or more curated options, such as options determined to be popular or used with the most frequency. For example, as shown in FIG. 8B, the option menu 814 includes an option 816 to delete the object 808d"). Regarding claim 10, Singh discloses a method of claim 9, wherein the context menu comprises options based on a semantic understanding of the segment identified by the segmentation model (Fig. 3, 318-324, 326; also, paragraph 8, line(s) 1 "FIG. 3 illustrates a segmentation neural network utilized by the scene-based image editing system to generate object masks for objects in accordance with one or more embodiments"; also, paragraph 159, line(s) 8-9 "To illustrate, FIG. 3 shows a label 318 for woman, a label 320 for bird, and a label 322 for man"; also, Fig. 8B, 814' also, paragraph 246, line(s) 1-12 "As further illustrated, based on detecting the user interaction for selecting the object 808d, the scene-based image editing system 106 provides an option menu 814 for display via the graphical user interface 802. The option menu 814 shown in FIG. 8B provides a plurality of options, though the option menu includes various numbers of options in various embodiments. For instance, in some implementations, the option menu 814 includes one or more curated options, such as options determined to be popular or used with the most frequency. For example, as shown in FIG. 8B, the option menu 814 includes an option 816 to delete the object 808d"). Regarding claim 11, Singh discloses a method of claim 10, further comprising receiving a selection of a second segment of the image by the user (Fig. 8A-8D; also, paragraph 12, line(s) "FIG. 7 illustrates a diagram for generating object masks and content fills to facilitate object-aware modifications to a digital image in accordance with one or more embodiments;"; also, Fig. 3, 318-324, 326; also, paragraph 8, line(s) 1-4 "FIG. 3 illustrates a segmentation neural network utilized by the scene-based image editing system to generate object masks for objects in accordance with one or more embodiments"; also, paragraph 164, line(s) 4-9 "the scene-based image editing system 106 receives input from a user indicating a selection of one of the detected objects. To illustrate, in the implementation shown, the scene-based image editing system 106 receives user input 312 of the user selecting bounding boxes 321 and 323") and displaying a second context menu of options to edit the image, wherein the second context menu comprises an option to edit the image that is different from the options of the context menu (paragraph 236, line(s) 7-11 "it should be understood that the graphical user interface 802 displays at least some menus, options, or other visual elements in various embodiments—at least when the digital image 806 is initially displayed."; also, Fig. 20A-20C, 2002, 2004, 2008, 2010, 2012a-2012c; also, paragraph 397, line(s) 1-12 "FIGS. 20A-20C illustrate another graphical user interface implemented by the scene-based image editing system 106 to facilitate modifying object attributes of objects portrayed in a digital image in accordance with one or more embodiments. As shown in FIG. 20A, the scene-based image editing system 106 provides a digital image 2006 portraying an object 2008 for display within a graphical user interface 2002 of a client device 2004. Further, upon detecting a user interaction with the object 2008, the scene-based image editing system 106 provides an attribute menu 2010 having attribute object indicators 2012a-2012c listing object attributes of the object 2008."; also, Fig. 21A-C, 2102, 2104, 2108, 2110, 2112a-2112c; also, paragraph 401, line(s) 1-12 "FIGS. 21A-21C illustrate another graphical user interface implemented by the scene-based image editing system 106 to facilitate modifying object attributes of objects portrayed in a digital image in accordance with one or more embodiments. As shown in FIG. 21A, the scene-based image editing system 106 provides a digital image 2106 portraying an object 2108 for display within a graphical user interface 2102 of a client device 2104. Further, upon detecting a user interaction with the object 2108, the scene-based image editing system 106 provides an attribute menu 2110 having attribute object indicators 2112a-2012c listing object attributes of the object 2108."). Regarding claim 12, Singh discloses a method of claim 11, further comprising prompting the generative AI model to regenerate the image based on a selection by the user of an option in the context menu (Fig. 45A-52B; also, paragraph 531, line(s) 1-23 "the scene-based image editing system 106 provides options to generate the modified digital image 4012 within a user interface. For instance, the scene-based image editing system 106 provides client devices with more flexibility and controllability during editing. The user interface has various options to indicate the type of modification (e.g., expanding the digital image or removing an object) and the scene-based image editing system 106 further provides customizable options for modifying the digital image 4002. For example, the scene-based image editing system 106 can generate a diverse pool of possible solutions for user selection. Moreover, the scene-based image editing system 106 can allow a client device to provide user input of strokes, style guides, or color patches at a desired region to guide the image generation. Indeed, by including a style input within a region of digital image 4002, the scene-based image editing system 106 can utilize a machine learning model to expand/apply the style to the entire semantic region in generating the modified digital image 4012. Similarly, by considering user input/modification of semantic regions, the scene-based image editing system 106 can flexibly guide generation of completed digital images to accurately reflect desired features."; also, paragraph 533, line(s) 18-24 "the scene-based image editing system 106 can receive a variety of user inputs, such as user input of a mask identifying an object or region to replace, selection of an object to remove, an area or region to expand, or a style/color patch to expand within an input region."). Regarding claim 13, Singh discloses a method of claim 12, wherein displaying the context menu of options comprises identifying an option for display based on contextual information about the segment received from the segmentation model (Fig. 25A-25D; also, paragraph 30, line(s) 1-5 "FIGS. 25A-25D illustrate a graphical user interface implemented by the scene-based image editing system to add objects to a selection for modification based on classification relationships in accordance with one or more embodiments"; also, paragraph 441, line(s) 1-6 "FIG. 25A further illustrates semantic scene graph components 2510a-2510c from a semantic scene graph of the digital image 2506. Indeed, the semantic scene graph components 2510a-2510c include portions of a semantic scene graph providing a hierarchy of object classifications for each of the objects 2508a-2508g."; also, paragraph 442, line(s) 1-23 "As shown in FIG. 25A, the semantic scene graph component 2510a includes a node 2512 representing a clothing class, a node 2514 representing an accessory class, and a node 2516 representing a shoe class. As further shown, the accessory class is a subclass of the clothing class, and the shoe class is a subclass of the accessory class. Similarly, the semantic scene graph component 2510b includes a node 2518 representing the clothing class, a node 2520 representing the accessory class, and a node 2522 representing a glasses class, which is a subclass of the accessory class. Further, the semantic scene graph component 2510c includes a node 2524 representing the clothing class and a node 2526 representing a coat class, which is another subclass of the clothing class. Thus, the semantic scene graph components 2510a-2510c provide various classifications that apply to each of the objects 2508a-2508g. In particular, the semantic scene graph component 2510a provides a hierarchy of object classifications associated with the shoes presented in the digital image 2506, the semantic scene graph component 2510b provides a hierarchy of object classifications associated with the pairs of glasses, and the semantic scene graph component 2510c provides a hierarchy of object classifications associated with the coat."; also, paragraph 443, 1-8 line(s) "As shown in FIG. 25B, the scene-based image editing system 106 detects a user interaction selecting the object 2508e. Further, the scene-based image editing system 106 detects a user interaction selecting the object 2508b. As further shown, in response to detecting the selection of the object 2508b and the object 2508e, the scene-based image editing system 106 provides a text box 2528 suggesting all shoes in the digital image 2506 be added to the selection."). Regarding claim 14, Singh discloses a method of claim 9, further comprising identifying a generative AI model (paragraph 133, line(s) 12-16 "the scene-based image editing system 106 utilizes one or more machine learning models, such as the neural network(s) 114 to perform the pre-processing operations."; also, paragraph 145, line(s) 4-6 "the scene-based image editing system 106 utilizes a machine learning model, such as a segmentation neural network"; also, paragraph 150, line(s) 7-13 "in some embodiments to both detect objects in a digital image and generate object masks for those objects. Indeed, FIG. 3 illustrates a detection-masking neural network 300 that comprises both an object detection machine learning model 308 (in the form of an object detection neural network) and an object segmentation machine learning model 310") or image editing tool for each option of the context menu of options to edit the segment (paragraph 101, line(s) 9-15 "Accordingly, in one or more embodiments, the scene-based image editing system pre-processes the digital image in preparation for an object-aware modification, such as a move operation or a delete operation, by pre-generating object masks and/or content fills before receiving user input for such a modification"; also, Fig. 8A-8D; also, paragraph 12, line(s) "FIG. 7 illustrates a diagram for generating object masks and content fills to facilitate object-aware modifications to a digital image in accordance with one or more embodiments;"). Regarding claim 15, Singh discloses a method of claim 12, wherein displaying the image canvas comprises visually differentiating the segment selected by the user from other segments of the multiple segments of the image (Fig. 25A-25D; also, paragraph 30, line(s) 1-5 "FIGS. 25A-25D illustrate a graphical user interface implemented by the scene-based image editing system to add objects to a selection for modification based on classification relationships in accordance with one or more embodiments"; also, paragraph 441, line(s) 1-6 "FIG. 25A further illustrates semantic scene graph components 2510a-2510c from a semantic scene graph of the digital image 2506. Indeed, the semantic scene graph components 2510a-2510c include portions of a semantic scene graph providing a hierarchy of object classifications for each of the objects 2508a-2508g."; also, paragraph 442, line(s) 1-23 "As shown in FIG. 25A, the semantic scene graph component 2510a includes a node 2512 representing a clothing class, a node 2514 representing an accessory class, and a node 2516 representing a shoe class. As further shown, the accessory class is a subclass of the clothing class, and the shoe class is a subclass of the accessory class. Similarly, the semantic scene graph component 2510b includes a node 2518 representing the clothing class, a node 2520 representing the accessory class, and a node 2522 representing a glasses class, which is a subclass of the accessory class. Further, the semantic scene graph component 2510c includes a node 2524 representing the clothing class and a node 2526 representing a coat class, which is another subclass of the clothing class. Thus, the semantic scene graph components 2510a-2510c provide various classifications that apply to each of the objects 2508a-2508g. In particular, the semantic scene graph component 2510a provides a hierarchy of object classifications associated with the shoes presented in the digital image 2506, the semantic scene graph component 2510b provides a hierarchy of object classifications associated with the pairs of glasses, and the semantic scene graph component 2510c provides a hierarchy of object classifications associated with the coat."; also, paragraph 443, 1-8 line(s) "As shown in FIG. 25B, the scene-based image editing system 106 detects a user interaction selecting the object 2508e. Further, the scene-based image editing system 106 detects a user interaction selecting the object 2508b. As further shown, in response to detecting the selection of the object 2508b and the object 2508e, the scene-based image editing system 106 provides a text box 2528 suggesting all shoes in the digital image 2506 be added to the selection."). Regarding claim 16, Singh discloses one or more computer readable storage media having program instructions stored thereon that, when executed by one or more processors, direct a computing apparatus (Fig. 82, also, paragraph 843, line(s) 3-5 "the components 8202-8216 include one or more instructions stored on a computer-readable storage medium and executable by processors") to at least: receive an image generated by a generative artificial intelligence (AI) model in response to a prompt (paragraph 3, line(s) 3-10 "non-transitory computer-readable media that implement artificial intelligence models to facilitate flexible and efficient scene-based image editing. To illustrate, in one or more embodiments, a system utilizes one or more machine learning models to learn/identify characteristics of a digital image, anticipate potential edits to the digital image, and/or generate supplementary components that are usable in various edits"; also, paragraph 3, line(s) 24-26 "system utilizes various generative models and various instances of artificial intelligence to generate modified digital images or animations"; also, paragraph 393, line(s) 11-12 "the scene-based image editing system 106 provides a prompt for entry of textual user input."), wherein the prompt includes a natural language request from a user (paragraph 393, line(s) 11-12 "the scene-based image editing system 106 provides a prompt for entry of textual user input"; also, paragraph 583, line(s) 7-9 "106 provides an option to the user of the client device to query a portion to be selected (via a selection or a natural language input)."; also, paragraph 584, line(s) 5-7 "the scene-based image editing system 106 can utilize the segmentation machine learning model 4404 to analyze user inputs (e.g., clicks, drawings, outlines, or natural language)"); display an image canvas of the image comprising multiple segments of the image identified by a segmentation model (Fig. 3, 318-324, 326; also, paragraph 8, line(s) 1 "FIG. 3 illustrates a segmentation neural network utilized by the scene-based image editing system to generate object masks for objects in accordance with one or more embodiments"; also, paragraph 159, line(s) 8-9 "To illustrate, FIG. 3 shows a label 318 for woman, a label 320 for bird, and a label 322 for man"); and display a context menu of options to edit the image based on a selection of a segment of the image by the user (Fig. 8B, 814' also, paragraph 246, line(s) 1-12 "As further illustrated, based on detecting the user interaction for selecting the object 808d, the scene-based image editing system 106 provides an option menu 814 for display via the graphical user interface 802. The option menu 814 shown in FIG. 8B provides a plurality of options, though the option menu includes various numbers of options in various embodiments. For instance, in some implementations, the option menu 814 includes one or more curated options, such as options determined to be popular or used with the most frequency. For example, as shown in FIG. 8B, the option menu 814 includes an option 816 to delete the object 808d"). Regarding claim 17, Singh discloses one or more computer readable storage media of claim 16, wherein the context menu comprises options based on a semantic understanding of the segment identified by the segmentation model (Fig. 3, 318-324, 326; also, paragraph 8, line(s) 1 "FIG. 3 illustrates a segmentation neural network utilized by the scene-based image editing system to generate object masks for objects in accordance with one or more embodiments"; also, paragraph 159, line(s) 8-9 "To illustrate, FIG. 3 shows a label 318 for woman, a label 320 for bird, and a label 322 for man"; also, Fig. 8B, 814' also, paragraph 246, line(s) 1-12 "As further illustrated, based on detecting the user interaction for selecting the object 808d, the scene-based image editing system 106 provides an option menu 814 for display via the graphical user interface 802. The option menu 814 shown in FIG. 8B provides a plurality of options, though the option menu includes various numbers of options in various embodiments. For instance, in some implementations, the option menu 814 includes one or more curated options, such as options determined to be popular or used with the most frequency. For example, as shown in FIG. 8B, the option menu 814 includes an option 816 to delete the object 808d"). Regarding claim 18, Singh discloses one or more computer readable storage media of claim 17, further comprising receiving a selection of a second segment of the image by the user (Fig. 8A-8D; also, paragraph 12, line(s) "FIG. 7 illustrates a diagram for generating object masks and content fills to facilitate object-aware modifications to a digital image in accordance with one or more embodiments;"; also, Fig. 3, 318-324, 326; also, paragraph 8, line(s) 1-4 "FIG. 3 illustrates a segmentation neural network utilized by the scene-based image editing system to generate object masks for objects in accordance with one or more embodiments"; also, paragraph 164, line(s) 4-9 "the scene-based image editing system 106 receives input from a user indicating a selection of one of the detected objects. To illustrate, in the implementation shown, the scene-based image editing system 106 receives user input 312 of the user selecting bounding boxes 321 and 323") and displaying a second context menu of options to edit the image, wherein the second context menu comprises an option to edit the image that is different from the options of the context menu (paragraph 236, line(s) 7-11 "it should be understood that the graphical user interface 802 displays at least some menus, options, or other visual elements in various embodiments—at least when the digital image 806 is initially displayed."; also, Fig. 20A-20C, 2002, 2004, 2008, 2010, 2012a-2012c; also, paragraph 397, line(s) 1-12 "FIGS. 20A-20C illustrate another graphical user interface implemented by the scene-based image editing system 106 to facilitate modifying object attributes of objects portrayed in a digital image in accordance with one or more embodiments. As shown in FIG. 20A, the scene-based image editing system 106 provides a digital image 2006 portraying an object 2008 for display within a graphical user interface 2002 of a client device 2004. Further, upon detecting a user interaction with the object 2008, the scene-based image editing system 106 provides an attribute menu 2010 having attribute object indicators 2012a-2012c listing object attributes of the object 2008."; also, Fig. 21A-C, 2102, 2104, 2108, 2110, 2112a-2112c; also, paragraph 401, line(s) 1-12 "FIGS. 21A-21C illustrate another graphical user interface implemented by the scene-based image editing system 106 to facilitate modifying object attributes of objects portrayed in a digital image in accordance with one or more embodiments. As shown in FIG. 21A, the scene-based image editing system 106 provides a digital image 2106 portraying an object 2108 for display within a graphical user interface 2102 of a client device 2104. Further, upon detecting a user interaction with the object 2108, the scene-based image editing system 106 provides an attribute menu 2110 having attribute object indicators 2112a-2012c listing object attributes of the object 2108."). Regarding claim 19, Singh discloses one or more computer readable storage media of claim 18, wherein the program instructions further direct the computing apparatus to prompt the generative AI model to regenerate the image based on a selection by the user of an option in the context menu (Fig. 45A-52B; also, Fig. 82, also, paragraph 843, line(s) 3-5 "the components 8202-8216 include one or more instructions stored on a computer-readable storage medium and executable by processors"; also, paragraph 531, line(s) 1-23 "the scene-based image editing system 106 provides options to generate the modified digital image 4012 within a user interface. For instance, the scene-based image editing system 106 provides client devices with more flexibility and controllability during editing. The user interface has various options to indicate the type of modification (e.g., expanding the digital image or removing an object) and the scene-based image editing system 106 further provides customizable options for modifying the digital image 4002. For example, the scene-based image editing system 106 can generate a diverse pool of possible solutions for user selection. Moreover, the scene-based image editing system 106 can allow a client device to provide user input of strokes, style guides, or color patches at a desired region to guide the image generation. Indeed, by including a style input within a region of digital image 4002, the scene-based image editing system 106 can utilize a machine learning model to expand/apply the style to the entire semantic region in generating the modified digital image 4012. Similarly, by considering user input/modification of semantic regions, the scene-based image editing system 106 can flexibly guide generation of completed digital images to accurately reflect desired features."; also, paragraph 533, line(s) 18-24 "the scene-based image editing system 106 can receive a variety of user inputs, such as user input of a mask identifying an object or region to replace, selection of an object to remove, an area or region to expand, or a style/color patch to expand within an input region."). Regarding claim 20, Singh discloses one or more computer readable storage media of claim 16, wherein the program instructions further direct the computing apparatus to identify an option for editing the selected segment for display in the context menu based on contextual information about the segment received from the segmentation model (Fig. 25A-25D; also, paragraph 30, line(s) 1-5 "FIGS. 25A-25D illustrate a graphical user interface implemented by the scene-based image editing system to add objects to a selection for modification based on classification relationships in accordance with one or more embodiments"; also, paragraph 441, line(s) 1-6 "FIG. 25A further illustrates semantic scene graph components 2510a-2510c from a semantic scene graph of the digital image 2506. Indeed, the semantic scene graph components 2510a-2510c include portions of a semantic scene graph providing a hierarchy of object classifications for each of the objects 2508a-2508g."; also, paragraph 442, line(s) 1-23 "As shown in FIG. 25A, the semantic scene graph component 2510a includes a node 2512 representing a clothing class, a node 2514 representing an accessory class, and a node 2516 representing a shoe class. As further shown, the accessory class is a subclass of the clothing class, and the shoe class is a subclass of the accessory class. Similarly, the semantic scene graph component 2510b includes a node 2518 representing the clothing class, a node 2520 representing the accessory class, and a node 2522 representing a glasses class, which is a subclass of the accessory class. Further, the semantic scene graph component 2510c includes a node 2524 representing the clothing class and a node 2526 representing a coat class, which is another subclass of the clothing class. Thus, the semantic scene graph components 2510a-2510c provide various classifications that apply to each of the objects 2508a-2508g. In particular, the semantic scene graph component 2510a provides a hierarchy of object classifications associated with the shoes presented in the digital image 2506, the semantic scene graph component 2510b provides a hierarchy of object classifications associated with the pairs of glasses, and the semantic scene graph component 2510c provides a hierarchy of object classifications associated with the coat."; also, paragraph 443, 1-8 line(s) "As shown in FIG. 25B, the scene-based image editing system 106 detects a user interaction selecting the object 2508e. Further, the scene-based image editing system 106 detects a user interaction selecting the object 2508b. As further shown, in response to detecting the selection of the object 2508b and the object 2508e, the scene-based image editing system 106 provides a text box 2528 suggesting all shoes in the digital image 2506 be added to the selection."). Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAI WEI TOMMY LI whose telephone number is (571)272-1170. The examiner can normally be reached 6:00AM-4:00PM EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached at (571) 272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /JAI LI/Junior Examiner, Art Unit 2613 /XIAO M WU/Supervisory Patent Examiner, Art Unit 2613
Read full office action

Prosecution Timeline

May 29, 2024
Application Filed
Jan 21, 2026
Non-Final Rejection — §102
Mar 17, 2026
Examiner Interview Summary
Mar 17, 2026
Applicant Interview (Telephonic)

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
Grant Probability
2y 9m
Median Time to Grant
Low
PTA Risk
Based on 0 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month