DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Applicant’s submission filed 11/24/2025 has been entered. The claims 1, 10 and 19 have been amended. The claims 9 and 18 have been cancelled. The claim 21 has been newly added. The claims 1-8, 10-17, and 19-20 are pending in the current application.
Response to Arguments
Applicant’s arguments with respect to the amended claim 1 and similar claims have been considered but are moot in view of the new ground(s) of rejection based on the newly cited references.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 10, 19 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al. US-PGPUB No. 2024/0127410 (hereinafter Lin) in view of Wu et al. US-PGPB No. 2023/0217097 (hereinafter Wu) and Brandt et al. US-PGPUB No. 2024/0169624 (hereinafter Brandt).
Re Claim 1:
Lin teaches an interactive method, comprising:
obtaining to-be-modified display content in a display content, the display content being generated based on a first input content input by a user (Lin teaches at FIG. 7C and Paragraph 0114 that obtaining to to-be-modified display content (a selected mountain region) as indicated by the cursor, the display content 710 being generated on a first input content input by user of FIG. 7B);
obtaining semantic information of the to-be-modified display content (Lin teaches at FIG. 7C and Paragraph 0114 obtaining semantic information “Mountain” of the to-be-modified display content (the mountain region) indicated by the cursor);
displaying the semantic information of the to-be-modified display content as one or more modifiable annotation of the to-be-modified display element (Lin teaches at FIG. 7C and Paragraph 0114 displaying “Mountain” of the selected mountain region as one or more modifiable annotation of the selected mountain region)
obtaining a second input content input by the user, the second input content comprising candidate semantic information obtained by receiving direct user modification of the one more modifiable annotations of the to-be-modified display content, and based on the candidate semantic information, replacing the semantic information of the to-be-modified display content with the candidate semantic information to obtain modified semantic information (Lin teaches at FIG. 7C and Paragraph 0114-0115 obtaining a second input content input by the user using the panoptic segment brushes 718 where the panoptic inpainting system 102 provides the panoptic segment brushes 718 together with a digital image 704 so that the user can paint portions directly onto a designated area of the digital image with desired panoptic labels for inpainting. Lin shows at FIG. 7C that the second input content comprising “Sky” obtained by receiving direct user modifications of the modifiable annotation “Mountain” of the selected mountain region, and based on the candidate semantic information “Sky”, replacing the semantic information “Mountain” of the selected mountain region with the candidate semantic information “Sky”); and
based on the modified semantic information, modifying the display content to obtain modified display content (Lin teaches at FIG. 7C and Paragraph 0114-0115 based on the modified semantic information “Sky”, modifying the selected mountain region to obtain the modified mountain region as pixels of the sky region).
Wu teaches an interactive method, comprising:
obtaining to-be-modified display content in a display content, the display content being generated based on a first input content input by a user (Wu teaches at FIG. 6A and Paragraph 0171-0173 obtaining the stick display content in the preview picture 324, the preview picture 324 is generated based on a first input content. Wu teaches at FIGS. 5B-5C and Paragraph 0138 that a user uses the terminal 100 to shoot a picture and/or to select the selfie stick for removal);
obtaining semantic information of the to-be-modified display content (Wu teaches at FIG. 6A and Paragraph 0171-0173 obtaining semantic information 621/622 “Selfie stick” of the stick display content);
displaying the semantic information of the to-be-modified display content as one or more modifiable annotation of the to-be-modified display element (Wu teaches at FIG. 6A and Paragraph 0171-0173 displaying the semantic information 621/622 “Selfie stick” of the stick display content)
obtaining a second input content input by the user, the second input content comprising candidate semantic information obtained by receiving direct user modification of the one more modifiable annotations of the to-be-modified display content, and based on the candidate semantic information, replacing the semantic information of the to-be-modified display content with the candidate semantic information to obtain modified semantic information (Wu teaches at FIG. 6A and Paragraph 0171-0173 obtaining a second input content input by a user (e.g., receiving a tap input) and a removal control 632 may be configured to trigger the terminal 100 to remove the background person from the preview picture 324. The removal control 622 may be configured to trigger the terminal 100 to remove the selfie stick from the preview picture 324. Wu teaches that the second input content comprises candidate semantic information of FIG. 6B to include the cancellation control 623 of the selfie stick by receiving direct user modification of the annotation 621/622 and based on the candidate semantic information 621/623 along with 631/632, replacing the semantic information 621/622 along with 631/632 of the stick region with the candidate semantic information 621/623 along with 631/632 to obtain modified semantic information); and
based on the modified semantic information, modifying the display content to obtain modified display content (Wu teaches at FIG. 6B and Paragraph 0171-0173 displaying the modified display content in obtaining a second input content input by a user (e.g., receiving a tap input) and a removal control 632 may be configured to trigger the terminal 100 to remove the background person from the preview picture 324. The removal control 622 may be configured to trigger the terminal 100 to remove the selfie stick from the preview picture 324. Wu teaches that the second input content comprises candidate semantic information of FIG. 6B to include the cancellation control 623 of the selfie stick by receiving direct user modification of the annotation 621/622 and based on the candidate semantic information 621/623 along with 631/632, replacing the semantic information 621/622 along with 631/632 of the stick region with the candidate semantic information 621/623 along with 631/632 to obtain modified semantic information).
based on the modified semantic information, modifying the display content to obtain modified display content (Wu teaches at FIG. 6B and Paragraph 0171-0173 displaying the modified semantic information 621/623 and modified the preview picture 324 to obtain modified display content 328).
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have provided a first/second/third text input repeatedly to have modified the semantic information in one or more portions of an image in relation to the first/second/third text input according to Wu’s system and method to have been incorporated into Lin’s modification of the image based on the multiple stages of the user inputs on a graphical user interface. One of the ordinary skill in the art would have modified the semantic information in one or more portions of an image based on the additional user inputs.
Brandt teaches an interactive method, comprising:
obtaining to-be-modified display content in a display content, the display content being generated based on a first input content input by a user (Brandt teaches at FIGS. 20A-20C and Paragraph 0355 obtaining the display content 2008 in a display content, the display content being generated based on a first input content via 2012a);
obtaining semantic information of the to-be-modified display content (Brandt teaches at FIG. 20A obtaining the semantic information 2012a “cube” of the cube display content);
displaying the semantic information of the to-be-modified display content as one or more modifiable annotation of the to-be-modified display element (Brandt teaches displaying the semantic information 2012a “cube” of the cube display content)
obtaining a second input content input by the user, the second input content comprising candidate semantic information obtained by receiving direct user modification of the one more modifiable annotations of the to-be-modified display content, and based on the candidate semantic information, replacing the semantic information of the to-be-modified display content with the candidate semantic information to obtain modified semantic information (Brandt teaches at FIGS. 20A-20C obtaining a second input content “Sphere” object, the second input content comprising the “Sphere” obtained by receiving direct user modification via the menu 2012a-2012c including the interface 2014 to have specified the candidate semantic information “Sphere”. Brandt teaches replacing the semantic information “Cube” of the cube object with the candidate semantic information “Sphere” to obtain modified semantic information “Sphere” in FIG. 20C); and
based on the modified semantic information, modifying the display content to obtain modified display content (Brandt teaches at FIG. 20C based on the modified semantic information “Sphere”, modifying the display content 2008 to obtain modified display content).
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have provided a first/second/third text input repeatedly to have modified the semantic information in one or more portions of an image in relation to the first/second/third text input according to Brandt’s system and method to have been incorporated into Lin’s modification of the image based on the multiple stages of the user inputs on a graphical user interface. One of the ordinary skill in the art would have modified the semantic information in one or more portions of an image based on the additional user inputs.
Re Claim 10:
The claim 10 is in parallel with the claim 1 in the form of an apparatus claim. The claim 10 is subject to the same rationale of rejection as the claim 1.
Lin further teaches an interactive device comprising a memory storing program instructions and a processor coupled to the memory and configured to execute the program instructions to [perform the method of the claim 1] (Lin teaches at Paragraph 0150 that the components of the panoptic inpainting system 102 include software, hardware, or both. For example, the components of the panoptic inpainting system 102 include one or more instructions stored on a computer-readable storage medium and executable by processors of one or more computing devices (e.g., the computing device 1500). When executed by the one or more processors, the computer-executable instructions of the panoptic inpainting system 102 cause the computing device 1500 to perform the methods described herein. Alternatively, the components of the panoptic inpainting system 102 comprise hardware, such as a special purpose processing device to perform a certain function or group of functions. Additionally, or alternatively, the components of the panoptic inpainting system 102 include a combination of computer-executable instructions and hardware).
Re Claim 19:
The claim 19 is in parallel with the claim 1 in the form of computer program product. The claim 19 is subject to the same rationale of rejection as the claim 1.
Lin further teaches a non-transitory computer readable storage medium storing program instructions, when being executed by one or more processors, the program instructions causing the one or more processors to [perform the method of the claim 1] (Lin teaches at Paragraph 0150 that the components of the panoptic inpainting system 102 include software, hardware, or both. For example, the components of the panoptic inpainting system 102 include one or more instructions stored on a computer-readable storage medium and executable by processors of one or more computing devices (e.g., the computing device 1500). When executed by the one or more processors, the computer-executable instructions of the panoptic inpainting system 102 cause the computing device 1500 to perform the methods described herein. Alternatively, the components of the panoptic inpainting system 102 comprise hardware, such as a special purpose processing device to perform a certain function or group of functions. Additionally, or alternatively, the components of the panoptic inpainting system 102 include a combination of computer-executable instructions and hardware).
Re Claim 21:
The claim 21 encompasses the same scope of invention as that of the claim 1 except additional claim limitation that the display content comprises a text content, the semantic information of the to-be-modified display content is obtained based on a machine learning model trained using sample texts, and the semantic information comprises semantic information of content corresponding to each text line in the text content.
Brandt teaches the claim limitation that the display content comprises a text content, the semantic information of the to-be-modified display content is obtained based on a machine learning model trained using sample texts, and the semantic information comprises semantic information of content corresponding to each text line in the text content (Brandt shows at FIG. 3 and FIGS. 20A-20C that the display content comprises text content (e.g., the text content 318/320/322 of FIG. 3) and the semantic information (text content) of an object display content is obtained based on a machine learning model (e.g., the neural network 300 of FIG. 3) trained using labels of the objects and the semantic information comprises semantic information of each object corresponding to each text line in the text content.
Brandt teaches at FIGS. 16-18 and 26-29 and Paragraph 0123 a neural network is utilized by the scene-based image editing system to facilitate modifying object attributes of objects portrayed in a digital image and the scene-based image editing system 106 selects the object segmentation machine learning model 310 based on the object labels of the object identified by the object detection machine learning model 308 and at Paragraph 0132 that the scene-based image editing system 106 utilizes the cascaded modulation inpainting neural network 420 to generate replacement pixels for the replacement region 404. Brandt teaches at Paragraph 0116 that the detection-masking neural network 300 annotates the bounding boxes with the previously mentioned object labels such as the name of the detected object).
Claims 2-4, 11-13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al. US-PGPUB No. 2024/0127410 (hereinafter Lin) in view of Wu et al. US-PGPB No. 2023/0217097 (hereinafter Wu) and Brandt et al. US-PGPUB No. 2024/0169624 (hereinafter Brandt);
Kumari et al. US-PGPUB No. 2024/0185588 (hereinafter Kumari);
Denison US-PGPUB No. 2024/0193821 (hereinafter Denison).
Re Claim 2:
The claim 2 encompasses the same scope of invention as that of the claim 1 except additional claim limitation that obtaining the to-be-modified display content in the display content comprises:
obtaining semantic information of the display content;
extracting key semantic information from the semantic information of the display content; and
determining the display content corresponding to the key semantic information in the display content as the to-be-modified display content;
or in response to a third input content input by the user, selecting a content specified by the user from the display content as the to-be-modified display content.
Denison and Kumari teach the claim limitation that obtaining the to-be-modified display content in the display content comprises:
obtaining semantic information of the display content;
extracting key semantic information from the semantic information of the display content; and
determining the display content corresponding to the key semantic information in the display content as the to-be-modified display content;
or in response to a third input content input by the user, selecting a content specified by the user from the display content as the to-be-modified display content (
(Denison teaches at FIGS. 4C-1 to 4C-2 and Paragraph 0064 obtaining the second input 401’ “Remove clouds and make sky clear” and modifying the semantic information “the sky including clouds 415” to generate a modified semantic information 415’ of the generated image to generate the adjusted image. Moreover, the belt of the batman can be adjusted to exhibit different style 411’ wherein the different style could be specified using another image or using tuning comments provided via text input at the text field.
Kumari teaches at FIG. 4 Step 415 obtaining semantic information “Moon Gate” of the synthetic image as the to-be-modified display content. The Moon gate constitutes the key semantic information of the synthetic image. Kumari teaches at FIG. 4 Step 415 displaying the key semantic information “Moon Gate” in the synthetic image. Kumari teaches at Paragraph 0022 and Paragraph 0065-0068 that the diffusion model learns the concept represented in the input text and the diffusion model learns what a photo of “Moon Gate” and the diffusion model may be fine-tuned to learn about the concept of a moon gate. The diffusion model learns about the moon gate in the synthetic image by generating images of a gate without circular opening.
Kumari teaches at FIG. 4 and Paragraph 0068 obtaining a text prompt “A PHOTO of a MOONGATE” input by the user, and based on the text prompt at Step 430, modifying the semantic information of the synthetic image to produce a new synthetic image with the accurate depiction of a moon-gate as semantic information. Accurate depiction of a moon-gate is a modified depiction of an inaccurate moon-gate generated for the synthetic image at operation 415 where the image of a gate does not include a circular opening.
Kumari teaches at FIG. 4 and Paragraph 0068 based on the accurate depiction of a moon gate, modifying the synthetic image to obtain the new synthetic image with accurate depiction of a moon gate).
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have provided a first/second/third text input repeatedly to have modified the semantic information in one or more portions of an image in relation to the first/second/third text input according to Denison/Kumari’s system and method to have been incorporated into Lin’s modification of the image based on the multiple stages of the user inputs on a graphical user interface. One of the ordinary skill in the art would have modified the semantic information in one or more portions of an image based on the additional user inputs.
Re Claim 11:
The claim 11 encompasses the same scope of invention as that of the claim 10 except additional claim limitation that, when obtaining the to-be-modified display content in the display content, the processor is further configured to: obtain semantic information of the display content; extract key semantic information from the semantic information of the display content; and determine the display content corresponding to the key semantic information in the display content as the to-be-modified display content; or
in response to a third input content input by the user, select a content specified by the user from the display content as the to-be-modified display content.
The claim 11 is in parallel with the claim 2 in the form of an apparatus claim. The claim 11 is subject to the same rationale of rejection as the claim 2.
Re Claim 20:
The claim 20 encompasses the same scope of invention as that of the claim 19 except additional claim limitation that when obtaining the to-be-modified display content in the display content, the processor is further configured to: obtain semantic information of the display content; extract key semantic information from the semantic information of the display content; and determine the display content corresponding to the key semantic information in the display content as the to-be-modified display content; or in response to a third input content input by the user, select a content specified by the user from the display content as the to-be-modified display content.
The claim 20 is in parallel with the claim 2 in the form of computer program product. The claim 20 is subject to the same rationale of rejection as the claim 2.
Re Claim 3:
The claim 3 encompasses the same scope of invention as that of the claim 1 except additional claim limitation that obtaining the second input content input by the user, and based on the second input content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information comprises:
obtaining change content input by the user corresponding to the semantic information of the to-be-modified display content, and based on the change content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information; or
obtaining candidate semantic information input by the user corresponding to the to-be- modified display content, and replacing the semantic information of the to-be-modified display content with the candidate semantic information to obtain the candidate semantic information of the to-be-modified display content.
Denison and Kumari teach the claim limitation that obtaining the second input content input by the user, and based on the second input content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information comprises:
obtaining change content input by the user corresponding to the semantic information of the to-be-modified display content, and based on the change content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information; or
obtaining candidate semantic information input by the user corresponding to the to-be- modified display content, and replacing the semantic information of the to-be-modified display content with the candidate semantic information to obtain the candidate semantic information of the to-be-modified display content (
(Denison teaches at FIGS. 4C-1 to 4C-2 and Paragraph 0064 obtaining the second input 401’ “Remove clouds and make sky clear” and modifying the semantic information “the sky including clouds 415” to generate a modified semantic information 415’ of the generated image to generate the adjusted image. Moreover, the belt of the batman can be adjusted to exhibit different style 411’ wherein the different style could be specified using another image or using tuning comments provided via text input at the text field.
Kumari teaches at Paragraph 0024 that the diffusion model learns additional concepts and at Paragraph 0025 that a diffusion model may be fine-tuned to learn about the concept of a moon gate. Kumari teaches at Paragraph 0066-0068 and FIG. 4 that relatively inaccurate representation of the moon-gate in the synthetic image is replaced with the inaccurate depiction of the moon-gate to generate the new synthetic image.
Kumari teaches at FIG. 4 Step 415 obtaining semantic information “Moon Gate” of the synthetic image as the to-be-modified display content. The Moon gate constitutes the key semantic information of the synthetic image. Kumari teaches at FIG. 4 Step 415 displaying the key semantic information “Moon Gate” in the synthetic image. Kumari teaches at Paragraph 0022 and Paragraph 0065-0068 that the diffusion model learns the concept represented in the input text and the diffusion model learns what a photo of “Moon Gate” and the diffusion model may be fine-tuned to learn about the concept of a moon gate. The diffusion model learns about the moon gate in the synthetic image by generating images of a gate without circular opening.
Kumari teaches at FIG. 4 and Paragraph 0068 obtaining a text prompt “A PHOTO of a MOONGATE” input by the user, and based on the text prompt at Step 430, modifying the semantic information of the synthetic image to produce a new synthetic image with the accurate depiction of a moon-gate as semantic information. Accurate depiction of a moon-gate is a modified depiction of an inaccurate moon-gate generated for the synthetic image at operation 415 where the image of a gate does not include a circular opening.
Kumari teaches at FIG. 4 and Paragraph 0068 based on the accurate depiction of a moon gate, modifying the synthetic image to obtain the new synthetic image with accurate depiction of a moon gate).
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have provided a first/second/third text input repeatedly to have modified the semantic information in one or more portions of an image in relation to the first/second/third text input according to Denison/Kumari’s system and method to have been incorporated into Lin’s modification of the image based on the multiple stages of the user inputs on a graphical user interface. One of the ordinary skill in the art would have modified the semantic information in one or more portions of an image based on the additional user inputs.
Re Claim 4:
The claim 4 encompasses the same scope of invention as that of the claim 1 except additional claim limitation that obtaining the second input content input by the user, and based on the second input content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information comprises:
obtaining the second input content input by the user, and modifying the semantic information of the to-be-modified display content to obtain modified semantic information; and/or
obtaining the second input content input by the user, and expanding the semantic information of the to-be-modified display content to obtain expanded semantic information.
Denison and Kumari teach the claim limitation that obtaining the second input content input by the user, and based on the second input content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information comprises:
obtaining the second input content input by the user, and modifying the semantic information of the to-be-modified display content to obtain modified semantic information; and/or
obtaining the second input content input by the user, and expanding the semantic information of the to-be-modified display content to obtain expanded semantic information (
(Denison teaches at FIGS. 4C-1 to 4C-2 and Paragraph 0064 obtaining the second input 401’ “Remove clouds and make sky clear” and modifying the semantic information “the sky including clouds 415” to generate a modified semantic information 415’ of the generated image to generate the adjusted image. Moreover, the belt of the batman can be adjusted to exhibit different style 411’ wherein the different style could be specified using another image or using tuning comments provided via text input at the text field.
Kumari teaches at Paragraph 0024 that the diffusion model learns additional concepts and at Paragraph 0025 that a diffusion model may be fine-tuned to learn about the concept of a moon gate. Kumari teaches at Paragraph 0066-0068 and FIG. 4 that relatively inaccurate representation of the moon-gate in the synthetic image is replaced with the inaccurate depiction of the moon-gate to generate the new synthetic image.
Kumari teaches at FIG. 4 Step 415 obtaining semantic information “Moon Gate” of the synthetic image as the to-be-modified display content. The Moon gate constitutes the key semantic information of the synthetic image. Kumari teaches at FIG. 4 Step 415 displaying the key semantic information “Moon Gate” in the synthetic image. Kumari teaches at Paragraph 0022 and Paragraph 0065-0068 that the diffusion model learns the concept represented in the input text and the diffusion model learns what a photo of “Moon Gate” and the diffusion model may be fine-tuned to learn about the concept of a moon gate. The diffusion model learns about the moon gate in the synthetic image by generating images of a gate without circular opening.
Kumari teaches at FIG. 4 and Paragraph 0068 obtaining a text prompt A PHOTO of a MOONGATE” input by the user, and based on the text prompt at Step 430, modifying the semantic information of the synthetic image to produce a new synthetic image with the accurate depiction of a moon-gate as semantic information. Accurate depiction of a moon-gate is a modified depiction of an inaccurate moon-gate generated for the synthetic image at operation 415 where the image of a gate does not include a circular opening.
Kumari teaches at FIG. 4 and Paragraph 0068 based on the accurate depiction of a moon gate, modifying the synthetic image to obtain the new synthetic image with accurate depiction of a moon gate).
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have provided a first/second/third text input repeatedly to have modified the semantic information in one or more portions of an image in relation to the first/second/third text input according to Denison/Kumari’s system and method to have been incorporated into Lin’s modification of the image based on the multiple stages of the user inputs on a graphical user interface. One of the ordinary skill in the art would have modified the semantic information in one or more portions of an image based on the additional user inputs.
Re Claim 12:
The claim 12 encompasses the same scope of invention as that of the claim 10 except additional claim limitation that, when obtaining the second input content input by the user, and based on the second input content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information, the processor is further configured to:
obtain change content input by the user corresponding to the semantic information of the to-be-modified display content, and based on the change content, modify the semantic information of the to-be-modified display content to obtain the modified semantic information; or
obtain candidate semantic information input by the user corresponding to the to-be- modified display content, and replace the semantic information of the to-be-modified display content with the candidate semantic information to obtain the candidate semantic information of the to-be-modified display content.
The claim 12 is in parallel with the claim 3 in the form of an apparatus claim. The claim 12 is subject to the same rationale of rejection as the claim 3.
Re Claim 13:
The claim 13 encompasses the same scope of invention as that of the claim 10 except additional claim limitation that, when obtaining the second input content input by the user, and based on the second input content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information, the processor is further configured to:
obtain the second input content input by the user, and modify the semantic information of the to-be-modified display content to obtain modified semantic information; and/or
obtain the second input content input by the user, and expand the semantic information of the to-be-modified display content to obtain expanded semantic information.
The claim 13 is in parallel with the claim 4 in the form of an apparatus claim. The claim 13 is subject to the same rationale of rejection as the claim 4.
Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al. US-PGPUB No. 2024/0127410 (hereinafter Lin) in view of Wu et al. US-PGPB No. 2023/0217097 (hereinafter Wu) and Brandt et al. US-PGPUB No. 2024/0169624 (hereinafter Brandt);
Lee et al. US-PGPUB No. 2021/0390700 (hereinafter Lee);
Denison US-PGPUB No. 2024/0193821 (hereinafter Denison).
Re Claim 5:
The claim 5 encompasses the same scope of invention as that of the claim 4 except additional claim limitation that modifying the semantic information of the to-be-modified display content to obtain the modified semantic information comprises:
through changing a category of the to-be-modified display content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information.
Denison and Lee teach the claim limitation that modifying the semantic information of the to-be-modified display content to obtain the modified semantic information comprises:
through changing a category of the to-be-modified display content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information (
Denison teaches at FIGS. 4C-1 to 4C-2 and Paragraph 0064 obtaining the second input 401’ “Remove clouds and make sky clear” and modifying the semantic information “the sky including clouds 415” to generate a modified semantic information 415’ of the generated image to generate the adjusted image. Moreover, the belt of the batman can be adjusted to exhibit different style 411’ wherein the different style could be specified using another image or using tuning comments provided via text input at the text field.
Lee teaches at FIG. 3 and Paragraph 0053 through changing a category (changing from “a rightmost couch” to “a couch with arms”, modifying the semantic annotation information of the to-be-modified display content to obtain the modified semantic annotation information).
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have provided a first/second/third text input repeatedly to have modified the semantic information in one or more portions of an image in relation to the first/second/third text input according to Denison/Lee’s system and method to have been incorporated into Lin’s modification of the image based on the multiple stages of the user inputs on a graphical user interface. One of the ordinary skill in the art would have modified the semantic information in one or more portions of an image based on the additional user inputs.
Re Claim 14:
The claim 14 encompasses the same scope of invention as that of the claim 13 except additional claim limitation that when modifying the semantic information of the to-be-modified display content to obtain the modified semantic information, the processor is further configured to:
through changing a category of the to-be-modified display content, modify the semantic information of the to-be-modified display content to obtain the modified semantic information.
The claim 14 is in parallel with the claim 5 in the form of an apparatus claim. The claim 14 is subject to the same rationale of rejection as the claim 5.
Claims 6-7 and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al. US-PGPUB No. 2024/0127410 (hereinafter Lin) in view of Wu et al. US-PGPB No. 2023/0217097 (hereinafter Wu) and Brandt et al. US-PGPUB No. 2024/0169624 (hereinafter Brandt);
Aggarwal et al. US-PGPUB No. 2022/0343561 (hereinafter Aggarwal);
Denison US-PGPUB No. 2024/0193821 (hereinafter Denison).
Re Claim 6:
The claim 6 encompasses the same scope of invention as that of the claim 1 except additional claim limitation that at least two to-be-modified display contents need to be modified, obtaining the second input content input by the user, and based on the second input content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information comprises:
obtaining the second input content input by the user corresponding to the at least two to- be-modified display contents, and based on the second input content, modifying the semantic information of the at least two to-be-modified display contents to obtain the modified semantic information.
Denison and Aggarwal further teach the claim limitation that at least two to-be-modified display contents need to be modified, obtaining the second input content input by the user, and based on the second input content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information comprises:
obtaining the second input content input by the user corresponding to the at least two to- be-modified display contents, and based on the second input content, modifying the semantic information of the at least two to-be-modified display contents to obtain the modified semantic information (Denison teaches at FIGS. 4C-1 to 4C-2 and Paragraph 0064 obtaining the second input 401’ “Remove clouds and make sky clear” and modifying the semantic information “the sky including clouds 415” to generate a modified semantic information 415’ of the generated image to generate the adjusted image. Moreover, the blet of the batman can be adjusted to exhibit different style 411’ wherein the different style could be specified using another image or using tuning comments provided via text input at the text field.
Aggarwal teaches at Paragraph 0026 the user may input the image and say a phrase “convert grey to blue” and at Paragraph 0040 that a user may use speech to increase saturation and lightness of the replaced color and provide semantic segmentation areas. Aggarwal teaches Paragraph 0082 that the system generates a target color embedding corresponding to a target color by applying a color text embedding network to a target color text input and at Paragraph 0083 that the system replaces the source color with the target color in the image based on the color segmentation and the target color embeddings and at Paragraph 0113 that the user may use a slider to vary the lightness and saturation values and the user provides ink blue as a target color and at Paragraph 0114 that the process may be repeated for a different color and at Paragraph 0115 that if a user wants to convert blue to red, the colors blue and red are recognized by the tool and at Paragraph 0117 that models such as a semantic based segmentation may be used to get pre-segmented regions where a user get color based segmented portions and at Paragraph 0034 that speech may be used to increase saturation and lightness of the replaced color, size of the color regions to segment and provide semantic segmentation areas).
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have provided a first/second/third text input repeatedly to have modified the semantic information in one or more portions of an image in relation to the first/second/third text input according to Denison/Aggarwal’s system and method to have been incorporated into Lin’s modification of the image based on the multiple stages of the user inputs on a graphical user interface. One of the ordinary skill in the art would have modified the semantic information in one or more portions of an image based on the additional user inputs.
Re Claim 7:
The claim 7 encompasses the same scope of invention as that of the claim 1 except additional claim limitation that based on the modified semantic information, modifying the display content to obtain the modified display content comprises:
based on the modified semantic information, obtaining a candidate display content; and
replacing the to-be-modified display content in the display content with the candidate display content to obtain the modified display content.
Denison and Aggarwal further teach the claim limitation that based on the modified semantic information, modifying the display content to obtain the modified display content comprises:
based on the modified semantic information, obtaining a candidate display content; and
replacing the to-be-modified display content in the display content with the candidate display content to obtain the modified display content (
Denison teaches at FIGS. 4C-1 to 4C-2 and Paragraph 0064 obtaining the second input 401’ “Remove clouds and make sky clear” and modifying the semantic information “the sky including clouds 415” to generate a modified semantic information 415’ of the generated image to generate the adjusted image. Moreover, the blet of the batman can be adjusted to exhibit different style 411’ wherein the different style could be specified using another image or using tuning comments provided via text input at the text field.
Aggarwal teaches at Paragraph 0026 the user may input the image and say a phrase “convert grey to blue” and at Paragraph 0040 that a user may use speech to increase saturation and lightness of the replaced color and provide semantic segmentation areas. Aggarwal teaches Paragraph 0082 that the system generates a target color embedding corresponding to a target color by applying a color text embedding network to a target color text input and at Paragraph 0083 that the system replaces the source color with the target color in the image based on the color segmentation and the target color embeddings and at Paragraph 0113 that the user may use a slider to vary the lightness and saturation values and the user provides ink blue as a target color and at Paragraph 0114 that the process may be repeated for a different color and at Paragraph 0115 that if a user wants to convert blue to red, the colors blue and red are recognized by the tool and at Paragraph 0117 that models such as a semantic based segmentation may be used to get pre-segmented regions where a user get color based segmented portions and at Paragraph 0034 that speech may be used to increase saturation and lightness of the replaced color, size of the color regions to segment and provide semantic segmentation areas).
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have provided a first/second/third text input repeatedly to have modified the semantic information in one or more portions of an image in relation to the first/second/third text input according to Denison/Aggarwal’s system and method to have been incorporated into Lin’s modification of the image based on the multiple stages of the user inputs on a graphical user interface. One of the ordinary skill in the art would have modified the semantic information in one or more portions of an image based on the additional user inputs.
Re Claim 15:
The claim 15 encompasses the same scope of invention as that of the claim 10 except additional claim limitation that at least two to-be-modified display contents need to be modified, and when obtaining the second input content input by the user, and based on the second input content, modifying the semantic information of the to-be-modified display content to obtain the modified semantic information, the processor is further configured to:
obtain the second input content input by the user corresponding to the at least two to-be- modified display contents, and based on the second input content, modify the semantic information of the at least two to-be-modified display contents to obtain the modified semantic information.
The claim 15 is in parallel with the claim 6 in the form of an apparatus claim. The claim 15 is subject to the same rationale of rejection as the claim 6.
Re Claim 16:
The claim 16 encompasses the same scope of invention as that of the claim 10 except additional claim l imitation that when based on the modified semantic information, modifying the display content to obtain the modified display content, the processor is further configured to:
based on the modified semantic information, obtain a candidate display content; and
replace the to-be-modified display content in the display content with the candidate display content to obtain the modified display content.
The claim 16 is in parallel with the claim 7 in the form of an apparatus claim. The claim 16 is subject to the same rationale of rejection as the claim 7.
Claims 8 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al. US-PGPUB No. 2024/0127410 (hereinafter Lin) in view of Wu et al. US-PGPB No. 2023/0217097 (hereinafter Wu) and Brandt et al. US-PGPUB No. 2024/0169624 (hereinafter Brandt);
Aggarwal et al. US-PGPUB No. 2022/0343561 (hereinafter Aggarwal);
Lee et al. US-PGPUB No. 2021/0390700 (hereinafter Lee);
Denison US-PGPUB No. 2024/0193821 (hereinafter Denison).
Re Claim 8:
The claim 8 encompasses the same scope of invention as that of the claim 1 except additional claim limitation that obtaining the semantic information of the to-be-modified display content comprises:
highlighting the to-be-modified display content; and
in response to a triggering instruction of the to-be-modified display content, obtaining the semantic information of the to-be-modified display content.
Denison/Lee/Aggarwal further teaches the claim limitation that obtaining the semantic information of the to-be-modified display content comprises:
highlighting the to-be-modified display content; and
in response to a triggering instruction of the to-be-modified display content, obtaining the semantic information of the to-be-modified display content (
Denison teaches at FIGS. 4C-1 to 4C-2 and Paragraph 0064 obtaining the second input 401’ “Remove clouds and make sky clear” and modifying the semantic information “the sky including clouds 415” to generate a modified semantic information 415’ of the generated image to generate the adjusted image. Moreover, the blet of the batman can be adjusted to exhibit different style 411’ wherein the different style could be specified using another image or using tuning comments provided via text input at the text field.
Lee teaches at FIG. 3 and Paragraph 0052-0053 highlighting the to-be-modified display content in response to the first query 320 by providing an image mask outlining the first object 305. In response to the first query 320, obtaining the semantic annotation information outlining the first object 305.
Aggarwal teaches at FIG. 2 and at Paragraph 0040-0045 that the to-be-modified input image has a segmentation area being highlighted with the shaded color where the background is crosshatched to denote a single color to be replaced. The segmented image 305 is segmented into two regions, light and dark regions. The dark region will be replaced with a source color and color replaced image 310 is a final image produced by the color replacement system and the segmented background of the image is replaced by the target color.
Aggarwal teaches at Paragraph 0026 the user may input the image and say a phrase “convert grey to blue” and at Paragraph 0040 that a user may use speech to increase saturation and lightness of the replaced color and provide semantic segmentation areas. Aggarwal teaches Paragraph 0082 that the system generates a target color embedding corresponding to a target color by applying a color text embedding network to a target color text input and at Paragraph 0083 that the system replaces the source color with the target color in the image based on the color segmentation and the target color embeddings and at Paragraph 0113 that the user may use a slider to vary the lightness and saturation values and the user provides ink blue as a target color and at Paragraph 0114 that the process may be repeated for a different color and at Paragraph 0115 that if a user wants to convert blue to red, the colors blue and red are recognized by the tool and at Paragraph 0117 that models such as a semantic based segmentation may be used to get pre-segmented regions where a user get color based segmented portions and at Paragraph 0034 that speech may be used to increase saturation and lightness of the replaced color, size of the color regions to segment and provide semantic segmentation areas).
It would have been obvious to one of the ordinary skill in the art before the filing date of the instant application to have provided a first/second/third text input repeatedly to have modified the semantic information in one or more portions of an image in relation to the first/second/third text input according to Denison/Lee/Aggarwal’s system and method to have been incorporated into Lin’s modification of the image based on the multiple stages of the user inputs on a graphical user interface. One of the ordinary skill in the art would have modified the semantic information in one or more portions of an image based on the additional user inputs.
Re Claim 17:
The claim 17 encompasses the same scope of invention as that of the claim 10 except additional claim limitation that when obtaining the semantic information of the to-be-modified display content, the processor is further configured to:
highlight the to-be-modified display content; and
in response to a triggering instruction of the to-be-modified display content, obtain the semantic information of the to-be-modified display content.
The claim 17 is in parallel with the claim 8 in the form of an apparatus claim. The claim 17 is subject to the same rationale of rejection as the claim 8.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JIN CHENG WANG whose telephone number is (571)272-7665. The examiner can normally be reached Mon-Fri 8:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, King Poon can be reached at 571-270-0728. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JIN CHENG WANG/Primary Examiner, Art Unit 2617