DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed. The Examiner suggests changing the title to: “METHOD FOR PROCESSING COMMENTS TO GENERATE IMAGES AND ELECTRONIC DEVICE.”
Compact Prosecution
With respect to Claim Interpretation, the Examiner has provided some notes regarding “[BRI on the record]” throughout the Office Action, so that the record is clear about the scope of the claimed invention, and the record is also clear about the basis for the Examiner’s analyses. A clear record of the claim interpretation could expedite the examination by creating the condition to allow the examination to focus on Applicant’s inventive concept and its comparison with related prior art.
If there are disagreements, Applicant may present an alternative interpretation based on MPEP 2111. The Examiner will adopt Applicant’s interpretation on the record, if Applicant’s interpretation is reasonable and/or arguments are persuasive.
Applicant may amend claims relying on the Examiner’s claim interpretation provided on the record.
Objections
Claims 1-5, 10, 14, 16-18, and 20 are objected to because of the following informalities: the claims recite “in a case” or “in the case” and their scope is unclear. Appropriate correction or clarification on the interpretation is required.
For the art rejections provided in this Office Action, the Examiner did not treat any of the limitations containing “in a case” as optional.
However, the Examiner is requesting Applicant to clarify on the record, if “in a case” creates contingent limitations that “require[] only those steps that must be performed and does not include steps that are not required to be performed because the condition(s) precedent are not met.” MPEP 2111.04.
If they are contingent limitations, are they different from other contingent limitation, e.g., “if.” Please clarify what happens when the condition(s) precedent after “in a case” is not met. Does it make steps not required? Is the interpretation different for a device or CRM claims?
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-3, 8, 15-17, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Ackerman et al. (US 20240242428 A1) in view of Marek (US 20150033109 A1).
Regarding Claim 1, Ackerman teaches A method for processing comments (“The text content is then processed using an artificial intelligence algorithm configured to perform text-to-image processing to produce image content.” Ackerman Abstract. Text comments, a type of text, can be processed by the same text-to-image processing. The Examiner’s secondary reference will explicitly disclose processing comments. ), comprising:
displaying atext-to-image-generation-operation region (Ackerman Fig. 3E 330) text-to-image-generation-operation region comprises an image generation component (Ackerman Fig. 3E’s
PNG
media_image1.png
100
588
media_image1.png
Greyscale
) (
[BRI on the record]
With respect to “post-operation,” the Examiner is reading it to mean: an operation of publishing/posting, over an electronic network, comment(s) about item(s). The interpretation is made based on the context within the claim, the plain meaning of the terms, and the specification.
[0002]The present disclosure relates to the field of Internet technologies, and in particular, to a method for processing comments and an electronic device.
[0031]When viewing videos or images on a website or application, users may post comments on the videos or images. Typically, comments reflect the viewers' opinions on the videos or images and serve as a means to share information with other viewers. As the volume of content increases, users expect comments to provide more capabilities so as to increase the efficiency of information sharing, conserve computing resources, and improve platform service efficiency.
Spec. ¶¶ 2, 31.
With respect to “current resource,” the Examiner is reading it to mean: an item being currently commented on. Resource is differentiated from an event. The interpretation is made based on the context within the claim and the plain meaning of the terms.
With respect to “post-operation region is configured to post comment information,” the Examiner is reading the limitation to mean a computer graphical interface that is configured to post comment information. This interpretation is made in light of the specification.
[0002]The present disclosure relates to the field of Internet technologies, and in particular, to a method for processing comments and an electronic device.
[0035]In some embodiments, the electronic device 012 includes but is not limited to a smart phone, a desktop computer, a tablet computer, a notebook computer, a smart speaker, a digital assistant, an Augmented Reality (AR)/Virtual Reality (VR) device, and a smart wearable device. The electronic device 012 can run software, such as applications and applets. In some embodiments, operating systems running on the electronic device 012 include but are not limited to Android, iOS, Linux, windows, and Unix.
[0036]In some embodiments, the electronic device displays a post-operation region corresponding to a current resource, the post-operation region is configured to post comment information, and the post-operation region includes an image generation component. In the case that the post-operation region includes source comment information, an image set is displayed in the post-operation region in response to triggering the image generation component. The image set is generated based on the source comment information.
Spec. ¶¶ 2, 35-36.
With respect to “image generation component,” the Examiner is reading the limitation to be a software component or computer graphical interface component related to image generation.
[Mapping Analysis]
PNG
media_image2.png
394
728
media_image2.png
Greyscale
Ackerman Fig. 3E’s posted text: “A futuristic cityscape with flying cars and towering skyscrapers,” and Ackerman Fig. 3E 356, 354, 352, and 350 are generated based on the input text. See Ackerman Abstract.); and
displaying an image set (Ackerman Fig. 3E 356, 354, 352, and 350) in the text-to-image-generation-operation region Ackerman Fig. 3E 300) in response to triggering the image generation component in a case that the text-to-image-generation-operation region (Ackerman Fig. 3E 300) comprises source comment information (Ackerman Fig. 3E’s posted text: “A futuristic cityscape with flying cars and towering skyscrapers”), wherein the image set is generated based on the source comment information (“As explained above, as changes to the prompt 332 and/or the modified prompt 336 are made, one or more new pieces of image content may be generated, shown in FIG. 3E as image content 350, 352, 354, 356. Each of the pieces of image content may correspond to a particular prompt version. For example, the initial image 350 may be generated at a time t=0 (i.e., when the initial prompt is received), and each of the image 352-356 may correspond to an image generated based on changes to the prompt made using the functionality of the interface 330.” Ackerman ¶ 65.).
Ackerman does not explicitly disclose the image-generation-operation region is part of a post- -operation region corresponding to a current resource,wherein the post-operation region is configured to post comment information.
Marek teaches the image-generation-operation region is part of a post-operation region corresponding to a current resource, wherein the post-operation region is configured to post comment information (
“In addition to a text comment or alternative to a text comment, a user may insert an image, a video, an audio comment, or make a drawing in the annotation field 220. To insert an image, a user would activate the ‘Image’ button and then would be provided with pre-selected image options or allowed to upload an image to be inserted. In an embodiment, a user may be presented with an option to take a picture with a camera of the end user device 116 to be uploaded and inserted. To insert a video, a user would activate the ‘Video’ button and then would be provided with pre-selected video options or allowed to upload a video to be inserted. In an embodiment, a user may be presented with an option to take video with a video camera of the end user device 116 to be uploaded and inserted.” Marek ¶ 36. “The multimedia annotation device, system, and method described herein enables a user to add annotations, e.g. text comments, images, illustrations, audio comments, video, etc., to streaming multimedia objects for presentation to a user viewing the multimedia object without editing the multimedia object to include the annotations and while streaming the multimedia from its source host.” Marek ¶ 5. Marek Figs. 2-7.
Here, selected/generated image(s) could be included in a comment/annotation to be posted/added.
After the combination of Ackerman and Marek, Ackerman’s image generated based on a text input could be used as the selected/generated image(s) to be included in a comment. The image-generation-operation region also becomes part of a post-operation region, because the image generated could be used as part of a comment to be posted.
There are many options for a person with ordinary skills in the art to combine interface features from Ackerman and Marek. The following is an example only for the purpose of illustration, which combines Ackerman’s Fig. 3E and Marek Fig. 5.
PNG
media_image3.png
470
524
media_image3.png
Greyscale
“Various embodiments provide techniques for annotating multimedia objects without editing the original source file of the multimedia object and for sharing the multimedia object and annotations on the internet.” Marek ¶ 36.
The current resource is mapped to the disclosed multimedia objects.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Marek’s comment/annotation creation with primary reference Ackerman. One of ordinary skill in the art would be motivated to allow a user to create comments with various types of information. This would allow a user to be more effective in expressing the user’s comment or opinion. “The multimedia annotation device, system, and method described herein enables a user to add annotations, e.g. text comments, images, illustrations, audio comments, video, etc., to streaming multimedia objects for presentation to a user viewing the multimedia object without editing the multimedia object to include the annotations and while streaming the multimedia from its source host.” Marek ¶ 5.
Claims 16 and 20 are substantially similar to Claim 1. The rejection analyses of Claim 1 based on Ackerman in view of Marek are applied to Claims 16 and 20. In addition, Claim 16 recites “An electronic device, comprising: a processor; and a memory configured to store instructions that, when executed by the processor, cause the processor to: . . .” (Ackerman Fig. 1 130; Ackerman ¶¶ 29, 103). In addition, Claim 20 recites “A non-transitory computer-readable storage medium storing instructions that, when executed by a processor of an electronic device, cause the electronic device to: . . .” (Ackerman Fig. 1 130; Ackerman ¶¶ 29, 103).
Regarding Claim 2, Ackerman in view of Marek teaches The method according to claim 1,
wherein said displaying the image set in the post-operation region in response to triggering the image generation component in the case that the post-operation region comprises the source comment information (addressed in Claim 1) comprises:
changing the image generation component from a non-triggerable state to a triggerable state in a case that the source comment information meets a preset condition (
“As explained above, as changes to the prompt 332 and/or the modified prompt 336 are made, one or more new pieces of image content may be generated, shown in FIG. 3E as image content 350, 352, 354, 356. Each of the pieces of image content may correspond to a particular prompt version. For example, the initial image 350 may be generated at a time t=0 (i.e., when the initial prompt is received), and each of the image 352-356 may correspond to an image generated based on changes to the prompt made using the functionality of the interface 330.” Ackerman ¶ 65
The preset condition is mapped to the condition whether the prompt 332 has been entered or changed.); and
displaying the image set in the post-operation region in response to triggering the image generation component in the triggerable state (
Ackerman teaches triggering the image generation after preset condition, e.g., changes to the prompt 332, is satisfied, “As explained above, as changes to the prompt 332 and/or the modified prompt 336 are made, one or more new pieces of image content may be generated, shown in FIG. 3E as image content 350, 352, 354, 356. Each of the pieces of image content may correspond to a particular prompt version. For example, the initial image 350 may be generated at a time t=0 (i.e., when the initial prompt is received), and each of the image 352-356 may correspond to an image generated based on changes to the prompt made using the functionality of the interface 330.” Ackerman ¶ 65.).
Claim 17 is substantially similar to Claim 2. The rejection analyses of Claim 2 based on Ackerman in view of Marek are applied to Claim 17.
Regarding Claim 3, Ackerman in view of Marek teaches The method according to claim 2,
wherein said changing the image generation component from the non-triggerable state to the triggerable state in the case that the source comment information meets the preset condition (addressed in parent claim) comprises one of :
changing the image generation component from the non-triggerable state to the triggerable state in a case that the source comment information is character comment information and a number of characters in the character comment information meets a preset number of characters;
changing the image generation component from the non-triggerable state to the triggerable state in a case that the source comment information is image comment information (“In addition to a text comment or alternative to a text comment, a user may insert an image, a video, an audio comment, or make a drawing in the annotation field 220. To insert an image, a user would activate the ‘Image’ button and then would be provided with pre-selected image options or allowed to upload an image to be inserted. In an embodiment, a user may be presented with an option to take a picture with a camera of the end user device 116 to be uploaded and inserted. To insert a video, a user would activate the ‘Video’ button and then would be provided with pre-selected video options or allowed to upload a video to be inserted. In an embodiment, a user may be presented with an option to take video with a video camera of the end user device 116 to be uploaded and inserted.” Marek ¶ 36.
The image comment information is mapped to image selected as part of the comment, e.g., through clicking on the “image” button to insert an image into a comment/annotation.); or
changing the image generation component from the non-triggerable state to the triggerable state in a case that the source comment information comprises image comment information and character comment information.
Regarding Claim 8, Ackerman in view of Marek teaches The method according to claim 1,
wherein the post-operation region (There are many options for a person with ordinary skills in the art to combine interface features from Ackerman and Marek. The following is an example, which combines Ackerman’s Fig. 3E and Marek Fig. 5.
PNG
media_image3.png
470
524
media_image3.png
Greyscale
) comprises an information input region (
PNG
media_image4.png
82
344
media_image4.png
Greyscale
) and an input operation region (includes Ackerman’s Fig. 3E
PNG
media_image2.png
394
728
media_image2.png
Greyscale
), and said displaying the image set in the post-operation region further comprises:
displaying the image set in the input operation region (Ackerman Fig. 3E’s
PNG
media_image1.png
100
588
media_image1.png
Greyscale
);
displaying a target image in the information input region (
PNG
media_image4.png
82
344
media_image4.png
Greyscale
, which shows that a generated image that has been selected) in response to a select operation on the target image in the image set, wherein the information input region comprises a posting component (
“This process may continue until a time t=n, which correspond to a time when a stop criterion is satisfied. As a non-limiting example, the stop criterion may be when the image content output based on the prompt sufficiently matches the user's vision of what the prompt is describing.” Ackerman ¶ 65.
PNG
media_image5.png
124
388
media_image5.png
Greyscale
“Alternatively yet, an annotation may be an image selected from a pre-defined set of images that are presented to the user for selection or an image that is uploaded by the user, representatively illustrated in FIG. 5. Similar to the image, an annotation may be a video that when displayed would begin playing.” Marek ¶ 42.
“For example, the annotation and its related data are received by the application server 114 when a user activates the ‘OK’ button in the annotation interface 209.” Marek ¶ 57.); and
displaying at least the target image on a comment display page in response to triggering the posting component (
PNG
media_image6.png
460
448
media_image6.png
Greyscale
“Annotations, such as text comment 240 shown in FIG. 6 or illustration 242 shown in FIG. 7, are presented to the viewer in sync with the play back of the video 122. The annotations, when presented to the viewer, are located within the transparent overlay 201 and over top of the media player 202. Preferably, the annotations are located within the transparent overlay 201 and over top the video portion 214 of the media. Each annotation is presented to the view at the time and duration determined by the user who created the annotation. Multiple separate annotations may be presented concurrently or in an overlapping time period.” Marek ¶ 44.
Marek teaches an annotation/comment may also include a selected/generated image, stating “In addition to a text comment or alternative to a text comment, a user may insert an image, a video, an audio comment, or make a drawing in the annotation field 220. To insert an image, a user would activate the ‘Image’ button and then would be provided with pre-selected image options or allowed to upload an image to be inserted.” Marek ¶ 36.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Marek’s comment/annotation creation with primary reference Ackerman. One of ordinary skill in the art would be motivated to allow a user to create comments with various types of information. This would allow a user to be more effective in expressing the user’s comment or opinion. “The multimedia annotation device, system, and method described herein enables a user to add annotations, e.g. text comments, images, illustrations, audio comments, video, etc., to streaming multimedia objects for presentation to a user viewing the multimedia object without editing the multimedia object to include the annotations and while streaming the multimedia from its source host.” Marek ¶ 5. The “OK” button would allow to receive confirmation from a user, so that the system would react precisely according to the user’s preference. The displaying selected/generated image would allow the user to preview and/or allow others to view the shared annotations.
Claim 19 is substantially similar to Claim 8. The rejection analyses of Claim 8 based on Ackerman in view of Marek are applied to Claim 19.
Regarding Claim 15, Ackerman in view of Marek teaches The method according to claim 1, wherein the image set is generated based on the source comment information by an image generation model corresponding to the image generation component (“The text content is then processed using an artificial intelligence algorithm configured to perform text-to-image processing to produce image content.” Ackerman Abstract. The image generation model is mapped to the disclosed text-to-image model.).
Claims 4 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Marek as applied to Claims 1 and 16, in further view of Hicks (US 20140306899 A1).
Regarding Claim 4, Ackerman in view of Marek teaches The method according to claim 1, wherein said displaying the post-operation region corresponding to the current resource (addressed in Claim 1) comprises:
displaying an information input region (regions in the integrated interface related to input) and an input operation region (regions in the integrated interface related to changing front size, text color …) corresponding to the current resource ( Ackerman in view of Marek could teaches an integrated interface like below:
PNG
media_image3.png
470
524
media_image3.png
Greyscale
); and
displaying the image generation component in the information input region
PNG
media_image4.png
82
344
media_image4.png
Greyscale
, which shows that a generated image that has been selected); and
said displaying the image set in the post-operation region in response to triggering the image generation component in the case that the post-operation region comprises the source comment information (Ackerman teaches triggering the image generation after preset condition, e.g., changes to the prompt 332, is satisfied, “As explained above, as changes to the prompt 332 and/or the modified prompt 336 are made, one or more new pieces of image content may be generated, shown in FIG. 3E as image content 350, 352, 354, 356. Each of the pieces of image content may correspond to a particular prompt version. For example, the initial image 350 may be generated at a time t=0 (i.e., when the initial prompt is received), and each of the image 352-356 may correspond to an image generated based on changes to the prompt made using the functionality of the interface 330.” Ackerman ¶ 65.) comprises:
displaying the image set in the input operation region in response to triggering the image generation component in a case that the information input region displays the source comment information (“As explained above, as changes to the prompt 332 and/or the modified prompt 336 are made, one or more new pieces of image content may be generated, shown in FIG. 3E as image content 350, 352, 354, 356. Each of the pieces of image content may correspond to a particular prompt version. For example, the initial image 350 may be generated at a time t=0 (i.e., when the initial prompt is received), and each of the image 352-356 may correspond to an image generated based on changes to the prompt made using the functionality of the interface 330.” Ackerman ¶ 65.
Here, the Examiner has explained that regions in the integrated interface related to input is mapped to the information input region, including input field for prompt 332.).
However, Ackerman in view of Marek does not explicitly disclose displaying a virtual keyboard panel in the input operation region.
Hicks teaches displaying a virtual keyboard panel in the input operation region (
“The user interface for these touch sensitive computing devices typically include a virtual keyboard (also referred to as a soft keyboard) for entering text and other characters. The virtual keyboard is typically displayed when a user is interacting with a text entry box or other various text input fields.” Hicks ¶ 2.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Hicks’ virtual keyboard with Ackerman in view of Marek. One of ordinary skill in the art would be motivated to allow a more convenient to enter text. If a display was not connected to a physical keyboard, a virtual keyboard would be even more helpful to the user.
Claim 18 is substantially similar to some of Claim 4’s limitations. The rejection analyses of Claim 4 based on Ackerman in view of Marek and Hicks are applied to Claim 18.
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Marek and Hicks as applied to Claim 4, in further view of Cole et al. (US 20160019370 A1).
Regarding Claim 5, Ackerman in view of Marek and Hicks teaches The method according to claim 4,
wherein said displaying the image set in the input operation region (Ackerman Fig. 3E’s
PNG
media_image1.png
100
588
media_image1.png
Greyscale
) in response to triggering the image generation component in the case that the information input region (regions in the integrated interface related to input) displays the source comment information (
Ackerman teaches triggering the image generation after preset condition, e.g., changes to the prompt 332, is satisfied, “As explained above, as changes to the prompt 332 and/or the modified prompt 336 are made, one or more new pieces of image content may be generated, shown in FIG. 3E as image content 350, 352, 354, 356. Each of the pieces of image content may correspond to a particular prompt version. For example, the initial image 350 may be generated at a time t=0 (i.e., when the initial prompt is received), and each of the image 352-356 may correspond to an image generated based on changes to the prompt made using the functionality of the interface 330.” Ackerman ¶ 65.) comprises:
displaying
Ackerman Fig. 3E:
PNG
media_image7.png
114
596
media_image7.png
Greyscale
Ackerman Fig. 3F:
PNG
media_image8.png
224
708
media_image8.png
Greyscale
) in response to triggering the image generation component in a case that the source comment information displayed in the information input region comprises image comment information (Ackerman Fig. 3F:
PNG
media_image9.png
96
358
media_image9.png
Greyscale
) and character comment information (
PNG
media_image10.png
42
516
media_image10.png
Greyscale
),
wherein the image panel comprises first generation information and second generation information (The first generation information and second generation information are mapped to parameters, e.g., weights, words, style …, used to generate images); and
displaying on the image panel the image set generated based on the image comment information in response to triggering the first generation information (“As explained above, as changes to the prompt 332 and/or the modified prompt 336 are made, one or more new pieces of image content may be generated, shown in FIG. 3E as image content 350, 352, 354, 356. Each of the pieces of image content may correspond to a particular prompt version. For example, the initial image 350 may be generated at a time t=0 (i.e., when the initial prompt is received), and each of the image 352-356 may correspond to an image generated based on changes to the prompt made using the functionality of the interface 330.” Ackerman ¶ 65. “To execute the term modification, the interface 370 may include interactive elements 376, shown as dropdown menus, that enable the user to select which term for an object depicted in the image the user would like to change, and selection of one of the interactive elements 376 may display a list of terms that are selected by the user to change the term used to identify one or more objects in the image, such as a list of synonyms for the particular object the user would like to change. As the user makes changes via the interactive elements 374 and 376, a modified image may be displayed to the user.” Ackerman ¶ 66.); or
displaying on the image panel the image set generated based on the character comment information in response to triggering the second generation information (The same analyses for the previous limitation related to the “first generation information” applies here.).
Ackerman in view of Marek and Hicks does not explicitly disclose displaying an image panel in response to triggering the image generation.
Cole teaches displaying an image panel in response to triggering the image generation component (“. . . these screens may be implemented as pop-up windows or may be shown by the processing circuitry in response to user input, such as pressing a button.” Cole ¶ 84.
After Ackerman in view of Marek and Hicks is combined with Cole, the image generation component may correspond to or include a button. The image panel is mapped to a pop-up window.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Cole’s popup windows with Ackerman in view of Marek and Hicks. One of ordinary skill in the art would be motivated to make it easier for a user to interact with the interface for certain scenarios. It may increase interface real estate to display controls and options. The user would be led to focus on the options within the pop-up windows.
Claims 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Marek as applied to Claim 1, in further view of Kim et al. (US 20190294287 A1).
Regarding Claim 6, Ackerman in view of Marek teaches The method according to claim 1.
Ackerman in view of Marek does not explicitly disclose further comprising: displaying a first image as enlarged on the post-operation region in response to a preview instruction on the first image in the image set.
Kim teaches further comprising: displaying a first image as enlarged on the post-operation region in response to a preview instruction on the first image in the image set (“Referring to FIG. 15, a first user interface 1510 may provide a preview image 1515 based on a force input. For example, when the force input is detected while a touch input is being held, the processor 120 may display the preview image 1515 regarding an image corresponding to the touch input. For example, the preview image 1515 may be an image that is selected from a plurality of images provided in a gallery application and is slightly enlarged. The preview image 1515 may be different from an image that is provided based on a touch input for selecting any one of the plurality of images. When the image is selected (for example, a touch input), the processor 120 may display the selected image on a substantially entire region of the display (for example, the display 430). However, when the force input is detected while the touch input is being held, the processor 120 may display the preview image 1515 which is a slightly larger image of the image corresponding to the touch input.” Kim ¶ 174.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Kim’s image preview technique with Ackerman in view of Marek. One of ordinary skill in the art would be motivated to make it easier to see the details of an image. Consequently, the user would make a better informed decisions related to the images.
Regarding Claim 7, Ackerman in view of Marek and Kim teaches The method according to claim 6,
wherein said displaying the first image as enlarged on the post-operation region (addressed in the analyses for the previous claim) comprises:
displaying the first image as enlarged (Kim:
PNG
media_image11.png
662
422
media_image11.png
Greyscale
) and operation information for the first image on the post-operation region (Ackerman Fig. 3E:
PNG
media_image7.png
114
596
media_image7.png
Greyscale
), wherein the operation information comprises at least one of
regeneration information (see analysis below) or “save to a preset storage region” information (see analysis below); and
generating, based on the source comment information, a second image for replacing the first image in response to triggering the regeneration information (“As explained above, as changes to the prompt 332 and/or the modified prompt 336 are made, one or more new pieces of image content may be generated, shown in FIG. 3E as image content 350, 352, 354, 356. Each of the pieces of image content may correspond to a particular prompt version. For example, the initial image 350 may be generated at a time t=0 (i.e., when the initial prompt is received), and each of the image 352-356 may correspond to an image generated based on changes to the prompt made using the functionality of the interface 330.” Ackerman ¶ 65.); or
saving the first image to a preset storage region in response to triggering the “save to a preset storage region” information (
“Step 914 records the received annotation and its related data within an annotation database 130 for later retrieval when the multimedia object is replayed with the annotations. Non-text annotations, including audio, images, or video are saved to storage so that they can later be retrieved during playing of the associated multimedia object. For example, the non-text annotations may be saved to storage 124 or storage 128.” Marek ¶ 58.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Kim’s image preview technique with Ackerman in view of Marek. One of ordinary skill in the art would be motivated to make it easier to see the details of an image and reuse/reload data. Consequently, the user would make a better informed decisions related to the images.
Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Marek as applied to Claim 8, in further view of Tambi et al. (US 20240095275 A1).
Regarding Claim 9, Ackerman in view of Marek teaches The method according to claim 8,
wherein said displaying at least the target image on the comment display page in response to triggering the posting component (addressed in the analyses for the parent claim) comprises:
displaying the target image
“This process may continue until a time t=n, which correspond to a time when a stop criterion is satisfied. As a non-limiting example, the stop criterion may be when the image content output based on the prompt sufficiently matches the user's vision of what the prompt is describing.” Ackerman ¶ 65.
PNG
media_image5.png
124
388
media_image5.png
Greyscale
“For example, the annotation and its related data are received by the application server 114 when a user activates the ‘OK’ button in the annotation interface 209.” Marek ¶ 57.
PNG
media_image6.png
460
448
media_image6.png
Greyscale
“Annotations, such as text comment 240 shown in FIG. 6 or illustration 242 shown in FIG. 7, are presented to the viewer in sync with the play back of the video 122. The annotations, when presented to the viewer, are located within the transparent overlay 201 and over top of the media player 202. Preferably, the annotations are located within the transparent overlay 201 and over top the video portion 214 of the media. Each annotation is presented to the view at the time and duration determined by the user who created the annotation. Multiple separate annotations may be presented concurrently or in an overlapping time period.” Marek ¶ 44.
Marek teaches an annotation/comment may also include a selected/generated image, stating “In addition to a text comment or alternative to a text comment, a user may insert an image, a video, an audio comment, or make a drawing in the annotation field 220. To insert an image, a user would activate the ‘Image’ button and then would be provided with pre-selected image options or allowed to upload an image to be inserted.” Marek ¶ 36.).
Ackerman in view of Marek does not explicitly disclose displaying the source comment information with the target image.
Tambi teaches displaying the source comment information (Tambi Fig. 10 1040) with the target image (
PNG
media_image12.png
230
446
media_image12.png
Greyscale
).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Tambi’s with Ackerman in view of Marek. One of ordinary skill in the art would be motivated to inform a user the context of a generated image, like a label.
Claims 10 are rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Marek and Tambi as applied to Claim 9, in further view of Perrodin et al. (US 20130239049 A1).
Regarding Claim 10, Ackerman in view of Marek and Tambi The method according to claim 9,
wherein said displaying the target image and the source comment information on the comment display page in response to triggering the posting component (addressed in the analyses for the parent claim) comprises:
in a case that the source comment information comprises image comment information, displaying the target image and the image comment information
[BRI on the record]
With respect to “spliced form,” the Examiner is reading the limitation to require reduced width and height of an image’s form. This interpretation is made in light of the specification.
[0104] In some embodiments, as shown in FIG. 10, the target image and the image comment information are displayed in a spliced way, and the display size of the image comment information and the target image is compressed, such as by reducing the width of the images. When the spliced image is detected to be tapped, an image display window is displayed on the comment display page, and the target image and image comment information with the normal sizes are displayed in the image display window.
Spec. ¶ 104.
[Mapping Analysis]
“This process may continue until a time t=n, which correspond to a time when a stop criterion is satisfied. As a non-limiting example, the stop criterion may be when the image content output based on the prompt sufficiently matches the user's vision of what the prompt is describing.” Marek ¶ 65.
PNG
media_image5.png
124
388
media_image5.png
Greyscale
“For example, the annotation and its related data are received by the application server 114 when a user activates the ‘OK’ button in the annotation interface 209.” Marek ¶ 57.
“In addition to a text comment or alternative to a text comment, a user may insert an image, a video, an audio comment, or make a drawing in the annotation field 220. To insert an image, a user would activate the ‘Image’ button and then would be provided with pre-selected image options or allowed to upload an image to be inserted. In an embodiment, a user may be presented with an option to take a picture with a camera of the end user device 116 to be uploaded and inserted.” Marek ¶ 36.
“Various embodiments provide techniques for annotating multimedia objects without editing the original source file of the multimedia object and for sharing the multimedia object and annotations on the internet.” Marek ¶ 36.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Marek’s comment/annotation creation with primary reference Ackerman. One of ordinary skill in the art would be motivated to allow a user to create comments with various types of information. This would allow a user to be more effective in expressing the user’s comment or opinion. “The multimedia annotation device, system, and method described herein enables a user to add annotations, e.g. text comments, images, illustrations, audio comments, video, etc., to streaming multimedia objects for presentation to a user viewing the multimedia object without editing the multimedia object to include the annotations and while streaming the multimedia from its source host.” Marek ¶ 5.
Ackerman in view of Marek and Tambi does not explicitly disclose displaying the target image (image A) and the image comment information (image B) in a spliced form.
Perrodin teaches displaying a plurality of images in a spliced form (
PNG
media_image13.png
612
484
media_image13.png
Greyscale
“In the second stage 4010, the user selects the crop tool 4045. As shown in the third stage 4015, the user then selects a portion of the image to crop. The fourth stage 4020 illustrates the image display area displaying the cropped image 4050. The user selects a back button to return to the journal. As shown in the fifth stage 4025, the cropped version of the image 4035 is overlaid on the journal.” Perrodin ¶ 277. “The crop tool 4055 activates a cropping tool that allows the user to align cropped images and remove unwanted portions of an image.” Perrodin ¶ 276.
Here, the spliced form is mapped to cropped form, and the same process could be applied to multiple images.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Perrodin’s cropped form with Ackerman in view of Marek and Tambi. One of ordinary skill in the art would be motivated to save screen space to show more images and/or to emphasize the image features the user would like to highlight.
Regarding Claim 11, Ackerman in view of Marek and Tambi teaches The method according to claim 9.
Ackerman in view of Marek and Tambi does not explicitly disclose further comprising:
displaying a function application link of the image generation component in a preset region corresponding to the target image on the comment display page,
wherein the function application link is configured to reserve an application of the image generation component or to apply the image generation component.
Perrodin teaches further comprising:
displaying a function application link of the image generation component in a preset region corresponding to the target image on the comment display page (
Perrodin Fig. 40:
PNG
media_image14.png
252
506
media_image14.png
Greyscale
“In the first stage 4005, the user has selected the image 4035 on the journal. The selection results in the application displaying the menu item 4040 for editing the image to appear. As shown, the user selects this item 4040 to edit the image.” Perrodin ¶ 275.
“As shown in the second stage 4010, the selection of the menu item 4040 causes the application to display the selected image on the image display area 110. In addition, the GUI 100 includes a tool bar 4045. In the example illustrated in FIG. 40, the tool bar includes several editing tools. The crop tool 4055 activates a cropping tool that allows the user to align cropped images and remove unwanted portions of an image.” Perrodin ¶ 276.
The function application link is mapped to any of the buttons as shown in Perrodin’s Fig. 40, e.g., e.g., buttons for editing tools.),
wherein the function application link is configured to reserve an application of the image generation component or to apply the image generation component (
After Ackerman in view of Marek and Tambi is combined with Perrodin, a button similar to those at in Perrodin Fig. 40 could be added or modified to be linked the image generation component to edit the image.
“As explained above, as changes to the prompt 332 and/or the modified prompt 336 are made, one or more new pieces of image content may be generated, shown in FIG. 3E as image content 350, 352, 354, 356. Each of the pieces of image content may correspond to a particular prompt version. For example, the initial image 350 may be generated at a time t=0 (i.e., when the initial prompt is received), and each of the image 352-356 may correspond to an image generated based on changes to the prompt made using the functionality of the interface 330.” Ackerman ¶ 65.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Perrodin’s cropped form and various application controls with Ackerman in view of Marek and Tambi. One of ordinary skill in the art would be motivated to provide a user the flexibility to further edit comment images.
Claims 12-14 are rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Marek as applied to Claim 1, in further view of Cho et al (US 20160203841 A1).
Regarding Claim 12, Ackerman in view of Marek teaches The method according to claim 1,
wherein said displaying the image set in the post-operation region in response to triggering the image generation component in the case that the post-operation region comprises the source comment information (addressed in Claim 1 rejection analyses) comprises:
displaying the image set (Ackerman Fig. 3E 356, 354, 352, and 350) “As explained above, as changes to the prompt 332 and/or the modified prompt 336 are made, one or more new pieces of image content may be generated, shown in FIG. 3E as image content 350, 352, 354, 356.” Ackerman ¶ 65), wherein the image set comprises images of at least two styles (
“For example, the user may be able to change one or more styles for the image via interactive elements 380 and may also be able to apply weights to multiple styles via interactive elements 382, shown as slider controls. As an illustrative example, the styles shown in FIG. 3F includes ‘Sketch’ and ‘Cartoon’. If the user desires a more cartoony style, the user can increase the weight applied to the cartoon style and decrease the weight of the sketch style via the interactive elements 382. Furthermore, the user may select a different style, such as a “Realistic” style to change the image content.” Ackerman ¶ 66.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Ackerman’s styling adjusting for generated images with Ackerman’s features related Fig. 3E. One of ordinary skill in the art would be motivated to allow a user more options for generated images.
After the features from Ackerman abovementioned embodiments are combined, images of at least two styles are generated and included in the image set.).
However, Ackerman in view of Marek does not explicitly disclose displaying the image set on a full-screen image.
Cho teaches displaying the image set on a full-screen image (“The electronic device 100 may display an image 1042 other than the image 1041 on a main area (e.g., an area where the image 1041 is displayed) or in full screen in response to a scroll input, a touch input, or the like. If an object 1043 for returning back to the edited video is selected in this state, the electronic device 100 may return back to a previous state in which the edited video is played.” Cho ¶ 87.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Cho’s full-screen displaying with Ackerman in view of Marek. One of ordinary skill in the art would be motivated to allow the user to see the image set better by displaying the image set in full screen. In addition, the user would not be distracted by the other controls on a display.
Regarding Claim 13, Ackerman in view of Marek and Cho teaches The method according to claim 12, wherein
the source comment information comprises image comment information (
“In addition to a text comment or alternative to a text comment, a user may insert an image, a video, an audio comment, or make a drawing in the annotation field 220. To insert an image, a user would activate the ‘Image’ button and then would be provided with pre-selected image options or allowed to upload an image to be inserted.” Marek ¶ 36. “The multimedia annotation device, system, and method described herein enables a user to add annotations, e.g. text comments, images, illustrations, audio comments, video, etc., to streaming multimedia objects for presentation to a user viewing the multimedia object without editing the multimedia object to include the annotations and while streaming the multimedia from its source host.” Marek ¶ 5. Marek Figs. 2-7.
The image comment information is mapped to information related to a selected image/animation to be inserted.),
the image set comprises the image comment information and images generated based on the image comment information (
“In addition to a text comment or alternative to a text comment, a user may insert an image, a video, an audio comment, or make a drawing in the annotation field 220. To insert an image, a user would activate the ‘Image’ button and then would be provided with pre-selected image options or allowed to upload an image to be inserted.” Marek ¶ 36.
After the combination of Ackerman in view of Marek, the generated image could also include images generated through text-to-image model.
Ackerman:
PNG
media_image15.png
480
722
media_image15.png
Greyscale
, which teaches generating images based on an input image, which could be the image comment.), and
said displaying the image set on the full-screen image page (“The electronic device 100 may display an image 1042 other than the image 1041 on a main area (e.g., an area where the image 1041 is displayed) or in full screen in response to a scroll input, a touch input, or the like. If an object 1043 for returning back to the edited video is selected in this state, the electronic device 100 may return back to a previous state in which the edited video is played.” Cho ¶ 87.) comprises:
displaying the images in the image set on the image page uniformly; or
displaying any image in the image set in a primary display region and displaying in a secondary display region remaining images other than the image displayed in the primary display region, wherein the image page comprises the primary display region and the secondary display region, and a size of the primary display region is larger than a size of the secondary display region (
Fig. 10:
PNG
media_image16.png
208
202
media_image16.png
Greyscale
, which shows a primary display region larger than a secondary display region.
“The electronic device 100 may display an image 1042 other than the image 1041 on a main area (e.g., an area where the image 1041 is displayed) or in full screen in response to a scroll input, a touch input, or the like. If an object 1043 for returning back to the edited video is selected in this state, the electronic device 100 may return back to a previous state in which the edited video is played.” Cho ¶ 87.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Marek’s comment/annotation creation with primary reference Ackerman. One of ordinary skill in the art would be motivated to allow a user to create comments with various types of information. This would allow a user to be more effective in expressing the user’s comment or opinion. “The multimedia annotation device, system, and method described herein enables a user to add annotations, e.g. text comments, images, illustrations, audio comments, video, etc., to streaming multimedia objects for presentation to a user viewing the multimedia object without editing the multimedia object to include the annotations and while streaming the multimedia from its source host.” Marek ¶ 5.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Cho’s multiple display regions of full-screen displaying with Ackerman in view of Marek. One of ordinary skill in the art would be motivated to allow the user to see the image set better by enlarged displaying for some images compared to others.
Regarding Claim 14, Ackerman in view of Marek and Cho teaches The method according to claim 1,
wherein said displaying the image set in the post-operation region in response to triggering the image generation component in the case that the post-operation region comprises the source comment information (addressed in Claim 1 rejection analyses) comprises:
displaying a full-screen input page in response to detecting an input operation in the post-operation region (“. . . in full screen in response to a scroll input, a touch input, or the like. If an object 1043 for returning back to the edited video is selected in this state, the electronic device 100 may return back to a previous state in which the edited video is played.” Cho ¶ 87.), wherein the input page comprises an input box (
PNG
media_image17.png
34
264
media_image17.png
Greyscale
), the image generation component (Ackerman Fig. 3E’s
PNG
media_image1.png
100
588
media_image1.png
Greyscale
), and a plurality of text sets corresponding to a plurality of types, and the plurality of types comprise at least two of a theme, an environment, or an action (
PNG
media_image3.png
470
524
media_image3.png
Greyscale
“The interface 330 also includes interactive controls to manipulate and change the initial prompt. For example, the interface 330 may include selectable buttons or icons that enable a user to select the types of alternative text content the user would like to use to modify the prompt, such as synonyms, antonyms, hypernyms, hyponyms, random words, etc. The user may select one of the interactive elements to identify a particular type of language modification to be made to the prompt, such as to select synonyms to indicate the user desires to replace one or more words of the prompt 332 with a synonym. To execute the text content modification, the interface 330 may include interactive elements 335, shown as dropdown menus, that enable the user to select which word in the prompt the user would like to change, and selection of one of the interactive elements 335 may display a list of words that may be selected by the user to change the prompt, such as a list of synonyms for the particular word the user would like to change. As noted in FIG. 3E, other words besides synonyms may also be utilized to populate the suggested words, such as random words, antonyms, hypernyms, hyponyms, and the like. As the user makes changes to the prompt via the interactive elements 334, 335, the modified prompt may be displayed to the user, as shown at prompt 336. In addition to manipulation of the words included in the prompt 332, the interface 330 may also provide interactive elements, shown as slider controls 338, to apply a weight to each of the words.” Ackerman ¶ 64.
Ackerman Fig. 3E’s posted text: “A futuristic cityscape with flying cars and towering skyscrapers.” Here, for example, “flying” is an action, and one could change the action/word for “flying.” In addition, “cityscape” is an environment, and one could change the environment/word for “cityscape.”);
displaying target text in the input box in response to a determining instruction on the target text, wherein the target text is text in a text set of the plurality of text sets corresponding to any one of the plurality of types (“As noted in FIG. 3E, other words besides synonyms may also be utilized to populate the suggested words, such as random words, antonyms, hypernyms, hyponyms, and the like. As the user makes changes to the prompt via the interactive elements 334, 335, the modified prompt may be displayed to the user, as shown at prompt 336. In addition to manipulation of the words included in the prompt 332, the interface 330 may also provide interactive elements, shown as slider controls 338, to apply a weight to each of the words.” Ackerman ¶ 64.); and
displaying on an image page the image set generated based on the target text in response to triggering the image generation component (
“As explained above, as changes to the prompt 332 and/or the modified prompt 336 are made, one or more new pieces of image content may be generated, shown in FIG. 3E as image content 350, 352, 354, 356.” Ackerman ¶ 65.
Ackerman Fig. 3E’s
PNG
media_image1.png
100
588
media_image1.png
Greyscale
Cho Fig. 10:
PNG
media_image16.png
208
202
media_image16.png
Greyscale
,).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Cho’s multiple display regions of full-screen displaying with Ackerman in view of Marek. One of ordinary skill in the art would be motivated to allow the user to see the image set better by enlarged displaying for some images compared to others. The use of the full screen display would also allow an interface to be less crowded and a user would find it easier to use the interface.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Ye (US 20240221122 A1) teaches sliced form:
PNG
media_image18.png
580
360
media_image18.png
Greyscale
Lu et al. (US 20220377403 A1) teaches adding annotations/comments to a video:
PNG
media_image19.png
518
652
media_image19.png
Greyscale
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZHENGXI LIU whose telephone number is (571)270-7509. The examiner can normally be reached M-F 9 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached at (571)272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ZHENGXI LIU/Primary Examiner, Art Unit 2611