Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed.
The following title is suggested: Cluster Management Method, Device, and Storage Medium for Generation of Multimedia Content From Text Prompt.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Tao (US 20190325626 A1).
Regarding claim 1, Tao teaches a content generation method, comprising:
acquiring a target text, wherein the target text comprises description content for describing a target role and/or a target scenario (par. 0179: “One example of a targeting parameter is a user-specified purpose of the design content, such as whether the design content is intended to convey information (e.g., a “tell” purpose), present an aesthetically desirable scene (e.g., a “show” purpose), or convey information in an aesthetically desirable manner (e.g. a “show and tell” purpose).”);
generating a prompt word based on the target text, wherein the prompt word comprises: a role prompt word corresponding to the target role and/or a scenario prompt word corresponding to the target scenario (1. Introduction, par. 3: “During inference, the finetuned model can generate identity-preserved scene variations of the input concept using prompts containing the learned textual identifier.”);
inputting the prompt word into a content generation model, and generating at least one frame of preview image corresponding to the target text, wherein prompt words associated with different preview images are at least partially different (par. 0153: “The role-selection section 910 can also include one or more previews 914 having visualizations that depict the use of a given color in a given role. For instance, each of the previews 914 depicts an example of text set against a respective background color identified using the one or more of the control elements 913. Additional control elements 916 can be used to select, as a primary text color or accent text color, another one of the colors identified in the visualizations 912.”);
in response to a first modification operation on a prompt word associated with a first preview image, inputting a modified prompt word into the content generation model, and generating a new preview image corresponding to the first preview image, wherein the first preview image is any one frame of the at least one frame of preview image (par. 0108: “The design engine 108 can receive, via the content-creation interface, an edit input identifying a modification to the branded design content 130. The design engine 108 can constrain, augment, or reject the modification based on a constraint indicated by the brand profile, a quality requirement assessed by a design-quality model (described in more detail below), or both.”); and
generating target multimedia content corresponding to the target text based on the new preview image (par. 0046: “Each of the brand engine 104 and the design engine 108 includes program instructions for displaying and editing design content, such as text, images or other graphics, videos, or some combination thereof. Examples of these program instructions include program instructions for rendering content for display, program instructions for creating one or more instances of event listeners or other suitable objects for receiving input from input devices (e.g., a mouse, a touchscreen, etc.), program instructions for overlaying different graphics in a multilayer design, program instructions for automatically generating HTML code, program instructions for formatting content in different file formats (e.g., JPG, PDF, etc.).”).
Regarding claim 2, Tao teaches the content generation method according to claim 1, wherein the generating a prompt word based on the target text, comprises:
splitting the target text to obtain a plurality of text segments, wherein any text segment comprises: at least part of a first description content for the target role, and/or at least part of a second description content for the target scenario (par. 0180: “Assigning multiple elements to a particular content group can cause the design engine 108 to position those elements adjacent to one another in subsequent phases of a content-creation process.”); and
for each text segment of the plurality of text segments, performing semantic analysis on each text segment to obtain a prompt word corresponding to each text segment (par. 0121: “Prior brand example analysis could also include textual analysis of a brand book, digital images, web site, or web search results to determine one or more characteristics of the brand. For example, this could include textual analysis of a brand book, website, or search results to extract color codes, text styles, keywords or phrases associated with the brand, or textual analysis of a web site or search results to determine brand personality, such as identifying descriptions or social media conversations that may suggest the brand is modern, funny, edgy, formal, aggressive, or other characteristics.”).
Regarding claim 3, Tao teaches the content generation method according to claim 2, wherein the inputting the prompt word into a content generation model, and generating at least one frame of preview image corresponding to the target text, comprises:
inputting prompt words respectively corresponding to the plurality of text segments into the content generation model, to obtain a preview image corresponding to each text segment (par. 0153: “The role-selection section 910 can also include one or more previews 914 having visualizations that depict the use of a given color in a given role. For instance, each of the previews 914 depicts an example of text set against a respective background color identified using the one or more of the control elements 913.”).
Regarding claim 4, Tao teaches the content generation method according to claim 3, wherein the in response to a first modification operation on a prompt word associated with a first preview image, inputting a modified prompt word into the content generation model, and generating a new preview image corresponding to the first preview image, comprises:
in response to a first modification operation on a prompt word associated with any text segment, inputting a modified prompt word corresponding to the any text segment into the content generation model, and generating a new preview image corresponding to the any text segment (par. 0062: “An automated analysis could include identifying, for a given brand attribute, different values of the brand attribute found within the brand exemplar and presenting some or all of the identified values in a profile-development interface 106 for selection, exclusion, and/or modification via further inputs received via the user device 126.”).
Regarding claim 5, Tao teaches the content generation method according to claim 4, further comprising: determining an associated preview image from other preview images other than the first preview image based on the modified prompt word corresponding to the any text segment; and
modifying the associated preview image based on the modified prompt word corresponding to the any text segment, to obtain a new preview image corresponding to the associated preview image (par. 0138: “For example, the design engine 108 can detect a particular edit event via an event listener of the content-creation interface 110. Subsequent to detecting the edit event, and prior to implementing a modification specified by the edit (e.g., updating a preview of the design content within the content-creation interface 110), the design engine 108 can assess a design modification requested by the edit.”).
Regarding claim 6, Tao teaches the content generation method according to claim 1, wherein before the generating target multimedia content corresponding to the target text based on the new preview image, the content generation method further comprises:
generating caption information corresponding to the target text, and/or determining a target timbre corresponding to the target text (par. 0081: “A presentation device 212 can include any device or group of devices suitable for providing visual, auditory, or other suitable sensory output.” It would be obvious to one familiar in the art that generating auditory output as described would necessitate the ability to influence timbre.); and
the generating target multimedia content corresponding to the target text based on the new preview image, comprises:
generating the target multimedia content corresponding to the target text based on the new image and at least one selected from a group consisting of the caption information and the target timbre (par. 0081, as above).
Regarding claim 7, Tao teaches the content generation method according to claim 6, wherein the determining a target timbre corresponding to the target text, comprises:
determining a sound feature of the target role based on the target text, and matching a corresponding target timbre for the target role based on the sound feature (par. 0118: “In some aspects, the digital graphic design computing system 100 may receive manual brand input from the user device, which could be in the form of data submitted based upon text entered into input boxes, options selected from menu boxes, button clicks or radio button selections, color palette selections from a color grid, uploaded logos, photographs, and other images, audio, video, or other information relating to a brand.”); or,
receiving a target timbre determined by a user from a plurality of candidate timbres (par. 0118, as above).
Regarding claim 8, Tao teaches the content generation method according to claim 1, further comprising:
acquiring painting style information and/or image ratio information of the preview image (par. 0140: “the design engine 108 may have translation logic that links, for example, a 500×500 template to a similarly styled 728×90 template. This allows the content of the 500×500 graphic design to be readily mapped to the 728×90 template while substantially preserving the overall style, aesthetic, and visual appearance that caused the user to select the preferred graphic design in the first place.”); and
the inputting the prompt word into a content generation model, and generating at least one frame of preview image corresponding to the target text, comprises:
inputting the prompt word and at least one selected from the group consisting of the painting style information and the image ratio information into the content generation model, and generating the at least one frame of preview image corresponding to the target text (par. 0141: “If a selected graphic design has been edited as desired, and related designs have been produced, selected, and potentially also edited, the design engine 108 may finalize the designs and thereby generate output branded design content 130, as depicted at block 720.”).
Regarding claim 9, Tao teaches the content generation method according to claim 1, wherein before the generating at least one frame of preview image corresponding to the target text, the method further comprises:
obtaining appearance feature information corresponding to the target role, wherein the appearance feature information is obtained by performing role feature analysis on the target text, and/or receiving the appearance feature information corresponding to the target role input by a user (par. 0125: “The digital graphic design computing system 100 may then receive inputs from the user device 126, such as receiving a brand profile selection that will be used to determine the appearance of and restrictions on the provisional graphic designs.”);
inputting the appearance feature information into the content generation model, to obtain a role image of the target role (par. 0125: “Another input could provide image content for the graphic design (e.g., selecting from photographs configured for the brand or uploading new photographs). Another input could include an indication of whether or not a configured brand logo should be included in the provisional graphic designs.”); and
the inputting the prompt word into a content generation model, and generating at least one frame of preview image corresponding to the target text, comprising:
inputting the prompt word and the role image into the content generation model to generate the at least one frame of preview image corresponding to the target text (par. 0131: “The design engine 108 may accomplish placement by, for example, programmatically generating image data (e.g., as a JPG, BMP, or other image format) based upon the particular template and inputs, may render or draw the graphic design within an application (e.g., rendering the design with objects from an object oriented language, drawing on an HTML canvas), or may simulate the graphic design by creating and organizing a number of HTML components to appear as the graphic design.”).
Regarding claim 10, Tao teaches the content generation method according to claim 9, further comprising:
in response to a second modification operation on the appearance feature information corresponding to the target role, generating a new role image of the target role based on a modified appearance feature information (par. 0116: “At block 506, user inputs that edit one or more of the designs may be received, and the client application 128 and/or the digital design application 102 can edit one or more of the designs. Examples of edits could include moving or resizing partitions, changing colors, logos, photos, and text. Each manual change indicated by inputs from a user device 126 may be compared to the brand profile to determine whether the manual change is an allowable change, or whether the manual change is prohibited based upon the brand profile. Once changes have been rejected or accepted based upon the brand profile, the design is finalized and ready for publication.”);
determining a preview image corresponding to the target role from a plurality of preview images (par. 0153: “The role-selection section 910 can also include one or more previews 914 having visualizations that depict the use of a given color in a given role.”); and
modifying the target role in the preview image corresponding to the target role based on the new role image, to obtain a second preview image (par. 0159: “The profile-development interface 1500 can also include a control section 1504 having a control element configured for accepting or rejecting a particular type of background for the particular logo variant depicted in the preview section 1502. For instance, the tool depicted in FIG. 15 can be used for updating, based on input to the tool (i.e., logo-configuration interface 1502), a logo attribute of the brand profile to identify the modified color specified with the tool; updating, based on input to the tool, a logo attribute of the brand profile to prevent a modified color specified with the tool from being displayed adjacent to the logo element; or some combination thereof.”);
the generating target multimedia content corresponding to the target text based on the new preview image, comprising:
generating the target multimedia content corresponding to the target text based on the second preview image and the new preview image (par. 0161: “FIG. 17 depicts an example of a profile-development interface 1700 for configuring one or more logo attributes controlling whether the branding engine can automatically generate a logo variant. In this example, the profile-development interface 1700 can include a preview section 1702 configured for presenting a visualization of a particular logo variant generated by the brand engine 104.”).
Claim 11 is substantially similar to claim 1, and differs only in that it teaches a device rather than a method. As such, it is rejected on a similar basis to claim 1.
Claim 12 is substantially similar to claim 2, and differs only in that it depends from claim 11 rather than claim 1. As such, it is rejected on a similar basis to claim 2.
Claim 13 is substantially similar to claim 3, and differs only in that it depends from claim 12 rather than claim 2. As such, it is rejected on a similar basis to claim 3.
Claim 14 is substantially similar to claim 4, and differs only in that it depends from claim 13 rather than claim 3. As such, it is rejected on a similar basis to claim 4.
Claim 15 is substantially similar to claim 5, and differs only in that it depends from claim 14 rather than claim 4. As such, it is rejected on a similar basis to claim 5.
Claim 16 is substantially similar to claim 6, and differs only in that it depends from claim 11 rather than claim 1. As such, it is rejected on a similar basis to claim 6.
Claim 17 is substantially similar to claim 7, and differs only in that it depends from claim 16 rather than claim 6. As such, it is rejected on a similar basis to claim 7.
Claim 18 is substantially similar to claim 8, and differs only in that it depends from claim 11 rather than claim 1. As such, it is rejected on a similar basis to claim 8.
Claim 19 is substantially similar to claim 9, and differs only in that it depends from claim 11 rather than claim 1. As such, it is rejected on a similar basis to claim 9.
Claim 20 is substantially similar to claim 1, and differs only in that it teaches a storage medium rather than a method. As such, it is rejected on a similar basis to claim 1.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN A BARHAM whose telephone number is (571)272-4338. The examiner can normally be reached Mon-Fri, 8:30am-5pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu, can be reached at (571) 272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RYAN ALLEN BARHAM/Examiner, Art Unit 2613
/XIAO M WU/Supervisory Patent Examiner, Art Unit 2613