Last updated: April 19, 2026
Application No. 18/780,289
METHOD OF GENERATING COMIC IMAGE, COMPUTER DEVICE AND STORAGE MEDIUM

Non-Final OA §103
Filed
Jul 22, 2024
Examiner
TSWEI, YU-JANG
Art Unit
2614
Tech Center
2600 — Communications
Assignee
BEIJING ZITIAO NETWORK TECHNOLOGY CO., LTD.
OA Round
1 (Non-Final)
Interview Optional

— +17.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 447 resolved cases, 2023–2026
Examiner Intelligence

TSWEI, YU-JANG View full profile →
Grants 84% — above average
Career Allow Rate
376 granted / 447 resolved
+22.1% vs TC avg
Strong +17% interview lift
Without
With
+17.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 5m
Avg Prosecution
44 currently pending
Career history
491
Total Applications
across all art units
Statute-Specific Performance

§101
5.5%
-34.5% vs TC avg
§103
66.4%
+26.4% vs TC avg
§102
5.6%
-34.4% vs TC avg
§112
7.1%
-32.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 447 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. CN202311118761.2, filed on 2023.08.31.
Should applicant desire to obtain the benefit of foreign priority under 35 U.S.C. 119(a)-(d) prior to declaration of an interference, a certified English translation of the foreign application must be submitted in reply to this action.  37 CFR 41.154(b) and 41.202(e).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 7-13, 16-20 is rejected under 35 U.S.C. 103 as being unpatentable over Price et al. (US 9106812 B1, hereinafter Price) in view of Lupascu et al. (US 20230267652 A1, hereinafter Lupascu).

Regarding Claim 11, Price teaches a computer device (Price, Column 4, Lines 8-11, the shared computing resources 208), comprising at least one processor and a memory (Price, Column 4, Lines 8-11, the shared computing resources 208 comprise one or more processors 210 and one or more forms of computer-readable media 212 <read on memory>); wherein the memory stores machine-readable instructions executed by the at least one processor (Price, Column 4, Lines 20-26, the computer-readable media 212 may also contain an operating system for controlling software modules stored in the computer-readable media 212 and for controlling hardware associated with the shared computing resources 208. the automated storyboard producer 114 is as a network-based system discussed above), the at least one processor is configured to execute the machine-readable instructions stored in the memory (Price, Column 4, Lines 25-29, the automated storyboard producer 114 is as a network-based system discussed above, it is also possible for the automated storyboard producer 114 to be implemented as a local application running on one of the computing devices 204), and when the machine-readable instructions are executed by the at least one processor, the at least one processor executes a method of generating a comic image (Price, Column 3, Lines 3-5, the automated storyboard producer 114 may use a series of rules, logic, artificial intelligence, and machine learning to generate the storyboard 112 <read on comic image> from the screenplay 100), the method of generating the comic image comprises acquiring a target novel to be used to generate a comic (Price, Column 4, Lines 34-46, the automated storyboard producer 114 may manage one or more projects 214 that each contains content which may be used to create storyboards for the respective projects...the project 214 may begin with a single screenplay 100 <read on target novel>); determining keyword information corresponding to comic storyboards that correspond to the target novel in a plurality of comic image generation dimensions according to a content of the target novel (Price, Column 14, Lines 49-58, the analysis may also include identifying a slugline 102 that includes a scene description...the character descriptions may include gender, physical characteristics, and facial expressions. For example, a description that Jack sees a tall, smiling woman provides information about one of the other characters in the scene with Jack. Furthermore, the scene description may describe a background image 118 and or a camera technique); [[ with respect to any comic storyboard of the comic storyboards, determining target model input information corresponding to the comic storyboard according to a mapping relationship library between dimension keywords and model input information, and the keyword information of the comic storyboards in the comic image generation dimensions, wherein the dimension keywords comprise a keyword that has been determined in any comic image generation dimension]]; and using an artificial intelligence model to generate a comic image corresponding to the comic storyboard according to [[the target model input information]] (Price, Column 3, Lines 3-5, the automated storyboard producer 114 may use a series of rules, logic, artificial intelligence, and machine learning to generate the storyboard 112 <read on comic image> from the screenplay 100). 
Price does not explicitly disclose but Lupascu teaches with respect to any comic storyboard of the comic storyboards, determining target model input information corresponding to the comic storyboard according to a mapping relationship library between dimension keywords and model input information, and the keyword information of the comic storyboards in the comic image generation dimensions, wherein the dimension keywords comprise a keyword that has been determined in any comic image generation dimension (Lupascu, Paragraph [0005], the disclosed systems receive style parameters as input (e.g., via a list of style images, a list of text prompts <read on keyword information>, or a combination of style images and text prompts); Paragraph (0067), the artistic content generation system 106 utilizes the multi-domain style encoder 308 to determine style encodings 320 <read on target model input information> from the style parameters <read on keyword information> associated with the style digital image 316 and the style text prompt 318; Paragraph (0069], the artistic content generation system 106 utilizes, as the multi-domain style encoder 308 <read on mapping relationship library>, an encoder that includes the cross-lingual-multimodal-embedding model and the image-embedding model...the artistic content generation system 106 utilizes, as the multi-domain style encoder 308 <read on mapping relationship library>, the Contrastive Language-Image Pre-training (CLIP) model); using an artificial intelligence model to generate a comic image corresponding to the comic storyboard according to the target model input information (Lupascu, Paragraph [0072], the artistic content generation system 106 utilizes the decoder 304 of the artistic generative neural network <read on artificial intelligence model> to generate an initialized artistic digital image 324 <read on comic image> based on the learnable tensor 306 <read on target model input information>; Paragraph [0071], the artistic content generation system 106 utilizes the artistic image neural network 300 to generate an artistic digital image <read on comic image> by iteratively generating an intermediate artistic digital image from the learnable tensor 306 <read on target model input information>, comparing the intermediate artistic digital image to the style parameters, and updating the learnable tensor 306 based on the comparison).
Lupascu and Price are analogous since both of them are dealing with automated generation of visual content by analyzing textual narrative content and utilizing computing systems to generate images based on extracted parameters from the narrative. Price provided a way of automatically analyzing screenplays to extract storyboard content (including character descriptions, scene descriptions, and dialogue) and generating storyboard images using libraries of visual assets accessed based on extracted content. Lupascu provided a way of mapping text-based style parameters (text prompts) to neural network model input encodings using a multi-domain style encoder that projects keywords from different domains (text and images) into a common encoding space, and then using an artificial intelligence generative neural network to generate artistic visual content based on these mapped encodings. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate the multi-domain style encoder mapping technique taught by Lupascu into the modified invention of Price such that when generating storyboard images from a screenplay (novel), the system would use a mapping relationship library (multi-domain encoder) to convert the extracted keyword information from various dimensions (character, scene, props, etc.) into precise model input  information (style encodings) for the artificial intelligence model. The motivation is to improve the accuracy and consistency of automatically generated visual content by utilizing a more sophisticated mapping mechanism that can handle multiple types of textual keyword information from different dimensions and convert them into optimized neural network inputs, thereby enabling the Al model to generate comic images that more accurately reflect the fine-grained details specified in the keyword information across the plurality of comic image generation dimensions.
Regarding Claim 12, the combination of Price and Lupascu teaches the invention in Claim 11.
The combination further teaches wherein the keyword information comprises at least one target keyword (Lupascu, Paragraph [0005], the disclosed systems receive style parameters as input (e.g., via a list of style images, a list of text prompts <read on keyword information, target keyword>); the mapping relationship library is used for storing different mapping relationships between the dimension keywords and the model input information (Lupascu, Paragraph [0067], the artistic content generation system 106 utilizes the multi-domain style encoder 308 to determine style encodings 320 <read on model input information> from the style parameters <read on dimension keywords> associated with the style digital image 316 and the style text prompt 318; Paragraph, the artistic content generation system 106 utilizes, as the multi-domain style encoder 308 <read on mapping relationship library>, an encoder that includes the cross-lingual-multimodal-embedding model and the image-embedding model...the artistic content generation system 106 utilizes, as the multi-domain style encoder 308, the Contrastive Language-Image Pre-training (CLIP) model); the determining target model input information corresponding to the comic storyboard according to a mapping relationship library between dimension keywords and model input information, and the keyword information of the comic storyboards in the comic image generation dimensions, comprises: when a target mapping relationship that matches the target keyword comprised in the keyword information is searched from the mapping relationships included in the mapping relationship library, taking model input information indicated by the target mapping relationship as the target model input information (Lupascu, Paragraph [0067], the artistic content generation system 106 utilizes the multi-domain style encoder 308 to determine style encodings 320 <read on target model input information> from the style parameters <read on keyword information> associated with the style digital image 316 and the style text prompt 318; Paragraph [0077] , based on the comparison, the artistic content generation system 106 updates the learnable tensor 306 <read on selecting target model input information> and generates an artistic digital image 212 utilizing the artistic image neural network 300).
[AltContent: ]As explained in rejection of claim 11, the obviousness for combining the multi-domain style encoder mapping technique of Lupascu into the automated storyboard generation system of Price is provided above. 

Regarding Claim 13, the combination of Price and Lupascu teaches the invention in Claim 11.
The combination further teaches wherein the target mapping relationship is searched as following steps: taking a mapping relationship corresponding to a dimension keyword that is consistent with the target keyword in the mapping relationship library as the target mapping relationship (Lupascu, Paragraph [0067], the artistic content generation system 106 utilizes the multi-domain style encoder 308 to determine style encodings 320 from the style parameters associated with the style digital image 316 and the style text prompt 318, the artistic content generation system 106 utilizes the neural network image encoder 310 to project the style digital image 316 into the multi-domain encoding space and utilizes the neural network text encoder 312 to project the style text prompt 318 into the multi-domain encoding space <read on taking a mapping relationship corresponding to a dimension keyword that is consistent with the target keyword>); or, determining the target mapping relationship according to correlation degrees between keyword semantics of the dimension keywords of the mapping relationships in the mapping relationship library and a target semantic of the target keyword (Lupascu, Paragraph [0069], the artistic content generation system 106 utilizes, as the multi-domain style encoder 308, the Contrastive Language-Image Pre-training (CLIP) model described by Alec Radford et al., Learnable Transferable Visual Models from Natural Language Supervision, ICML, 2021...the artistic content generation system 106 utilizes the multi-domain style encoder 308 to project the style digital image 316 and the style text prompt 318 into a multi-domain encoding space <read on determining the target mapping relationship according to correlation degrees between keyword semantics and a target semantic>).
As explained in rejection of claim 11, the obviousness for combining the multi-domain style encoder mapping technique of Lupascu into the automated storyboard generation system of Price is provided above.
Regarding Claim 16, the combination of Price and Lupascu teaches the invention in Claim 11.
[AltContent: ]The combination further teaches wherein the determining keyword information corresponding to comic storyboards that correspond to the target novel in a plurality of comic image generation dimensions according to a content of the target novel, comprises: determining occurrence numbers of each storyboard scene corresponding to the target novel according to the content of the target novel (Price, Column 9, Lines 47-54, a user 202 1 that provided a screenplay 100 <read on target novel> may have used the automated storyboard producer 114 multiple times in the past with other screenplays 100 having similar scene descriptions <read on storyboard scene> e.g., hillside, hill, knoll, mound, etc. and the previous selections by the user 202 1 may indicate which hillside image from the background image library 220 he or she picks most often <read on occurrence numbers of each storyboard scene>; Column 9, Lines 40-43, past behavior of users 202 interacting with the shared computing resources 208 may be analyzed to identify the most frequently selected <read on occurrence numbers> background image for a hillside); and determining the keyword information separately corresponding to the comic storyboards that corresponds to the target novel in the plurality of the comic image generation dimensions according to the occurrence numbers, wherein an information amount of the keyword information is positively correlated with the occurrence number (Price, Column 7, Lines 63-66, the text analytics module 304 may include a set of linguistic, statistical, and machine learning techniques that model and structure the information content of textual content from the screenplay 100; Column 8, Line 1-4, Specific textual analysis techniques may include lexical analysis to study word frequency distributions <read on information amount positively correlated with occurrence number>, pattern recognition, tagging/annotation, information extraction, data mining techniques; Column 14, Lines 10-20, the descriptions of scenes in the sluglines 102, the action 104, and the dialogue 106 provide a basis from which the screenplay 100 is used to create a storyboard 112...the analysis may leverage the formatting and style conventions of scripts to enhance automatic recognition of script elements and use text analytics to assign meanings to words in the script <read on determining keyword information in multiple dimensions according to frequency/occurrence>).
Regarding Claim 17, the combination of Price and Lupascu teaches the invention in Claim 11.
The combination further teaches wherein after generating the comic images corresponding to the comic storyboards, the method further comprises: determining storyboard frames corresponding to the comic storyboards according to the keyword information corresponding to each of the comic storyboards (Price, Column 3, Lines 52-60, FIG. 1 description, one frame of the storyboard 112 may include  character images 116 for each of the characters mentioned in the corresponding scene from the screenplay 100...the background 118 may be selected based on analysis of the slugline 102, the action 104, and or the dialogue 106 from the screenplay 100 <read on determining storyboard frames according to keyword information>); filling the comic images corresponding to each of the comic storyboards in the storyboard frames to obtain storyboard images corresponding to each of the comic storyboards (Price, Column 3, Lines 60-67, the character images 116 may be placed in front of a background 118...the finished draft of the storyboard 112 may include multiple frames <read on filling the comic images in storyboard frames to obtain storyboard images>); and according to a target number of episodes to which each of the comic storyboards belongs in a strip comic and a storyboard order of each of the comic storyboards in the target number of the episodes, typesetting the storyboard images corresponding to the comic storyboards to obtain a target strip comic corresponding to the target novel (Price, Column 6, Lines 1-9, the project 214 may begin with a single screenplay 100...the finished draft of the storyboard 112 may include multiple frames...the scene on the hill is preceded by an earlier scene 120 and followed by a later scene 122 <read on ordering storyboard images in sequence>; Column 2, Lines 24-30, storyboards are graphic organizers that show a series of illustrations or images displayed in sequence for the purpose of pre-visualizing some type of video media <read on typesetting storyboard images to obtain a strip comic>).
Regarding Claim 18, the combination of Price and Lupascu teaches the invention in Claim 11.
The combination further teaches wherein the comic storyboards corresponding to the target novel is determined as following steps: splitting the target novel according to information semantics of text information corresponding to the target novel, and obtaining novel segments corresponding to the target novel {Price, Column 15, Lines 1-10, a screenplay analysis module 302 may analyze a screenplay 100 to extract information both from the content of the text and the layout and formatting of the text...the text analytics module 304 may include a set of linguistic, statistical, and machine learning techniques that model and structure the information content of textual content from the screenplay 100); and determining the comic storyboards corresponding to the target novel according to segmented texts of the novel segments and the comic image generation dimensions (Price, Column 3, Lines 64-67, a storyboard or animatic may be created automatically from a script based on automatic analysis of the script; Column 15, Lines 10-18, the screenplay analysis module 302 identifies sections of dialogue 106, sections of the action 104, and relative locations of the sections of dialogue and the sections of action on the pages of the screenplay 100).

Regarding Claim 19, the combination of Price and Lupascu teaches the invention in Claim 11.
The combination further teaches wherein the comic image generation dimensions are determined as following steps: determining a novel genre of the target novel (Price, Column 10, Lines 38-44, The preproduction module 318 may receive an indication of a genre for the screenplay 100 <read on novel genre of the target novel> that is used by the automated storyboard producer 114 to select elements of the storyboard 112 (e.g., a soundtrack); Column 6, Lines 41-44, the songs available in the music library 222 may be grouped by song genre e.g., rock n roll, blues, etc., corresponding movie genre <read on novel genre> e.g., Western, film noir, etc.); and determining the plurality of the comic image generation dimensions corresponding to the target novel from a plurality of preset image generation dimensions according to the novel genre (Price, Column 7, Lines 4-11, the camera technique library 226 may include multiple templates for a given type of camera technique...the templates may be organized by director, genre, or other category <read on determining plurality of dimensions according to novel genre>. For example, there may be a template that includes camera techniques <read on comic image generation dimensions> which are suggestive of a television soap opera or a movie directed by Woody Allen <read on selecting dimensions based on genre>; Column 10, Lines 38-44,, a genre for the screenplay 100 that is used by the automated storyboard producer 114 to select elements of the storyboard 112 <read on determining the plurality of dimensions from preset dimensions according to genre>).
Regarding Claim 1, it recites limitations similar in scope to the limitations of Claim 11 but as a method and the combination of Price and Lupascu teaches all the limitations as of Claim 11. Therefore is rejected under the same rationale.
Regarding Claim 2, it recites limitations similar in scope to the limitations of Claim 12 and therefore is rejected under the same rationale.
Regarding Claim 3, it recites limitations similar in scope to the limitations of Claim 13 and therefore is rejected under the same rationale.
Regarding Claim 7, it recites limitations similar in scope to the limitations of Claim 16 and therefore is rejected under the same rationale.
Regarding Claim 8, it recites limitations similar in scope to the limitations of Claim 17 and therefore is rejected under the same rationale.
Regarding Claim 9, it recites limitations similar in scope to the limitations of Claim 18 and therefore is rejected under the same rationale.
Regarding Claim 10, it recites limitations similar in scope to the limitations of Claim 19 and therefore is rejected under the same rationale.
Regarding Claim 20, it recites limitations similar in scope to the limitations of claim 11 and the combination of Price and Lupascu teaches all the limitations as of Claim 11. And Price discloses these features can be implemented on a computer readable storage medium (Price, Column 4, Line 13-20, The computer-readable media 212 may include… any other non-transitory computer-readable medium which can be used to store information and which can be accessed by a processor; Column 14, Line 9-14, In the context of software, the blocks represent computer-executable instructions stored on one or more computer-readable storage media that, when executed by one or more processors, perform the recited operations).

Claim 4, 5, 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Price et al. (US 9106812 81, hereinafter Price) in view of Lupascu et al. (US 20230267652 A1, hereinafter Lupascu) as applied to Claim 1, 11 above respectively and further in view of Hibbert et al. (US 20110102424 A1, hereinafter Hibbert).

Regarding Claim 14, the combination of Price and Lupascu teaches the invention in Claim 11.
The combination does not explicitly disclose but Hibbert teaches wherein the determining target model input information corresponding to the comic storyboard according to a mapping relationship library between dimension keywords and model input information, and the keyword information of the comic storyboards in the comic image generation dimensions, comprises: when the mapping relationship library does not store a target mapping relationship that matches a target keyword comprised in the keyword information, determining pieces of candidate input information corresponding to the target keyword (Hibbert, Paragraph [0096], the system allows multiple 30 sets to be saved in a library corresponding to each storyboard project and accessible by the storyboard artist...the library provides a real-time visual preview of the set in question (utilising a default camera position) so the correct set can be easily chosen; Paragraph [0098), the system allows multiple 3D objects to be saved in a library corresponding to each storyboard project...the 3D object library provides a real-time visual preview of the model or prop in question...so the storyboard artist can easily choose the correct model or prop for any particular panel <read on when no exact match, choosing from multiple candidate options>);[AltContent: ][AltContent: ] inputting each candidate input information into the artificial intelligence model separately to obtain target images corresponding to each candidate input information (Hibbert, Paragraph [0064), step 13 the 3D image data is rendered from the viewpoint of the virtual camera to form a 20 background image; Paragraph [0009), retrieving three-dimensional image data defining at least one three-dimensional object, rendering the three-dimensional image data from a predefined viewpoint to generate two-dimensional background image data <read on obtaining target images from each candidate>);[AltContent: ][AltContent: ] and determining the target model input information from the candidate input information according to matching degrees between the target images and the keyword information
    PNG
    media_image1.png
    112
    9
    media_image1.png
    Greyscale
 {Hibbert, Paragraph [0096], the library provides a real-time visual preview of the set in question utilising a default camera position so the correct set can be easily chosen <read on determining target model input based on matching>; Paragraph [0098], the storyboard artist can easily choose the correct model or prop for any particular panel).
Hibbert and Price are analogous since both of them are dealing with automated generation of visual content for storytelling purposes by analyzing textual or parametric input and utilizing computing systems to generate images based on extracted or specified parameters.
    PNG
    media_image2.png
    129
    9
    media_image2.png
    Greyscale
 Price provided a way of automatically analyzing screenplays to extract storyboard content and generating storyboard images using libraries of visual assets accessed based on extracted content. Hibbert provided a way of storing multiple candidate 30 sets and objects in libraries, rendering each candidate as a preview image from a specified viewpoint, and enabling selection of the correct visual element by comparing the rendered preview images to determine which candidate best matches the storyboard requirements.[AltContent: ] Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate the library-based candidate selection with preview rendering technique taught by Hibbert into the modified invention combining Price such that when the mapping relationship library does not store an exact match for keyword information, the system would determine multiple pieces of candidate input information from the library, input each candidate separately into the artificial intelligence model to render corresponding target images, and determine the target model input information by comparing the matching degrees between the rendered target images and the keyword information requirements which yielding predictable results of enabling selection among multiple candidate visual options when an exact match is not immediately available.

Regarding Claim 4, it recites limitations similar in scope to the limitations of Claim 14 and therefore is rejected under the same rationale.

Regarding Claim 5, the combination of Price, Lupascu, and Hibbert teaches the invention in Claim 4.
The combination further teaches wherein after the determining the target model input information from the candidate input information, the method further comprises: establishing a mapping relationship between the target model input information and the
target keyword (Hibbert, Paragraph [0170], the system contains a user-managed look-up table that lists all the low-polygon 3D sets and 3D objects (proxies), used in any particular storyboard project. This lookup table links the low-polygon proxies (used in the system) to the location and filename of the corresponding final high-polygon model <read on establishing a mapping relationship between target model input information and target keyword>; Paragraph [0092], the first file is a data file containing...a unique model ID defining the model used as the
background set <read on mapping relationship between keyword/identifier and model data>);and storing the mapping relationship in the mapping relationship library  (Hibbert, Paragraph [0093], the system scans the storyboard folder and parses the filenames of each individual file in order to rebuild a storyboard in memory; Paragraph [0095], the timestamps in the files for each panel within that folder are checked and compared with a database in the system memory to find out which panel(s) is(are) affected. The updated data can then be retrieved from the affected panels; Paragraph [0170]-[0171], This lookup table links the low-polygon proxies (used in the system) to the location and filename of the corresponding final high-polygon model...by using the lookup table as a reference).
As explained in rejection of claim 4, the obviousness for combining the library selection and rendering preview system of Hibbert with the automated storyboard generation system of Price enhanced by the multi-domain style encoder of Lupascu is provided above.


Claim 6, 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Price et al. (US 9106812 81, hereinafter Price) in view of Lupascu et al. (US 20230267652 A1, hereinafter Lupascu) as applied to Claim 1, 11 above respectively and further in view of Sturlaugson et al. (US 20160358099 A1, hereinafter Sturlaugson).
Regarding Claim 6, the combination of Price and Lupascu teaches the invention
in Claim 1.
The combination does not explicitly disclose but Sturlaugson teaches wherein the
determining target model input information corresponding to the comic storyboard
according to a mapping relationship library between dimension keywords and model
input information, and the keyword information of the comic storyboards in the comic
image generation dimensions, comprises: when the mapping relationship library does
not store a target mapping relationship that is related to a target keyword comprised in
the keyword information, determining pieces of candidate input information
corresponding to the target keyword (Sturlaugson, Paragraph [0104], a data input module configured to receive an input dataset and a selection of machine learning models); with respect to anyone of the candidate input information, using a plurality of text-and-image conversion models separately to generate corresponding target images according to the candidate input information, wherein different text-and-image conversion models are deployed with different text-and-image conversion algorithms (Sturlaugson, Paragraph [0103], a machine learning algorithm library that includes a plurality of machine learning algorithms <read on plurality of text-and-image conversion models with different algorithms> configured to be tested with a common interface; Paragraph [0003], a broad array of machine learning algorithms are available ... artificial neural networks, learned decision trees, and support vector machines are different classes of algorithms; Paragraph [0040], Experiment module 30 is configured to train each of the machine learning models 32 ... to produce a trained model for each machine learning model); and determining an image generation effect of the candidate input information according to the target images, and determining the target model input information corresponding to the comic storyboard according to the image generation effect of the candidate input information respectively (Sturlaugson, Paragraph [0040], Experiment module 30 is configured to evaluate and/or to validate each trained model to produce a performance result for each machine learning model <read on determining an image generation effect according to target images>; Paragraph [0106], an aggregation module configured to aggregate the performance results for all of the machine learning models to form performance comparison statistics).
Sturlaugson and Price are analogous since all three are dealing with automated
generation of output content by processing input data using computing systems that
select and apply appropriate algorithms or models based on input parameters.
Price provided a way of automatically analyzing screenplays to extract storyboard
content (including character descriptions, scene descriptions, and dialogue) and
generating storyboard images using libraries of visual assets accessed based on
extracted content. Sturlaugson provided a way of maintaining a machine learning algorithm library including a plurality of different machine learning algorithms, receiving a selection of machine learning models, training and evaluating each machine learning model separately to produce a performance result for each model, and aggregating the
performance results to form comparison statistics for selecting the best-performing
model. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate the multi-model testing and comparison technique taught by Sturlaugson into the modified invention combining Price such that when the mapping relationship library does not store an exact match for keyword information, the system would determine multiple pieces of candidate input information, use a plurality of text-and-image conversion models with different algorithms separately to generate corresponding target images for each candidate, evaluate the image generation effect of each candidate by analyzing the performance results of the generated target images, and determine the target model
input information by selecting the candidate that produces the best image generation
effect according to the performance comparison statistics.
Regarding Claim 15, it recites limitations similar in scope to the limitations of Claim 6 and therefore is rejected under the same rationale.

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20020122039 A1	Electronic comic viewing apparatus and method and recording medium
US 20100110080 A1	System and method for comic creation and editing
US 20190026958 A1	employing three-dimensional (3d) data predicted from two-dimensional (2d) images using neural networks for 3d modeling applications and other applications
US 20200174796 A1	Faster sparse flush recovery
US 20210310408 A1	Gas turbine engine with efficient thrust generation
US 20220404771 A1	Natural escapement for a horological movement and horological movement comprising such an escapement
US 20230126286 A1	Multiplexed testing of lymphocytes for antigen specificity
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YUJANG TSWEI whose telephone number is (571)272-6669. The examiner can normally be reached 8:30am-5:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached on (571) 272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/YuJang Tswei/Primary Examiner, Art Unit 2614
Read full office action
Prosecution Timeline

Jul 22, 2024
Application Filed
Feb 11, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/689,933
Patent 12579805
AUGMENTED, VIRTUAL AND MIXED-REALITY CONTENT SELECTION & DISPLAY FOR TRAVEL
2y 5m to grant Granted Mar 17, 2026
18/420,243
Patent 12579838
Perspective Distortion Correction on Faces
2y 5m to grant Granted Mar 17, 2026
18/007,045
Patent 12567213
COMPUTER VISION AND ARTIFICIAL INTELLIGENCE METHOD TO OPTIMIZE OVERLAY PLACEMENT IN EXTENDED REALITY
2y 5m to grant Granted Mar 03, 2026
18/657,567
Patent 12567189
RELATIONAL LOSS FOR ENHANCING TEXT-BASED STYLE TRANSFER
2y 5m to grant Granted Mar 03, 2026
18/512,461
Patent 12561930
PARAMETRIC EYEBROW REPRESENTATION AND ENROLLMENT FROM IMAGE INPUT
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
84%
Grant Probability
99%
With Interview (+17.0%)
2y 5m
Median Time to Grant
Low
PTA Risk
Based on 447 resolved cases by this examiner. Grant probability derived from career allow rate.