DETAILED ACTION
This communication is in response to the amendment filed 12/22/25 in which claims 1-2, 5-6, 8-9, 12-13, 15, and 19 were amended, claim 7 was canceled, and claim 21 was newly presented. Claims 1-6 and 8-21 are pending.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claims 1, 8, and 15 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or non-obviousness.
Claims 1-4 and 8-11 are rejected under 35 U.S.C. 103 as being unpatentable over McCarty (US 2022/0229865 A1; published Jul. 21, 2022) in view of Reed, Scott, et al. "Generative adversarial text to image synthesis." International conference on machine learning. Pmlr (2016) (“Reed”) and Bennion (US 2019/0114661 A1; published Apr. 18, 2019).
Regarding claim 1, McCarty discloses [a] method comprising:
receiving, from a first user, a description of content to be generated for more than one target user associated with respective target user profiles using a generative model, wherein the generative model is trained to receive text and output an image, wherein the received description of content is associated with a first user profile; (¶ 18 (“In such a case, a request for story synthesis may thus begin with a user [first user] of a computing device asking a hypothetical question. Examples of questions which may be presented in connection with a request for story synthesis include, without limitation, “what would my life be like if I took the Senior Director job in Golden, Colo.,” [e.g., text] “what if I went to the University of Oregon instead of Oregon State University,” and “what would my family life be like if we [more than one target user] bought this house.”), ¶ 41 (“In either such case, the input may, in some embodiments, be associated with an electronic persona 340 [“first user profile”]. The electronic persona 340 represents a collection of information, which may, for example, be presented in the form of a profile or other record.”), ¶ 63 (“For example, where the user [“first user”] of the electronic device 305 asks for a story to be synthesized based on the question “what would life be like for my spouse and I [“more than one target user”] if we became social activists in Washington, D.C.,” a crawler may search through the social media account page of the user of the electronic device 305 to identify the social media account page of his or her spouse [“respective target user profiles”]. The crawler may then proceed to search through the social media account of the spouse, such as for photographs or other information which may be used to identify one or more of the other content items.”), ¶ 84 (“A machine learning model 600 [generative model] may be trained using various input associated with requests for story synthesis and the processing of such requests. The machine learning model 600 may further be used for inference of further requests for story synthesis, such as to identify, select, determine, or otherwise generate aspects associated with the processing of such requests. The machine learning model 600 may be or include one or more of a neural network (e.g., a convolutional neural network, recurrent neural network, or other neural network), decision tree, vector machine, Bayesian network, genetic algorithm, deep learning system separate from a neural network, or other machine learning model. The machine learning model applies intelligence to identify complex patterns in the input and to leverage those patterns to produce output and refine systemic understanding of how to process the input to produce the output.”)).
McCarty does not expressly disclose wherein the generative model is trained to receive text and output an image (but see Reed Section 1 Introduction, last paragraph (“Our main contribution in this work is to develop a simple and effective GAN architecture and training strategy that enables compelling text to image synthesis of bird and flower images from human-written descriptions. We mainly use the Caltech-UCSD Birds dataset and the Oxford-102 Flowers dataset along with five text descriptions per image we collected as our evaluation setting. Our model is trained on a subset of training categories, and we demonstrate its performance both on the training set categories and on the testing set, i.e. “zero-shot” text to image synthesis. In addition to birds and flowers, we apply our model to more general images and text descriptions in the MSCOCOdataset (Lin et al., 2014).”)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified McCarty to incorporate the teachings of Reed to use a trained GAN model to synthesize image from human-written text, at least because doing so would provide the user a richer story development experience. See McCarty ¶ 82 (“The particular manner by which the synthesized story is output may depend upon the type or types of media comprising the synthesized story (e.g., text, audio, image, video, etc.), the capabilities of the electronic device 500, or both.”).
McCarty further discloses:
determining a semantic term based on the description of content; (¶ 19 (“Because the questions presented in connection with a request for story synthesis are personalized, the system personalizes the generation and delivery of the story. In particular, the system may use personalized information input in connection with a request for story synthesis to find other information…”), ¶ 44 (“The request received at the server device 310 from the electronic device 305 includes specified content items and asks for a story, such as by framing a “what if” question based on the specified content items. The specified content items [semantic term] are initial seeds of information which serves as the foundation for the story synthesis. The request processing module 325 processes the request received from the electronic device 305 to identify the specified content items which were included in the request.”))
generating a user-specific template including the semantic term, a user preference associated with the first user profile, a user preference associated with a first target user profile, and a user preference associated with a second target user profile; (¶ 44 (“…using the one or more specified content items to identify a story template [user-specific template] to use for story synthesis, and using the story template to identify one or more other content items to use to synthesize the story.”), ¶ 63 (“For example, where the user of the electronic device 305 asks for a story to be synthesized based on the question “what would life be like for my spouse and I if we became social activists in Washington, D.C.,” a crawler may search through the social media account page of the user of the electronic device 305 to identify the social media account page of his [first target user profile] or her spouse [“second target user profile”]. The crawler may then proceed to search through the social media account of the spouse, such as for photographs or other information which may be used to identify one or more of the other content items [user preference associated with a first/second target user profile].”))
generating first image content for a first target user using the generative model based on the user-specific template; and (¶ 68 (“The story synthesis module 335 is used to synthesize the story associated with the request for story synthesis using the specified content items included in the request and the other content items retrieved or generated based on the request. In some embodiments, the story synthesis module 335 can include functionality for performing the following operations: receiving the specified content items, the other content items, and the story template; determining an order for arranging the specified content items and the other content items according to the story template; and synthesizing a story by combining at least some of the specified content items and at least some of the other content items according to the order.”), ¶ 84 (“A machine learning model 600 [generative model] may be trained using various input associated with requests for story synthesis and the processing of such requests. The machine learning model 600 may further be used for inference of further requests for story synthesis, such as to identify, select, determine, or otherwise generate aspects associated with the processing of such requests. The machine learning model 600 may be or include one or more of a neural network (e.g., a convolutional neural network, recurrent neural network, or other neural network), decision tree, vector machine, Bayesian network, genetic algorithm, deep learning system separate from a neural network, or other machine learning model. The machine learning model applies intelligence to identify complex patterns in the input and to leverage those patterns to produce output and refine systemic understanding of how to process the input to produce the output.”); see also ¶¶ 88-90 (describing how the machine learning model processes requests for story synthesis)).
McCarty does not expressly teach that the output of the machine learning model is an image. However, as discussed above, Reed teaches a text to image generative model and is combinable with McCarty for the same reasons as set forth above.
Although McCarty teaches generating a content relevant to more than one user (e.g., a husband and his wife) based on profile information associated with each user (e.g., social media information), McCarty does not expressly disclose generating second image content for a second target user using the generative model based on the user-specific template, where the first image content for the first target user is different from the second image content for the second target user; displaying the first image content on a first target user device associated with the first target user; and displaying the second image content on a second target user device associated with the second target user (but see Bennion ¶ 17 (“An advertiser, rather than trying to generate specific different advertisements for an advertising campaign, can provide to the ad server system content items (e.g., video, images, audio, haptic) associated with the advertising campaign and associate each content item with one or more tokens. When an advertisement for the advertising campaign is to be provided, a template may be used by the ad server system to dynamically generate the advertisement by selecting content items from the provided content items for each of the tokens of the ad template, based on the content item and token associations provided by the advertiser. In some implementations, various other factors may also be considered when selecting the ad template [user-specific template] and/or content items [first/second content]. For example, any one or more of the following factors may be utilized in selecting an advertising template and/or content items: a user profile of a user that is executing the application in which the advertisement is to be presented, a placement within the application, a user device profile, environmental factors (e.g., time of day, day of week, location of the user device), user models that have been developed by a machine learning system, etc. Utilizing the described implementations, advertisements for an advertisement campaign may be dynamically generated for the user [first/second target user] that will receive the advertisement, thereby increasing a probability that the user will engage with the advertisement when presented to the user in the application.”), ¶ 25 (“The ad server system 100 provides advertising services for advertisers 184 to user devices 102 [first target user device], 104 [second target user device], and 106 (e.g., source device, client device, mobile phone, tablet device, laptop, computer, connected or hybrid television (TV), IPTV, Internet TV, Web TV, smart TV, satellite device, satellite TV, automobile, airplane, etc.).”)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified McCarty to incorporate the teachings of Bennion to use the profile information associated with the user’s wife to generate different content for her using a template selected based on her information, at least because doing so would increase the probability that she will engage with the generated content. See Bennion ¶ 17.
McCarty does not expressly teach that the output of the machine learning model is an image. However, as discussed above, Reed teaches a text to image generative model and is combinable with McCarty for the same reasons as set forth above.
Claim 8 is an apparatus claim corresponding to claim 1 and, therefore, is similarly rejected.
Regarding claim 2, McCarty, in view of Reed and Bennion discloses the invention of claim 1 as discussed above. McCarty further discloses receiving a user refinement modifying the semantic term of the user-specific template, wherein the user refinement is at least one of a modification of the semantic term, an addition of one or more semantic terms, a deletion of one or more semantic terms, a modification of the description of content, a modification of a control loop, or a modification of a constraint; and (¶ 51 (“Determining that a word included in the request is a specified content item includes comparing the word to the story metrics. The request processing module 325 may use a library of known words to match words in the request with story metrics. For example, the library may indicate that the words “work” and “job” correspond to the “career” story metric, whereas the words “Colorado” and “Oregon” correspond to the “places” story metric. The request processing module 325 can determine that each of the specified content items may correspond to one or more story metrics. In some embodiments, if the request processing module 325 is unable to match a word included in the request to a story metric, the request processing module 325 may transmit a signal to the electronic device 305 to cause the electronic device 305 to prompt the user thereof for further input specifying a story metric for that unmatched word.”))
generating the first image content or the second image content using the generative model based on a modified user-specific template including the user refinement (¶ 68 (“The story synthesis module 335 is used to synthesize the story associated with the request for story synthesis using the specified content items included in the request and the other content items retrieved or generated based on the request. In some embodiments, the story synthesis module 335 can include functionality for performing the following operations: receiving the specified content items, the other content items, and the story template; determining an order for arranging the specified content items and the other content items according to the story template; and synthesizing a story by combining at least some of the specified content items and at least some of the other content items according to the order.”), ¶ 90 (“In some embodiments, using the machine learning model 600 to output the identification of other content items 650 may include the machine learning model 600 generating customized content items as the other content items.”)).
McCarty does not expressly teach that the output of the machine learning model is an image. However, as discussed above, Reed teaches a text to image generative model and is combinable with McCarty for the same reasons as set forth above.
Claim 9 is an apparatus claim corresponding to claim 2 and, therefore, is similarly rejected.
Regarding claim 3, McCarty, in view of Reed and Bennion, discloses the invention of claim 1 as discussed above. McCarty further discloses obtaining a target user profile information from a server, application, or database (¶ 63 (“For example, where the user of the electronic device 305 asks for a story to be synthesized based on the question “what would life be like for my spouse and I if we became social activists in Washington, D.C.,” a crawler may search through the social media account page [“server, application or database”] of the user of the electronic device 305 to identify the social media account page of his or her spouse. The crawler may then proceed to search through the social media account of the spouse [“target profile information”], such as for photographs or other information which may be used to identify one or more of the other content items.”)).
Claim 10 is an apparatus claim corresponding to claim 3 and, therefore, is similarly rejected.
Regarding claim 4, McCarty, in view of Reed and Bennion, discloses the invention of claim 1 as discussed above. McCarty further discloses obtaining a target user profile information as a user input (¶ 43 (“In some embodiments, the electronic device 305 may request information from one or more other sources (e.g., a database at the server device 310) to select, retrieve, or otherwise identify the electronic persona 340.”)).
Claim 11 is an apparatus claim corresponding to claim 4 and, therefore, is similarly rejected.
Claims 5 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over McCarty, Reed, and Bennion as applied to claims 1 and 8 above, and further in view of Weel (US 2008/0208379 A1; published Aug. 28, 2008).
Regarding claim 5, McCarty, in view of Reed and Bennion, discloses the invention of claim 1 as discussed above. McCarty teaches processing information in an electronic persona of a user to identify specified content items, a story template, and/or other content items (e.g., the current location of the user who asked the question) to guide the processing of the request for story synthesis. Yet, McCarty does not expressly disclose wherein generating the first image content or the second image content using the generative model is further based on profile information of users similar to the first target user or the second target user (but see Weel ¶¶ 31-35 (“The user profile is preferably compared to user profiles of others to determine a match. Then, a playlist of the other person for which a match was determined is communicated to the user and used to at least partially define a playlist for the user.”)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have further modified McCarty to incorporate the teachings of Weel to guide the story synthesis based on the electronic persona of users similar to the user, at least because doing so would enhance the likelihood of synthesizing a story that contains selections enjoyed by another person living in the same city as the user and having the same ethnicity, for example. See Weel ¶ 32.
Claim 12 is an apparatus claim corresponding to claim 5 and, therefore, is similarly rejected.
Claims 6, 13, and 14 are rejected under 35 U.S.C. 103 as being unpatentable over McCarty, Reed, and Bennion as applied to claims 1 and 8 above, and further in view of Eskandar, George, et al. "USIS: Unsupervised Semantic Image Synthesis." arXiv preprint arXiv:2109.14715 (2021) (“Eskander”).
Regarding claim 6, McCarty, in view of Reed and Bennion, discloses the invention of claim 1 as discussed above. McCarty, Reed, and Bennion do not expressly disclose obtaining a semantically related image from the first image content or the second image content (but see Eskander Abstract (“Semantic Image Synthesis (SIS) is a subclass of image-to-image translation where a photorealistic image is synthesized from a segmentation mask. SIS has mostly been addressed as a supervised problem. However, state-of-the-art methods depend on a huge amount of labeled data and cannot be applied in an unpaired setting. On the other hand, generic unpaired image-to-image translation frameworks under perform in comparison, because they color-code semantic layouts and feed them to traditional convolutional networks, which then learn correspondences in appearance instead of semantic content. In this initial work, we propose a new Unsupervised paradigm for Semantic Image Synthesis (USIS) as a first step towards closing the performance gap between paired and unpaired settings. Notably, the framework deploys a SPADE generator that learns to output images with visually separable semantic classes using a self-supervised segmentation loss.”)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have further modified McCarty to incorporate the teachings of Eskander to use unsupervised semantic image synthesis to synthesize a new image from the existing images, at least because doing so would enable a range of applications such as content creation and semantic manipulation by editing, adding, removing or changing the appearance of an object. See Eskander Section 1.
Claim 13 is an apparatus claim corresponding to claim 6 and, therefore, is similarly rejected.
Regarding claim 14, McCarty, in view of Reed, Bennion, and Eskander, discloses the invention of claim 13 as discussed above. McCarty further discloses outputting the semantically related image for display on the respective first target user device or the second target user device (¶ 82 (“For example, in some cases, outputting the synthesized story at 580 may include outputting the synthesized story for display at a display of the electronic device 500.”)).
Claims 15-19 are rejected under 35 U.S.C. 103 as being unpatentable over McCarty, in view of Yangwoo (KR 20210025856A; published Mar. 10, 2021), Reed, and Bennion.
Regarding claim 15, McCarty discloses [a] non-transitory computer-readable medium storing executable instructions, which when executed by a processing device, cause the processing device to perform operations comprising: (see ¶ 103)
during a first period of time, creating a file including a template populated with a user preference associated with a first user profile, a user preference associated with a first target user profile, a user preference associated with a second target user profile, and a semantic term determined based on a description of content; (¶ 51 (“Determining that a word included in the request is a specified content item includes comparing the word to the story metrics. The request processing module 325 may use a library of known words to match words in the request with story metrics. For example, the library may indicate that the words “work” and “job” correspond to the “career” story metric, whereas the words “Colorado” and “Oregon” correspond to the “places” story metric. The request processing module 325 can determine that each of the specified content items may correspond to one or more story metrics.”), ¶ 52 (“The request processing module 325 further processes the request by identifying a story template [“template”] to use to process the request for story synthesis based on the one or more specified content items. The story template is a sequence, model, or guide for synthesizing a story. The story template includes a number of content spaces. Some of the content spaces will be filled by the specified content items [“semantic term”].”), ¶ 59 (“Once the story template is selected, the other content items to include in the story are identified and retrieved. First, a determination is made as to the particular types of other content items to be retrieved based on the story template and the specified content items. The content spaces to be used for the other content items within the story template may each correspond to one or more story metrics. Based on those story metrics, the types of other content items can be identified. Other content items of such types may then be searched for and retrieved, such as from one or more content sources.”), ¶ 60 (“The data crawling module 330 is used to retrieve or generate the other content items.”), ¶ 63 (“The particular manner in which a crawler searches through a content source may depend or otherwise vary based on the content source, based on the other content items to retrieve [“user preference associated with a first user profile”], or both. For example, a crawler searching through one of the social media platforms 345 may begin searching a social media account of the user [“first user profile”] of the electronic device 305…The crawler may then search through the social media account page of the user of the electronic device 305 to identify other pages through which to search for other content items. For example, where the user of the electronic device 305 asks for a story to be synthesized based on the question “what would life be like for my spouse and I if we became social activists in Washington, D.C.,” a crawler may search through the social media account page of the user of the electronic device 305 to identify the social media account page of his or her spouse [“a first/second target user profile”]. The crawler may then proceed to search through the social media account of the spouse, such as for photographs or other information [“user preference associated with a first/second target user profile”] which may be used to identify one or more of the other content items.”)).
McCarty does not expressly disclose creating a file including a template (but see Yangwoo ¶ 25 (“The automatic video production system creates a plurality of templates, which can be predefined to cover different story topics, video categories, and video lengths.”), ¶ 36 (“The meta information generated by the template generation unit 101 may include a story theme of the template, a category, the number and order of scenes to be included, and a role of each scene and a definition of a requirement of a source space.”)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified McCarty to incorporate the teachings of Yangwoo to create the story template with meta information such as a story theme, at least because doing so would enable automatic video production. See Yangwoo ¶ 1.
McCarty further discloses receiving the file during a second period of time; (¶ 38 (“The server device 310 is a computing device which processes information, instructions, commands, signals, or other communications received from the electronic device 305. In particular, the server device 310 includes functionality recognized in hardware, software, or a combination thereof for synthesizing a story for output at the electronic device 305 based on a request received from the electronic device 305. The functionality of the server device 310 used for story synthesis includes a request processing module 325, a data crawling module 330, and a story synthesis module 335.”), ¶ 52 (“The request processing module 325 further processes the request by identifying a story template to use to process the request for story synthesis based on the one or more specified content items. The story template is a sequence, model, or guide for synthesizing a story. The story template includes a number of content spaces. Some of the content spaces will be filled by the specified content items.”))
generating first image content for a first target user using a generative model based on the template of the file, wherein the generative model is trained to receive text and output an image; and (¶ 68 (“The story synthesis module 335 is used to synthesize the story associated with the request for story synthesis using the specified content items included in the request and the other content items retrieved or generated based on the request. In some embodiments, the story synthesis module 335 can include functionality for performing the following operations: receiving the specified content items, the other content items, and the story template; determining an order for arranging the specified content items and the other content items according to the story template; and synthesizing a story by combining at least some of the specified content items and at least some of the other content items according to the order.”), ¶ 90 (“In some embodiments, using the machine learning model 600 to output the identification of other content items 650 may include the machine learning model 600 generating customized content items as the other content items.”), ¶ 63 (“For example, where the user of the electronic device 305 asks for a story to be synthesized based on the question “what would life be like for my spouse and I if we became social activists in Washington, D.C.,” a crawler may search through the social media account page of the user of the electronic device 305 to identify the social media account page of his [“first target user”] or her spouse. The crawler may then proceed to search through the social media account of the spouse, such as for photographs or other information which may be used to identify one or more of the other content items.”)).
McCarty does not expressly disclose generating image content wherein the generative model is trained to receive text and output an image (but see Reed, Scott, et al. "Generative adversarial text to image synthesis." International conference on machine learning. Pmlr, 2016 Section 1 Introduction, last paragraph (“Our main contribution in this work is to develop a simple and effective GAN architecture and training strategy that enables compelling text to image synthesis of bird and flower images from human-written descriptions. We mainly use the Caltech-UCSD Birds dataset and the Oxford-102 Flowers dataset along with five text descriptions per image we collected as our evaluation setting. Our model is trained on a subset of training categories, and we demonstrate its performance both on the training set categories and on the testing set, i.e. “zero-shot” text to image synthesis. In addition to birds and flowers, we apply our model to more general images and text descriptions in the MSCOCOdataset (Lin et al., 2014).”)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified McCarty to incorporate the teachings of Reed to use a trained GAN model to synthesize image from human-written text, at least because doing so would provide the user a richer story development experience. See McCarty ¶ 82 (“The particular manner by which the synthesized story is output may depend upon the type or types of media comprising the synthesized story (e.g., text, audio, image, video, etc.), the capabilities of the electronic device 500, or both.”).
Although McCarty teaches generating a content relevant to more than one user (e.g., a husband and his wife) based on profile information associated with each user (e.g., social media information), McCarty does not expressly disclose generating second image content for a second target user using the generative model based on the template of the file, where the first image content for the first target user is different from the second image content for the second target user; displaying the first image content on a first target user device associated with the first target user; and displaying the second image content on a second target user device associated with the second target user (but see Bennion ¶ 17 (“An advertiser, rather than trying to generate specific different advertisements for an advertising campaign, can provide to the ad server system content items (e.g., video, images, audio, haptic) associated with the advertising campaign and associate each content item with one or more tokens. When an advertisement for the advertising campaign is to be provided, a template may be used by the ad server system to dynamically generate the advertisement by selecting content items from the provided content items for each of the tokens of the ad template, based on the content item and token associations provided by the advertiser. In some implementations, various other factors may also be considered when selecting the ad template [template] and/or content items [first/second image content]. For example, any one or more of the following factors may be utilized in selecting an advertising template and/or content items: a user profile of a user that is executing the application in which the advertisement is to be presented, a placement within the application, a user device profile, environmental factors (e.g., time of day, day of week, location of the user device), user models that have been developed by a machine learning system, etc. Utilizing the described implementations, advertisements for an advertisement campaign may be dynamically generated for the user [first/second target user] that will receive the advertisement, thereby increasing a probability that the user will engage with the advertisement when presented to the user in the application.”), ¶ 25 (“The ad server system 100 provides advertising services for advertisers 184 to user devices 102 [first target user device], 104 [second target user device], and 106 (e.g., source device, client device, mobile phone, tablet device, laptop, computer, connected or hybrid television (TV), IPTV, Internet TV, Web TV, smart TV, satellite device, satellite TV, automobile, airplane, etc.).”)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified McCarty to incorporate the teachings of Bennion to use the profile information associated with the user’s wife to generate different content for her using a template selected based on her information, at least because doing so would increase the probability that she will engage with the generated content. See Bennion ¶ 17.
McCarty does not expressly teach that the output of the machine learning model is an image. However, as discussed above, Reed teaches a text to image generative model and is combinable with McCarty for the same reasons as set forth above.
Regarding claim 16, McCarty, in view of Yangwoo, Reed, and Bennion, discloses the invention of claim 15 as discussed above. McCarty further discloses wherein the template of the file further includes a sub-prompt associated with the user preference associated with the first user profile (¶ 94 (“Act one 720 includes several pieces of text information associated with various story metrics. For example, as shown, act one 720 includes text information relating to a career change/relocation, recreational activities, familial status, and life fulfillment.”)).
Regarding claim 17, McCarty, in view of Yangwoo, Reed, and Bennion, discloses the invention of claim 15 as discussed above. McCarty further discloses wherein the template of the file further includes a sub-prompt associated with a preference of a group of users similar to a target user (¶ 94 (“Act one 720 includes several pieces of text information associated with various story metrics. For example, as shown, act one 720 includes text information relating to a career change/relocation, recreational activities, familial status, and life fulfillment.”)).
Regarding claim 18, McCarty, in view of Yangwoo, Reed, and Bennion, discloses the invention of claim 15 as discussed above. McCarty further discloses wherein the user preference associated with the first target user profile or the user preference associated with the second target user profile was obtained from a server, application or database (¶ 63 (“For example, where the user of the electronic device 305 asks for a story to be synthesized based on the question “what would life be like for my spouse and I if we became social activists in Washington, D.C.,” a crawler may search through the social media account page [“server, application or database”] of the user of the electronic device 305 to identify the social media account page of his or her spouse. The crawler may then proceed to search through the social media account of the spouse [“target profile information”], such as for photographs or other information which may be used to identify one or more of the other content items.”)).
Regarding claim 19, McCarty, in view of Yangwoo, Reed, and Bennion, discloses the invention of claim 15 as discussed above. McCarty further discloses receiving a user refinement associated with the semantic term during the second period of time; and (¶ 51 (“Determining that a word included in the request is a specified content item includes comparing the word to the story metrics. The request processing module 325 may use a library of known words to match words in the request with story metrics. For example, the library may indicate that the words “work” and “job” correspond to the “career” story metric, whereas the words “Colorado” and “Oregon” correspond to the “places” story metric. The request processing module 325 can determine that each of the specified content items may correspond to one or more story metrics. In some embodiments, if the request processing module 325 is unable to match a word included in the request to a story metric, the request processing module 325 may transmit a signal to the electronic device 305 to cause the electronic device 305 to prompt the user thereof for further input specifying a story metric for that unmatched word.”))
generating the first image content or the second image content using the generative model based on the template in the file and further based on the user refinement (¶ 68 (“The story synthesis module 335 is used to synthesize the story associated with the request for story synthesis using the specified content items included in the request and the other content items retrieved or generated based on the request. In some embodiments, the story synthesis module 335 can include functionality for performing the following operations: receiving the specified content items, the other content items, and the story template; determining an order for arranging the specified content items and the other content items according to the story template; and synthesizing a story by combining at least some of the specified content items and at least some of the other content items according to the order.”), ¶ 90 (“In some embodiments, using the machine learning model 600 to output the identification of other content items 650 may include the machine learning model 600 generating customized content items as the other content items.”)).
Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over McCarty, Yangwoo, Reed, and Bennion as applied to claim 19 above, and further in view of Johnson (US 2008/0304807 A1; published Dec. 11, 2008).
Regarding claim 20, McCarty, in view of Yangwoo, Reed, and Bennion, discloses the invention of claim 19 as discussed above. McCarty teaches “if the request processing module 325 is unable to match a word included in the request to a story metric, the request processing module 325 may transmit a signal to the electronic device 305 to cause the electronic device 305 to prompt the user thereof for further input specifying a story metric for that unmatched word.” Yet, McCarty does not expressly disclose updating the template to include the received user refinement during the second period of time (but see Johnson ¶ 44 (“The templates may be further refined to more precisely identify events. In the above example, the user may modify the template of "goal" to require that the ball stop for a second against the net, or that the ball moves near the net and appear against the back drop of the net (even if it does not stop).”)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified McCarty to incorporate the teachings of Johnson to modify the story template based on the user input, at least because doing so would facilitate the creation of a story template for a new scenario.
Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over McCarty, Reed, and Bennion as applied to claim 1 above, and further in view of Li, Bowen, et al. "Image-to-image translation with text guidance." arXiv preprint arXiv:2002.05235 (Feb. 12, 2020) (“Li”).
Regarding claim 21, McCarty, in view of Reed and Bennion, discloses the invention of claim 1 as discussed above. McCarty does not expressly disclose wherein the generative model is trained to receive text and output an image further comprises:
training the generative model to generate a predicted image using an input image and a natural language label corresponding to the input image (but see Li Abstract (“The goal of this paper is to embed controllable factors, i.e., natural language descriptions, into image-to-image translation with generative adversarial networks, which allows text descriptions to determine the visual attributes of synthetic images.”), Section 3.6 (“To train the model, we follow the Control GAN (Li et al., 2019a) and add extra unconditional adversarial losses (LZDi, LZGi) in Eqs.3 and 4 and the structure loss (LSi) in Eq. 5 at each stage. Generators and discriminators are optimized alternatively by minimising their objective functions.”)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have further modified McCarty to incorporate the teachings of Li to generate synthetic images from existing images using natural language descriptions as controllable factors, at least because doing so would enable generating desired images without fine-grained pixel-labelled semantic maps to decide what to generate. See Li Introduction.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAHID KHAN whose telephone number is (571)270-0419. The examiner can normally be reached M-F, 9-5 est.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached at (571)272-4046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SHAHID K KHAN/Primary Examiner, Art Unit 2146