Last updated: April 19, 2026

Application No. 19/042,319

METHOD, APPARATUS, DEVICE AND STORAGE MEDIUM FOR GENERATING MEDIA CONTENT

Non-Final OA §103

Filed

Jan 31, 2025

Examiner

CHOKSHI, PINKAL R

Art Unit

2425

Tech Center

2400 — Computer Networks

Assignee

BEIJING YOUZHUJU NETWORK TECHNOLOGY CO., LTD.

OA Round

1 (Non-Final)

This examiner grants 60% of cases after interview

— +29.6% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 505 resolved cases, 2023–2026

Examiner Intelligence

CHOKSHI, PINKAL R View full profile →

Grants 60% of resolved cases

Career Allow Rate

305 granted / 505 resolved

+2.4% vs TC avg

Strong +30% interview lift

Without

With

+29.6%

Interview Lift

resolved cases with interview

Typical timeline

3y 5m

Avg Prosecution

29 currently pending

Career history

534

Total Applications

across all art units

Statute-Specific Performance

§101

4.6%

-35.4% vs TC avg

§103

59.6%

+19.6% vs TC avg

§102

12.3%

-27.7% vs TC avg

§112

13.4%

-26.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 505 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 8-16, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over US PG Pub 2024/0273796 to Edson (“Edson”) in view of US PG Pub 2014/0075335 to Hicks (“Hicks”).
Regarding claim 1, “A method for generating media content” reads on the technique for generating animated image file using an image and text instructions (abstract) disclosed by Edson and represented in Fig. 3.
As to “comprising: presenting a configuration interface based on a selection of first media content, the configuration interface comprising an input control for inputting a prompt item” Edson discloses (¶0029, ¶0036, ¶0062) that the AIG module receives an image selection on the user interface as represented in Fig. 4D; (¶0063-¶0064) in response to user selecting the image file, the user interface displays a text prompt field that enables the user to input instructions in natural-language that may be used to affect edits to the selected image as represented in Fig. 4F.
As to “acquiring a target prompt item via the input control” Edson discloses (¶0063-¶0064) that the user inputs a text prompt instructing the model to change make the changes for the image file selected.
As to “generating second media content, the second media content comprising…a second segment, … and the second segment generated based on a target image in the first media content and the target prompt item” Edson discloses (¶0066-¶0068, ¶0070, claim 1) that the system generates the animated image file based on the selected image file and the user inputted text prompt as represented in Fig. 4F-H.
Edson meets all the limitations of the claim except “generating second media content, the second media content comprising a first segment and a second segment, … the first segment corresponding to the first media content.”  However, Hicks discloses (¶0087-¶0089) that the system generates and displays edited image along with the original image as represented in Fig. 4E.  Therefore, it would have been obvious to one of the ordinary skills in the art before the effective filing date of the invention to modify Edson’s system by generating second media content comprising first/original segment and second/edited segment as taught by Hicks in order to visually see the difference in the original image and the edited image on the screen (Hicks - ¶0020).

Regarding claim 2, “The method according to claim 1, wherein presenting the configuration interface comprises: receiving a selection of a target generation mode; and presenting the configuration interface corresponding to the target generation mode” Edson discloses (¶0060-¶0061) that the user selects to generate an animated image using one of the selection options on the screen as represented in Figs. 4B-4D.

Regarding claim 3, “The method according to claim 1, wherein the configuration interface corresponds to a first generation mode, the configuration interface further comprises an image control, and the method further comprises: acquiring at least one reference image via the image control; and generating the second segment based on the target image, the target prompt item and the at least one reference image” Edson discloses (¶0060-¶0061) that the user selects to generate an animated image using one of the selection options on the screen; (¶0029, ¶0036, ¶0062) the AIG module receives an image selection on the user interface as represented in Fig. 4D; (¶0063-¶0064) in response to user selecting the image file, the user interface displays a text prompt field that enables the user to input instructions in natural-language that may be used to affect edits to the selected image; (¶0066-¶0068, ¶0070, claim 1) the system generates the animated image file based on the selected image file and the user inputted text prompt as represented in Fig. 4F-H.

Regarding claim 4, “The method according to claim 3, wherein the second segment comprises at least one video frame corresponding to the at least one reference image” Edson discloses (¶0066-¶0068, ¶0070, claim 1) that the system generates the animated image file based on the selected image file and the user inputted text prompt; (¶0076-¶0079) each of the frames extracted from an animated image file may be compiled and rendered into a single image and utilized as input for a fine-tuned model.

Regarding claim 5, “The method according to claim 4, wherein a location of the at least one video frame in the second segment is determined based on a configuration operation of a user” Edson discloses (¶0047, ¶0050) the generative AI model may apply one or more computer vision algorithms to identify positions or angles of the features of each layer; the generative AI model outputs a natural-language description of the identified feature positions and angles based on the user input text instructions.

Regarding claim 8, “The method according to claim 1, wherein the configuration interface further comprises a target input component, and the method further comprises: acquiring at least one media parameter via the target input component, such that the second segment is generated further based on the at least one media parameter” Edson discloses (¶0050) that the user may input text instructions to change an angle or position of a described feature. The approved or modified natural-language text output of the generative AI model may be re-used by the same generative AI model or inputted into another model to generate the animated image file

Regarding claim 9, “The method according to claim 8, wherein the at least one media parameter comprises at least one of: a first media parameter, used to indicate a motion amplitude of the segment to be generated; a second media parameter, used to indicate lens information of the segment to be generated; and a third media parameter, used to indicate proportional information of the segment to be generated” Edson discloses (¶0088, ¶0078-¶0079) that the user subsequently instructs the model to modify the image using a natural-language text prompt, such as “Move the cat's left paw to X degree” as represented in Fig. 5.

Regarding claim 10, “The method according to claim 1, further comprising: displaying a candidate prompt item corresponding to generating the first media content at the input control for inputting the prompt item in the configuration interface; and determining the modified candidate prompt item as the target prompt item in response to the candidate prompt item being modified” Edson discloses (¶0040) that the AIG module generates a storyboard based on the text instructions. The storyboard may represent text descriptions or questions of features, e.g., visual elements, actions, changes, events, or the like, of the animated image described in the text instructions. In one embodiment, the storyboard may be generated by the generative AI model using an input prompt that includes the text instructions of the user and instructions to the generative AI model to generate questions about the inputted text instructions of the use.

Regarding claim 11, “The method according to claim 1, further comprising: presenting at least one piece of media content generated based on a group of parameters; and receiving the selection of the first media content from the at least one piece of media content” Edson discloses (¶0061-¶0063) that the user is provided with multiple options to generate an animated image based on the selection of an image file as represented in Fig. 4.

Regarding claim 12, see rejection similar to claim 1. Furthermore, Edson discloses (¶0005, ¶0097-¶0099) that the CRM device stores instruction executed by the processor to perform above mentioned method.

Regarding claim 13, see rejection similar to claim 2.

Regarding claim 14, see rejection similar to claim 3.

Regarding claim 15, see rejection similar to claim 4.

Regarding claim 16, see rejection similar to claim 5.

Regarding claim 19, see rejection similar to claim 8.

Regarding claim 20, see rejection similar to claim 1.  Furthermore, Edson discloses (¶0005, ¶0097-¶0099) that the CRM device stores instruction executed by the processor to perform above mentioned method.

Claims 6-7 and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Edson in view of Hicks as applied to claims 1-3 and 12 above, and further in view of US PG Pub 2023/0368534 to Vachhani (“Vachhani”).
Regarding claim 6, “The method according to claim 3, wherein generating the second segment based on the target image, the target prompt item and the at least one reference image comprises…generating the second segment…” Edson discloses (¶0060-¶0061) that the user selects to generate an animated image using one of the selection options on the screen; (¶0029, ¶0036, ¶0062) the AIG module receives an image selection on the user interface as represented in Fig. 4D; (¶0063-¶0064) in response to user selecting the image file, the user interface displays a text prompt field that enables the user to input instructions in natural-language that may be used to affect edits to the selected image; (¶0066-¶0068, ¶0070, claim 1) the system generates the animated image file based on the selected image file and the user inputted text prompt as represented in Fig. 4F-H. However, combination of Edson and Hicks does not explicitly teach “determining a reference start frame of the segment to be generated based on the target image; determining a reference end frame of the segment to be generated based on the at least one reference image; and generating the second segment based on the reference start frame, the reference end frame and the target prompt item.”  Vachhani discloses (abstract) a method for generating a segment of a video by the device by identifying a context associated with the video; (¶0075) the device analyzes the parameter in every frame of the video, where the parameter includes a subject, an environment, an action of the subject, an object, etc., by determining in the frame of an occurrence of a change in the parameter and generates the segment of the video comprising the frame at which there is a change in the parameter as a temporal boundary of the at least one segment.  Therefore, it would have been obvious to one of the ordinary skills in the art before the effective filing date of the invention to modify Edson and Hicks’ systems by determining reference start/end frame to generate second segment as taught by Vachhani in order to identify and generate intelligent temporal segments for improving temporal consistency (Vachhani - ¶0008).

Regarding claim 7, “The method according to claim 1, wherein the configuration interface corresponds to a second generation mode, and the method further comprises: determining a reference start frame of the segment to be generated based on the target image; and generating the second segment based on the reference start frame and the target prompt item” combination of Edson and Vachhani teaches this limitation, where Edson discloses (¶0060-¶0061) that the user selects to generate an animated image using one of the selection options on the screen; (¶0029, ¶0036, ¶0062) the AIG module receives an image selection on the user interface as represented in Fig. 4D; (¶0063-¶0064) in response to user selecting the image file, the user interface displays a text prompt field that enables the user to input instructions in natural-language that may be used to affect edits to the selected image; (¶0066-¶0068, ¶0070, claim 1) the system generates the animated image file based on the selected image file and the user inputted text prompt as represented in Fig. 4F-H, and  Vachhani discloses (abstract) a method for generating a segment of a video by the device by identifying a context associated with the video; (¶0075) the device analyzes the parameter in every frame of the video, where the parameter includes a subject, an environment, an action of the subject, an object, etc., by determining in the frame of an occurrence of a change in the parameter and generates the segment of the video comprising the frame at which there is a change in the parameter as a temporal boundary of the at least one segment.  Therefore, it would have been obvious to one of the ordinary skills in the art before the effective filing date of the invention to modify Edson and Hicks’ systems by determining reference start/end frame to generate second segment as taught by Vachhani in order to identify and generate intelligent temporal segments for improving temporal consistency (Vachhani - ¶0008).

Regarding claim 17, see rejection similar to claim 6.

Regarding claim 18, see rejection similar to claim 7.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 2025/0259362 to Brooks
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PINKAL R CHOKSHI whose telephone number is (571)270-3317.  The examiner can normally be reached on Monday - Friday, 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, BRIAN T PENDLETON can be reached on (571)272-7527.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PINKAL R CHOKSHI/
Primary Examiner, Art Unit 2425

Read full office action

Prosecution Timeline

Jan 31, 2025

Application Filed

Jan 29, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/259,728

Patent 12598332

PROCESSING OF MULTI-VIEW VIDEO

2y 5m to grant Granted Apr 07, 2026

17/291,893

Patent 12593114

APPARATUS AND A METHOD FOR SIGNALING INFORMATION IN A CONTAINER FILE FORMAT

2y 5m to grant Granted Mar 31, 2026

18/641,519

Patent 12593084

VIDEO STREAMING SYSTEM AND VIDEO STREAMING METHOD

2y 5m to grant Granted Mar 31, 2026

18/279,696

Patent 12581144

A METHOD OF PROVIDING A TIME-SYNCHRONIZED MULTI-STREAM DATA TRANSMISSION

2y 5m to grant Granted Mar 17, 2026

18/268,824

Patent 12574599

METHOD AND SYSTEM FOR REDACTING UNDESIRABLE DIGITAL CONTENT

2y 5m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

60%

Grant Probability

90%

With Interview (+29.6%)

3y 5m

Median Time to Grant

Low

PTA Risk

Based on 505 resolved cases by this examiner. Grant probability derived from career allow rate.