Last updated: May 29, 2026

Application No. 18/926,910

METHOD, APPARATUS, DEVICE AND STORAGE MEDIUM FOR MULTIMEDIA CONTENT GENERATION

Final Rejection §103§112

Filed

Oct 25, 2024

Priority

Jan 15, 2024 — CN 202410059722.8

Examiner

ALATA, YASSIN

Art Unit

2426

Tech Center

2400 — Computer Networks

Assignee

BEIJING ZITIAO NETWORK TECHNOLOGY CO., LTD.

OA Round

2 (Final)

Interview Optional

— +14.1% interview lift. Interview lift (+14.1%) is below the 15.0% threshold. A written response is recommended.

Based on 825 resolved cases, 2023–2026

Examiner Intelligence

ALATA, YASSIN View full profile →

Grants 66% — above average

Career Allowance Rate

549 granted / 825 resolved

+8.5% vs TC avg

Moderate +14% lift

Without

With

+14.1%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

36 currently pending

Career history

867

Total Applications

across all art units

Statute-Specific Performance

§101

0.8%

-39.2% vs TC avg

§103

79.3%

+39.3% vs TC avg

§102

10.8%

-29.2% vs TC avg

§112

1.0%

-39.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 825 resolved cases

Office Action

§103 §112

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claims 1-19 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claims 1, 4-10 and 13-19 have been amended and the amendments to claim 19 overcome the previous 101 rejection.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 1, 10 and 19 recites the limitation "receiving image data and input sound data…wherein the image data and input sound data” and then later they recite “the sound data”.  There is insufficient antecedent basis for this limitation in the claim. The Examiner suggests keeping the language of the claim consistent.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-19 are rejected under 35 U.S.C. 103 as being unpatentable over LI (US 2025/0061649) in view of Beith (US 2024/0078731) and further in view of Ayoub (US 2013/0201397).
Regarding claim 1, LI discloses a method for multimedia content generation, comprising: 
receiving image data and input sound data from an object (from face recorder and voice recorder via the user interface; see at least Fig. 1 and paragraphs 0019-0020);
performing, with subsequent input sound data being obtained, a conversion operation on at least a portion of the input sound data to obtain converted sound data (converting voice into sound; see at least paragraph 0036);
generating audio data based at least on the converted sound data corresponding to the input sound data (converting voice into sound; see at least paragraph 0036); 
aligning the audio data with the image data in time (fusing the image data and the voice to create a virtual avatar; see at least paragraph 0037); and 
generating multimedia content associated with the object based on the aligned audio data and image data (generating a virtual avatar that incorporates the image and the voice; see at least paragraph 0037).  
LI discloses the converted sound data, but is not clear about wherein image data and input sound data are captured concurrently in real-time, a conversion in a streaming manner and
that the aligning is based on a time delay associated with sound data.
Beith discloses wherein image data and input sound data are captured concurrently in real-time, a conversion in a streaming manner; receiving image data from cameras and speech of the user from microphones and generating an avatar based on the captured image data and sound data; see at least paragraphs 0077, 0085-0089 and 0091-0092 in a streaming environment; see at least paragraphs 0220-00221, 0233 and 0243.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify LI by the teachings of Beith by having the above limitations for the purpose of generating representation of an avatar; see at least the Abstract.
Li in view of Beith are not clear about that the aligning is based on a time delay associated with converted sound data.
Ayoub discloses the above missing limitation; receiving audio data associated with video data from a data source, the audio data is processed and converted, the video data is delayed to compensate for audio processing of the audio data, wherein the delay is set to a delay that correlates to the audio processing of the audio data to synchronize the audio generated at an audio rendering device with the video generated at a video rendering device; see at least paragraphs 0015, 0026, 0030 and 0037.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify Li in view of Beith by the teachings of Ayoub by having the above limitations to compensate for additional audio data processing; see at least paragraph 0026.

Regarding claim 2, LI in view of Beith and further in view of Ayoub disclose the method of claim 1, wherein generating the audio data comprises: 
obtaining background sound data played concurrently with capturing the input sound data and the image data; 
aligning the converted sound data with the background sound data in time based on the time delay; and 
generating the audio data based on the aligned converted sound data and background sound data (the claim does not distinguish between the background sound data and the sound data in claim 1; see at least the rejection of claim 1).  

Regarding claim 3, LI in view of Beith and further in view of Ayoub disclose the method of claim 2, wherein aligning the converted sound data with the background sound data in time comprises: 
shifting a start position of the background sound data backwards in time based on the time delay (Beith; see at least the rejection of claim 1).  

Regarding claim 4, LI in view of Beith and further in view of Ayoub disclose the method of claim 1, wherein aligning the audio data with the image data in time comprises: 
determining a data amount based on the time delay and a code rate; and 
aligning the audio data with the image data in time by removing the data amount of the audio data from a start position of the audio data (the aligning of Ayoub; see at least the rejection of claim 1).  

Regarding claim 5, LI in view of Beith and further in view of Ayoub disclose method of claim 1, wherein the time delay comprises at least one of: 
a conversion time delay caused by obtaining the converted sound data based on the input sound data, or a capture time delay caused by obtaining the input sound data with a microphone (the processing and time delay of Ayoub; see at least the rejection of claim 1).  

Regarding claim 6, LI in view of Beith and further in view of Ayoub disclose the method of claim 5, further comprising:
transmitting, to a remote device, the input sound data and a request for performing the conversion operation on at least the portion of the input sound data (Beith; see at least paragraphs 0082-0083, 0233 and 0237); 
receiving the converted sound data from the remote device (Beith; see at least paragraphs 0082-0083, 0233 and 0237); and 
obtaining the conversion time delay based on the transmission of the input sound data and the reception of the converted sound data (the combination of Beith’s transmitting and Ayoub processing and delay; see at least the rejection of claim 1).  

Regarding claim 7, LI in view of Beith and further in view of Ayoub disclose the method of claim 1, wherein the input sound data comprises a voice of the object, and the conversion operation comprises converting the voice into a user-specified timbre (LI; see a least paragraph 0036).  

Regarding claim 8, LI in view of Beith and further in view of Ayoub disclose the method of claim 7, wherein the image data comprises a facial image of the object (LI; face recorder; see at least Fig. 1).  

Regarding claim 9, LI in view of Beith and further in view of Ayoub disclose the method of claim 1, further comprising: 
receiving a user input indicating content shooting; and
in response to the user input, triggering concurrent capture of the image data and the input sound data (LI; a user interface and face/voice recorder; see at least Fig. 1 and paragraphs 0018-0019 and the camera of Beith; see at least the rejection of claim 1).  

Claim 10 is rejected on the same grounds as claim 1.
Claim 11 is rejected on the same grounds as claim 2.
Claim 12 is rejected on the same grounds as claim 3.
Claim 13 is rejected on the same grounds as claim 4.
Claim 14 is rejected on the same grounds as claim 5.
Claim 15 is rejected on the same grounds as claim 6.
Claim 16 is rejected on the same grounds as claim 7.
Claim 17 is rejected on the same grounds as claim 8.
Claim 18 is rejected on the same grounds as claim 9.
Claim 19 is rejected on the same grounds as claim 1.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YASSIN ALATA whose telephone number is (571)270-5683. The examiner can normally be reached Mon-Fri 7-4 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nasser Goodarzi can be reached at 571-272-4195. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/YASSIN ALATA/Primary Examiner, Art Unit 2426

Read full office action

Prosecution Timeline

Oct 25, 2024

Application Filed

Oct 22, 2025

Non-Final Rejection mailed — §103, §112

Jan 22, 2026

Response Filed

Apr 07, 2026

Final Rejection mailed — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/781,664

Patent 12639855

METHOD AND SYSTEM FOR AUTOMATICALLY ACQUIRING TARGET INFORMATION

1y 10m to grant Granted May 26, 2026

18/288,900

Patent 12632935

SYSTEM, METHOD AND COMPUTER-ACCESSIBLE MEDIUM FOR DECORRELATING AND REMOVING NOISE, AND/OR REMOVING PARTIAL FOURIER-INDUCED GIBBS (RPG) RINGING ARTIFACTS FROM AT LEAST ONE IMAGE

2y 6m to grant Granted May 19, 2026

18/535,419

Patent 12632936

METHOD OF HIGH-DYNAMIC-RANGE IMAGE DENOISING AND DEVICE

2y 5m to grant Granted May 19, 2026

18/658,302

Patent 12634408

VIDEO ROUTERS AND RELATED METHODS WITH INTEGRATED AUDIO MIXING AND PROCESSING

2y 0m to grant Granted May 19, 2026

18/913,425

Patent 12634538

Controlling Sharing of Content Targeting Data with Content Delivery Networks

1y 7m to grant Granted May 19, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

66%

Grant Probability

81%

With Interview (+14.1%)

2y 11m (~1y 4m remaining)

Median Time to Grant

Moderate

PTA Risk

Based on 825 resolved cases by this examiner. Grant probability derived from career allowance rate.