Last updated: April 19, 2026

Application No. 18/725,372

MUSIC GENERATION METHOD, APPARATUS AND SYSTEM, AND STORAGE MEDIUM

Non-Final OA §102§103

Filed

Jun 28, 2024

Examiner

GODBOLD, DOUGLAS

Art Unit

2655

Tech Center

2600 — Communications

Assignee

Lemon Inc.

OA Round

1 (Non-Final)

Interview Optional

— +10.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 1079 resolved cases, 2023–2026

Examiner Intelligence

GODBOLD, DOUGLAS View full profile →

Grants 83% — above average

Career Allow Rate

898 granted / 1079 resolved

+21.2% vs TC avg

Moderate +10% lift

Without

With

+10.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 10m

Avg Prosecution

25 currently pending

Career history

1104

Total Applications

across all art units

Statute-Specific Performance

§101

15.0%

-25.0% vs TC avg

§103

46.3%

+6.3% vs TC avg

§102

19.6%

-20.4% vs TC avg

§112

8.6%

-31.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1079 resolved cases

Office Action

§102 §103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office Action is in response to correspondence filed 28 June 2024 has been accepted and considered in this office action.  Claims 1-8 and 10-21 are pending and have been examined.

Response to Amendment
The preliminary amendment filed 28 June 2024 has been accepted and considered in this office action.  Claims 8, 10, and 11 has been amended, claim 9 cancelled, and, and claims 12-21 added new.

 Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-3, 7, 10-13, 17, and 18 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ackerman et al. (US PAP 2021/0312897).

Consider claim 1, Ackerman teaches A music generation method (abstract), comprising: 
obtaining text information (0059, user may input lyrics), and performing voice synthesis on the text information, so as to obtain a voice audio corresponding to the text information (0066, 0091, generating singing voice via speech synthesis using the lyrics); 
obtaining an initial music audio, the initial music audio including a music key point, and music characteristics of the initial music audio having a sudden change at the position of an audio key point (0053-54 selecting a background track, which may include musical transitions such as verses, chorus, bridge, etc.); and 
synthesizing the voice audio and the initial music audio based on the position of the music key point to obtain a target music audio; in the target music audio, the voice audio appears at the position of the music key point of the initial audio music (0065-66, 0049, 0091, synthesizing singing voice according to lyrics and melodies generated for different sections of the music, such as chorus and verses).

Consider claim 2, Ackerman teaches the method according to claim 1, wherein the performing voice synthesis on the text information to obtain the voice audio corresponding to the text information comprises: 
converting the text information into a corresponding voice using a text-to-speech method (0049, 0091, synthesizing singing voice using a speech synthesis); 
in response to an operation of selecting a timbre, selecting a target timbre from a plurality of preset timbres (0049, user may select from present voices such as male and female which vary in timbre); and 
based on the target timbre, converting the voice corresponding to the text information into a voice audio (0049, synthesizing singing based on selected voice).

Consider claim 3, Ackerman teaches The method according to claim 1, wherein the obtaining an initial Application No.: New music audio comprises: 
in response to an operation of selecting a music category, selecting a target music category from a plurality of preset music categories (0052, 0069, figure 5B, selection of genres of background music); and 
selecting one music audio as the initial music audio from a plurality of music audios corresponding to the target music category (0052, 0069, figure 5B, selection of genres of background music, and then selecting from multiple tracks corresponding to genre).

Consider claim 7, Ackerman teaches The method according to claim 1, wherein the synthesizing the voice audio and the initial music audio based on the position of the music key point to obtain a target music audio comprises: 
matching the voice audio with at least one music key point according to a preset strategy, and different voice audios being matched with different music key points (0061-65, matching lyrics to sections of the songs); and 
injecting the voice audio into the initial music audio at a matched music key point based on the result of matching according to the preset strategy, and synthesizing the injected voice audio and the initial music audio into the target music audio (0065-66, 0049, 0091, synthesizing singing voice according to lyrics and melodies generated for different sections of the music, such as chorus and verses).

Consider claim 10, Ackerman teaches A system (abstract, figure 1) comprising at least one computing apparatus and at least one storage apparatus for storing instructions, wherein the instructions, when executed by the at least one computing apparatus, cause the at least one computing apparatus to perform steps (0033, CPU, memory, application for song generation) of a music generation method, comprising: 
obtaining text information (0059, user may input lyrics), and performing voice synthesis on the text information, so as to obtain a voice audio corresponding to the text information (0066, 0091, generating singing voice via speech synthesis using the lyrics); 
obtaining an initial music audio, the initial music audio including a music key point, and music characteristics of the initial music audio having a sudden change at the position of an audio key point (0053-54 selecting a background track, which may include musical transitions such as verses, chorus, bridge, etc.); and 
synthesizing the voice audio and the initial music audio based on the position of the music key point to obtain a target music audio; in the target music audio, the voice audio appears at the position of the music key point of the initial audio music (0065-66, 0049, 0091, synthesizing singing voice according to lyrics and melodies generated for different sections of the music, such as chorus and verses).

Consider claim 11, Ackerman teaches A non-transitory computer-readable storage medium, wherein the computer-readable storage medium stores a program or instructions (0106, computer readable media and instrucitons), which, when executed by at least one computing apparatus, cause the at least one computing apparatus to perform steps of a music generation method, comprising: 
obtaining text information (0059, user may input lyrics), and performing voice synthesis on the text information, so as to obtain a voice audio corresponding to the text information (0066, 0091, generating singing voice via speech synthesis using the lyrics); 
obtaining an initial music audio, the initial music audio including a music key point, and music characteristics of the initial music audio having a sudden change at the position of an audio key point (0053-54 selecting a background track, which may include musical transitions such as verses, chorus, bridge, etc.); and 
synthesizing the voice audio and the initial music audio based on the position of the music key point to obtain a target music audio; in the target music audio, the voice audio appears at the position of the music key point of the initial audio music (0065-66, 0049, 0091, synthesizing singing voice according to lyrics and melodies generated for different sections of the music, such as chorus and verses).

Claim 12 contains similar limitations as claim 2 and is therefore rejected for the same reasons.

Claim 13 contains similar limitations as claim 3 and is therefore rejected for the same reasons.

Claim 17 contains similar limitations as claim 2 and is therefore rejected for the same reasons.

Claim 18 contains similar limitations as claim 3 and is therefore rejected for the same reasons.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 4, 5, 14, 15, 19, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Lemons (US PAP 2008/0264241).

Consider claim 4, Ackerman teaches the method according to claim 3, wherein the selecting one music audio as the initial music audio from a plurality of music audios corresponding to the target music category comprises: 
obtaining a plurality of music style templates corresponding to the target music category, the music style templates being audio templates for generating music (0052, 0069, figure 5B, selection of genres of background music, and then selecting from multiple tracks corresponding to genre); and 
in response to an operation of selecting a music style template, performing one of: selecting a target music style template from the plurality of music style templates as the initial music audio (0052, 0069, figure 5B, selection of genres of background music, and then selecting from multiple tracks corresponding to genre); randomly selecting a music style template from the plurality of music style templates as the initial music audio (OPTIONAL LIMITATION).
Ackerman does not specifically teach the music style templates created based on melody, chord progression and orchestration.
In the same file of music composition, Lemons teaches the music style templates created based on melody, chord progression and orchestration (0071-73, system may supply templates that include chord progressions, melody, and instrument selection based on the chosen genre).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to use templates including melody, chord progression and orchestration as taught by Lemons in the system of Ackerman in order to provide more guidance to creating music to those users lacking experience (Lemons 0071).

Consider claim 5, Ackerman teaches the method according to claim 4, wherein the audio key point is located at any of a plurality of preset positions in the music style template, and wherein the plurality of preset positions include at least one of: a preset position before a chorus in the music style template (0053-54 selecting a background track, which may include musical transitions such as verses, chorus, bridge, etc.), a position in the music style template where its beat intensity is greater than or equal to a preset threshold, a preset position before or after a phrase in the music style templates (OPTIONAL LIMITATIONS).

Claim 14 contains similar limitations as claim 4 and is therefore rejected for the same reasons.

Claim 15 contains similar limitations as claim 5 and is therefore rejected for the same reasons.

Claim 19 contains similar limitations as claim 4 and is therefore rejected for the same reasons.

Claim 20 contains similar limitations as claim 5 and is therefore rejected for the same reasons.

Claim(s) 6, 8, 16, and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ackerman in view of Zhou et al. (US PAP 2022/0223125).

Consider claim 6, Ackerman teaches The method according to claim 1, but does not specifically teach wherein the synthesizing the voice audio and the initial music audio based on the position of the music key point to obtain a target music audio comprises: 
randomly matching the voice audio with at least one music key point, and different voice audios being matched with different music key points; and 
injecting the voice audio into the initial music audio at a matched music key point based on a result of the randomly matching, and synthesizing the injected voice audio and the initial music audio into the target music audio.
In the same field of musical generation, Zhou teaches wherein the synthesizing the voice audio and the initial music audio based on the position of the music key point to obtain a target music audio comprises: 
randomly matching the voice audio with at least one music key point, and different voice audios being matched with different music key points (0069-70, input lyric text may be randomly matched with rhythm bars and chords from template based on musical style); and 
injecting the voice audio into the initial music audio at a matched music key point based on a result of the randomly matching, and synthesizing the injected voice audio and the initial music audio into the target music audio (0095, generating singing track, matching with Ackerman at 0065-66, 0049, 0091, synthesizing singing voice according to lyrics and melodies generated for different sections of the music, such as chorus and verses).
It would have been obvious to one of ordinary skill in the art at the time of effective filing to randomly match lyric segments as taught by Zhou in the system of Ackerman in order to create variety in the generated music.

Consider claim 8, Ackerman teaches The method according to claim 6, wherein the synthesizing the injected voice audio and the initial music audio into the target music audio comprises:
performing at least one of reverberation processing, delay processing, compression processing and volume processing on the injected voice audio and the initial music audio to obtain the target music audio (0096, signal enhancements such as reverb may be added).

Claim 16 contains similar limitations as claim 6 and is therefore rejected for the same reasons.

Claim 21 contains similar limitations as claim 6 and is therefore rejected for the same reasons.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Cui et al. (US PAP 2020/0372896) and Yang (US PAP 2019/0385578) teach similar methods of generating singing voices along with backing music.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451. The examiner can normally be reached 6:30am-5pm Monday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached at (571)272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

DOUGLAS GODBOLD
Examiner
Art Unit 2655



/DOUGLAS GODBOLD/           Primary Examiner, Art Unit 2655

Read full office action

Prosecution Timeline

Jun 28, 2024

Application Filed

Feb 11, 2026

Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/338,075

Patent 12585879

ARTIFICIAL INTELLIGENCE ASSISTED NETWORK OPERATIONS REPORTING AND MANAGEMENT

2y 5m to grant Granted Mar 24, 2026

18/327,780

Patent 12579371

USING MACHINE LEARNING TO GENERATE SEGMENTS FROM UNSTRUCTURED TEXT AND IDENTIFY SENTIMENTS FOR EACH SEGMENT

2y 5m to grant Granted Mar 17, 2026

18/489,671

Patent 12579372

KEY PHRASE TOPIC ASSIGNMENT

2y 5m to grant Granted Mar 17, 2026

18/492,524

Patent 12579383

VERIFYING TRANSLATIONS OF SOURCE TEXT IN A SOURCE LANGUAGE TO TARGET TEXT IN A TARGET LANGUAGE

2y 5m to grant Granted Mar 17, 2026

18/232,485

Patent 12572749

Compressing Information Provided to a Machine-Trained Model Using Abstract Tokens

2y 5m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

83%

Grant Probability

94%

With Interview (+10.5%)

2y 10m

Median Time to Grant

Low

PTA Risk

Based on 1079 resolved cases by this examiner. Grant probability derived from career allow rate.

MUSIC GENERATION METHOD, APPARATUS AND SYSTEM, AND STORAGE MEDIUM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email