Last updated: April 19, 2026

Application No. 18/620,342

SYSTEM AND METHOD FOR AI/XI BASED AUTOMATIC SONG FINDING AND ADAPTATION METHOD FOR VIDEOS

Non-Final OA §103§112

Filed

Mar 28, 2024

Examiner

WOO, STELLA L

Art Unit

2693

Tech Center

2600 — Communications

Assignee

BELLEVUE INVESTMENTS GmbH & Co. KGaA

OA Round

1 (Non-Final)

Interview Optional

— +13.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 1007 resolved cases, 2023–2026

Examiner Intelligence

WOO, STELLA L View full profile →

Grants 80% — above average

Career Allow Rate

801 granted / 1007 resolved

+17.5% vs TC avg

Moderate +13% lift

Without

With

+13.2%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

21 currently pending

Career history

1028

Total Applications

across all art units

Statute-Specific Performance

§101

3.3%

-36.7% vs TC avg

§103

42.4%

+2.4% vs TC avg

§102

27.9%

-12.1% vs TC avg

§112

11.4%

-28.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1007 resolved cases

Office Action

§103 §112

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-4 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation "said intrinsic video" in line 4.  There is insufficient antecedent basis for this limitation in the claim.  It is suggested to replace “video” with –audio--.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-2 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chinta et al. (US 2023/0031056 A1, “Chinta”) in view of Herberger et al. (US 2006/0122842 A1, “Herberger”).
As to claim 1, Chinta discloses a method of automatically matching a song to a video work (automatically recommending audio files, such as songs, background music, based on text and video content, para. 0017, using a matching operation of an AI engine, para. 0022), said video work having a video component and an intrinsic audio component associated therewith (video content may be a movie, a short video, a television show, video game content, a feature film, etc., para. 0018-0019), comprising the steps of: 
(a) extracting said intrinsic video from said video work (textual information 116 from script, dialogues, monologues, narration, etc. of the video content, para. 0021, 0032); 
(b) identifying all sections of said intrinsic audio that contain speech (determines video content features including dialogue, para. 0092, 0121, 0136); 
(c) identifying a speech start time and a speech stop time of each of said sections of said intrinsic audio that contain speech, thereby obtaining at least one speech start time and speech stop time associated with said intrinsic audio (start position and stop position to insert audio files during non-verbal, para. 0050); 
(d) obtaining an AI-selected song suitable for use with said video work (AI engine 108 automatically recommends best matching audio files, e.g. songs or background music, to be inserted at determined positions related to selected scenes, para. 0022, 0066); 
(e) using each of said at least one speech start times and speech said stop times to identify a corresponding a music start time and a music stop time in said AI-selected song (audio files are inserted based on start and stop positions, para. 0050); 
(f) generating an adaptation setting using each of said at least one music start times and music stop times, thereby obtaining at least one adaptation setting; 
(g) applying each of said at least one adaptation settings to said AI-selected song, thereby obtaining an adapted song; 
(h) adding said adapted song to said video work to form a combined video work, thereby matching said adapted song to said video work (recommended audio file is inserted at recommended positions of the video content, para. 0019); and 
(i) performing at least a portion of said combined video work for a user (media content comprising the video content and inserted audio file is output, para. 0027).
Chinta differs from claim 1 in that it does not disclose the above underlined limitations, i.e. an adaptation setting.
Herberger teaches automatically creating an emotionally controlled soundtrack that matches in overall emotion or mood to video scenes (Abstract) in which a user will be able to adapt the inserted audio file by raising or minimizing the volume level, adjusting the fade-in/fade-out, etc. (para. 0051).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chinta with the above teaching of Herberger in order to smooth the transition between a music-accompanied section of video and one with no soundtrack (para. 0076).
As to claim 2, Chinta in view of Herberger teaches: wherein the step of applying each of said at least one adaptation settings to said AI-selected song, said AI-selected having an original volume, comprises the steps of: 
(g1) for each of said at least one music start times and music stop times, either
(i) reducing said original volume of said AI-selected song between said music start time and said music stop time by a predetermined amount (Herberger: added music soundtrack volume may be lowered to a subtle/background level for smooth transitions, para. 0048), or
(ii) reducing an energy level of said AI-selected song between said music start time and said music stop time (Herberger: generated soundtrack may need to be adjusted in tempo, pitch, etc., para. 0050),
thereby obtaining said adapted song (Herberger: after processing, and the user is satisfied with the result, the soundtrack is stored and made an integral part of the video content, para. 0071).
Claim(s) 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chinta in view of Herberger, as applied to claim 2 above, and further in view of Singer (US 2012/0201518 A1).
Chinta in view of Herberger differs from claim 3 in that it does not specifically teach: wherein said predetermined amount is between -5 dB and -15 dB.
Singer teaches adding a music track to video, in which the volume of the music track is automatically reduced 15-45 dB (para. 0057).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chinta in view of Herberger with the above teaching of Singer in order to automatically reduce the volume sufficiently to hear the music track but not cover up the audio of the original (Singer: para. 0057).
Claim(s) 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chinta in view of Herberger, as applied to claim 2 above, and further in view of Zhao (US 2018/0349493 A1).
Chinta in view of Herberger differs from claim 4 in that although it teaches inserting fade-ins and fade-outs before and after music loops to smooth the audio transitions (Herberger: para. 0024, 0048, 0052, 0080-0081), it does not specifically teach: wherein the step of reducing said original volume of said AI-selected song between said music start time and said music stop time by a predetermined amount, further comprises: 
(i1) beginning at least one second before said music start time, ramping down said original volume of said AI-selected song by said predetermined amount, 
(i2) beginning one second after said music stop time ramping up said volume of said AI-selected song to said original volume.
Zhao teaches performing a fade-out effect one second before a specific time period (para. 0080) and a fade-in effect one second after a specific time period (para. 0082).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chinta in view of Herberger with the above teaching of Zhao in order to quickly apply fade-in/fade-out effects.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Balabhadrapatruni et al. (US 2017/0294208 A1) teach fade-in of background music 5 seconds before the end of a scene (para. 0061).
Harel (US 9343053 B2) teaches adding audio sound effects to movies.
Brochu (US 10,276,189 B1) teach suggesting audio tracks for moods identified in video content.
Ponochevnyi (US 2023/0260548 A1) teach selecting music for video based on mood, genre, energy level, etc.
Venti et al. (US 11,481,434 B1) teach generating a ranked list of audio files to “soundtrack” a video file based on emotion, style, subject matter, etc.
Patterson et al. (US 2021/0272599 A1) teach automatically selecting at least one soundtrack for a user sourced video.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Stella L Woo whose telephone number is (571)272-7512. The examiner can normally be reached Monday - Friday, 8 a.m. to 5 p.m.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached at 571-272-7488. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Stella L. Woo/            Primary Examiner, Art Unit 2693

Read full office action

Prosecution Timeline

Mar 28, 2024

Application Filed

Oct 31, 2025

Non-Final Rejection — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/209,475

Patent 12602416

HYBRID ARTIFICIAL INTELLIGENCE SYSTEM FOR SEMI-AUTOMATIC PATENT CLAIMS ANALYSIS

2y 5m to grant Granted Apr 14, 2026

18/454,212

Patent 12587613

System and method for documenting and controlling meetings with labels and automated operations

2y 5m to grant Granted Mar 24, 2026

18/466,814

Patent 12585681

Methods for Converting Electronic Presentations Into Autonomous Information Collection and Feedback Systems

2y 5m to grant Granted Mar 24, 2026

18/543,126

Patent 12581038

AUDIO PROCESSING IN VIDEO CONFERENCING SYSTEM USING MULTIMODAL FEATURES

2y 5m to grant Granted Mar 17, 2026

19/220,169

Patent 12568170

PRIORITIZING EMERGENCY CALLS BASED ON CALLER RESPONSE TO AUTOMATED QUERY

2y 5m to grant Granted Mar 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

80%

Grant Probability

93%

With Interview (+13.2%)

2y 9m

Median Time to Grant

Low

PTA Risk

Based on 1007 resolved cases by this examiner. Grant probability derived from career allow rate.