Last updated: April 19, 2026

Application No. 18/621,284

MOVING IMAGE GENERATION SYSTEM

Final Rejection §103

Filed

Mar 29, 2024

Examiner

TSWEI, YU-JANG

Art Unit

2614

Tech Center

2600 — Communications

Assignee

Toyota Jidosha Kabushiki Kaisha

OA Round

2 (Final)

Interview Optional

— +17.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 447 resolved cases, 2023–2026

Examiner Intelligence

TSWEI, YU-JANG View full profile →

Grants 84% — above average

Career Allow Rate

376 granted / 447 resolved

+22.1% vs TC avg

Strong +17% interview lift

Without

With

+17.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 5m

Avg Prosecution

44 currently pending

Career history

491

Total Applications

across all art units

Statute-Specific Performance

§101

5.5%

-34.5% vs TC avg

§103

66.4%

+26.4% vs TC avg

§102

5.6%

-34.4% vs TC avg

§112

7.1%

-32.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 447 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the Amendment filed on 12/09/2025.
Claims 1, 4-7 are pending. Claim 1, 4, 5 has been amended. Claims 2-3 have cancelled. Claims 6-7 are newly added.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Takahashi et al. (US 20200007759 A1, hereinafter Takahashi) in view of Katz et al. (US 20210269045 A1, hereinafter Katz).

Regarding Claim 1, Takahashi teaches a moving image generation system (Takahashi, Paragraph [0006], an album generation apparatus <read on moving image generation system> … a generation unit configured to generate an album by using the selected image) comprising a server configured to (Takahashi, Paragraph [0035], “apparatus 130 processes a large amount of data, and is configured, for instance, with a computer for a server”): receive shooting data (Takahashi, Paragraph [0032], “The input apparatus 113 is an apparatus for receiving an input”) including a moving image (Takahashi, Paragraph [0029], “The image may be a still image or a moving image”) and sound, analyze the sound of the shooting data to extract a period in which music is being reproduced (Takahashi, Paragraph [0049], “the image capturing apparatus 120 may store the image in the storage apparatus… album generation apparatus 130, or in a case where a certain time (for instance, one hour) elapses” [0050], “The processing in the album viewing apparatus 140 is performed by reading a program stored in the storage apparatus” [0053], “the album generation apparatus 130 generates the album in line with this edit policy <read on analyze the input data extract a period>”), set a clipping range  (Takahashi, Paragraph [0053], “The generation of the album includes clipping a scene from a moving image”) based on [[ an analysis result of ]] shooting data correspond to the extracted period in which [[ music ]] is being reproduced (Takahashi, Paragraph [0047], “The image may be a still image or a moving image… The image capturing apparatus 120b installed on…the vehicle 101 <read on drive recorder>”),  and automatically generate a moving image that is obtained by clipping the shooting data according to the clipping range (Takahashi, Paragraph [0053], “uses the image selected… to generate the album… The generation of the album includes clipping a scene from a moving image… inserting BGM… the album generation apparatus 130 determines the parameters by itself <read on automatically generates>”).
But Takahashi does not explicitly disclose Input data include sound, [[ analyze the ]] sound [[ of the shooting data to extract a period in which ]] music [[ is being reproduced ]] based on an analysis result of  shooting data correspond to the extracted period in which music is being reproduced.
However, Katz teaches that Input data include sound, analyze the sound of the shooting data to extract a period in which music is being reproduced (Katz, Paragraph [0024], “further processing that took place further to capture the image, illumination condition during capturing images, features <read on shooting data> extracted from a digital image by sensor 130, or any other information associated with sensor data sensed by sensor… data received … may include … sound data <read on music>…sensor data may include metrics obtained by analyzing combinations of data from two or more sensors”) based on an analysis result of  shooting data correspond to the extracted period in which music is being reproduced (Katz, Paragraph [0037], “Deep neural networks may be used for predicting various human traits, behavior and actions from input sensor data such as still images, videos, sound and speech” [0040], “Machine learning components can be used to detect or predict actions including talking, shouting, singing, driving”; [0024], “sensor data may include metrics obtained by analyzing combinations of data from two or more sensors”).
Katz and Takahashi are analogous since both of them are dealing with in-vehicle media capture and system-driven generation of visual content. Takahashi provided a way of process video captured from camera and clip the scene for automatically generation an album. Katz provided a way of analysis of the captured video/audio and generate analysis result for other semantic events. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate in-vehicle expression/utterance analysis taught by Katz into modified invention of Takahashi such that the clipping range is based on analysis results of the captured video which can generate albums suited to user preferences.

Regarding Claim 5, the combination of Takahashi and Katz teaches the invention in Claim 1.
The combination further teaches wherein the server is further configured to set a range in which a designated person is included in the image (Takahashi, Paragraph [0035], “apparatus 130 processes a large amount of data, and is configured, for instance, with a computer for a server.” [0052], “select an image, of one or more images stored in the storage apparatus 132, including the combination of the driver <read on designated person> 103 and the vehicle 101”), [[ a range that starts at a period in which a person in the image performs a particular gesture ]], is set as the clipping range (Takahashi, Paragraph [0053], “The generation of the album includes clipping a scene from a moving image”).
 Takahashi does not explicitly disclose but Katz teaches a range that starts at a period in which a person in the image performs a particular gesture (Katz, Paragraph [0102], “Body gestures relate to any gesture performed by the driver by one or more body part, including gestures performed by hands, head, or eyes” [0129], “facial actions including: talking, yawning, blinking”).
Katz and Takahashi are analogous since both of them are dealing with in-vehicle media capture and system-driven generation of visual content. Takahashi provided a way of select scenes based on user presence. Katz provided a way of analysis of in-vehicle media based  gesture and expressions. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate gesture detection taught by Katz into modified invention of Takahashi such that system will be able to initiate clip start points based on gesture detection and to capture meaningful human moments during the drive and to generate albums suitable for user preferences.

Claim(s) 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Takahashi et al. (US 20200007759 A1, hereinafter Takahashi) in view of Katz et al. (US 20210269045 A1, hereinafter Katz) as applied to Claim 1 above and further in view of Mortensen et al. (WO2020011203A1, hereinafter Mortensen).

Regarding Claim 4, the combination of Takahashi and Katz teaches the invention in Claim 1.
The combination does not explicitly disclose but Mortensen teaches wherein the server is further configured to use a musical piece that was played in a vehicle during shooting of the shooting data (Mortensen, Abstract,  The in vehicle karaoke can record, with a microphone…interior of the vehicle, audio of a performance of the song by the user and overlay the recording with the selected song over a speaker… interior of the vehicle; [0032], The karaoke experience can include playing a selected song and scrolling associated lyrics… The entertainment system then overlays signals received through microphone 138 over the song as it is playing. For some embodiments, a video output can also be provided over the displays; [0019], “can include one or more other types of computers <read on server>”) for background music of the generated moving image (Mortensen, Paragraph  [0032], “The karaoke experience can include playing a selected song…The entertainment system then overlays signals received through microphone 138 over the song as it is playing. For some embodiments, a video output can also be provided over the displays” [0033], “the entertainment system 150 may record the performance of a user as well as the song that is played. Accordingly, the entertainment system 150 can then share a single recording of the user's performance as well as the audio of the song”).
Mortensen and Takahashi are analogous since both of them are dealing with in-vehicle media system that generate video output with music. Takahashi provided a way of moving image generation system that clips scenes and inserts BGM. Mortensen provided a way of use of song being played in vehicle and overlaying and recording it together with the performance with video output. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to incorporate in vehicle song as video soundtrack taught by Mortensen into modified invention of Takahashi such that music piece that is playing can be used for background music for the moving image which can preserve the authentic in cabin experience and to provide a personalized soundtrack for the generated video.

Claim(s) 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Takahashi et al. (US 20200007759 A1, hereinafter Takahashi) in view of Katz et al. (US 20210269045 A1, hereinafter Katz) and further in view of Mortensen et al. (WO2020011203A1, hereinafter Mortensen) as applied to Claim 4 above and further in view of Wang et al. (“Scene-Aware Background Music Synthesis”, 20201012, ACM, hereinafter Wang)

Regarding Claim 6, the combination of Takahashi and Katz teaches the invention in Claim 4.
The combination further teaches the musical piece that was played in the vehicle during shooting of the shooting data (Mortensen, Abstract,  The in vehicle karaoke can record, with a microphone…interior of the vehicle, audio of a performance of the song by the user and overlay the recording with the selected song over a speaker… interior of the vehicle; [0032], The karaoke experience can include playing a selected song and scrolling associated lyrics… The entertainment system then overlays signals received through microphone 138 over the song as it is playing. For some embodiments, a video output can also be provided over the displays; [0019], “can include one or more other types of computers <read on server>”). 
But the combination does not explicitly disclose wherein the server is further configured to synchronize the background music.
However, Wang teaches wherein the server is further configured to synchronize the background music with the musical piece that was played (Wang, Page 1165, Section 4.1 Optimization Formulation, “We synthesize scene-aware background music by optimizing against the approximated total cost function… which are segments of time in music corresponding to a specific number of beats…denote the currently playing music track”).
Wang and Takahashi are analogous since both of them are dealing with media system that generate video output with music. Takahashi provided a way of moving image generation system that clips scenes and inserts BGM. Wang provided a way of use of generate the music video output by synthesize the background music with the music currently is playing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate synthesizing music taught by Wang into modified invention of Takahashi such that music piece that is playing can be synthesize with background music for the moving image which can optimize the process during the video generation in vehicle be more smoothly and allow user to enjoy the personalized soundtrack more easily.

Claim(s) 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Takahashi et al. (US 20200007759 A1, hereinafter Takahashi) in view of Katz et al. (US 20210269045 A1, hereinafter Katz) as applied to Claim 1 above and further in view of Ahmad et al. (US 20030085913 A1, hereinafter Ahmad)

Regarding Claim 7, the combination of Takahashi and Katz teaches the invention in Claim 1.
The combination further teaches wherein the server is further configured to set the clipping range (Takahashi, Paragraph [0053], “The generation of the album includes clipping a scene from a moving image”);
But the combination does not explicitly disclose based on a volume, a tone, or a song title of the musical piece that is being reproduced in the shooting data.
However, Ahmad teaches set the clipping range based on a volume, [[ a tone ]] , or a song title of the musical piece that is being reproduced in the shooting data (Ahmad, Paragraph [0041], “audio volume during the audio content display can be automatically determined and used to determine the duration of each slideshow image (i.e., when to transition from one slideshow image to a next); [0044], “The system can also be implemented to enable (and prompt for) user input of some metadata (e.g., titles for musical content, such as album and song titles)”.
Ahmad and Takahashi are analogous since both of them are dealing with automatically selecting/defining portions of visual content in dependence on audio/music content. Takahashi provided a way of clipping a scene from a moving image by setting the clipping range. Ahmad provided a way of automatically determining audio volume and other attributes and using these attributes to determine the duration of the clipping range. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate automatically determine duration based on music attributes taught by Ahmad into modified invention of Takahashi such that the determination of how long or which portions of a captured moving image are selected would be guided by properties of the accompanying music which will improve synchronization and coherence between audio and visual content.


Response to Arguments
Applicant’s arguments with respect to claim 1 filed on 12/09/2025, with respect to rejection under 35 USC § 103 in regard to prior art does not teaches the limitation(s) “analyze the sound of the shooting data to extract a period in which music is being reproduced" and "set a clipping range based on an analysis result of the shooting data to correspond to the extracted period in which music is being reproduced” have been considered but is not persuasive.
In response to the argument, Prior art Takahashi teaches in Paragraph [0029] that the image that system handling is either still or moving image and further teaches in Paragraph [0049], [0050] and [0053] that the music generation is based on the analyzed  sound within a certain duration which read on the limitation “analyze the sound of the shooting data to extract a period in which music is being reproduced".
Prior art Takahashi further teaches in Paragraph [0047], [0053] that the clipping scene of moving image is determined based on the analyzed result from parameter defined to process the input data. Although it does not explicitly disclose the data is music data being reproduced, but prior art Katz teaches in Paragraph [0037] that the final data is generated from the analyze the sound data (aka music) is playing. Hence In combine the Takahashi and Katz, the prior arts combination fully teaches the limitation “which read on “set a clipping range based on an analysis result of the shooting data to correspond to the extracted period in which music is being reproduced”. Hence the combination of prior arts fully anticipate all the limitations in Claim 1.
In regard to Claims 4 and 5, they directly/indirectly depends on independent Claim 1. Applicant does not argue anything other than the independent claim 1. The limitations in those claims in conjunction with combination previously established as explained.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YUJANG TSWEI whose telephone number is (571)272-6669. The examiner can normally be reached 8:30am-5:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached at (571)272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/YuJang Tswei/Primary Examiner, Art Unit 2614

Read full office action

Prosecution Timeline

Mar 29, 2024

Application Filed

Sep 27, 2025

Non-Final Rejection — §103

Nov 11, 2025

Interview Requested

Nov 20, 2025

Examiner Interview Summary

Nov 20, 2025

Applicant Interview (Telephonic)

Dec 09, 2025

Response Filed

Feb 10, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/689,933

Patent 12579805

AUGMENTED, VIRTUAL AND MIXED-REALITY CONTENT SELECTION & DISPLAY FOR TRAVEL

2y 5m to grant Granted Mar 17, 2026

18/420,243

Patent 12579838

Perspective Distortion Correction on Faces

2y 5m to grant Granted Mar 17, 2026

18/007,045

Patent 12567213

COMPUTER VISION AND ARTIFICIAL INTELLIGENCE METHOD TO OPTIMIZE OVERLAY PLACEMENT IN EXTENDED REALITY

2y 5m to grant Granted Mar 03, 2026

18/657,567

Patent 12567189

RELATIONAL LOSS FOR ENHANCING TEXT-BASED STYLE TRANSFER

2y 5m to grant Granted Mar 03, 2026

18/512,461

Patent 12561930

PARAMETRIC EYEBROW REPRESENTATION AND ENROLLMENT FROM IMAGE INPUT

2y 5m to grant Granted Feb 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

84%

Grant Probability

99%

With Interview (+17.0%)

2y 5m

Median Time to Grant

Moderate

PTA Risk

Based on 447 resolved cases by this examiner. Grant probability derived from career allow rate.