Last updated: May 29, 2026
Application No. 18/402,336
ELECTRIC VEHICLE DATA BASED VIDEO COMPOSITION AND CONTENT AUGMENTATION

Final Rejection §101§103
Filed
Jan 02, 2024
Examiner
ZHANG, WAYNE
Art Unit
2672
Tech Center
2600 — Communications
Assignee
Rivian Ip Holdings LLC
OA Round
2 (Final)
Interview Optional

— +31.6% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 59% grant rate with +31.6% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 22 resolved cases, 2023–2026
Examiner Intelligence

ZHANG, WAYNE View full profile →
Grants 59% of resolved cases
Career Allowance Rate
13 granted / 22 resolved
-2.9% vs TC avg
Strong +32% interview lift
Without
With
+31.6%
Interview Lift
resolved cases with interview
Typical timeline
2y 11m
Avg Prosecution
12 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§103
91.4%
+51.4% vs TC avg
§112
8.6%
-31.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 22 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1, 3-5, 11, 13-15, 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Claim 1 recites:
“identify a plurality of videos taken from a vehicle, each video of the plurality of videos captured between a first time and a second time” which can be reasonably interpreted as a human observer mentally identifying videos captured from a vehicle.
“identify, for the plurality of videos, a plurality of video fragments, each video fragment of the plurality of video fragments corresponding to data of the vehicle at a time interval of a plurality of time intervals between the first time and the second time for each video of the plurality of videos;” which can be reasonably interpreted as a human observer mentally identifying clips throughout certain videos.
“determine, based on the plurality of video fragments input into a model trained on a data of a plurality of scenes, a type of scene for each video fragment of the plurality of video fragments;” which can be reasonably interpreted as a human observer mentally determining the type of scene each video fragment is. A model is specified at a high level of generality such that they do not impose any meaningful limitations on practicing the abstract idea.
“select, a set of video fragments based on the respective data and the respective type of scene of a plurality of sets of video fragments;” which can be reasonably interpreted as a human observer mentally selecting a set of video fragments.
“and generate a composite video using the set of video fragments.” is a well-understood, routine, and conventional insignificant extra-solution activity of data gathering.

Claim 3 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea of determining a driving session is complete and in response to this completion, identifying videos related to the driving session. A person can mentally determine when a driving session is complete and mentally identify clips from the driving session when it is complete.

Claim 4 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea of generating a score related to a video fragment and selecting a video fragment based on the score. A person can mentally determine a score for a video fragment, such as the quality, and selecting a video fragment is a well-understood, routine, and conventional insignificant extra-solution activity of data gathering.

Claim 5 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea of selecting a set of video fragments that correspond to a time interval and generating a composite video based on these fragments. A person can mentally select a set of video fragments based on a time interval and generating the composite video is a well-understood, routine, and conventional insignificant extra-solution activity of data gathering.

Claims 11, 13-15 corresponds to claims 1, 3-5 respectively. Thus, they are rejected for the same reasons of being directed to an abstract idea without significantly more.

Claim 20 corresponds to claim 1, additionally reciting a non-transitory computer-readable media. These parts are adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. Thus, claim 20 is rejected for the same reasons of being directed to an abstract idea without significantly more.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-2, 4-5, 11-12, 14-15, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Golov (US 20200250901 A1) in view of Coimbra De Andrade (herein referred to as Andrade) (US 20240395081 A1) and Aguilar (US 20180174616 A1).
	Regarding claim 1, Golov discloses a data processing system (Golov, paragraph [0044], “FIG. 1 shows a system in which a vehicle is configured with a data recorder to collect sensor data for improving its advanced driver assistance systems (ADAS) and for accident review.”), comprising:
	One or more processors coupled with memory(Golov, paragraph [0059], “The computer system (131) of the vehicle (111) includes one or more processors (133), a data recorder (101), and memory (135) storing firmware (or software) (147), including the computer instructions and data models for ADAS (105)”), to
identify a plurality of videos taken from a vehicle, each video of the plurality of videos captured between a first time and a second time (Golov, paragraph [0021], "For example, the data record can receive a data stream containing live video data from various cameras mounted on the vehicle and various output signals from the other sensors, such as radar, lidar, ultrasound sonar, etc. The sensor data can be raw data from the sensors or compressed image/video data"),
	identify, for the plurality of videos a plurality of video fragments, each video fragment of the plurality of video fragments corresponding to data of the vehicle at a time interval of a plurality of time intervals between the first time and the second time for each video of the plurality of videos (Golov, paragraph [0075-0076], "At block 203, the data recorder (101) buffers in the first cyclic buffer (161) a first sensor data stream of a first time duration (e.g., 30 seconds). At block 213, the data recorder (101) buffers in the second cyclic buffer (163) a second sensor data stream of a second time duration (e.g., 3 minutes).").
	Golov does not teach “determine, based on the plurality of video fragments input into a model trained on a data of a plurality of scenes, a type of scene for each video fragment of the plurality of video fragments”.
	However, Andrade teaches determine, based on the plurality of video fragments input into a model trained on a data of a plurality of scenes, a type of scene for each video fragment of the plurality of video fragments (Coimbra de Andrade, paragraph [0021], "As an example, the classifier machine learning model may assign the following risk-related labels based on an analysis of the video data (e.g., where a “0” indicates a low risk, a “1” indicates a mild violation (a mild risk), a “2” indicates a severe violation (a high risk), and a “3” indicates a collision): a tailgating severity label (e.g., 0, 1, or 2), a stop sign violation severity label (e.g., 0, 1, 2, or 3), a minor severity confidence label (e.g., from 0 to 1), a moderate severity confidence label (e.g., from 0 to 1), a major severity confidence label (e.g., from 0 to 1), a critical severity confidence label (e.g., from 0 to 4), a presence of a vulnerable road user (VRU) label (e.g., 0, 1, or 2), and/or the like").
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to assign a severity to each video of Golov, as taught by Andrade.
The suggestion/motivation for doing so would have been to provide a more accurate diagnosis of incidents and enhance safety as a result.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
	Golov in view of Andrade does not teach “select, a set of video fragments based on the respective data and the respective type of scene of a plurality of sets of video fragments; and generate a composite video using the set of video fragments”.
	However, Aguilar teaches select, a set of video fragments based on the respective data and the respective type of scene of a plurality of sets of video fragments; and generate a composite video using the set of video fragments (Aguilar, paragraph [0028], "Video segments can be selected from the set of one or more source video clips, and the video segments can be combined to create a compiled video. Video segments may be selected so as to create a compiled video of a fixed duration. For example, if a user wishes to create a compiled video having a duration of one minute, video segments totaling one minute in duration may be selected, or video segments may be selected and edited so that the compiled video is one minute long. Video segments can be selected based on various video segment selection criteria. In certain embodiments, source video clips can be analyzed to automatically determine a common theme, and video segments can be selected based on the common theme. In various embodiments, themes may be automatically determined using machine learning techniques, such as object recognition and/or facial recognition. In other embodiments, a user may provide a user-specified theme, and video segments can be selected based on the user-specified theme. Once a plurality of video segments are selected, they can be combined into a compiled video").  
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to create a video montage using the assigned severities of Golov’s (in view of Andrade) videos, as taught by Aguilar.
The suggestion/motivation for doing so would have been to allow viewers to be more knowledgeable about accidents and enhance precaution and safety.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Golov in view of Andrade and Aguilar to obtain the invention as specified in claim 1.

Regarding claim 2, Golov in view of Andrade and Aguilar discloses the system of claim 1.
Golov in view of Andrade and Aguilar does not explicitly teach “wherein each video of the plurality of videos is captured by a camera of a plurality of cameras of the vehicle, each of the plurality of cameras turned to a direction different from a direction of each other of the plurality of cameras”.
However, Andrade additionally teaches wherein each video of the plurality of videos is captured by a camera of a plurality of cameras of the vehicle, each of the plurality of cameras turned to a direction different from a direction of each other of the plurality of cameras (Coimbra De Andrade, paragraph [0001], "A video system may utilize machine learning models to classify driving events (e.g., tailgating, a collision, distraction, drowsiness, and/or the like) triggered by accelerometers, front facing cameras, driver facing cameras, and/or the like").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to additionally mount cameras on different sides of Golov’s (in view of Andrade and Aguilar) vehicle, as additionally taught by Andrade.
The suggestion/motivation for doing so would have been to allow a driver to see from all sides, enhancing safety and awareness.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Golov in view of Andrade, Aguilar, and the additional teachings of Andrade to obtain the invention as specified in claim 2.

Regarding claim 4, Golov in view of Andrade and Aguilar discloses the system of claim 1, wherein the one or more processors are configured to: generate, for each video fragment of each set of video fragments of the plurality of sets of video fragments, a score determined according to the data and the type of scene of the respective video fragment (Coimbra de Andrade, paragraph [0021], "As an example, the classifier machine learning model may assign the following risk-related labels based on an analysis of the video data (e.g., where a “0” indicates a low risk, a “1” indicates a mild violation (a mild risk), a “2” indicates a severe violation (a high risk), and a “3” indicates a collision): a tailgating severity label (e.g., 0, 1, or 2), a stop sign violation severity label (e.g., 0, 1, 2, or 3), a minor severity confidence label (e.g., from 0 to 1), a moderate severity confidence label (e.g., from 0 to 1), a major severity confidence label (e.g., from 0 to 1), a critical severity confidence label (e.g., from 0 to 4), a presence of a vulnerable road user (VRU) label (e.g., 0, 1, or 2), and/or the like", severity is a score),
select, for each respective set of video fragments, a selected video fragment of the respective set according to the score of the selected video fragment (Aguilar, paragraph [0028], "Video segments can be selected from the set of one or more source video clips, and the video segments can be combined to create a compiled video. Video segments may be selected so as to create a compiled video of a fixed duration. For example, if a user wishes to create a compiled video having a duration of one minute, video segments totaling one minute in duration may be selected, or video segments may be selected and edited so that the compiled video is one minute long. Video segments can be selected based on various video segment selection criteria. In certain embodiments, source video clips can be analyzed to automatically determine a common theme, and video segments can be selected based on the common theme. In various embodiments, themes may be automatically determined using machine learning techniques, such as object recognition and/or facial recognition. In other embodiments, a user may provide a user-specified theme, and video segments can be selected based on the user-specified theme. Once a plurality of video segments are selected, they can be combined into a compiled video", based on the severity, a video compilation is generated, as shown in claim 1’s modification).  
	Regarding claim 5, Golov in view of Andrade and Aguilar discloses the system of claim 1.
Golov in view of Andrade and Aguilar does not teach “wherein the one or more processors are configured to: select for a plurality of sets of video fragments corresponding to at least a subset of the plurality of time intervals; and generate the composite video using the set of video fragments corresponding to the at least a subset of the plurality of time intervals”.
However, Aguilar additionally teaches wherein the one or more processors are configured to: select for a plurality of sets of video fragments corresponding to at least a subset of the plurality of time intervals; and generate the composite video using the set of video fragments corresponding to the at least a subset of the plurality of time intervals (Aguilar, paragraph [0028], "For example, if a user specifies that he or she would like a compiled video having a duration of 30-seconds, and there are six source video clips, a five-second video segment can be selected from each source video clip").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to create a video montage based on clips with similar lengths from Golov’s (in view of Andarde and Aguilar) livestream, as additionally taught by Aguilar.
The suggestion/motivation for doing so would have been to keep the video montage format organized and consistent with a certain video length.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Golov in view of Andrade and Aguilar with the additional teachings of Aguilar to obtain the invention as specified in claim 5.

	Claims 11-12, 14-15 corresponds to claims 1-2, 4-5 respectively. Thus, claims 11-12, 14-15 are rejected for the same reasons of obviousness as claims 1-2, 4-5 respectively.

	Claim 20 corresponds to claim 1, additionally reciting a non-transitory computer-readable media having processor readable instructions (Golov, paragraph [0122], “Examples of computer-readable media include but are not limited to non-transitory, recordable and non-recordable type media such as volatile and non-volatile memory devices, read only memory (ROM), random access memory (RAM), flash memory devices, floppy and other removable disks, magnetic disk storage media, optical storage media (e.g., Compact Disk Read-Only Memory (CD ROM), Digital Versatile Disks (DVDs), etc.), among others”). Thus, claim 20 is rejected for the same reasons of obviousness as claim 1.

Claim(s) 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Golov (US 20200250901 A1) in view of Coimbra De Andrade (herein referred to as Andrade) (US 20240395081 A1), Aguilar (US 20180174616 A1), and in further view of Simmons (US 20190333409 A1).
Regarding claim 3, Golov in view of Andrade and Aguilar discloses the system of claim 1.
Golov in view of Andrade and Aguilar does not teach “wherein the one or more processors are configured to: determine that a drive session is complete; and identify, responsive to the determination that the drive session is complete, the plurality of videos of the drive session captured by a plurality of cameras of the vehicle”.
However, Simmons teaches wherein the one or more processors are configured to: determine that a drive session is complete; and identify, responsive to the determination that the drive session is complete, the plurality of videos of the drive session captured by a plurality of cameras of the vehicle (Simmons, paragraph [0039], "In some embodiments, the SDMS app can automatically activate any of the cameras at the start of a driving session, deactivate the cameras at the end of the driving session, and upload the recorded video log to the SDMS 100.").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to transmit videos when a drive session of Golov (in view of Andrade and Aguilar) is complete, as taught by Simmons.
The suggestion/motivation for doing so would have been to automate the process of transmission, reducing the manual need for a person to upload the videos.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Golov in view of Andrade, Aguilar, and in further view of Simmons to obtain the invention as specified in claim 3.

Claim 13 corresponds to claim 3. Thus, claim 13 is rejected for the same reasons of obviousness as claim 3.

Claim(s) 6 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Golov (US 20200250901 A1) in view of Coimbra De Andrade (herein referred to as Andrade) (US 20240395081 A1), Aguilar (US 20180174616 A1), and in further view of CfTesla on Youtube (https://www.youtube.com/watch?v=TF1qdUcLCCM).
Regarding claim 6, Golov in view of Andrade and Aguilar discloses the system of claim 1.
Golov in view of Andrade and Aguilar does not teach “wherein each of the plurality of sets of video fragments includes a plurality of video fragments from the plurality of videos captured by a plurality of cameras, each of the plurality of sets of video fragments corresponding to a different time interval of the plurality of time intervals between the first time and the second time”.
However, CfTesla teaches wherein each of the plurality of sets of video fragments includes a plurality of video fragments from the plurality of videos captured by a plurality of cameras, each of the plurality of sets of video fragments corresponding to a different time interval of the plurality of time intervals between the first time and the second time (CfTesla, 4:36, screenshot below).

    PNG
    media_image1.png
    1114
    1810
    media_image1.png
    Greyscale

It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to capture Golov’s (in view of Andrade and Aguilar) videos of different time intervals, as taught by CfTesla.
The suggestion/motivation for doing so would have been to have non-overlapping clips, resulting in better editing and compiling of the videos.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Golov in view of Andrade, Aguilar, and in further view of CfTesla to obtain the invention as specified in claim 6.

Claim 16 corresponds to claim 6. Thus, claim 16 is rejected for the same reasons of obviousness as claim 6.

Claim(s) 7, 10, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Golov (US 20200250901 A1) in view of Coimbra De Andrade (herein referred to as Andrade) (US 20240395081 A1), Aguilar (US 20180174616 A1), and in further view of van Welzen (US 20220108727 A1).
Regarding claim 7, Golov in view of Andrade and Aguilar discloses the system of claim 1.
Golov in view of Andrade and Aguilar does not teach “wherein the one or more processors are configured to: identify a feature of the composite video; determine, based at least on the composite video input into a model trained using machine learning on a data comprising a plurality of features in the plurality of scenes, a scene of the composite video corresponding to the feature; select, based on the scene, content to insert into the composite video, and provide, for display, the composite video including the content”.
However, van Welzen teaches wherein the one or more processors are configured to: identify a feature of the composite video (van Welzen, paragraph [0032], “In some embodiments, video represented by the game data 202, video uploaded by a user, stored videos from prior game sessions, etc. may be analyzed by the event detector 114 using machine learning models, neural networks, and/or other artificial intelligence techniques to identify events from the video or image data”),
determine, based at least on the composite video input into a model trained using machine learning on a data comprising a plurality of features in the plurality of scenes, a scene of the composite video corresponding to the feature (van Welzen, paragraph [0031], “With reference to FIG. 2, the event detector 114 may analyze game data 202 (e.g., live game data from live game sessions, pre-recorded game data from previously played game sessions, video of game sessions, etc.) to create event logs 204 corresponding to particular event types the event detector 114 is programmed or trained to detect and/or to trigger the highlight generator 116 to trigger the recorder 118 to generate a highlight (e.g., a game video(s) 206) from the game data”),
select, based on the scene, content to insert into the composite video (van Welzen, paragraph [0014], "The recipe may be configured such that, when executed, one or more input videos and event metadata corresponding to the input video (e.g., event logs corresponding to events of particular types—such as kills or deaths in first person shooter (FPS) style games, or goals, home runs, touchdowns, or other scoring plays in sports style games) may be used to generate a video montage script").
and provide, for display, the composite video including the content (van Welzen, paragraph [0022], "The client devices 104 may include a game application 106, a display 108, a graphical user interface (GUI) 110, and/or an input device(s) 112").
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to implement van Welzen’s method of finding a feature in Golov’s (in view of Andrade and Aguilar) video, using a model to acquire a scene, finding content to insert into Golov’s (in view of Andrade and Aguilar) video, and displaying it.
The suggestion/motivation for doing so would have been to acquire more scenes for video compilation, which further gives the viewer more clips to watch and further their understanding of vehicle safety.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Golov in view of Andrade, Aguilar, and in further view of van Welzen to obtain the invention as specified in claim 7.

Regarding claim 10, Golov in view of Andrade, Aguilar, and van Welzen discloses the system of claim 7, wherein the one or more processors are configured to: generate, based at least on the scene input into a second model trained using machine learning on data comprising a plurality of contents, the content to insert into the composite video  (van Welzen, paragraph [0014], "The recipe may be configured such that, when executed, one or more input videos and event metadata corresponding to the input video (e.g., event logs corresponding to events of particular types—such as kills or deaths in first person shooter (FPS) style games, or goals, home runs, touchdowns, or other scoring plays in sports style games) may be used to generate a video montage script").
and select the content responsive to the generating (van Welzen, paragraph [0049], "For example, once a recipe, parameters, data sources, and/or other information for the video montage 214 are selected, the video preview may be generated for the user and displayed with the GUI 110").

Claim 17 corresponds to claim 7. Thus, claim 17 is rejected for the same reasons of obviousness as claim 7.

Claim(s) 8-9, 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Golov (US 20200250901 A1) in view of Coimbra De Andrade (herein referred to as Andrade) (US 20240395081 A1), Aguilar (US 20180174616 A1), van Welzen (US 20220108727 A1), and in further view of Muramatsu (US 20130070096 A1).
Regarding claim 8, Golov in view of Andrade, Aguilar, and van Welzen discloses the system of claim 7.
Golov in view of Andrade, Aguilar, and van Welzen does not teach “wherein the one or more processors are configured to: identify data of the vehicle corresponding to a fragment of the composite video; select, based on the data and the scene, the content to insert into the composite video”.
However, Muramatsu teaches wherein the one or more processors are configured to: 
identify data of the vehicle corresponding to a fragment of the composite video; select, based on the data and the scene, the content to insert into the composite video (Muramatsu, paragraph [0011], "An aspect of the present invention is an object detection device detecting an object near a vehicle from an input video image, the input video image being a video image of surroundings of the vehicle shot from the vehicle. This object detection device is provided with: a video image converting section converting the input video image to a characteristics video image into which image characteristics have been extracted from the input video image; a video images-classified-by-distance extracting/composing section extracting areas which differ according to distances from the characteristics video image as video images classified by distance, on the basis of the distance from the vehicle, and composing a composite video image using the video images classified by distance;").
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to perform object detection using Golov’s (in view of Andrade, Aguilar, and van Welzen) video, and inserting them into Golov’s (in view of Andrade, Aguilar, and van Welzen) montage video, as taught by Muramatsu.
The suggestion/motivation for doing so would have been to allow viewers to understand where certain objects/vehicle are during a crash, which furthers their understanding of the situation they are watching.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Golov in view of Andrade, Aguilar, van Welzen, and in further view of Muramatsu to obtain the invention as specified in claim 8.

Regarding claim 9, Golov in view of Andrade, Aguilar, and van Welzen discloses the system of claim 7.
Golov in view of Andrade, Aguilar, and van Welzen does not teach “wherein the one or more processors are configured to: identify data of the vehicle corresponding to a fragment of the composite video; select, based on the data and the scene, the content to insert into the composite video”.
However, Muramatsu teaches wherein the one or more processors are configured to: identify data of the vehicle corresponding to a fragment of the composite video; select, based on the data and the scene, the content to insert into the composite video (Muramatsu, paragraph [0011], "An aspect of the present invention is an object detection device detecting an object near a vehicle from an input video image, the input video image being a video image of surroundings of the vehicle shot from the vehicle. This object detection device is provided with: a video image converting section converting the input video image to a characteristics video image into which image characteristics have been extracted from the input video image; a video images-classified-by-distance extracting/composing section extracting areas which differ according to distances from the characteristics video image as video images classified by distance, on the basis of the distance from the vehicle, and composing a composite video image using the video images classified by distance;", object detection shows a feature location).
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to perform object detection using Golov (in view of Andrade, Aguilar, and van Welzen) video, and inserting them into Golov’s (in view of Andrade, Aguilar, and van Welzen) montage video, as taught by Muramatsu.
The suggestion/motivation for doing so would have been to allow viewers to understand where certain objects/vehicle are during a crash, which furthers their understanding of the situation they are watching.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Golov in view of Andrade, Aguilar, van Welzen, and in further view of Muramatsu to obtain the invention as specified in claim 9.

Claim 18 corresponds to claim 8. Therefore, claim 18 is rejected for the same reasons of obviousness as claim 8.

Regarding claim 19, Golov in view of Andrade, Aguilar, and van Welzen discloses the method of claim 17, comprising:
generate, based at least on the scene input into a second model trained using machine learning on data comprising a plurality of contents, the content to insert into the composite video (van Welzen, paragraph [0014], "The recipe may be configured such that, when executed, one or more input videos and event metadata corresponding to the input video (e.g., event logs corresponding to events of particular types—such as kills or deaths in first person shooter (FPS) style games, or goals, home runs, touchdowns, or other scoring plays in sports style games) may be used to generate a video montage script"),
and select the content responsive to the generating (van Welzen, paragraph [0049], "For example, once a recipe, parameters, data sources, and/or other information for the video montage 214 are selected, the video preview may be generated for the user and displayed with the GUI 110").
	Golov in view of Andrade, Aguilar, and van Welzen does not teach “identifying, by the one or more processors a location of the feature in a frame of the composite video; inserting, by the one or more processors, the content into the frame of the composite video according to the location of the feature”.
	However, Muramatsu teaches identify a location of the feature in a frame of the composite video; and insert the content into the frame of the composite video according to the location of the feature (Muramatsu, paragraph [0011], "An aspect of the present invention is an object detection device detecting an object near a vehicle from an input video image, the input video image being a video image of surroundings of the vehicle shot from the vehicle. This object detection device is provided with: a video image converting section converting the input video image to a characteristics video image into which image characteristics have been extracted from the input video image; a video images-classified-by-distance extracting/composing section extracting areas which differ according to distances from the characteristics video image as video images classified by distance, on the basis of the distance from the vehicle, and composing a composite video image using the video images classified by distance;").
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to perform object detection using Golov’s (in view of Andrade, Aguilar, and van Welzen) video, and inserting them into Golov’s (in view of Andrade, Aguilar, and van Welzen) montage video, as taught by Muramatsu.
The suggestion/motivation for doing so would have been to allow viewers to understand where certain objects/vehicle are during a crash, which furthers their understanding of the situation they are watching.
Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. 
Therefore, it would have been obvious to combine Golov in view of Andrade, Aguilar, van Welzen, and in further view of Muramatsu to obtain the invention as specified in claim 19.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WAYNE ZHANG whose telephone number is (571) 272-0245. The examiner can normally be reached Monday-Friday 10:00-6:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ms. Sumati Lefkowitz can be reached on (571) 272-3638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.








Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/WAYNE ZHANG/Examiner, Art Unit 2672



/SUMATI LEFKOWITZ/Supervisory Patent Examiner, Art Unit 2672
Read full office action
Prosecution Timeline

Jan 02, 2024
Application Filed
Dec 16, 2025
Non-Final Rejection mailed — §101, §103
Mar 01, 2026
Interview Requested
Mar 06, 2026
Applicant Interview (Telephonic)
Mar 12, 2026
Response Filed
Mar 13, 2026
Examiner Interview Summary
May 27, 2026
Final Rejection mailed — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/995,033
Patent 12591990
METHOD AND APPARATUS FOR GENERATING SPATIAL GEOMETRIC INFORMATION ESTIMATION MODEL
3y 6m to grant Granted Mar 31, 2026
18/185,102
Patent 12591958
INFRA-RED CONTRAST ENHANCEMENT FILTER
3y 0m to grant Granted Mar 31, 2026
17/919,905
Patent 12561843
METHOD FOR MANAGING IMAGE DATA, AND VEHICLE LIGHTING SYSTEM
3y 4m to grant Granted Feb 24, 2026
17/923,329
Patent 12536629
Image Processing Method and Electronic Device
3y 2m to grant Granted Jan 27, 2026
17/945,100
Patent 12536667
METHOD AND FACILITY FOR SEGMENTATION OF HIGH-CONTRAST OBJECTS IN X-RAY IMAGES
3y 4m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
59%
Grant Probability
91%
With Interview (+31.6%)
2y 11m (~6m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 22 resolved cases by this examiner. Grant probability derived from career allowance rate.