Prosecution Insights
Last updated: April 19, 2026
Application No. 18/585,272

METHOD, DEVICE, STORAGE MEDIUM AND PROGRAM PRODUCT FOR VIDEO RECORDING

Non-Final OA §103
Filed
Feb 23, 2024
Examiner
DANG, HUNG Q
Art Unit
2484
Tech Center
2400 — Computer Networks
Assignee
BEIJING ZITIAO NETWORK TECHNOLOGY CO., LTD.
OA Round
5 (Non-Final)
68%
Grant Probability
Favorable
5-6
OA Rounds
3y 1m
To Grant
87%
With Interview

Examiner Intelligence

Grants 68% — above average
68%
Career Allow Rate
1257 granted / 1841 resolved
+10.3% vs TC avg
Strong +18% interview lift
Without
With
+18.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
95 currently pending
Career history
1936
Total Applications
across all art units

Statute-Specific Performance

§101
4.2%
-35.8% vs TC avg
§103
54.1%
+14.1% vs TC avg
§102
23.6%
-16.4% vs TC avg
§112
11.6%
-28.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 1841 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 08/15/2025 has been entered. Response to Arguments Applicant's arguments filed 07/16/2025 have been fully considered but they are not persuasive. On pages 9-10, Applicant argues that, The Office Action asserts that Chen at paragraphs [0089]-[0093], [0097], [0101], [0128], [0147]-[0151], [0168] and [0225] and Figs. 1 and 4 discloses most features of the combination recited in Claim 1, except for “the synthesizing comprises: transmitting, by the second video track, the video frame image in the second video track to the first video track; and synthesizing, by the first video track, the video frame image in the first video track and the received video frame image transmitted from the second video track to obtain the video frame image of the recorded video.” (hereinafter referred to as “distinguishing feature (1)” for convenience). However, the Office Action also asserts that these distinguishing features have been disclosed by Iwase at paragraph [0410] and Fig. 35. Applicant respectfully disagrees with these assertions, and respectfully submits that Iwase also fails to disclose or suggest the above distinguishing feature as recited in Claim 1. Nevertheless, in the interest of advancing prosecution, Claims 1, 9 and 17 have been amended to explicitly recite, inter alia. “placing a video frame image collected by a camera into a first video track frame by frame,” “placing a video frame image of the second video into a second video track frame by frame” and “the first video track being configured to record real-time images collected by the camera.” As indicated in Chen, e.g., specification paras. [0002] and [0100], Chen aims to represent the user's feelings, comments, or viewing reactions through a user video to the original video. Specifically, Chen only discloses, at e.g., specification paras. [0089]-[0093], that “Step S110, receiving a video capture trigger operation from a user via a video playing interface for an original video. ... Step S120, superimposing a video capture window on the video playing interface, in response to the video capture trigger operation”, and discloses at e.g., specification paras. [0149]- [0150], that “... when the original video and the user video are combined to obtain a combined video, the video frame image of the user video and the corresponding video frame image of the original video are combined ... and the video frame image of the user video in the synthesized one frame image is located in the video frame image of the original video.” However, in addition to the above distinguishing feature (1), Chen is also silent about “first video track” or “second video track” for recording image frames recited in the subject matter, and especially, fails to disclose or suggest at least “in response to a shooting initiation instruction, placing a video frame image collected by a camera into a first video track frame by frame, and the first video track is configured to record real-time images collected by the camera” (hereinafter referred to as “distinguishing feature (2)”), “in response to a playing instruction for a second video placing a video frame image of the second video into a second video track frame by frame” (hereinafter referred to as “distinguishing feature (3)”). (original emphasis) In response, Examiner respectfully disagrees and submits that Chen teaches a camera is used to capture the user video frame by frame, each frame of the user video is then placed into a first track one of a time to correspond to a corresponding frame of the original video. Specifically, at least in [0149], Chen teaches, [0149] It should be noted that, in the method for capturing video provided by the embodiment of the present disclosure, when the original video and the user video are combined to obtain a combined video, the video frame image of the user video and the corresponding video frame image of the original video are combined, the audio information corresponding to the video frame image of the user video and the audio information corresponding to the corresponding video frame image of the original video are combined, and then the combined video frame image and the corresponding audio information are combined, to obtain a combined video. Wherein, optionally, when the video frame image and the video frame image are combined, it means that the corresponding two video frame images are combined into one frame image, and the video frame image of the user video in the synthesized one frame image is located in the video frame image of the original video. When the original video and the user video are combined to obtain a combined video, the size of the video frame image of the user video is smaller than the size of the video frame image of the original video. (emphasis added) The emphasized text in the quoted paragraph [0149] clearly describes the first video and the second video are placed into corresponding track frame by frame so that the frames in the pair are combined to create a combined video. Examiner interprets a combined video is a sequence of frames, each of which is synthesized by combining two frames from the first and the second videos. Examiner respectfully submits that a video track can be reasonably interpreted as a processing path comprising various connecting and/or processing components in which or along which video data placed in the track is maintained until at least a specific processing is applied to the video. For example, in Chen, both the capture video and the original video are clearly maintained in a corresponding track to be displayed in corresponding window areas until an operation is performed to combine the video. Further, they must be placed in the corresponding track frame by frame so that such a pair of frames can be synthesized into a combined video frame. In addition, in at least [0100], [0147], and [0206], Chen also discloses the first video is captured and displayed in real-time, i.e. capturing his or her images showing his or her reactions to the currently displayed original video, so that a combined video, each frame of which includes a corresponding frame of the original video and a frame capturing the user image showing his or her reaction while viewing the original video can be generated. In other words, Chen teaches a user viewing an original video frame, capturing video of his or her reactions to the original video frame currently played, and displaying the frames of the captured video together with corresponding frames of the original video frame by frame. On page 10, Applicant further argues that, Further, Applicant respectfully submits that Iwase is also silent about “video track” for recording image frames, and fails to disclose at least the above distinguishing features (1) to (3) recited in amended Claim 1. Specifically, as recited in the subject matter of amended Claim 1, real-time images, 1.e., video frame images of a first video collected by a camera are placed into a first video track when shooting, and image frames, i.e., video frame images, of a prepared second video are placed into a second video track frame by frame when playing the second video […] (original emphasis) In response, Examiner respectfully submits that these arguments are moot in view of the discussion of Chen above, i.e. the Office Action does not rely on Iwase to teach these features. On pages 10-11, Applicant further argues that, […] Then, the second video track transmits the video frame image in the second video track to the first video track, and at the first video track, the video frame image in the first video track and the received video frame image transmitted from the second video track are synthesized to obtain a video frame image of the recorded video. While, as indicated in Iwase, e.g., specification paras. [0296], [0300], [0302], and [0309]- [0310], “the buffer 52 buffers the main-Clip data, the buffer 53 buffers the sub-Clip data,” “stream data read from the buffer 52, which is a main-Clip read buffer, is output to the PID (packet ID) filter 55 ... That is, the PID filter 55 supplies a video stream to the PID filter 60 for supplying it to one of the first video decoder 72-1 and the second video decoder 72-2 ...,” “Stream data read from the buffer 53, which is a sub-Clip read buffer, is output to the PID (packet ID) filter 56 ... the PID filter 56 supplies the supplied video stream to the PID filter 60 for supplying it to one of the first video decoder 72-1 and the second video decoder 72-2 ...,” and “The first video decoder 72-1 or the second video decoder 72-2 decodes the supplied video stream and outputs the decoded video data to the video-plane generator 92. ... When the video-plane generator 92 receives the video data from the first video decoder 72-1 and the second video decoder 72-2, the video-plane generator 92 combines the supplied video data ... The video-plane generator 92 then generates a video plane including a main display screen 1 and a sub display screen 2.” As such, Iwase teaches that the buffer 52 and the buffer 53 transmit data to corresponding video decoders 72 respectively, and then the video decoders 72 respectively output the decoded video data to the video-plane generator 92, to combine the supplied video data at the video-plane generator 92 for finally generating a video plane. That is, Iwase provides a completely different strategy and working principle from the subject matter of amended Claim 1. (original emphasis) In response, Examiner respectfully disagrees for the same reason as discussed in the Office Action dated 05/16/2025. Specifically, Examiner submits that a video track can be reasonably interpreted as a processing path comprising various connecting and/or processing components. As such, Iwase teaches a first video track comprising: buffer 52, video decoders 72-1, 72-2, and video plane generator 92 and a second video track comprising at least the buffer 53. The second track transmits a video frame image to one of the first and the second video decoders of the first track via PID filter 56 (see at least [0302]), the video plane generator 92 of the first track then synthesizes the received video frame image from the second track with the video frame image of the first track from the other one of the video decoders into a combined video. On pages 11-12, Applicant do not present new arguments, thus the arguments are moot in view of the discussion of Chen and Iwase above. As such, Applicant’s arguments as a whole are not persuasive. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1, 5-6, 8-9, 13-14, and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (US 2021/0014431 A1 – hereinafter Chen) and Iwase et al. (US 2008/0267588 A1 – hereinafter Iwase). Regarding claim 1, Chen discloses a method of video recording, comprising: in response to a shooting initiation instruction, placing a video frame image collected by a camera into a first video track frame by frame, and displaying the video frame image in the first video track frame by frame (Fig. 1 – steps S110-S120; [0089]-[0093] – in response to receiving a video capture trigger operation, placing a video frame image into at least a video playing interface, which is implemented as a software layer as described at least in [0225] to at least abstract the frame image, for displaying the video frame image onto a video capture window within the video playing interface), the first video track being configured to record real-time images collected by the camera ([0095]; [0100]; [0147]; [0206] – the first video track recording real-time images of the viewer reacting to the original video when the original video is being played); in response to a playing instruction for a second video, placing a video frame image of the second video into a second video track frame by frame ([0093] – upon playing the original video, placing a video frame image of the second video into an abstraction of the frame image as an image of a certain video frame being retained as described at least in [0097] and [0128] at a software layer as described at least in [0225]), and displaying the video frame image in the second video track (Fig. 4; [0151] – displaying the video frame image of the original video into region “image a”); in response to a pausing playing instruction for the second video, pausing placing the video frame image of the second video into the second video track, so that the video frame image placed before pausing play of the second video is retained in the second video track ([0097]; [0128] – in response to a pausing playing instruction issued either automatically or by the user, pausing placing the video frame image so that an image of a certain video frame is retained); and in response to a recording instruction, synthesizing the video frame image in the first video track with the video frame image in the second video track to obtain a video frame image of a recorded video ([0147]; [0149]-[0150] – in response to a capture trigger operation as described at least in [0101], synthesizing the video frame images in the first and second video tracks to obtain a video frame image of a recorded video as described at least at [0168]). Chen also discloses the synthesizing the video frame image in the first video track with the video frame image in the second video track to obtain a video frame image of a recorded video, comprising: synthesizing the video frame image in the first video track as a main track and a video frame image from the second video track as a PIP track to obtain the video frame image of the recorded video ([0147]; [0149]-[0150] – the first video track is in the main video part b of the screen and the second video track is in a PIP video part a of the screen). Chen does not disclose the synthesizing comprises: transmitting, by the second video track, the video frame image in the second video track to the first video track; and synthesizing, by the first video track, the video frame image in the first video track and a received video frame image transmitted from the second video track to obtain the video frame image of the recorded video. Iwase discloses a video synthesizing process comprises: transmitting, by a second video track, a video frame image in the second video track to a first video track; and synthesizing, by the first video track, the video frame image in the first video track and a received video frame image transmitted from the second video track to obtain the video frame image of a combined video (Fig. 35; [0310] – a second track as a PIP video track comprising buffer 53 transmitting the video images to be synthesized as PIP video frames to a first track, which comprises buffer 52, video decoders 72-1, 72-2, and video plane generator 92, the video plane generator 92 of the first track synthesizes the video images in the first track and the second track into a combined video). One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to incorporate the teachings of Iwase into synthesizing the first main video tracks and the second PiP video track in the method of Chen in order to make the input video streams steady and smooth, thus minimizing interruptions for the synthesizing process. Regarding claim 5, Chen also discloses the synthesizing, by the first video track, the video frame image in the first video track and the received video frame image transmitted from the second video track to obtain the video frame image of the recorded video, comprising: arranging, by the first video track, the video frame image in the first video track at a first position in a preset picture, and arranging, by the first video track, the video frame image received from the second video track at a second position in the preset picture, to obtain the video frame image of the recorded video ([0150] – synthesizing the first and the second video frames sequentially according to a positional order). Regarding claim 6, Chen also discloses obtaining recorded audio ([0164] – obtaining audio in respective audio information for combining); and writing the video frame image of the recorded video and the recorded audio into a synthesized video file ([0168] – writing the combined video information and the combined audio information into a combined video file for storage). Regarding claim 8, Chen also discloses before responding to the shooting initiation instruction, the method further comprising: in response to a co-shooting instruction for the second video, displaying a co-shooting interface ([0089] – in response to an instruction, displaying a video playing interface for an original video). Claim 9 is rejected for the same reason as discussed in claim 1 above in view of Chen also disclosing an electronic device ([0070] – an electronic device) comprising: at least one processor ([0070] – the electronic device including a processor) and a memory ([0070] – the electronic device including a memory) configured to store computer executable instructions ([0071] – the memory configured to store computer operation instructions), wherein the at least one processor is configured to execute computer executable instructions stored in the memory to cause the at least one processor to perform the recited acts ([0072] - the processor is configured to execute the method by invoking the computer operation instruction). Claim 13 is rejected for the same reason as discussed in claim 5 above. Claim 14 is rejected for the same reason as discussed in claim 6 above. Claim 16 is rejected for the same reason as discussed in claim 8 above. Claim 17 is rejected for the same reason as discussed in claim 1 above in view of Chen also disclosing a non-transitory computer-readable storage medium, the computer-readable storage medium stores computer executable instructions ([0071] – a memory configured to store computer operation instructions), when a processor executes the computer executable instructions, the computer executable instruction implement the recited acts ([0072] - a processor is configured to execute the method by invoking the computer operation instruction). Claims 2, 10, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Chen and Iwase as applied to claims 1, 5-6, 8-9, 13-14, and 16-17 above, and further in view of Hicks et al. (US 2017/0332089 A1 – hereinafter Hicks). Regarding claim 2, see the teachings of Chen and Iwase as discussed in claim 1 above. However, Chen and Iwase do not disclose the placing a video frame image collected by a camera into a first video track, comprising: transmitting image data collected by the camera to the first video track, so that the first video track encodes the image data to obtain the video frame image collected by the camera. Hicks discloses placing a video frame image collected by a camera into a first video track, comprising: transmitting image data collected by the camera to the first video track, so that the first video track encodes the image data to obtain the video frame image collected by the camera ([0078] – video frames captured by a camera are transmitted to an encoder that encodes the frames for display, the encoder and the display are interpreted together as a first video track). One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to incorporate the teachings of Hicks into the method taught by Chen and Iwase in order to prepare the video frame image for display correctly. Claim 10 is rejected for the same reason as discussed in claim 2 above. Claim 18 is rejected for the same reason as discussed in claim 2 above. Claims 4, 7, 12, 15, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chen and Iwase as applied to claims 1, 5-6, 8-9, 13-14, and 16-17 above, and further in view of Lee et al. (US 2018/0227506 A1 – hereinafter Lee). Regarding claim 4, see the teachings of Chen and Iwase as discussed in claim 1 above. However, Chen and Iwase do not disclose in response to an instruction for pausing recording, stopping synthesizing the video frame image in the first video track with the video frame image in the second video track. Lee discloses in response to an instruction for pausing recording, stopping synthesizing a video frame image in a first video track with a video frame image in a second video track ([0096];[0098]; [0100]; [0136] – in response to a photographing termination instruction to terminate recording of a combined video image, capturing of a video frame image into a first video track is stopped thus stopping synthesizing of the video frame image in the first video track onto the combined video frame image). One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to incorporate the teachings of Lee into the method taught by Chen and Iwase to allow the user to pause or stop the recording process if needed thus enhancing the user interface of the method. Regarding claim 7, see the teachings of Chen and Iwase as discussed in claim 6 above. However, Chen and Iwase do not explicitly disclose the writing the video frame image of the recorded video and the recorded audio into a synthesized video file, comprising: encoding the recorded audio and the video frame image of the recorded video to obtain encoded data, and writing the encoded data into the synthesized video file. Lee discloses writing a video frame image of a recorded video and a recorded audio into a synthesized video file, comprising: encoding the recorded audio and the video frame image of the recorded video to obtain encoded data, and writing the encoded data into the synthesized video file ([0098]; [0100] – the combined video file comprising the audio information and video information which are arranged in the file according to a certain format for storage, e.g. mpeg, mpg, mp4, avi, mov, and mkv). One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to incorporate the teachings of Lee into the method taught by Chen and Iwase to make the synthesized video file compatible with existing encoding standard so that it can be played back on any compatible player. Claim 12 is rejected for the same reason as discussed in claim 4 above. Claim 15 is rejected for the same reason as discussed in claim 7 above. Claim 20 is rejected for the same reason as discussed in claim 4 above. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUNG Q DANG whose telephone number is (571)270-1116. The examiner can normally be reached IFT. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Thai Q Tran can be reached on 571-272-7382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /HUNG Q DANG/Primary Examiner, Art Unit 2484
Read full office action

Prosecution Timeline

Feb 23, 2024
Application Filed
Apr 07, 2024
Non-Final Rejection — §103
Jul 11, 2024
Response Filed
Jul 29, 2024
Final Rejection — §103
Sep 30, 2024
Response after Non-Final Action
Oct 09, 2024
Response after Non-Final Action
Oct 09, 2024
Examiner Interview (Telephonic)
Oct 30, 2024
Request for Continued Examination
Nov 04, 2024
Response after Non-Final Action
Jan 30, 2025
Non-Final Rejection — §103
May 05, 2025
Response Filed
May 13, 2025
Final Rejection — §103
Jul 16, 2025
Response after Non-Final Action
Aug 15, 2025
Request for Continued Examination
Aug 28, 2025
Response after Non-Final Action
Feb 09, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12594460
MANAGING BLOBS FOR TRACKING OF SPORTS PROJECTILES
2y 5m to grant Granted Apr 07, 2026
Patent 12588818
DETECTION OF A MOVABLE OBJECT WHEN 3D SCANNING A RIGID OBJECT
2y 5m to grant Granted Mar 31, 2026
Patent 12592258
METHOD AND APPARATUS FOR INTERACTIVE VIDEO EDITING PLATFORM TO CREATE OVERLAY VIDEOS TO ENHANCE ENTERTAINMENT VIDEO GAMES WITH EDUCATIONAL CONTENT
2y 5m to grant Granted Mar 31, 2026
Patent 12587693
ARTIFICIALLY INTELLIGENT AD-BREAK PREDICTION
2y 5m to grant Granted Mar 24, 2026
Patent 12574649
ENCODING AND DECODING METHOD, ELECTRONIC DEVICE, COMMUNICATION SYSTEM, AND STORAGE MEDIUM
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

5-6
Expected OA Rounds
68%
Grant Probability
87%
With Interview (+18.3%)
3y 1m
Median Time to Grant
High
PTA Risk
Based on 1841 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month