DETAILED ACTION
Claims 1-18 are pending.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-3, 5-9, 11-15, 17-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kang (US 2015/0195608) in view Bauchot et al. (US 2009/0060458).
Claim 1, Kang teaches a computer-implemented method for synchronizing audio and video using pause gap analysis, the method comprising:
splitting a video into an audio stream and a video stream (p. 0062);
identifying time points at which there is no sound in the audio stream and deriving pause gaps in the audio stream (i.e. audio silence detector) (p. 0091, 0164);
applying a binary classifier to predict sound presence or absence in frames of the video stream and deriving pause gaps in the video stream (i.e. detecting scene changes) (p. 0192);.
Kang is not entirely clear in teaching a computer-implemented method for synchronizing audio and video using pause gap analysis, the method comprising:
identifying desynchronization between the pause gaps in the video stream and the pause gaps in the audio stream; and
aligning the pause gaps in the video stream with the pause gaps in the audio stream, based on metadata of the pause gaps in the video stream.
Bauchot teaches a computer-implemented method for synchronizing audio and video using pause gap analysis, the method comprising:
identifying desynchronization (i.e. synchronization mark) between the pause gaps (i.e. silence detection) in the video stream and the pause gaps in the audio stream (p. 0031-0033); and
aligning the pause gaps (i.e. increasing or decreasing silence gaps) in the video stream with the pause gaps in the audio stream, based on metadata of the pause gaps in the video stream (i.e. syncing data flows by extending silences and matching silences) (p. 0031-0033, 0042).
Therefore, it would be obvious to one of ordinary skill in the art before the effective invention was filed to have provided synchronization of streams as taught by Bauchot to the system of Kang to sync data flows with different silence gaps (p. 0033).
Claim 2, The computer-implemented method of claim 1, further comprising:
splitting a training video (i.e. audio or video) into a training audio stream and a training video stream (p. 0062);
identifying time points at which there is no sound in the training audio stream and deriving pause gaps in the training audio stream (i.e. audio silence detector) (p. 0091, 0164).
Kang is not entirely clear in teaching The computer-implemented method of claim 1, further comprising:
converting the training audio stream into a binary stream with sound flags identifying the time points; and
using the sound flags and frames in the training video stream to train the binary classifier.
Bauchot teaches The computer-implemented method of claim 1, further comprising:
converting the training audio stream into a binary stream with sound flags identifying the time points (i.e. silence detection) (p. 0031-0033); and
using the sound flags (i.e. scene changes) and frames in the training video stream to train the binary classifier (i.e. syncing data flows by extending silences and matching silences) (p. 0031-0033, 0042)
Therefore, it would be obvious to one of ordinary skill in the art before the effective invention was filed to have provided synchronization of streams as taught by Bauchot to the system of Kang to sync data flows with different silence gaps (p. 0033).
Claim 3, Kang is silent regarding the computer-implemented method of claim 2, wherein the training video is a normal video which has no desynchronization of the training audio stream and the training video stream.
Bauchot teaches the computer-implemented method of claim 2, wherein the training video is a normal video which has no desynchronization of the training audio stream and the training video stream (i.e. buffered data flow) (p. 0031-0033).
Therefore, it would be obvious to one of ordinary skill in the art before the effective invention was filed to have provided synchronization of streams as taught by Bauchot to the system of Kang to sync data flows with different silence gaps (p. 0033).
Claim 5, Kang teaches the computer-implemented method of claim 1, further comprising:
feeding the video stream into the binary classifier (i.e. scene change detection) to obtain binary values which indicate whether the sound presence or absence in the frames of the video stream (i.e. using black frames) (p. 0192).
Claim 6, Kang is silent regarding the computer-implemented method of claim 1, wherein, by aligning the pause gaps in the video stream with the pause gaps in the audio stream, the video stream and the audio stream are synchronized.
Bauchot teaches the computer-implemented method of claim 1, wherein, by aligning the pause gaps in the video stream with the pause gaps in the audio stream, the video stream and the audio stream are synchronized (i.e. syncing data flows by extending silences and matching silences) (p. 0031-0033, 0042).
Therefore, it would be obvious to one of ordinary skill in the art before the effective invention was filed to have provided synchronization of streams as taught by Bauchot to the system of Kang to sync data flows with different silence gaps (p. 0033).
Claim 7 recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 1.
Kang inherently discloses recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 1.
Claim 8 recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 2.
Kang inherently discloses recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 2.
Claim 9 recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 3.
Kang inherently discloses recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 3.
Claim 11 recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 5.
Kang inherently discloses recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 5.
Claim 12 recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 6.
Kang inherently discloses recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 6.
Claim 13 is analyzed and interpreted as an apparatus of claim 1.
Claim 14 is analyzed and interpreted as an apparatus of claim 2.
Claim 15 is analyzed and interpreted as an apparatus of claim 3.
Claim 17 is analyzed and interpreted as an apparatus of claim 5.
Claim 18 is analyzed and interpreted as an apparatus of claim 6.
Claim(s) 4, 10, 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kang (US 2015/0195608) in view Bauchot et al. (US 2009/0060458), and Kumar et al. (US 2024/0098346).
Claim 4, Kang is silent regarding the computer-implemented method of claim 2, wherein training the binary classifier is through supervised machine learning.
Kumar teaches the computer-implemented method of claim 2, wherein training the binary classifier is through supervised machine learning (p. 0036).
Therefore, it would be obvious to one of ordinary skill in the art before the effective invention was filed to have provided machine learning for scene detection as taught by Kumar to the system of Kang to allow boundary locations to be identified (p. 0036).
Claim 10 recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 4.
Kang inherently discloses recites “A computer program product for synchronizing audio and video using pause gap analysis, the computer program product comprising a computer readable storage medium having program instructions stored therewith” for performing the steps of claim 4.
Claim 16 is analyzed and interpreted as an apparatus of claim 4.
Response to Arguments
Applicant's arguments filed 12/9/2025 have been fully considered but they are not persuasive.
Claim 1, Applicant submits that the cited combination of references fails to teach or suggest each limitation of the invention as set forth in the claim. Specifically, whereas the Office Action provides that the limitation: "applying a binary classifier to predict sound presence or absence in frames of the video stream and deriving pause gaps in the video stream" is taught or suggested by Kang, paragraph [0192]: "As still another case, the advertising program may be detected using detection information on black frame and scene change from the video data and detection information on audio silence from the audio data for the advertisement detection period set based on the electronic program guide information of data for data broadcasting within the broadcast signal including the broadcasting program" Applicant submits hat the cited paragraph provides for detecting the presence of an advertising program using either black frame and scene change from the video data and detection information on audio silence from the audio data, and not the claimed prediction of the presence or absence of sound in the frames of the video stream. Applicant submits that the cited portions of the reference fail to teach or suggest predicting sound presence or absence in frames of the video stream, focusing instead upon the presence of black frames and scene changes in the video stream to detect advertising and not to predict the presence or absence of sound in the advertising program.
In response:
The Examiner respectfully disagrees. Kang clearly discloses a system that detects black frames and/or a silence within the program which according to one or ordinary skill in the art is used to “predict” the presence of an advertisement. In functional language there is no difference between “detect” and “predict” absent language that specifies how a prediction is performed. The claims are silent regarding how a prediction is performed. Therefore, Kangs system of “detecting” black frames for silence in the program, since a detection is never going to be 100% accurate all the time, this detection is interpreted as also a prediction.
Applicant further argues that Bauchot, paragraph [0033] provides: "In an embodiment, the data flows buffer (200) buffers a first incoming data flow. As soon as the synchronization marks receiver (200) receives a synchronization mark involving the first data flow, the audio silence detector (200) starts analyzing and detecting audio silence periods. Meanwhile, the data flows buffer (200) listens for the pending necessary second data flow, as determined by the synchronization mark. Buffered data is modified in the data flows modification unit (200). Audio silence periods durations are increased or decreased, according to the interaction with the network controller (208). When both the second data of the second data flow to be synchronized with the first data of the first data flow and the first data of the first data flow are received, buffered, and synchronized, the data quit the buffer running positions for playing back in the media player (160)."
As to the limitation: "identifying desynchronization between the pause gaps in the video stream and the pause gaps in the audio stream", the Office Action points only to the use of a genericized synchronization mark as teaching the limitation. Applicant submits that the cited portion of the reference lacks any specificity regarding the desynchronization between the pause gaps in the video stream and the pause gaps in the audio stream details of the limitation. The reference speaks of analyzing and detecting audio silence periods in the first data flow but fails to mention the claimed pause gaps in the video stream.
In response:
Applicant respectfully disagrees. Reading the claims in the broadest sense, the claims requires identifying a desync in between the gaps in the audio and video, and then aligning the gaps. Bauchot discloses audio and video gaps (fig. 7, 702, “v1”). When an audio gap is increased, as taught by Bauchot, a desync will occur and the system of Bauchot will synchronize or align the streams by adding frames to the video gap (710, 712). Therefore, Bauchot discloses the claimed features of “…identifying a desync in between the gaps in the audio and video, and then aligning the gaps…”
Conclusion
Claims 1-18 are rejected.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Inquiries
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MUSHFIKH I ALAM whose telephone number is (571)270-1710. The examiner can normally be reached 1:00PM-9:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nasser Goodarzi can be reached at 571-272-4195. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
MUSHFIKH I. ALAM
Primary Examiner
Art Unit 2426
/MUSHFIKH I ALAM/Primary Examiner, Art Unit 2426 3/25/2026