DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 03/03/2026 have been fully considered but they are not persuasive.
Applicant argues on pages 2+ of the 03/03/2026 Remarks that Matsuyama fails to teach “performing, in response to the instruction to rewind, a second context driven rewind analysis including at least one of performing face detection…” as recited in claim 21 and similarly to claims 28 and 35.
In response to argument, Examiner respectfully disagrees. Examiner’s position is that this argument is not persuasive because it is based on an unduly narrow interpretation of the claim language. Under the Broadest Reasonable Interpretation (BRI), the limitation: “performing face detection…in response to the instruction to rewind” does not require that: face detection be performed only after the rewind instruction, nor that the detection must be initiated exclusively by the rewind instruction. Rather, the limitation reasonably encompasses: performing face detection for the purpose of enabling rewind functionality, including embodiments where detection is performed prior to rewind and utilized when rewind is invoked.
Matsuyama teaches performing face detection on video frames, and storing face-related information in association with the video, which is later used during playback operation (including rewind/navigation). Thus, when a rewind instruction is received, the system uses face detection results to control or influence playback, thereby satisfying the functional relationship required by the claim. The fact that detection occurs earlier in time does not negate that it is functionally responsive to rewind operations. The claim does not explicitly recite that “initiating face detection after receiving the rewind instruction”, or “real-time detection triggered by rewind.” Since this language is missing, the claim reads on systems where detection is performed in advance but supports rewind functionality when invoked. Therefore, Matsuyama teaches or at least suggests claim 1 limitation.
In response to the applicant’s argument regarding “Change in principle of operation”, Applicant asserts that combining Hong with Matsuyama would require a change in principle of operation. Examiner respectfully disagrees. This argument is not supported by sufficient technical evidence. Matsuyama’s principle is for using detected face information to enhance video navigation while Hong’s teaching is for applying detection in connection with playback control. Both references operate within the same general field of analyzing video content to improve user interaction with playback. Modifying Matsuyama to perform face detection at a different time (e.g., closer to rewind or on-demand) would merely constitute an obvious design variation, and a predictable use of prior art elements according to their established functions, consistent with KSR.
Applicant argues on pages 3+ of the 03/03/2026 Remarks, that the alleged motivations to combine the references are unsupported, that the references do not explicitly suggest improving accuracy, and there is no reason to combine Hong with Matsuyama.
In response to Applicant’s argument, Examiner respectfully disagrees. Under KSR v. Teleflex, an explicit teaching, suggestion, or motivation is not required. Examiner is allowed to rely on common knowledge of one of ordinary skill in the art. Furthermore, a person of ordinary skill in the art would have been motivated to combine the references because improving playback accuracy, usability, and responsiveness is well-known design goal. Applying known detection technique of Hong to known playback/navigation systems of Matsuyama yields predictable results and no unexpected technical difficulties. Applicant’s argument improperly requires an express statement in the reference, which is inconsistent with controlling guidelines.
For the reasons above, the rejection under 35 USC § 103 is therefore maintained.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 21, 22, 28, 29, 35 and 36 are rejected under 35 U.S.C. 103 as being unpatentable over Hong (U.S. Pub. No. 2015/0154982) in view of Matsuyama (U.S. Pub. No. 2010/0014835).
Regarding claim 21, Hong discloses a method to dynamically rewind digital video content based on detected dialog comprising:
detecting an instruction by a user to rewind the digital video content (see paragraphs 0026-0028, fig. 3, fig. 4 (408); receiving rewind control signals);
performing, in response to the instruction to rewind, a first context driven rewind analysis including voice activity detection (see paragraphs 0020, 0046 fig. 4 (402); voice activity detection (VAD) and speech activity detection (SAD));
calculating a starting point of a last segment of detected dialog based on the first context driven rewind analyses (see paragraphs 0027-0028, fig. 2b, fig. 2c, fig. 3; calculating starting points of speech segments); and
rewinding the digital video content to the calculated starting point (see paragraphs 0027-0028, fig. 3; rewinding to starting point of speech segment).
However, Hong is silent as to performing, in response to the instruction to rewind, a second context driven rewind analysis including at least one of: performing face detection, performing lip detection, verifying subtitle data, and verifying scene change data.
In an analogous art, Matsuyama discloses performing, in response to the instruction to rewind, a second context driven rewind analysis including at least one of: performing face detection (see paragraphs 0025, 0038-0041, 0049-0051, fig. 2, fig. 3a-3d; face detection and management of appearance/disappearance points), performing lip detection, verifying subtitle data, and verifying scene change data;
calculating a starting point of a last segment of detected dialog based on the second context driven rewind analyses (see paragraphs 0035, 0041, fig. 2, figs. 3a-3d; calculating appearance points of faces);
rewinding the digital video content to the calculated starting point (see paragraphs 0049-0051, fig. 5; rewinding to appearance point of face).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the dialog/VAD system of Hong with the teachings of Matsuyama’s face detection rewind analysis, the motivation being to improve contextual rewind accuracy.
Regarding claim 28, claim 28 is rejected for the same reason set forth in the rejection of claim 21.
Regarding claim 35, claim 35 is rejected for the same reason set forth in the rejection of claim 21.
Regarding claims 22, 29 and 36, Hong and Matsuyama discloses everything claimed as applied above (see claims 21, 28 and 35). Matsuyama discloses wherein performing the second context driven rewind analysis is performing face detection (see paragraphs 0025, 0038-0041, fig. 2, fig. 3a-3d; face detection and management of appearance/disappearance points. Also see paragraphs 0049-0051).
Claims 23, 30 and 37 are rejected under 35 U.S.C. 103 as being unpatentable over Hong and Matsuyama as applied to claims 21, 28 and 35 above, and further in view of Kim et al. (U.S. Pub. No. 2011/0071830).
Regarding claims 23, 30 and 37, Hong and Matsuyama discloses everything claimed as applied above (see claims 21, 28 and 35). However, Hong and Matsuyama are silent as to wherein performing the second context driven rewind analysis is performing lip detection.
Kim et al. discloses wherein performing the second context driven rewind analysis is performing lip detection (see paragraphs 0012-0013, 0042-0043 and fig. 2).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the dialog/VAD system of Hong with the teachings of Matsuyama’s face detection rewind analysis and Kim’s lip detection techniques, the motivation being to provide additional modality for identifying dialog segments.
Claims 24, 27, 31, 34 and 38 are rejected under 35 U.S.C. 103 as being unpatentable over Hong and Matsuyama as applied to claims 21, 28 and 35 above, and further in view of Yoo et al. (U.S. Pub. No. 2005/0196149).
Regarding claims 24, 31 and 38, Hong and Matsuyama discloses everything claimed as applied above (see claims 21, 28 and 35). However, Hong and Matsuyama are silent as to wherein performing the second context driven rewind analysis is verifying subtitle data.
Yoo et al. discloses wherein performing the second context driven rewind analysis is verifying subtitle data (see paragraphs 0010-0011, 0025-0033, fig. 4, fig. 9; subtitle verification and synchronization with AV playback).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the dialog/VAD system of Hong with the teachings of Matsuyama’s face detection rewind analysis and Yoo’s subtitle verification, the motivation being to improve contextual rewind accuracy.
Regarding claim 27, Hong, Matsuyama and Yoo discloses everything claimed as applied above (see claim 24). Yoo discloses wherein the subtitle data comprises timing information for dialog in the digital content, and wherein calculating the starting point of the last segment of detected dialog based on the first and second context driven rewind analyses comprises calculating the starting point of the last segment of detected dialog based on the first and the timing information (see paragraphs 0028-0029, 0032-0033, fig. 4, fig. 9 (shows checking PSR registers (PSR2, PSR17, PSR30) to determine subtitle capability and timing before playback; subtitle data comprising timing information).
Regarding claims 27 and 34, Hong and Matsuyama discloses everything claimed as applied above (see claims 24 and 28).
However, Hong and Matsuyama are silent as to wherein the subtitle data comprises timing information for dialog in the digital content, and wherein calculating the starting point of the last segment of detected dialog based on the first and second context driven rewind analyses comprises calculating the starting point of the last segment of detected dialog based on the first and the timing information.
Yoo discloses wherein the subtitle data comprises timing information for dialog in the digital content, and wherein calculating the starting point of the last segment of detected dialog based on the first and second context driven rewind analyses comprises calculating the starting point of the last segment of detected dialog based on the first and the timing information (see paragraphs 0028-0029, 0032-0033, fig. 4, fig. 9 (shows checking PSR registers (PSR2, PSR17, PSR30) to determine subtitle capability and timing before playback; subtitle data comprising timing information).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the dialog/VAD system of Hong with the teachings of Matsuyama’s face detection rewind analysis and Yoo’s subtitle verification, the motivation being to improve contextual rewind accuracy.
Claims 25, 32 and 39 are rejected under 35 U.S.C. 103 as being unpatentable over Hong and Matsuyama as applied to claims 21, 28 and 35 above, and further in view of Murahashi (U.S. Pub. No. 2014/0240602).
Regarding claims 25, 32 and 39, Hong and Matsuyama discloses everything claimed as applied above (see claims 21, 28 and 35). However, Hong and Matsuyama are silent as to wherein performing the second context driven rewind analysis is verifying scene change data.
Murahashi discloses wherein performing the second context driven rewind analysis is verifying scene change data (see abstract, paragraphs 0009-0016, fig. 10, fig. 17, fig. 21, fig. 21, verification of scene change occurrence).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the dialog/VAD system of Hong with the teachings of Matsuyama’s face detection rewind analysis and Murahashi’s scene change verification, the motivation being to improve contextual rewind accuracy.
Claims 26, 33 and 40 are rejected under 35 U.S.C. 103 as being unpatentable over Hong and Matsuyama as applied to claims 21, 28 and 35 above, and further in view of Kim et al., Yoo and Murahashi.
Regarding claims 26, 33 and 40, Hong and Matsuyama discloses everything claimed as applied above (see claims 21, 28 and 35). Matsuyama discloses wherein performing the second context driven rewind analysis comprises: performing face detection (see paragraphs 0025, 0038-0041, 0049-0051, fig. 2, fig. 3a-3d; face detection and management of appearance/disappearance points).
However, Hong and Matsuyama are silent as to wherein performing the second context driven rewind analysis comprises: performing lip detection, verifying subtitle data, and verifying scene change data.
Kim discloses wherein performing the second context driven rewind analysis is performing lip detection (see paragraphs 0012-0013, 0042-0043 and fig. 2).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the dialog/VAD system of Hong with the teachings of Matsuyama’s face detection rewind analysis and Kim’s lip detection techniques, the motivation being to provide additional modality for identifying dialog segments.
However, Hong, Matsuyama and Kim are silent as to wherein performing the second context driven rewind analysis comprises: verifying subtitle data, and verifying scene change data.
Yoo et al. discloses wherein performing the second context driven rewind analysis is verifying subtitle data (see paragraphs 0010-0011, 0025-0033, fig. 4, fig. 9; subtitle verification and synchronization with AV playback).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the dialog/VAD system of Hong with the teachings of Matsuyama’s face detection rewind analysis, Kim and Yoo’s subtitle verification, the motivation being to improve contextual rewind accuracy.
However, Hong, Matsuyama, Kim and Yoo are silent as to wherein performing the second context driven rewind analysis comprises: verifying scene change data.
Murahashi discloses wherein performing the second context driven rewind analysis is verifying scene change data (see abstract, paragraphs 0009-0016, fig. 10, fig. 17, fig. 21, fig. 21, verification of scene change occurrence).
It would have been obvious to a skilled artisan before the effective filing date of the claimed invention to modify the dialog/VAD system of Hong with the teachings of Matsuyama’s face detection rewind analysis, Kim, Yoo and Murahashi’s scene change verification, the motivation being to improve contextual rewind accuracy.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NNENNA NGOZI EKPO whose telephone number is (571)270-1663. The examiner can normally be reached M-W 10:00am - 6:30pm, TH-F 8:00am - 4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Brian Pendleton can be reached at 571-272-7527. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
NNENNA EKPO
Primary Examiner
Art Unit 2425
/NNENNA N EKPO/Primary Examiner, Art Unit 2425 March 17, 2026.