Last updated: April 19, 2026

Application No. 18/218,442

VIDEO SUMMARIZATION METHOD AND APPARATUS

Non-Final OA §103

Filed

Jul 05, 2023

Examiner

DAVIS, CHENEA

Art Unit

2421

Tech Center

2400 — Computer Networks

Assignee

Samsung Electronics Co., Ltd.

OA Round

3 (Non-Final)

Interview Optional

— +16.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 525 resolved cases, 2023–2026

Examiner Intelligence

DAVIS, CHENEA View full profile →

Grants 72% — above average

Career Allow Rate

378 granted / 525 resolved

+14.0% vs TC avg

Strong +16% interview lift

Without

With

+16.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 10m

Avg Prosecution

23 currently pending

Career history

548

Total Applications

across all art units

Statute-Specific Performance

§101

13.7%

-26.3% vs TC avg

§103

48.2%

+8.2% vs TC avg

§102

11.1%

-28.9% vs TC avg

§112

17.1%

-22.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 525 resolved cases

Office Action

§103

DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 2/17/2026 has been entered.

 Response to Amendment

This office action is in response to communications filed 2/1/2026. Claims 1, 4-7, 10-11, 15 and 18-20 are pending in this action.

Response to Arguments

Applicant’s arguments with respect to claims 1-20 have been considered but are not persuasive. In response to Applicant’s arguments on page10 that “Tandon, however, does not teach or suggest "identifying, for each clip of interest included in the target video, at least one interest frame from the clip of interest based on attention information of each video frame from a plurality of video frames in the clip of interest”, the Examiner respectfully disagrees. Tandon teaches, at least at [0021], that “a segmentation application operating on a computing device uses a trained predictive model to provide aesthetic scores for each video frame. The segmentation application determines, from the aesthetic scores, that a quality threshold has been met between the first and second video frames and that a duration threshold has been met between the first and second video frames. The segmentation application segments the video accordingly”, 
at least at [0043] that “aesthetics score can be based on one or more components. For example, an aesthetic score can computed be based on color harmony, the balance of elements in a frame, whether content is interesting, depth of field, whether the light in the scene is interesting, which object is the emphasis of the scene, whether there is repetition, rule of thirds, vivid colors, or symmetry”, and
at least at [0078] that “aesthetic-based video segmentation including segmentation in conjunction with face detection or scene detection can be used for a variety of additional applications. For example, computing system 101 can generate a video summary. A video summary can be a shorter version of the original video 102, with only the most interesting segments. For example, computing system 101 can rank the determined video segments 180a-m according to a criterion such as an aesthetic score. Segmentation application 110 can use those segments with the highest aesthetic score in a video summary. Similarly, segmentation application 110 can create a slideshow of key images by using key frames from the highest-ranked video segments”.
Therefore, the components and/or aesthetic scores of the video frames are reasonably “attention information of each video frame from a plurality of video frames in the clip of interest”, given the limitation’s broadest reasonable interpretation. And as key frames are determined and used to create the video summary based on the aesthetic-based segmentation, this reasonably teaches “based on attention information of each video frame from a plurality of video frames in the clip of interest” given the limitation’s broadest reasonable interpretation, and the rejection of record is maintained.

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Mahyar (of record) in view of Tandon (of record). 

	Regarding claims 1 and 20, Mahyar discloses a video summarization method (see Mahyar, at least at col 2, lines 17-51, col 8, lines 19-27, Fig. 9, and other related text) comprising:
obtaining an attention coding parameter of a user based on behavior data of the user (see Mahyar, at least at col 4, lines 19-34, col 11, lines 34-39, and other related text);
determining, for each clip included in a target video, whether the clip is a clip of interest to the user, based on the attention coding parameter of the user (see Mahyar, at least at col 2, lines 18-41, col 4, lines 16-43, and other related text); 
based on determining that at least one clip included in the target video is a clip of interest,  obtaining a video summary of the target video (see Mahyar, at least at col 2, lines 36-54, col 8, lines 47-56, and other related text).
Mahyar does not specifically disclose based on determining that at least one clip included in the target video is the clip of interest, identifying, for each clip of interest included in the target video, at least one interest frame from the clip of interest based on attention information of each video frame from a plurality of video frames in the clip of interest; and 
obtaining a video summary of the target video by combining the at least one interest frame from each clip of interest.  
In an analogous art related to a system for analyzing video, Tandon discloses based on determining that at least one clip included in a target video is a clip of interest (see Tandon, at least at [0078], and other related text), identifying, for each clip of interest included in the target video (see Tandon, at least at [0078], and other related text), at least one interest frame from the clip of interest (see Tandon, at least at [0078], and other related text) based on attention information of each video frame from a plurality of video frames in the clip of interest (see Tandon, at least at [0021], [0043], [0078], and other related text); and 
obtaining a video summary of the target video by combining the at least one interest frame from each clip of interest (see Tandon, at least at [0078], and other related text).  
It would have been obvious to a person having ordinary skill in the art before the effective date of the invention to modify the system of the system of Mahyar to include the limitations Tandon for the advantage of more efficiently providing accurate video analysis and optimizing system resources.
Regarding claim 15, Mahyar in view of Tandon a video summarization apparatus (see Mahyar, at least at Fig. 1 and related text, and see Tandon, at least at Fig. 7 and related text) comprising: 
a user attention parameter generation unit (processor and code of processor that performs the specific functions, see Mahyar, at least at col 4, lines 19-34, col 11, lines 34-39, and other related text); 
an interest frame extraction unit (processor and code of processor that performs the specific functions, see Mahyar, at least at col 2, lines 18-41, col 4, lines 16-43, and other related text, and see Tandon, at least at [0078]-[0080], Fig. 7, and other related text); 
a combining unit (processor and code of processor that performs the specific functions, see Tandon, at least at [0078]-[0080], Fig. 7, and other related text); 
a memory storing at least one instruction see Mahyar, at least at Fig. 1 and related text; and 
at least one processor configured to execute the at least one instruction see Mahyar, at least at Fig. 1 and related text to:
obtain, through the user attention parameter generation unit, an attention coding parameter of a user based on behavior data of the user (processor and code of processor that performs the specific function, see Mahyar, at least at col 4, lines 19-34, col 11, lines 34-39, and other related text); 
determine, through the interest frame extraction unit, for each clip included in a target video, whether the clip is a clip of interest to the user, based on the attention coding parameter of the user (processor and code of processor that performs the specific function, see Mahyar, at least at col 2, lines 18-41, col 4, lines 16-43, and other related text); 
identify, through the interest frame extraction unit, at least one interest frame from at least one clip of interest included in the target video (processor and code of processor that performs the specific function, see Tandon, at least at [0078]-[0080], Fig. 7, and other related text) based on attention information of each video frame from a plurality of video frames in the clip of interest (see Tandon, at least at [0021], [0043], [0078], and other related text); and
obtain, through the combining unit, a video summary of the target video by combining the at least one interest frame from the at least one clip of interest (processor and code of processor that performs the specific function, see Tandon, at least at [0078]-[0080], Fig. 7, and other related text).  

Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Mahyar (of record) in view of Tandon (of record), as applied to claim 1 above, and further in view of Gopalan (of record).

Regarding claim 2, Mahyar in view of Tandon does not specifically disclose wherein the behavior data comprises input-related information and a viewing behavior record of the user within a statistical window, the input-related information comprising at least one of input content information, a time when an input operation is performed, or a place where the input operation is performed.
In an analogous art relating to a system for analyzing video, Gopalan discloses behavior data comprising input-related information and a viewing behavior record of a user within a statistical window, the input-related information comprising at least one of input content information, a time when an input operation is performed, or a place where the input operation is performed (see Gopalan, at least at [0025], and other related text).
It would have been obvious to a person having ordinary skill in the art before the effective date of the invention to modify the system of the system of Mahyar in view of Tandon to include the limitations as taught by Gopalan for the advantage of providing a more efficient system that allows for more diverse data to be analyzed.

Claims 3 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Mahyar (of record) in view of Tandon (of record), as applied to claim 1 above, and further in view of Qiu (of record).

Regarding claims 3 and 16, Mahyar in view of Tandon does not specifically disclose wherein the obtaining the attention coding parameter of the user comprises: 
obtaining a vector representation of the behavior data by coding the behavior data; and obtaining the attention coding parameter of the user by inputting the vector representation into a preset first self-attention calculation model to perform self-attention processing. 
In an analogous art relating to a system for analyzing data, Qiu discloses obtaining an attention coding parameter of a user comprises: obtaining a vector representation of behavior data by coding the behavior data (see Qiu, at least at [0061], [0094], [0199]-[0201], and other related text); and 
obtaining the attention coding parameter of the user by inputting the vector representation into a preset first self-attention calculation model to perform self-attention processing (see Qiu, at least at [0061], [0094], [0199]-[0201], and other related text). 
It would have been obvious to a person having ordinary skill in the art before the effective date of the invention to modify the system of the system of Mahyar in view of Tandon to include the limitations as taught by Qiu for the advantage of providing a more efficient system that allows for more accurate results of analysis.

Claims 4 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Mahyar (of record) in view of Tandon (of record), as applied to claim 1 above, and further in view of Rufenacht (of record).

Regarding claims 4 and 17, Mahyar in view of Tandon does not specifically disclose wherein the determining, for each clip included in the target video, whether the clip is the clip of interest comprises:
obtaining video frame vector representations of each video frame in the clip by coding each video frame in the clip; and 
determining whether the clip is the clip of interest based on the video frame vector representations.  
In an analogous art relating to a system for video analysis, Rufenacht discloses determining, for each clip included in a target video, whether the clip is a clip of interest (see Rufenacht, at least at page 3, lines 17-31, page 4, lines 10-32, page 5, lines 4-12, page 8, lines 27-32, page 9, lines 10-33, and other related text) comprising:
obtaining video frame vector representations of each video frame in the clip by coding each video frame in the clip (see Rufenacht, at least at page 3, lines 17-31, page 4, lines 10-32, page 5, lines 4-12, page 8, lines 27-32, page 9, lines 10-33, and other related text); and 
determining whether the clip is a clip of interest based on the video frame vector representations(see Rufenacht, at least at page 3, lines 17-31, page 4, lines 10-32, page 5, lines 4-12, page 8, lines 27-32, page 9, lines 10-33, and other related text).  
It would have been obvious to a person having ordinary skill in the art before the effective date of the invention to modify the system of the system of Mahyar in view of Tandon to include the limitations as taught by Rufenacht for the advantage of providing a more efficient system that allows for more accurate results of analysis.
 	
Allowable Subject Matter

Claims 5-14 and 18-19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHENEA DAVIS whose telephone number is (571)272-9524 and whose email address is CHENEA.SMITH@USPTO.GOV. The examiner can normally be reached M-F: 8:00 am - 4:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nathan Flynn can be reached at 571-272-1915. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHENEA DAVIS/Primary Examiner, Art Unit 2421

Read full office action

Prosecution Timeline

Jul 05, 2023

Application Filed

Jul 25, 2025

Non-Final Rejection — §103

Oct 29, 2025

Response Filed

Dec 11, 2025

Final Rejection — §103

Feb 17, 2026

Request for Continued Examination

Feb 26, 2026

Response after Non-Final Action

Mar 06, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/964,989

Patent 12604057

STREAMING SYSTEM AND METHOD

2y 5m to grant Granted Apr 14, 2026

17/986,601

Patent 12581147

SYSTEMS AND METHODS FOR CONTROLLING QUALITY OF CONTENT

2y 5m to grant Granted Mar 17, 2026

18/654,741

Patent 12581169

UNDER-ADDRESSABLE ADVERTISEMENT MEASUREMENT

2y 5m to grant Granted Mar 17, 2026

18/366,792

Patent 12556762

METHODS AND APPARATUS TO CALIBRATE RETURN PATH DATA FOR AUDIENCE MEASUREMENT

2y 5m to grant Granted Feb 17, 2026

17/564,702

Patent 12549790

INTEGRATION OF PLATFORMS FOR MULTI-PLATFORM CONTENT ACCESS

2y 5m to grant Granted Feb 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

72%

Grant Probability

88%

With Interview (+16.5%)

2y 10m

Median Time to Grant

High

PTA Risk

Based on 525 resolved cases by this examiner. Grant probability derived from career allow rate.