Last updated: April 19, 2026

Application No. 17/980,322

SMART SCENE BASED IMAGE CROPPING

Final Rejection §103

Filed

Nov 03, 2022

Examiner

SALEH, ZAID MUHAMMAD

Art Unit

2668

Tech Center

2600 — Communications

Assignee

Black Sesame Technologies Inc.

OA Round

4 (Final)

Interview Optional

— +48.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 43 resolved cases, 2023–2026

Examiner Intelligence

SALEH, ZAID MUHAMMAD View full profile →

Grants 65% — above average

Career Allow Rate

28 granted / 43 resolved

+3.1% vs TC avg

Strong +48% interview lift

Without

With

+48.4%

Interview Lift

resolved cases with interview

Typical timeline

3y 1m

Avg Prosecution

30 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

5.7%

-34.3% vs TC avg

§103

58.5%

+18.5% vs TC avg

§102

28.0%

-12.0% vs TC avg

§112

4.4%

-35.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 43 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Claims 1 – 16 remain pending.
Claims 1, 14, 15 and 16 are Amended

Response to Arguments
Applicant's arguments filed December 31, 2025 with respect to claims 1 – 16
have been considered but are moot because the new grounds of rejection do not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103 
	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 - 16 are rejected under 35 U.S.C 103 as being unpatentable over Lin et al. US Patent Publication No. US-7760956-B2 (hereinafter Lin) in view of Kansara US Patent Application Publication No. US-20210021900-A1 (hereinafter Kansara) and Zavesky US Patent Application Publication No. US-20160092751-A1 (hereinafter Zavesky) and further in view of Chen US Patent Application Publication No. US-20180286199-A1 (hereinafter Chen).

Regarding claim 1, Lin discloses a video cropping system(Lin in [Column – 1; Line 57-59] discloses, “automatically crop the enhanced key frame images to produce cropped images; and compose one or more pages comprising the cropped images”), comprising: a segmentation module configured to segment a video into a plurality of frames (In [Column – 6; Line 54-58] Lin discloses about, “for extracting a set of key frames from a video stream” where extracting the key frame implies to segmentation module), comprising: a cluster generator configured to generate a cluster of frames from the plurality of frames(In [Column – 6; Line 65-67 and Column – 7; Line 1-7  ] Lin discloses about, “the candidate key frames from step 100 are arranged into a set of clusters”); and a scene generator configured to generate a plurality of scenes from the cluster of frames and (In [Column – 6; Line 65-67 and Column – 7; Line 1-7  ] Lin discloses about, “one of the candidate key frames from each cluster is selected as a key frame for the video stream”); a feature processing module configured to analyze the scene segment and an associated frame to extract at least one feature (In [Column – 6; Line 65-67 and Column – 7; Line 1-7  ] Lin disclosed, “A relative importance of a candidate key frame can be based on an overall level of meaningful content in the candidate key frame”. Identifying the meaningful content of the key frame implies to extracting the key feature).
Lin doesn’t disclose the following limitations as further recited in the claim as noted in strike-through above.
Kansara discloses about merging the plurality of scenes to form a scene segment (In [0004] Kansara discloses about identifying multiple object and each object identified from individual scene from respective image of the video content i.e., a sequence of images. Thus, the cropped scene including various "objects" determined to be included in said crop scene implies merging these objects from their respective scenes into one scene in the cropped image), the feature is stacked on the associated frame to form a stacked feature frame (Kansara in [0009] discloses, “relative importance value may include an indication of which of the identified objects are to be included in a cropped version of the video scene”. The identified objects within the video scene and their relative importance are determined first. Then these identified objects based on their relative importance are determined to be included in the crop video scene. Video scene" is NOT a scene in a single image.  A video scene implies a particular scene within a series of images. Therefore, determining the importance of multiple images of object and include the images in the cropped scene implies to stacking feature frame); ; wherein a scoring module configured to assign a score to the stacked feature frame (In [0067] the importance determining module equates to scoring module and importance value implies to score); and a cropping module, comprising: a region localizer configured to scan the stacked feature frame to detect a test area based on the score (In [0070] Kansara discloses, “More “important” people or objects may be grouped together so that the group is more likely to appear in a cropped version of the video scene ... module may use the relative importance value 412 for each object or group of objects to determine where the crop should be placed”). Moreover, In [0078] of the application it is disclosed, “stacking the number of feature on the image or scene segments to form a stacked feature frame”. In Kansara, a video scene includes various identified objects within a scene in that video sequence, in other words, objects within the same scene in a series of images in that video sequence (a stacked feature frame). Thus, identifying the objects within the video scene and giving them relative importance metrics are akin to a score within that scene);and a cropper configured to crop the test area to generate an image of interest (In [0070] Kansara discloses about identifying the important people or objects (image of interest) and make them appear in the crop version (“important” people or objects may be grouped together so that the group is more likely to appear in a cropped version of the video scene”).
It would have been obvious to one of ordinary skill in art before the effective filling date of the claimed invention to integrate the technique of Kansara into the teaching of Lin because having a scoring module would allow the system to focus on the critical content while ignoring irrelevant or less significant region of the frame.
Lin and Kansara in the combination doesn’t disclose the following limitations as further recited in the claim.
 Zavesky discloses neighboring scenes are merged based on a similarity of clusters, wherein the similarity is based on a cosine distance between a UV histogram of neighboring semantic sections (Zavesky in [0076] discloses, “Cluster-based dictionaries may combine similar image patches into a fixed set of words by simple k-means or hierarchical k-means ... After each approach, the representation may be a posterior vector (probabilistic or discrete)—or a mostly-zero vector with a few bins set to non-zero where there was a high correlation for quantization. Utilizing this vector representation, histogram-based distance metrics and kernels, such as the cosine and x.sup.2, may be utilized”).
It would have been obvious to one of ordinary skill in art before the effective filling date of the claimed invention to integrate the technique of Zavesky into the teaching of Lin in view of Kansara because it would allow the system to group frames that are visually alike based on the color and content resulting in higher accuracy and stability of subsequent feature detection and cropping.
Lin, Kansara and Zavesky in the combination doesn’t disclose the following limitations as further recited in the claim.
Chen discloses a deformer configured to resize the test area (Chen in [0009] discloses, “Adjusting the shape of the first blob tracker includes shifting at least one boundary of a bounding region of the first blob tracker based on the shape of the merged blob”), wherein the deformer covers the instances by expanding or shrinking toward a minimal (Chen in [0015] discloses, “a first shape adjustment by shifting at least one boundary pair of the first blob tracker towards a center of the bounding region of the first blob tracker ... the at least one boundary pair is shifted until one or more boundaries of the at least one boundary pair intersect with a foreground pixel of the merged blob” wherein shifting boundaries inward and outward equates to expanding and shrinking. Additionally, Chen in [0187] discloses, “Once a shifted boundary of a boundary pair intersects a foreground pixel, the initial shape adaptation process can stop for the boundary pair”) and maximal border of selected detections in both horizontal and vertical directions (Chen in [0293] discloses, “the at least one boundary pair of the first blob tracker includes at least one or more of a left boundary and a right boundary of the bounding region of the first blob tracker or a top boundary and a bottom boundary of the bounding region of the first blob tracker”).
It would have been obvious to one of ordinary skill in art before the effective filling date of the claimed invention to integrate the technique of Chen into the teaching of Lin in view of Kansara and Zavesky because it would allow the system to ensure complete preservation of meaningful content and reducing wrong or incomplete crops.



Summary of Citations (Chen)
Paragraph [0009]; “Adjusting the shape of the first blob tracker includes shifting at least one boundary of a bounding region of the first blob tracker based on the shape of the merged blob”.
Paragraph [0015]; “a first shape adjustment by shifting at least one boundary pair of the first blob tracker towards a center of the bounding region of the first blob tracker ... the at least one boundary pair is shifted until one or more boundaries of the at least one boundary pair intersect with a foreground pixel of the merged blob”.
Paragraph [0187]; “Once a shifted boundary of a boundary pair intersects a foreground pixel, the initial shape adaptation process can stop for the boundary pair”.
Paragraph [0293]; “the at least one boundary pair of the first blob tracker includes at least one or more of a left boundary and a right boundary of the bounding region of the first blob tracker or a top boundary and a bottom boundary of the bounding region of the first blob tracker”.

Summary of Citations (Lin)
[Column – 1; Line 57-59]; “automatically crop the enhanced key frame images to produce cropped images; and compose one or more pages comprising the cropped images”.
[Column – 6; Line 54-58]; “FIG. 1 shows an embodiment of a method for extracting a set of key frames from a video stream according to exemplary embodiments. At step 100 , a set of candidate key frames is selected from among a series of video frames in the video stream”.
[Column – 6; Line 65-67 and Column – 7; Line 1-7  ]; “At step 102 , the candidate key frames from step 100 are arranged into a set of clusters. The number of clusters can be fixed or can vary in response to the complexity of the content of the video. At step 104 , one of the candidate key frames from each cluster is selected as a key frame for the video stream. The candidate key frames can be selected in response to a relative importance of each candidate key frame. A relative importance of a candidate key frame can be based on an overall level of meaningful content in the candidate key frame”.

Summary of Citations (Kansara)
Paragraph [0004]; “The method may further include scanning the video scenes to identify objects within the video scene. The method may also include determining a relative importance value for the identified objects. The relative importance value may include an indication of which objects are to be included in a cropped version of the video scene”.
Paragraph [0009]; “The physical processor may scan at least one of the video scenes to identify objects within the video scene and determine a relative importance value for the identified objects within the video scene. The relative importance value may include an indication of which of the identified objects are to be included in a cropped version of the video scene”.
Paragraph [0067]; “As noted above, when determining the relative importance value 412 of each identified object in a scene, the importance determining module 411 may use various machine learning algorithms or different types of neural networks or other algorithms to determine the relative importance of the objects in a scene”. 
Paragraph [0070]; “More “important” people or objects may be grouped together so that the group is more likely to appear in a cropped version of the video scene. When the video crop generating module 413 is determining how to generate a video crop 414 for a given scene, that module may use the relative importance value 412 for each object or group of objects to determine where the crop should be placed”. 
Paragraph [0076]; “the video crop generating module 413 may generate circular video crops that include the most important objects 410, oval-shaped crops that include the most important objects, rectangular, square, portrait, landscape, or other differently sized or differently shaped video crops that include those objects that are most likely to want to be viewed by viewers of the scene”.
Paragraph [0077]; “the video crop generating module 413 may optimize the video crop 414 for each video scene 407 within the confines of the crop shape, size, and aspect ratio”.

Summary of Citations (Zavesky)
Paragraph [0076]; “Cluster-based dictionaries may combine similar image patches into a fixed set of words by simple k-means or hierarchical k-means ... After each approach, the representation may be a posterior vector (probabilistic or discrete)—or a mostly-zero vector with a few bins set to non-zero where there was a high correlation for quantization. Utilizing this vector representation, histogram-based distance metrics and kernels, such as the cosine and x.sup.2, may be utilized”.

Regarding claim 2 – 13 the grounds of rejection from the last Office Action with respect to Lin in view of Kansara, Zavesky and further in view of Chen apply here.

Regarding claim 14, claim 14 is represented in claims 1, 5, 11 and 13. Therefore, the rejection analysis and motivation to combine in claims 1, 5, 11 and 13 are applied to claim 14.

Regarding claim 15, method claim 15 corresponds to apparatus claim 1. Therefore, the rejection analysis of claim 1 is applicable to claim 15. 

Regarding claim 16, method claim 16 corresponds to apparatus claim 14. Therefore, the rejection analysis of claim 14 is applicable to claim 16. 


Conclusion 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZAID MUHAMMAD SALEH whose telephone number is (703)756-1684. The examiner can normally be reached M-F 8 am - 5 pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on (571)272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272- 1000. 

/ZAID MUHAMMAD SALEH/ 
Examiner, Art Unit 2668 
02/02/2026


/VU LE/Supervisory Patent Examiner, Art Unit 2668

Read full office action

Prosecution Timeline

Nov 03, 2022

Application Filed

Nov 21, 2024

Non-Final Rejection — §103

Feb 28, 2025

Response Filed

May 02, 2025

Final Rejection — §103

Aug 18, 2025

Request for Continued Examination

Aug 27, 2025

Response after Non-Final Action

Sep 10, 2025

Non-Final Rejection — §103

Dec 31, 2025

Response Filed

Feb 02, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/044,758

Patent 12602944

AUTHENTICATION OF DENDRITIC STRUCTURES

2y 5m to grant Granted Apr 14, 2026

17/553,271

Patent 12586501

DISPLAY DEVICE, DISPLAY METHOD, AND STORAGE MEDIUM

2y 5m to grant Granted Mar 24, 2026

18/180,613

Patent 12586396

INFORMATION PROCESSING APPARATUS AND SYSTEM

2y 5m to grant Granted Mar 24, 2026

17/874,075

Patent 12562535

METHOD FOR DETECTING UNDESIRED CONNECTION ON PRINTED CIRCUIT BOARD

2y 5m to grant Granted Feb 24, 2026

18/168,588

Patent 12555344

METHOD AND APPARATUS FOR IMPROVING VIDEO TARGET DETECTION PERFORMANCE IN SURVEILLANCE EDGE COMPUTING

2y 5m to grant Granted Feb 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

5-6

Expected OA Rounds

65%

Grant Probability

99%

With Interview (+48.4%)

3y 1m

Median Time to Grant

High

PTA Risk

Based on 43 resolved cases by this examiner. Grant probability derived from career allow rate.