DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Status
Claims 1-20 were pending for examination in Application No. 18/209,374 filed June 13th, 2023. In the remarks and amendments received on January 12th, 2026, claims 1-3, 5-7, 13, and 19-20 are amended, no claims are cancelled, and no claims are added, Accordingly, claims 1-20 are currently pending for examination in the application.
Response to Amendment
Applicant’s amendments filed January 12th, 2026, have overcome the 35 U.S.C. 112(b) rejections previously set forth in the Non-Final Office Action mailed September 11th, 2025. Accordingly, the rejections are withdrawn.
Response to Arguments
Applicant’s arguments filed January 12th, 2026, with respect to the rejection of claims 1 and 19-20, have been fully considered but are moot because the arguments do not apply to the new combination of references, facilitated by Applicant’s newly submitted amendments being used in the current rejection.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1, 4-5, 7-9, and 19-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Castellani et al. (US-20170200280-A1).
Regarding claim 1, Castellani teaches:
In a computer system, a method comprising:
reading a given frame of a video sequence (“An initial frame 202a is processed using full-frame feature extraction,” Para [0031]);
determining whether an object detection condition is satisfied for the given frame (“identify locations in that frame of features of one or more objects to be tracked,” Para [0031]), including determining whether the given frame depicts a shot transition (“Yet another example event is raised when a scene change analysis detects that the current frame is different enough from a previous frame that a scene change is detected,” Para [0030]), wherein the object detection condition for the given frame depends at least in part on whether the given frame depicts the shot transition, the object detection condition being satisfied if the given frame depicts the shot transition (“Based on some event, such as… detecting a scene change as between frame 202d and 202e, the processing of the stream returns to full frame feature-based detection for frame 202e,” Para [0032]);
and tracking an object in the given frame, including determining feature information for the object in the given frame (“a traditional full frame feature-based algorithm is used for feature extraction to detect interesting/relevant features, i.e. points or areas of interest,” Para [0027]), wherein the determining the feature information for the object in the given frame includes selecting between different tracking operations to determine at least some of the feature information for the object in the given frame depending on a result of the determining whether the object detection condition is satisfied for the given frame (“Based on some event, such as… detecting a scene change as between frame 202d and 202e, the processing of the stream returns to full frame feature-based detection for frame 202e,” Para [0032] for when a shot transition is detected, or simply “motion estimation” for non-scene change frames, see Para [0027]).
Regarding claim 4, the rejection of claim 1 is incorporated herein. Castellani teaches the method of claim 1, and further teaches:
wherein the shot transition is a viewpoint change in a scene, an abrupt scene change, a gradual scene change, a zoom-in, a zoom-out, a fade-in, a fade-out, or a wipe (“Sudden camera movements, entrance of new objects into field of view, and camera cutaways are just some examples of a scene change,” Para [0030]).
Regarding claim 5, the rejection of claim 1 is incorporated herein. Castellani teaches the method of claim 1, and further teaches:
wherein the determining whether the object detection condition is satisfied for the given frame further includes determining whether a frame counter has reached a threshold (“Based on some event, such as meeting/exceeding a threshold number of frames processed… the processing of the stream returns to full frame feature-based detection,” Para [0032]), and wherein the object detection condition for the given frame further depends at least in part on whether the frame counter has reached the threshold, the object detection condition being satisfied if the frame counter has reached the threshold (“Based on some event, such as meeting/exceeding a threshold number of frames processed,… or detecting a scene change as between frame 202d and 202e, the processing of the stream returns to full frame feature-based detection,” Para [0032]).
Regarding claim 7, the rejection of claim 1 is incorporated herein. Castellani teaches the method of claim 1, and further teaches:
wherein the feature information for the object in the given frame includes spatial information for the object in the given frame (“A list of coordinates corresponding to the points/areas of interest is established,” Para [0027]) and visual information for the object in the given frame (“a traditional full frame feature-based algorithm is used for feature extraction to detect interesting/relevant features, i.e. points or areas of interest,” Para [0027]).
Regarding claim 8, the rejection of claim 7 is incorporated herein. Castellani teaches the method of claim 7, and further teaches:
wherein, for the object in the given frame: the spatial information indicates location of the object in the given frame (“A list of coordinates corresponding to the points/areas of interest is established,” Para [0027]);
and the visual information indicates aspects of appearance of the object in the given frame (“a traditional full frame feature-based algorithm is used for feature extraction to detect interesting/relevant features, i.e. points or areas of interest,” Para [0027]).
Regarding claim 9, the rejection of claim 7 is incorporated herein. Castellani teaches the method of claim 7, and further teaches:
wherein the object detection condition is satisfied (“Based on some event, such as… detecting a scene change,” Para [0032]), and wherein the determining the feature information for the object in the given frame (“the processing of the stream returns to full frame feature-based detection,” Para [0032]) includes:
getting results of object detection operations to determine the spatial information for the object in the given frame (“A list of coordinates corresponding to the points/areas of interest is established,” Para [0027]);
and performing feature extraction operations to determine the visual information for the object in the given frame (“a traditional full frame feature-based algorithm is used for feature extraction to detect interesting/relevant features, i.e. points or areas of interest,” Para [0027]).
Claims 19 and 20 are non-transitory computer-readable medium and system claims that corresponds to method claim 1. Implementation of method claim 1 would necessarily encompass the non-transitory computer-readable medium and system claims 19 and 20. Therefore, the rejection of method claim 1 applies to claims 19 and 20.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 2 and 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Castellani et al. (US-20170200280-A1) as applied to claim 1 above, and further in view of Xu et al., "A shot boundary detection method for news video based on object segmentation and tracking," 2008 International Conference on Machine Learning and Cybernetics, Kunming, China, 2008, pp. 2470-2475, doi: 10.1109/ICMLC.2008.462082, hereinafter referred to as Xu.
Regarding claim 2, the rejection of claim 1 is incorporated herein. Castellani teaches the method of claim 1, but fails to teach the following limitations as further claimed. However, Xu teaches wherein the determining whether the given frame depicts a shot transition (“shot boundary”) includes evaluating a result of shot transition detection for the given frame, the result of shot transition detection for the given frame having been determined by shot transition detection operations comprising:
calculating a given frame histogram (“histogram”) using sample values (“color”) of the given frame (“first, video frame is divided into N sub-blocks; then, we compute the color histogram of each sub-blocks,” Pg. 2470, Section 2);
and measuring differences between the given frame histogram and a previous frame histogram (“Compare the histogram difference between the frame and the immediate preceding frame,” Pg. 2472, Section 4) the previous frame histogram having been calculated using sample values of a previous frame of the video sequence (Pg. 2471, Section 2, Equation 1, where H(i-1)(n) is the “color histogram” of the previous frame).
PNG
media_image1.png
54
447
media_image1.png
Greyscale
Xu is considered to be analogous to the claimed invention because they are in the same field of shot boundary detection that is dependent on motion or lack thereof. Therefore, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention to have incorporated the teachings of Xu into Castellani for the benefit of improved object tracking.
Regarding claim 10, Castellani teaches the method of claim 9, but fails to teach the following limitations as further claimed. Xu, however, further teaches wherein, for the object (“discrete objects”) in the given frame:
the object detection operations produce a bounding box around the object as the spatial information for the object (“First, the discrete objects are extracted from the segmentation results of the video frames, and their bounding boxes and centroids are obtained,” Pg. 2472, Section 3.2);
and the feature extraction operations produce an embedding vector as the visual information for the object (“the pixel value in object mask map is binary data (either 1 or 2),” Pg. 2472, Section 4).
It would have been obvious to one of ordinary skill before the effective filing date of the claimed invention to have incorporated the teachings of Xu into Castellani for the benefit of improved object tracking.
Claim(s) 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Castellani et al. (US-20170200280-A1) as applied to claim 1 above, and further in view of Xu et al., "A shot boundary detection method for news video based on object segmentation and tracking," 2008 International Conference on Machine Learning and Cybernetics, Kunming, China, 2008, pp. 2470-2475, doi: 10.1109/ICMLC.2008.462082, hereinafter referred to as Xu, Hameed, "Video shot detection by motion estimation and compensation," 2009 International Conference on Emerging Technologies, Islamabad, Pakistan, 2009, pp. 241-246, doi: 10.1109/ICET.2009.5353168, hereinafter referred to as Hameed, and Hassanien et al., “Large-scale, Fast and Accurate Shot Boundary Detection through Spatio-temporal Convolutional Neural Networks,” arXiv:1705.03281 v2, hereinafter referred to as Hassanien.
Regarding claim 3, the rejection of claim 1 is incorporated herein. Castellani teaches the method of claim 1, but fails to teach the following limitations as further claimed.
Xu, however, further teaches:
wherein the determining whether the given frame depicts a shot transition includes evaluating a result of shot transition detection for the given frame (“partitioned color histogram comparison method as the first filter and uses strict threshold in order not to miss any possible shot boundary,” Pg. 2470, Section 2), the result of shot transition detection for the given frame having been determined by shot transition detection operations comprising:
analyzing statistical properties of sample values of the given frame (“Compare the histogram difference between the frame and the immediate preceding frame,” Pg. 2472, Section 4) compared to statistical properties of another frame of the video sequence (“histogram… [of] the immediate preceding frame,” Pg. 2472, Section 4).
Xu fails to teach the following limitations as further claimed.
Hameed, however, teaches analyzing encoded data (“blocks,” Pg. 243, Section III) for the given frame, including analyzing one or more of statistical properties of motion vectors for the units of the given frame (“correlation coefficient gives a measure of the degree of similarity between two regions in different video frames,” Pg. 243, Section III);
analyzing results of block matching or other motion estimation (calculating “The correlation coefficient ȡij between the two blocks i and j in consecutive frames,” Pg. 243, Section III) between blocks of the given frame and a previous frame of the video sequence (“consecutive frames”).
Hameed fails to teach the following limitations as further claimed.
Hassanien, however, teaches using spatio-temporal convolutional neural network (“spatio-temporal Convolutional Neural Networks”) or other neural network to detect boundaries between different shots (“we present an SBD technique based on spatio-temporal Convolutional Neural Networks (CNN),” Abstract, where SBD is “shot boundary detection”).
Xu is considered to be analogous to the claimed invention because they are in the same field of shot boundary detection that is dependent on motion or lack thereof. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Xu into Castellani for the benefit of more accurate shot boundary detection.
Hameed is considered to be analogous to the claimed invention because they are in the same field of detecting shot transitions. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Hameed into Castellani and Xu for the benefit of more accurate shot transition detection.
Hassanien is considered to be analogous to the claimed invention because they are in the same field of shot boundary detection. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Hassanien into Hameed, Xu, and Castellani for the benefit of more accurate shot transition detection.
Claim(s) 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Castellani et al. (US-20170200280-A1), as applied to claim 1 above, and further in view of Huang et al., "Shot Change Detection via Local Keypoint Matching," in IEEE Transactions on Multimedia, vol. 10, no. 6, pp. 1097-1108, Oct. 2008, doi: 10.1109/TMM.2008.2001374, hereinafter referred to as Huang.
Regarding claim 6, the rejection of claim 1 is incorporated herein. Castellani teaches the method of claim 1, but fails to teach the following limitations as further described. However, Huang further teaches wherein the determining whether the object detection condition is satisfied for the given frame further includes:
determining whether an in-interval shot transition occurs (“candidate shot changes”) anywhere in an interval between two end-point frames on opposite sides of the given frame (“we match the frames before and after the intervals of candidate shot changes,” Pg. 1100, Section III), wherein the object detection condition for the given frame further depends at least in part on whether the in-interval shot transition (“detected transition”) occurs anywhere in the interval between the two end-point frames (“the first and the last frames of each interval”; “After finding the intervals of transitions in the initial step, discussed in Section II, the first and the last frames of each interval are matched again. If there are still many matching keypoints, the two frames are considered similar because the detected transition is probably a false alarm,” Pg. 1100, Section III).
Huang is considered to be analogous to the claimed invention because they are in the same field of shot transition detection. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Huang into Castellani for the benefit of another shot transition “check”, which makes the overall shot transition detection system more accurate.
Claim(s) 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Castellani et al. (US-20170200280-A1), as applied to claim 1 above, and further in view of Hameed, "Video shot detection by motion estimation and compensation," 2009 International Conference on Emerging Technologies, Islamabad, Pakistan, 2009, pp. 241-246, doi: 10.1109/ICET.2009.5353168, hereinafter referred to as Hameed.
Regarding claim 11, the rejection of claim 7 is incorporated herein. Castellani teaches the method of claim 7, but fails to teach the following limitations as further claimed. However, Hameed further teaches wherein the object detection condition is not satisfied (“Inaccurate motion estimation is observed for the block if a high pass filter was not applied,” Pg. 244, Section IV), and wherein the determining the feature information for the object in the given frame includes:
performing interpolation operations (“high pass filter”) to determine the spatial information (related to motion) for the object in the given frame (“motion of a block is projected more accurately which have been passed through high pass filter,” Pg. 244, Section IV);
and performing feature extraction operations (“correlation metric,” Pg. 245, Section V) to determine the visual information for the object in the given frame (“matches image features corresponding to high frequency occurrence viz. edges, corners and certain types of textures,” Pg. 245, Section V).
Hameed is considered to be analogous to the claimed invention because they are in the same field of detecting shot transitions. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Hameed into Castellani for the benefit of more accurately detecting and tracking objects when determining a shot transition.
Claim(s) 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Castellani et al. (US-20170200280-A1), as applied to claim 1 above, and further in view of Wang et al. (US-20200126241-A1), hereinafter referred to as Wang ‘241.
Regarding claim 13, the rejection of claim 1 is incorporated herein. Castellani teaches the method of claim 1, but fails to teach the following limitations as further claimed. Wang ‘241, however, teaches wherein the tracking the object in the given frame further includes:
PNG
media_image2.png
729
611
media_image2.png
Greyscale
using at least some of the feature information (Fig. 2, 203, “feature map”) determining affinities for the object in the given frame (“concatenating bounding boxes coordinates and corresponding velocities to each first feature map into corresponding first dimensional vectors”); and
using at least some of the affinities, associating the object in the given frame with corresponding objects (Fig. 2, 202, “each object in the second sequence of tracklets”) in other frames of the video sequence as part of updating tracking information (Fig. 3, Frame t vs Frame t-1).
PNG
media_image3.png
514
628
media_image3.png
Greyscale
Wang ‘241 is considered to be analogous to the claimed invention because they are in the same field of multi-object tracking and associating. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Wang ‘241 into Castellani for the benefit of more accurate object tracking and shot detection.
Claim(s) 17-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Castellani et al. (US-20170200280-A1), as applied to claim 1 above, and further in view of Wang (JP-2015050591-A), hereinafter referred to as Wang ‘591.
Regarding claim 17, the rejection of claim 1 is incorporated herein. Castellani teaches the method of claim 1, but fails to teach the following limitations as further claimed. Wang ‘591, however, further teaches determining that a queue (“queues”) is not full; and after the reading the given frame (“packet”), storing the given frame in the queue (“queues… that store packets classified into the video category (AC_VI),” Para [0045]).
Wang ‘591 is considered to be analogous to the claimed invention because they are in the same field of video processors. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Wang ‘591 into Castellani for the benefit of better video processing, as there will be less or no buffering in the system.
Regarding claim 18, the rejection of claim 17 is incorporated herein. Castellani in view of Wang ‘591 teaches the method of claim 17, and Wang ‘591 further teaches wherein the queue has a maximum queue size (“sets the number of stages in each queue buffer (queues 111 to 113) that stores packets classified in the video category (AC_VI),” Para [0089]), the method further comprising:
selectively adjusting the maximum queue size (“accept a size change”) depending on whether a queue condition is satisfied (“The queue buffer unit 22 can accept a size change from the change means 109 of the video queue size change unit 25,” Para [0089]; in other words, the size or amount of buffer stages in the queue can change depending on the processing needs of the user).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Wang ‘591 into Castellani for the benefit of optimized video processing performance.
Allowable Subject Matter
Claims 12 and 14-16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. It is suggested that the allowable dependent claims are moved to the independent claim as limitations. The claims are potentially allowable on account of the prior art failing to anticipate or render obvious the limitations of the allowable subject matter. The primary reason of allowance for claim 12 is the implementation of “as a condition for the performing the interpolation operations, determining that visual information for the object matches in the two end-point frames or that an identifier for the object matches in the two end-point frames; determining spatial information for the object in two end-point frames on opposite sides of the given frame, the object detection condition being satisfied for each of the two end-point frames; and interpolating between the spatial information for the object in the two end-point frames.” Although interpolation is not new in the field of shot detection or object tracking, it is in combination and in the context of all claim limitations that claim 12 recites novel matter. The primary reason for allowance for claim 14 is the implementation of selectively updating information relating to the target/detected object based on if a shot transition is detected or not. This, in combination with the claims it depends on, is novel. Claim 15 is novel because of the specific filtering of the tracking information, and claim 16 is novel because of the specific models adapted for each type of object. All of these claims are novel when in combination with the claims they depend from.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RACHEL A OMETZ whose telephone number is (571)272-2535. The examiner can normally be reached 6:45am-4:00pm ET Monday-Thursday, 6:45am-1:00pm ET every other Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached at 571-272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Rachel Anne Ometz/ Examiner, Art Unit 2668 2/12/26
/VU LE/ Supervisory Patent Examiner, Art Unit 2668