Last updated: April 19, 2026

Application No. 19/045,228

REAL-TIME SEGMENTATION AND TRACKING OF OBJECTS

Non-Final OA §102§103

Filed

Feb 04, 2025

Examiner

DANG, HUNG Q

Art Unit

2484

Tech Center

2400 — Computer Networks

Assignee

Fd Ip & Licensing LLC

OA Round

1 (Non-Final)

Interview Optional

— +18.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 1841 resolved cases, 2023–2026

Examiner Intelligence

DANG, HUNG Q View full profile →

Grants 68% — above average

Career Allow Rate

1257 granted / 1841 resolved

+10.3% vs TC avg

Strong +18% interview lift

Without

With

+18.3%

Interview Lift

resolved cases with interview

Typical timeline

3y 1m

Avg Prosecution

95 currently pending

Career history

1936

Total Applications

across all art units

Statute-Specific Performance

§101

4.2%

-35.8% vs TC avg

§103

54.1%

+14.1% vs TC avg

§102

23.6%

-16.4% vs TC avg

§112

11.6%

-28.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1841 resolved cases

Office Action

§102 §103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 2 and 7 are objected to because of the following informalities:
Claim 2 recites “the the object” in line 4. Claim 7 recites “the the first and second bound boxes” in line 2. The duplicate “the” in each of the claims should be deleted.
 Appropriate correction is required.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-9 and 12-17 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kansara (US 2023/0064431 A1 – hereinafter Kansara).
Regarding claim 1, Kansara discloses a computer-implemented method comprising: receiving frames of a scene, the frames including an object (Fig. 2; [0040] – accessing frames of a video depicting a scene with an object); generating a first bounding box so that the object in a first frame of the frames is within the first bounding box (Fig. 2; [0050]-[0052] – generating an initial spline, which is a bounding box, for the object in a key frame); providing a first prompt based on the first bounding box (Fig. 2; [0054]-[0055] – providing a first prompt as user input to adjust a corresponding spline); segmenting the object from the first frame based on the first prompt to provide a first mask (Fig. 2; [0054]-[0055] – segmenting the object to generate an adjusted spline to provide a mask for the frame as shown in Fig. 5 and further described at least in [0067]); generating a second bounding box so that the object in a second frame of the frames is within the second bounding box based on the first mask (Fig. 5; [0067]; [0071] – generating a spline based on the mask); providing a second prompt based on the second bounding box (Fig. 2; [0062] – interpolating the adjusted spline in a sequentially proximate frame and providing user’s input to repeat adjustment of the spline); and segmenting the object from the second frame based on the second prompt to provide a second mask ([0067] – generating a mask on each frame of the video).
Regarding claim 2, Kansara also discloses the computer-implemented method of claim 1, wherein generating the second bounding box comprises: overlaying the first mask over the first frame to overlay the object (Figs. 4-5 – overlaying the mask 510 over the first frame to overlay the object 310); identifying a contour of the object in the first frame based on pixel values of the first mask representing the object (Fig. 5; [0071] – identifying a contour of the person 310); and creating the second bounding box based on the contour of the object (Fig. 8; [0071] – creating a spline based on the mask 510).
Regarding claim 3, Kansara also discloses the computer-implemented method of claim 2, wherein creating the second bounding box comprises overlaying the second bounding box over the second frame so that the object is within the second bounding box, the second prompt being generated based on a location of the second bounding box when the object is located therein (Figs. 5, 8; [0067] – overlaying the second spline as shown in Fig. 8, the second prompt generated based on a location of the second spline to adjust the second spline as further described at least in [0072]-[0073] and shown in Fig. 9).
Regarding claim 4, Kansara also discloses the computer-implemented method of claim 3, wherein overlaying comprises adjusting the second bounding box until the object is located therein (Figs.8-9; [0072]-[0073]). 
Regarding claim 5, Kansara also discloses the computer-implemented method of claim 1, wherein the second bounding box is provided in response to determining that the second bounding box is valid (Fig. 8; [0071]).
Regarding claim 6, Kansara also discloses the computer-implemented method of claim 5, further comprising determining that the second bounding box is valid based on an evaluation of the first and second bounding boxes using an intersection or overlap condition ([0068]-[0069] - matching specific points of person 310 in position 610 with corresponding points of person 310 in position 620, wherein positions 610 and 620 represent positions of person 310 in consecutive frames of the video using an intersection or overlap condition as a difference of position as further described at least in [0005]).
Regarding claim 7, Kansara also discloses the computer-implemented method of claim 6, wherein the intersection or overlap condition specifies an amount of intersection or overlap between the first and second bounding boxes ([0068]-[0069] - matching specific points of person 310 in position 610 with corresponding points of person 310 in position 620, wherein positions 610 and 620 represent positions of person 310 in consecutive frames of the video, specifying an amount of intersection or overlap condition as a difference of position as further described at least in [0005]).
Regarding claim 8, Kansara also discloses the computer-implemented method of claim 1, generating augmented video comprising a video that has been augmented with a digital asset based on at least the first and second masks ([0028]; [0064]-[0067] – generating an augmented video with masks in each frame of the video and a spline imported from other tools).
Regarding claim 9, Kansara also discloses the computer-implemented method of claim 8, wherein the augmented video is generated during film production of a scene, the scene including the object ([0064]-[0067] – video editing is a part of film production).
Regarding claim 12, Kansara also discloses the computer-implemented method of claim 1, wherein the first bounding box is generated using an object detection model or based on user input at an input device (Fig. 2; [0050]-[0052] – generating the initial spline, which is the first bounding box using an object detection model) and the second bounding box is not ([0071] – generating the second spline, i.e. the second bounding box, using a mask, thus neither using an object detection model nor a user input at an input device).
Regarding claim 13, Kansara also discloses the computer-implemented method of claim 12, further comprising generating a graphical user interface (GUI) with the first frame and graphical elements that are manipulated by a user to generate the first bounding box around the object in the first frame (Figs. 3-5; [0066]-[0067]).
Regarding claim 14, Kansara also discloses the computer-implemented method of claim 13, further comprising outputting the GUI on a display (Figs. 3-5).
Regarding claim 15, Kansara discloses a computer-implemented method comprising: receiving a video of a scene or frames of the video of the scene  (Fig. 2; [0040] – accessing frames of a video depicting a scene with an object); generating a first mask for an object in a first frame of the frames, the first mask being generated based on a first segmentation of the object from the first frame using a first prompt (Fig. 2; [0054]-[0055] – segmenting the object to generate an adjusted spline using a first prompt from the user to provide a mask for the frame as shown in Fig. 5 and further described at least in [0067]); generating a second mask for the object in a second frame of the frames, the second mask being generated based on a second segmentation of the object from the second frame ([0067] – generating a mask on each frame of the video) using a second prompt, the second prompt being generated based on the first mask (Fig. 2; [0062] – using the adjusted spline in a sequentially proximate frame and providing user’s input to repeat adjustment of the spline); and generating augmented video based on the video, the first and second mask, and a digital asset ([0028]; [0064]-[0067] – generating an augmented video with masks in each frame of the video and imported splines).
Regarding claim 16, Kansara also discloses the computer-implemented method of claim 15, wherein the digital asset is animated, and the augmented video is provided with the digital asset animated therein ([0081] – the digital asset as a spline imported from other editing tools as further described at least in [0028] is animated with movements of objects, and the augmented video is provided with the digital asset animated with the movements of the objects).
Regarding claim 17, Kansara also discloses the computer-implemented method of claim 16, wherein the augmented video is generated during film production of a scene, the scene including the object ([0064]-[0067] – video editing is a part of film production).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 10-11 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Kansara as applied to claims 1-9 and 12-17 above, and further in view of Wang et al. (CN 117541843 A – hereinafter Wang, references to machine translated copy attached).
Regarding claim 10, see the teachings of Kansara as discussed in claim 1 above. However, Kansara does not disclose the object is segmented from the first and second frames using a segment anything model (SAM).
Wang discloses an object is segmented from frames using a segment anything model (SAM) ([0079] - the image to be processed is input into the FastSAM segmentation model for image).
One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to incorporate the teachings of Wang into the method taught by Kansara to segment the object in the first and second frames because such a model provides high segmentation accuracy while maintaining low GPU memory usage and fast inference times.
Regarding claim 11, see the teachings of Kansara and Wang as discussed in claim 10 above, in which Wang also discloses the SAM is fast SAM (FastSAM), a mobile (MobileSAM), or a high-quality SAM (HQ-SAM) ([0079] - the image to be processed is input into the FastSAM segmentation model for image).
The motivation for incorporating the teachings of Wang into the method has been discussed in claim 10 above.
Regarding claim 18, see the teachings of Kansara as discussed in claim 15 above. However, Kansara does not disclose the object is segmented from the first and second frames using a segment anything model (SAM), the SAM comprising one of a fast SAM (FastSAM), a mobile (MobileSAM), and a high-quality SAM (HQ-SAM).
Wang discloses an object is segmented from frames using a segment anything model (SAM), the SAM comprising one of a fast SAM (FastSAM), a mobile (MobileSAM), and a high-quality SAM (HQ-SAM) ([0079] - the image to be processed is input into the FastSAM segmentation model for image).
One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to incorporate the teachings of Wang into the method taught by Kansara to segment the object in the first and second frames because such a model provides high segmentation accuracy while maintaining low GPU memory usage and fast inference times.
Regarding claim 19, Kansara discloses a system (Fig. 1 – system 101) comprising: one or more computing platforms (Fig. 1 – system 101 comprising one or more computing platforms such as processor 130, various modules 104, 106, 108, 110, and 112) configured to: receive frames of a scene, the frames including an object (Fig. 2; [0040] – accessing frames of a video depicting a scene with an object); generate a first bounding box so that the object in a first frame of the frames is within the first bounding box (Fig. 2; [0050]-[0052] – generating an initial spline, which is a bounding box, for the object in a key frame); provide a first prompt based on the first bounding box (Fig. 2; [0054]-[0055] – providing a first prompt as user input to adjust a corresponding spline); segment the object from the first frame based on the first prompt to provide a first mask (Fig. 5; [0067]; [0071] – generating a spline based on the mask); generate a second bounding box so that the object in a second frame of the frames is within the second bounding box based on the first mask (Fig. 5; [0067]; [0071] – generating a spline based on the mask); provide a second prompt based on the second bounding box (Fig. 2; [0062] – interpolating the adjusted spline in a sequentially proximate frame and providing user’s input to repeat adjustment of the spline); and segment the object from the second frame based on the second prompt to provide a second mask ([0067] – generating a mask on each frame of the video). 
However, Kansara does not disclose the object is segmented from the first and second frames using a segment anything model (SAM).
Wang discloses an object is segmented from frames using a segment anything model (SAM) ([0079] - the image to be processed is input into the FastSAM segmentation model for image).
One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to incorporate the teachings of Wang into the method taught by Kansara to segment the object in the first and second frames because such a model provides high segmentation accuracy while maintaining low GPU memory usage and fast inference times.
Regarding claim 20, Kansara in view of Wang also discloses the system of claim 19, the one or more computing platforms configured to generate augmented video comprising a video that has been augmented with a digital asset based on at least the first and second masks  ([0028]; [0064]-[0067] – generating an augmented video with masks in each frame of the video and imported splines), the augmented video is generated during film production of a scene, the scene including the object ([0064]-[0067] – video editing is a part of film production).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUNG Q DANG whose telephone number is (571)270-1116. The examiner can normally be reached IFT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Thai Q Tran can be reached at 571-272-7382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/HUNG Q DANG/Primary Examiner, Art Unit 2484

Read full office action

Prosecution Timeline

Feb 04, 2025

Application Filed

Jan 29, 2026

Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/799,552

Patent 12594460

MANAGING BLOBS FOR TRACKING OF SPORTS PROJECTILES

2y 5m to grant Granted Apr 07, 2026

18/884,185

Patent 12588818

DETECTION OF A MOVABLE OBJECT WHEN 3D SCANNING A RIGID OBJECT

2y 5m to grant Granted Mar 31, 2026

19/070,510

Patent 12592258

METHOD AND APPARATUS FOR INTERACTIVE VIDEO EDITING PLATFORM TO CREATE OVERLAY VIDEOS TO ENHANCE ENTERTAINMENT VIDEO GAMES WITH EDUCATIONAL CONTENT

2y 5m to grant Granted Mar 31, 2026

18/143,485

Patent 12587693

ARTIFICIALLY INTELLIGENT AD-BREAK PREDICTION

2y 5m to grant Granted Mar 24, 2026

18/270,294

Patent 12574649

ENCODING AND DECODING METHOD, ELECTRONIC DEVICE, COMMUNICATION SYSTEM, AND STORAGE MEDIUM

2y 5m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

68%

Grant Probability

87%

With Interview (+18.3%)

3y 1m

Median Time to Grant

Low

PTA Risk

Based on 1841 resolved cases by this examiner. Grant probability derived from career allow rate.