Last updated: April 19, 2026

Application No. 18/400,323

TRACK AWARE DETECTION FOR OBJECT TRACKING SYSTEMS

Non-Final OA §103

Filed

Dec 29, 2023

Examiner

ALAVI, AMIR

Art Unit

2668

Tech Center

2600 — Communications

Assignee

Microsoft Technology Licensing, LLC

OA Round

1 (Non-Final)

Interview Optional

— +3.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 1156 resolved cases, 2023–2026

Examiner Intelligence

ALAVI, AMIR View full profile →

Grants 94% — above average

Career Allow Rate

1083 granted / 1156 resolved

+31.7% vs TC avg

Minimal +4% lift

Without

With

+3.6%

Interview Lift

resolved cases with interview

Typical timeline

2y 5m

Avg Prosecution

23 currently pending

Career history

1179

Total Applications

across all art units

Statute-Specific Performance

§101

23.0%

-17.0% vs TC avg

§103

20.2%

-19.8% vs TC avg

§102

19.5%

-20.5% vs TC avg

§112

12.9%

-27.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 1156 resolved cases

Office Action

§103

DETAILED ACTION

Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

            Claims 1-13 are rejected under 35 U.S.C. 103(a) as being unpatentable over Huang et al. (USPAP       2021/0192,182), hereinafter, “Huang” in view of Park (KR 20220070624 A Real user detection method and system in the observation scene), and further in view of Lin et al. (TW 202223737 A Object matching and identification method and system thereof), hereinafter, “Lin”.

        Regarding claim 1, Huang recites, a processing system; and memory comprising computer executable instructions (Please note, paragraph 0007. As indicated an apparatus for object detection is described. The apparatus may include a processor, memory coupled with the processor, and instructions stored in the memory. The instructions may be executable by the processor to cause the apparatus to receive an image.) that, when executed, perform operations comprising: detecting an object in a current frame of image content, wherein detecting the object comprises creating multiple candidate bounding boxes for the object, each candidate bounding box comprising at least a portion of the object. (Please note, figure 2, in correlation to paragraph 0064. As indicated the devices 105 may output a classification score 220-b, a bounding box location 225-b, a number of object landmarks 230-a, and an up-right determination 235-a. For example, the devices 105, using the second network 210, may regress (e.g., refine) the confidence score of the candidate objects 240 (e.g., the candidate object region) and the bounding box locations 225-b, detect the object landmarks 230-a, and determine if the candidate objects 240 are up-right).
        Huang does not expressly recite, wherein comparing the multiple candidate bounding boxes to a predicted bounding box for the object, wherein the predicted bounding box is generated based on a current track for the object, the current track including a first previous frame comprising the object.
             Park recites, comparing the multiple candidate bounding boxes to a predicted bounding box for the object, wherein the predicted bounding box is generated based on a current track for the object, the current track including a first previous frame comprising the object. (Please note, paragraph 0065. As indicated when an object is detected, "object tracking" is performed using bounding box information. "Object tracking" calculates the predicted position of the object, compares it with the input bounding box position, and allows the object to be continuously tracked and distinguished through matching between positions.).
        Huang & Park are combinable because they are from the same field of endeavor.
        At the time of the invention, it would have been obvious to a person of ordinary skill in the art to utilize this comparing the multiple candidate bounding boxes to a predicted bounding box for the object, wherein the predicted bounding box is generated based on a current track for the object, the current track including a first previous frame comprising the object of Park in Huang’s invention.
        The suggestion/motivation for doing so would have been as indicated on paragraph 0065, “In this process, new objects appearing in the scene can also be identified.”. 
       Huang and Park do not expressly recite, based on the comparing, filtering the multiple candidate bounding boxes, wherein the filtering identifies a representative bounding box from the multiple candidate bounding boxes, the representative bounding box being a closest match to the predicted bounding box and adding the current frame comprising the representative bounding box to the current track.
        Lin recites based on the comparing, filtering the multiple candidate bounding boxes, wherein the filtering identifies a representative bounding box from the multiple candidate bounding boxes, the representative bounding box being a closest match to the predicted bounding box; and adding the current frame comprising the representative bounding box to the current track. (Please note, paragraph 72. As indicated a second bounding box is generated to mark the position and range of the object in the second image MG2, and the second bounding box is input into the bounding box association module 150 (refer to steps S215 and S216), the bounding box association module 150 The similarity between the predicted bounding box and the second bounding box is determined, and if the matching result indicates the same object or different objects, the number of objects in the matching area 111 can be obtained. Next, calculate the sum of the number of objects that do not appear in the matching area 111 and the number of objects in the matching area 111 to obtain the total number of objects (refer to steps S217 , S218 , and S219 ).).
        Huang, Park & Lin are combinable because they are from the same field of endeavor.
        At the time of the invention, it would have been obvious to a person of ordinary skill in the art to utilize this based on the comparing, filtering the multiple candidate bounding boxes, wherein the filtering identifies a representative bounding box from the multiple candidate bounding boxes, the representative bounding box being a closest match to the predicted bounding box; and adding the current frame comprising the representative bounding box to the current track of Lin in Huang & Park’s invention.
        The suggestion/motivation for doing so would have been as indicated on paragraph 72, “The above-mentioned object matching and identification method can avoid the problem that the number of objects in the matching area is repeatedly counted because it is determined to be different objects.”. 
                   Therefore, it would have been obvious to combine Huang, Park with Lin to obtain the invention as specified in claim 1.


            Regarding claim 2, Park recites, prior to detecting the object in the current frame, receiving the image content at an image tracking system trained to detecting objects of interest and tracking movement of the objects of interest over a time period, (Please note, paragraph 56. As indicated objects classified as “users” are tracked every frame, and their location and identity are updated. Tracking continues until the object leaves the observation scene through the set entry/exit area.) wherein the image tracking system is stored in the memory of the system. (Please note, paragraph 70. As indicated the process of "creating user object information" stores information on objects classified as "users" and associates face object information with person object information.).
             Regarding claim 3, Park recites, wherein the objects of interest are predefined to include specified classes of objects and an object detector of the image tracking system is trained to detect the specified classes of objects. (Please note, paragraph 69. As indicated when the object classified as "user" is a face object, "face authentication" is performed on the face object bounding box area. "Face authentication" checks whether a face object matches the previously registered user's face information, and matches the identity information of the face with the face object.).
             Regarding claim 4, Park recites, at least one of: modifying color formatting of the frame; modifying an aspect ratio of the frame; or performing filtering or segmentation of the frame. (Please note, paragraph 80. As indicated a method for identifying a real user in a screen is presented using artificial intelligence technologies such as semantic object segmentation, object detection, and object tracking.).
            Regarding claim 5, Huang recites, assigning a confidence score to each candidate bounding box of the multiple candidate bounding boxes. (Please note, paragraph 0010. As indicated in some examples of the method, apparatuses, and non-transitory computer-readable medium described herein for determining the confidence score may include operations, features, means, or instructions for determining the confidence score during the first pass.).
            Regarding claim 6, Huang recites, wherein the confidence score represents a probability that the object belongs to a particular object class. (Please note, paragraph 0015. As indicated some examples of the method, apparatuses, and non-transitory computer-readable medium described herein may further include operations, features, means, or instructions for determining a classification score of the object recognition information including one or more of the candidate object in the image or the candidate bounding box associated with the candidate object in the image, and determining an additional classification score of the additional object recognition information including one or more of the additional candidate objects in the image or the additional candidate bounding boxes associated with the additional candidate objects in the image, where processing the image may be based on one or more of the classification score or the additional classification score.).
             Regarding claim 7, Park recites, determining a predicted location of the object in the current frame based on a first location of the object in the first previous frame. (Please note, paragraph 0065. As indicated when an object is detected, "object tracking" is performed using bounding box information. "Object tracking" calculates the predicted position of the object, compares it with the input bounding box position, and allows the object to be continuously tracked and distinguished through matching between positions.).
            Regarding claim 8, Huang recites, the current track further includes a second previous frame comprising the object, the second previous frame being sequentially prior to the first previous frame and determining the predicted location of the object in the current frame comprises determining a motion path of the object based on the first location of the object in the first previous frame and a second location of the object in the second previous frame. (Please note, paragraph 0106. As indicated the devices 105, using the output network 500, may process the input image region by merging the number of candidate objects 240 based on the detection branch 310 the devices 105 may have used to detect the candidate objects 240 (e.g., within the detection branch groups as determined by the devices 105 at the second stage). In some examples, the devices 105 may merge a group of candidate objects 240 by performing NMS on a group of candidate objects 240 detected by one detection branch 310. In some examples, after merging candidate objects 240 within each detection branch group, the devices 105 (e.g., using the output network 500) may merge (e.g., may perform a final NMS) for all of the predicted bounding boxes.).
            Regarding claim 9, Lin recites, determining an Intersection over Union (IoU) score for each of the multiple candidate bounding boxes, the IoU score indicating an amount of similarity between the predicted bounding box and a respective candidate bounding box. (Please note, paragraph 62. As indicated the bounding box association module 150, for example, uses the Intersection over Union (IoU) association method to perform matching, and calculates the coverage ratio of the predicted bounding box of the candidate object in the bounding boxes of other cameras to Complete matching tasks.).
            Regarding claim 10, Lin recites, wherein filtering the multiple candidate bounding boxes using a filtering algorithm to remove bounding boxes from the multiple candidate bounding boxes until one bounding box for the object remains, the one bounding box being the representative bounding box. (Please note, paragraph 0041. As indicated the multimedia manager 135 may also be configured to provide multimedia enhancements, multimedia restoration, multimedia analysis, multimedia compression, multimedia streaming, and multimedia synthesis, among other functionality. For example, the multimedia manager 135 may perform white balancing, cropping, scaling (e.g., multimedia compression), adjusting a resolution, multimedia stitching, color processing, multimedia filtering, spatial multimedia filtering, artifact removal, frame rate adjustments, multimedia encoding, multimedia decoding, and multimedia filtering. By further example, the multimedia manager 135 may process multimedia data to support a two-pass omni-directional object detection, according to the techniques described herein. For example, the multimedia manager 135 may employ the machine learning component 140 to process content of the application 130.).
            Regarding claim 11, Huang recites, wherein the filtering algorithm is a Non-Maximum Suppression (NMS) algorithm. (Please note, paragraph 0061. As indicated for each detection branch, the devices 105 (e.g., using the first network 205) may merge the candidate objects 240 (e.g., using non-max suppression (NMS)).
            Regarding claim 12, Huang recites, wherein the current track represents a motion or a position of the object through multiple frames of the image content. (Please note, paragraph 0042. As indicated the machine learning component 140 may perform learning-based object recognition processing on content (e.g., multimedia content, such as image frames or video frames) of the application 130 to support a two-pass omni-directional object detection according to the techniques described herein.).
            Regarding claim 13, Huang recites, at least one of: adding a confidence score for the representative bounding box to the current track; or adding metadata for the representative bounding box to the current track. (Please note, paragraph 0018. As indicated determining a second confidence score associated with one or more of the candidate object in the image, the candidate bounding box associated with the candidate object in the image, the one or more object features of the candidate object in the image, or an orientation of the candidate object in the image, or a combination thereof during the second pass, comparing the confidence score and the second confidence score, and outputting a representation of one or more of the confidence score or the second confidence score based on the comparing, where processing the image includes outputting one or more of the confidence score or the second confidence score.).


Examiner’s Note

               The examiner cites particular figures, paragraphs, columns and line numbers in the references as applied to the claims for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claims, other passages and figures may apply as well. 
               It is respectfully requested that, in preparing responses, the applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMIR ALAVI whose telephone number is (571)272-7386. The examiner can normally be reached on M-F from 8:00-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached at (571)272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format.
 For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


















/AMIR ALAVI/Primary Examiner, Art Unit 2668                                                                                                                                                                                                        Thursday, December 18, 2025

Read full office action

Prosecution Timeline

Dec 29, 2023

Application Filed

Dec 18, 2025

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/210,221

Patent 12597232

SYSTEM FOR LEARNING NEW VISUAL INSPECTION TASKS USING A FEW-SHOT META-LEARNING METHOD

2y 5m to grant Granted Apr 07, 2026

18/030,900

Patent 12573189

PROCESSING METHOD AND PROCESSING DEVICE USING SAME

2y 5m to grant Granted Mar 10, 2026

17/646,958

Patent 12567238

GENERATING A DATA STRUCTURE FOR SPECIFYING VISUAL DATA SETS

2y 5m to grant Granted Mar 03, 2026

17/936,519

Patent 12561950

AI System and Method for Automatic Analog Gauge Reading

2y 5m to grant Granted Feb 24, 2026

18/244,672

Patent 12561774

SYSTEM AND METHOD FOR REAL-TIME TONE-MAPPING

2y 5m to grant Granted Feb 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

94%

Grant Probability

97%

With Interview (+3.6%)

2y 5m

Median Time to Grant

Low

PTA Risk

Based on 1156 resolved cases by this examiner. Grant probability derived from career allow rate.