Office Action Analysis: 18505732 — MACHINE LEARNING MODEL FOR MULTI-CAMERA MULTI-PERSON TRACKING

Office Action

§101 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on November 9, 2023 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. § 101 because the claimed invention is directed to a
judicial exception without significantly more.
Claim 1 is directed to a method for tracking movement. Under Step 2A Prong One, the claim
recites abstract ideas including:
(1) Mathematical concepts: "generating scores for pairs of detection images" (mathematical
calculations), "generating a pairwise detection graph using the detection images as nodes and the scores as weighted edges" (mathematical relationships and graph theory), and "constrained
answer set programming problem" (mathematical algorithms and optimization);
(2) Mental processes: "performing person detection" (observation), "combining visual and
location information" (evaluation and comparison that can be performed in the human mind or
with pen and paper), and "tracking movement of an individual based on ... logical assumptions"
(mental process of observation and logical reasoning);
(3) Certain methods of organizing human activity: "tracking movement of an individual"
(managing information about human behavior and movement).
Under Step 2A Prong Two, the claim does not integrate the judicial exception into a practical application. The additional elements amount to: (1) generic data sources ("multiple video streams," "frames"), (2) insignificant extra-solution activity ("performing an action responsive to the tracked movement"), and (3) mere instructions to apply the abstract idea in the field of video surveillance (field of use). These elements do not improve the functioning of a computer or other technology, do not effect a particular transformation, and do not apply the abstract idea with a particular machine in a meaningful way.
Under Step 2B, the additional elements do not provide an inventive concept. The elements represent well-understood, routine, conventional activity in the art, including: using conventional person detection techniques, performing routine data combinations, using standard graph data structures, and applying known optimization algorithms. The specification acknowledges using known techniques such as re-identification models (ResNet, VIT networks) and standard camera projection mathematics.

Claims 2-9 depend from Claim 1 and add limitations that are also well-understood, routine,
conventional activities that do not amount to significantly more than the judicial exception.

Claim 10 recites the same abstract ideas as Claim 1 in system format. The recitation of generic
computer components ("hardware processor," "memory") does not integrate the judicial
exception into a practical application or provide significantly more.

Claims 11-18 depend from Claim 10 and are ineligible for the same reasons as Claims 2-9.

Claim 19 recites the same abstract ideas as Claim 1 with the addition of "in a healthcare facility"
(field of use limitation) and "generating a report for a healthcare professional for decision making
related to a patient's treatment" (insignificant post-solution activity of applying the
result). These limitations do not integrate the abstract idea into a practical application or provide
significantly more.

Claim 20 depends on Claim 19 and is ineligible for the same reasons.

To overcome this rejection, applicant could:
I. Add specific technical improvements to camera systems or video processing
2. Recite unconventional implementation details of the answer set programming approach
3. Claim specific technical features that provide more than routine data processing
4. Provide evidence that the combination produces unexpected results

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-5, 8-14, 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Wu (Chinese Patent CN 112131904 A) in view of Suchan, Jakob, et al. "Visual explanation by high-level abduction: On answer-set programming driven reasoning about moving objects." Proceedings of the AAAI conference on artificial intelligence. Vol. 32. No. 1. 2018. (Year: 2018), Shen, Yantao, et al. "Person re-identification with deep similarity-guided graph neural network." Proceedings of the European conference on computer vision (ECCV). 2018. (Year: 2018), and Buehler et. al. (European Patent EP-1563686-B1) .
Regarding claim 1, Wu discloses a method for tracking movement, comprising: performing person detection in frames from multiple video streams to identify detection images (Wu, Abstract). 

    PNG
    media_image1.png
    262
    732
    media_image1.png
    Greyscale

However, Wu fails to teach combining visual and location information from the detection images to generate scores for pairs of detection images across the multiple video streams and across frames of respective video streams; generating a pairwise detection graph using the detection images as nodes and the scores as weighted edges; tracking movement of an individual based a constrained answer set programming problem, with constraints determined based on matching scores and logical assumptions; and performing an action responsive to the tracked movement.
Shen teaches combining visual and location information from the detection images to generate scores for pairs of detection images across the multiple video streams and across frames of respective video streams (Shen, 3.1 Graph Formulation and Node Features, by inputting image pair into a Siamese-CNN for pairwise relation feature encoding); generating a pairwise detection graph using the detection images as nodes and the scores as weighted edges (Shen, 3.2 Similarity-Guided Graph Neural Network, establish edges E on graph G where the node features are updated as a weighted addition fusion). 

    PNG
    media_image2.png
    886
    1098
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    662
    1110
    media_image3.png
    Greyscale

Suchan teaches tracking movement of an individual based a constrained answer set programming problem (Suchan, A Hybrid Architecture for Visual Explanation based on the integration of high-level abductive reasoning within Answer Set Programming (ASP)), with constraints determined based on matching scores and logical assumptions (Suchan, Figure 2 People Movement, Beliefs as (Spatial) constraints).

    PNG
    media_image4.png
    436
    494
    media_image4.png
    Greyscale


    PNG
    media_image5.png
    340
    948
    media_image5.png
    Greyscale

Buehler et. al. teaches performing an action responsive to the tracked movement (Buehler et. al. [0104]).  

    PNG
    media_image6.png
    76
    748
    media_image6.png
    Greyscale

Shen is analogous to the claimed invention because it is reasonably pertinent to the problem of person reidentification, which aims at finding the person in images of interest in a set of images across different cameras without error in the intelligent surveillance systems. Suchan is analogous to the claimed invention because it addresses the answer set programming problem for visual perception and object tracking of moving people in the setting of movies. Buehler et. al. is analogous to the claimed invention because it pertains to surveillance, security, public safety, and tracking human movement. It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method for video surveillance of Wu to incorporate the teachings of Shen, Suchan, and Buehler et. al. so that the solution of the claimed invention is fully addressed and to have a more robust and accurate person image feature representation. 
Regarding claim 2, the combination of Wu, Suchan, Shen, and Buehler et. al. discloses the method of claim 1. Buehler et. al. further teaches further comprising synchronizing the multiple video streams to identify temporal correspondences between frames of the multiple video streams (Buehler et. al. Figures 1, 5).  It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have included the teachings of Buehler et. al. with the teachings of Wu, Suchan, and Shen so that the multiple video streams are observed for a substantially long period of recorded time.

    PNG
    media_image7.png
    750
    574
    media_image7.png
    Greyscale

Regarding claim 3, the combination of Wu, Shen, Suchan, and Buehler et.al. discloses the method of claim 1. Buehler et. al. further teaches further comprising extracting the visual information based on a visual similarity between detection images (Buehler et. al. Figures 6A, 6B, 7A, 7B). It is important to the claimed invention to have identified a target of visual similarity that defines the human of interest. Thus, it would have been obvious to one skilled in the art prior to the effective filing date of the claimed invention to have included the teachings of Buehler et. al. with the combination of Wu, Shen, and Suchan so that there is an objective target identified. 
Regarding claim 4, the combination of Wu, Shen, Suchan, and Buehler et. al. discloses the method of claim 1. Suchan further teaches further comprising extracting the location information based on a projection of two-dimensional coordinates into a three-dimensional environment for the detection images and determining a distance between the projected coordinates (Suchan, Ontology: Space, Time, Objects, Events where the tracks of the objects are measured in basic spatial 2D and 3D space coordinates).  

    PNG
    media_image8.png
    502
    496
    media_image8.png
    Greyscale


    PNG
    media_image9.png
    746
    492
    media_image9.png
    Greyscale

It is important to the claimed invention to have quantified distance between 2D coordinates in a 3D environment. Thus, it would have been obvious to one skilled in the art prior to the effective filing date of the claimed invention to have included the teachings of Suchan with the teachings of Wu, Shen, and Buehler et. al. so that the distance between two defined coordinates between people in a scene is measured and recorded.
Regarding claim 5, the combination of Wu, Suchan, Shen, and Buehler et. al. discloses the method of claim 1. Buehler et. al. further teaches wherein generating the pairwise detection graph includes determining edges between detection images from different frames of a same video stream and determining edges between detection images from different video streams at corresponding times (Buehler et. al. Figures 6A, 6B, 7A, 7B). It is critical to the claimed invention to produce a graph that is from pairwise detection results so that the identification of people can be visualized. Thus, it would have been obvious to one skilled in the art prior to the effective filing date of the claimed invention to have included the teachings of Buehler et. al. so that human surveillance solutions can be fully utilized.  
Regarding claim 10, the rejection analysis of claim 1 is incorporated herein. The combination of Wu, Suchan, Shen, and Buehler et. al. also teaches a system for tracking movement, comprising: a hardware processor (Buehler et. al., [0042]); and a memory that stores a computer program which, when executed by the hardware processor, causes the hardware processor to: (Buehler et. al. [0047], lines 47-48) perform person detection in frames from multiple video streams to identify detection images (Wu, Abstract); combine visual and location information from the detection images to generate scores for pairs of detection images across the multiple video streams and across frames of respective video streams (Shen, 3.1 Graph Formulation and Node Features, by inputting image pair into a Siamese-CNN for pairwise relation feature encoding); generate a pairwise detection graph using the detection images as nodes and the scores as weighted edges (Shen, 3.2 Similarity-Guided Graph Neural Network, establish edges E on graph G where the node features are updated as a weighted addition fusion); track movement of an individual based a constrained answer set programming problem (Suchan, A Hybrid Architecture for Visual Explanation based on the integration of high-level abductive reasoning within Answer Set Programming (ASP)), with constraints determined based on matching scores and logical assumptions (Suchan, Figure 2 People Movement, Beliefs as (Spatial) constraints); and perform an action responsive to the tracked movement (Buehler et. al. [0104]).  
Shen is analogous to the claimed invention because it is reasonably pertinent to the problem of person reidentification, which aims at finding the person in images of interest in a set of images across different cameras without error in the intelligent surveillance systems. Suchan is analogous to the claimed invention because it addresses the answer set programming problem for visual perception and object tracking of moving people in the setting of movies. Buehler et. al. is analogous to the claimed invention because it pertains to surveillance, security, public safety, and tracking human movement. It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method for video surveillance of Wu to incorporate the teachings of Shen, Suchan, and Buehler et. al. so that the solution of the claimed invention is fully addressed and to have a more robust and accurate person image feature representation. 
Regarding claim 11, the combination of Wu, Suchan, Shen, and Buehler, et. al. teaches the system of claim 10, wherein the computer program further causes the hardware processor to synchronize the multiple video streams to identify temporal correspondences between frames of the multiple video streams (Buehler et. al. Figures 7A, 7B). It is important to the claimed invention to have the complete computer program and hardware necessary to carry out the tasks disclosed. Thus, it would have been obvious to one skilled in the art prior to the effective filing date of the claimed invention to have included the teachings of Buehler et. al. that also includes a system of CCTV surveillance for security purposes.  
Regarding claim 12, the combination of Wu, Suchan, Shen, and Buehler et. al. teaches the system of claim 10, wherein the computer program further causes the hardware processor to extract the visual information based on a visual similarity between detection images (Buehler et. al. [0089]).

    PNG
    media_image10.png
    22
    736
    media_image10.png
    Greyscale
 

    PNG
    media_image11.png
    184
    738
    media_image11.png
    Greyscale
 It is important to the claimed invention to have the complete computer program and hardware necessary to carry out the tasks disclosed. Thus, it would have been obvious to one skilled in the art prior to the effective filing date of the claimed invention to have included the teachings of Buehler et. al. that also includes a system of CCTV surveillance for security purposes.  
Regarding claim 13, the combination of Wu, Suchan, Shen, and Buehler et. al. further discloses the system of claim 10, wherein the computer program further causes the hardware processor to extract the location information based on a projection of two-dimensional coordinates into a three-dimensional environment for the detection images and determining a distance between the projected coordinates (Suchan, Ontology: Space, Time, Objects, Events where the tracks of the objects are measured in basic spatial 2D and 3D space coordinates).  
It is important to the claimed invention to have quantified distance between 2D coordinates in a 3D environment. Thus, it would have been obvious to one skilled in the art prior to the effective filing date of the claimed invention to have included the teachings of Suchan with the teachings of Wu, Shen, and Buehler et. al. so that the distance between two defined coordinates between people in a scene is measured and recorded.
Regarding claim 14, the combination of Wu, Suchan, Shen, and Buehler et. al. further teaches the system of claim 10, wherein the computer program further causes the hardware processor to determine edges between detection images from different frames of a same video stream and to determine edges between detection images from different video streams at corresponding times (Buehler Figure 4).  

    PNG
    media_image12.png
    863
    339
    media_image12.png
    Greyscale

The edge detection of detection images is an important solution to the claimed invention because it reduces error in overlapping video streams with multiple people. Thus, it would have been obvious to one skilled in the art prior to the effective filing date to have included the teachings of Buehler et. al. with the teachings of Wu, Shen, and Suchan to include the solution for edge detection in multiple video frames.
Regarding claim 17, the combination of Wu, Suchan, Shen, and Buehler et. al. further teaches the system of claim 10, wherein the computer program further causes the hardware processor to an output from a visual branch to an output of a location branch to combine the visual and location information (Buehler et. al. Figure 15, Figure 4, Figure 1).  

    PNG
    media_image13.png
    696
    640
    media_image13.png
    Greyscale

This is an important aspect for person identification based on time and location, especially related to personal security. Thus, it would have been obvious to one skilled in the art prior to the effective filing date to have included the teachings of Buehler et. al. so that the information collected is matched to the correct person of interest.
Regarding claim 8, the combination of Wu, Shen, Suchan, and Buehler et. al. discloses the method of claim 1. Shen further teaches wherein combining the visual and location information includes adding an output from a visual branch to an output of a location branch (Shen, 2.2 (Graph for Machine Learning)—“After the message propagation among different nodes (samples), the mapping function will output the classification or regression results of each node”).  

    PNG
    media_image14.png
    376
    558
    media_image14.png
    Greyscale

Shen et. al. is considered analogous to the claimed invention because it is reasonably pertinent to the problem of person reidentification, which aims at finding the person in images of interest in a set of images across different cameras without error in the intelligent surveillance systems. It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method for video surveillance using CAS with greater accuracy of Buehler et. al to incorporate the teachings of Shen et.al by including the re-identification solution of having a more robust and accurate person image feature representation through fusion weights for updating the nodes’ features (Shen et. al 2.2 (Graph for machine learning)).

    PNG
    media_image15.png
    120
    558
    media_image15.png
    Greyscale


    PNG
    media_image16.png
    182
    572
    media_image16.png
    Greyscale

Regarding claim 9, Shen in the combination also teaches the method of claim 8, wherein the visual branch includes processing the detection images with a re-identification model (Shen et. al. Abstract).  

    PNG
    media_image17.png
    268
    366
    media_image17.png
    Greyscale

Regarding 18, the rejection of claim 9 is incorporated herein. 
Claim(s) 19-20, 6-7, 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Wu (Chinese Patent CN 112131904 A) in view of Suchan, Jakob, et al. "Visual explanation by high-level abduction: On answer-set programming driven reasoning about moving objects." Proceedings of the AAAI conference on artificial intelligence. Vol. 32. No. 1. 2018. (Year: 2018), Shen, Yantao, et al. "Person re-identification with deep similarity-guided graph neural network." Proceedings of the European conference on computer vision (ECCV). 2018. (Year: 2018), and Buehler et. al. (European Patent EP-1563686-B1) as applied to claim 1 and 10 above, and further in view of Johnson et. al. (United States Patent US 12,106,654 B2) and Ross (United States Patent US 2022/0022006A1).
Regarding Claim 19, which overlaps substantially in scope as in claim 1 except for “tracking movement in a healthcare facility” and “generating a report for a healthcare
professional for decision-making related to a patient’s treatment, based on the tracked
movement”. Thus, the rejection of claim 1 based on the combination of Wu, Shen, Suchan, and Buehler is incorporated herein.
Wu, Shen, Suchan and Buehler does not teach the limitation “tracking movement in a healthcare facility” as further recited. However, Johnson et. al teaches a method for tracking movement in a healthcare facility, comprising: performing person detection in frames from multiple video streams in a healthcare facility to identify detection images (Abstract, claim 9)

    PNG
    media_image18.png
    506
    504
    media_image18.png
    Greyscale


    PNG
    media_image19.png
    260
    356
    media_image19.png
    Greyscale

And generating a report for a healthcare professional for decision-making related to a patient’s
treatment, based on the tracked movement (Johnson et. al: Col. 7, lines 62-67, Col. 8, lines 5-
23).

    PNG
    media_image20.png
    84
    320
    media_image20.png
    Greyscale


    PNG
    media_image21.png
    330
    324
    media_image21.png
    Greyscale

Johnson is analogous to the claimed invention because it is pertinent to the problem of patient monitoring in a healthcare facility for fall reduction. One of ordinary skill in the art before the effective filing date of the claimed invention would have found it obvious to incorporate the teaching of Johnson for the benefit of fall mitigation of tracked patients in a healthcare facility.
The combination of Wu, Shen, Suchan, Buehler and Johnson fail to teach “generating a report for a healthcare professional for decision-making related to a patient’s treatment, based on the tracked movement” as further recited. However, Ross teaches wherein the action includes generating a report for a healthcare professional for decision-making related to a patient’s treatment, based on tracked movement of the patient (Ross Fig. 1, 3, Abstract).

    PNG
    media_image22.png
    616
    560
    media_image22.png
    Greyscale


    PNG
    media_image23.png
    726
    946
    media_image23.png
    Greyscale

Ross is considered analogous to the claimed invention because it is reasonably
pertinent to the problem of providing safety and security for patients receiving treatment and will
allow healthcare professionals to make an informed treatment decision based on recorded
movement data. It would have been obvious to a person having ordinary skill in the art before
the effective filing date of the claimed invention to incorporate the teaching of Ross for the
benefit of generating a report for a healthcare professional for decision-making related to a
patient’s treatment, based on tracked movement of the patient (Ross Fig. 1, 3, Abstract).
Regarding claim 20, Buehler et. al. in the combination further teaches the method of claim 19, wherein generating the pairwise detection graph includes determining edges between detection images from different frames of a same video stream (Buehler et. al. Fig. 4, 5) and determining edges between detection images from different video streams at corresponding times (Buehler et. al. Fig. 4, [0057]).

    PNG
    media_image24.png
    152
    728
    media_image24.png
    Greyscale

	Regarding claims 6, 7, 15, 16, the rejection of claims 19-20 above is fully incorporated herein.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JESSICA YIFANG LIN whose telephone number is (571)272-6435. The examiner can normally be reached M-F 7:00am-6:15pm, with optional day off.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached at 571-272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JESSICA YIFANG LIN/Examiner, Art Unit 2668                                                                                                                                                                                                        February 26, 2026


/VU LE/Supervisory Patent Examiner, Art Unit 2668
Read full office action
MACHINE LEARNING MODEL FOR MULTI-CAMERA MULTI-PERSON TRACKING

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

MACHINE LEARNING MODEL FOR MULTI-CAMERA MULTI-PERSON TRACKING

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email