DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Interpretation
The claims will be read under the broadest reasonable interpretation standard outlined in
MPEP § 2111.01.
In line with paragraph 73 of the claimed invention’s specification, the examiner interprets “geometry features” as recited by claim 3 to include sizes and locations of bounding boxes.
Claim Objections
Claim 2 and 12 are objected to because of the following informalities: the phrase “3D objection detection model” should read “3D object detection model”.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1 and 3-9 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Hung et. al “SoDA: Multi-Object Tracking with Soft Data Association” (Hereinafter, “Hung”, cited by Applicant).
As to claim 1, Hung discloses a method comprising:
PNG
media_image1.png
889
482
media_image1.png
Greyscale
receiving, at a current time step, a set of new object detections, each new object detection being data characterizing features of a respective object that has been detected in an environment at the current time step ([3.1]):
PNG
media_image2.png
665
393
media_image2.png
Greyscale
PNG
media_image3.png
371
471
media_image3.png
Greyscale
maintaining data that identifies one or more object tracks ([3.2]):
PNG
media_image4.png
431
632
media_image4.png
Greyscale
for each object track, selecting a subset of the new object detections as candidate object detections for the object track ([3.2]; [1]):
PNG
media_image5.png
330
572
media_image5.png
Greyscale
PNG
media_image6.png
366
570
media_image6.png
Greyscale
for each object track, processing an input derived from the candidate object detections for the object track and a track query feature representation for the object track using a track-detection interaction neural network to generate a respective association score for each candidate object detection; and ([3]; [A]):
PNG
media_image7.png
329
455
media_image7.png
Greyscale
PNG
media_image8.png
505
583
media_image8.png
Greyscale
PNG
media_image9.png
689
532
media_image9.png
Greyscale
PNG
media_image10.png
633
347
media_image10.png
Greyscale
PNG
media_image11.png
585
469
media_image11.png
Greyscale
PNG
media_image12.png
124
574
media_image12.png
Greyscale
PNG
media_image10.png
633
347
media_image10.png
Greyscale
determining, for each of the one or more object tracks, whether to associate any of the new object detections with the object track based on the respective association scores for the candidate object detections for the object tracks ([3]):
PNG
media_image13.png
650
458
media_image13.png
Greyscale
PNG
media_image12.png
124
574
media_image12.png
Greyscale
As to claim 3, Hung discloses the method of claim 1, wherein the features comprise geometry and appearance features of the respective object ([B]; [3.1]);
PNG
media_image14.png
509
571
media_image14.png
Greyscale
PNG
media_image15.png
335
483
media_image15.png
Greyscale
As to claim 4, Hung discloses the method of claim 1, wherein determining, for each of the one or more object tracks, whether to associate any of the new object detections with the object track based on the respective association scores for the candidate object detections for the object tracks comprises:
PNG
media_image16.png
818
574
media_image16.png
Greyscale
PNG
media_image17.png
420
576
media_image17.png
Greyscale
applying a Hungarian algorithm to the respective association scores for the candidate object detections for the object tracks to assign each new object detection to one of the object tracks or to a new object track ([3.2]; [3.3]):
As to claim 5, Hung discloses the method of claim 4, further comprising:
in response to determining to assign a given new object detection to a new object track, adding the new object track to the maintained data ([3.3]; [3.2]):
PNG
media_image3.png
371
471
media_image3.png
Greyscale
PNG
media_image18.png
600
693
media_image18.png
Greyscale
As to claim 6, Hung discloses the method of claim 1, further comprising:
processing each new object detection using a detection encoder to generate an embedding of the new object detection ([3.1]):
PNG
media_image11.png
585
469
media_image11.png
Greyscale
PNG
media_image19.png
471
469
media_image19.png
Greyscale
PNG
media_image20.png
689
478
media_image20.png
Greyscale
PNG
media_image21.png
337
469
media_image21.png
Greyscale
PNG
media_image14.png
509
571
media_image14.png
Greyscale
PNG
media_image22.png
675
501
media_image22.png
Greyscale
PNG
media_image23.png
366
471
media_image23.png
Greyscale
As to claim 7, Hung discloses the method of claim 6, wherein the input derived from the candidate object detections for the object track and the track query feature representation for the object track comprises the embeddings of the candidate object detections for the object track and the track query feature representation for the object track (Fig. 6; [3]):
PNG
media_image24.png
589
485
media_image24.png
Greyscale
PNG
media_image25.png
121
478
media_image25.png
Greyscale
PNG
media_image26.png
90
468
media_image26.png
Greyscale
(As stated in paragraph 56 of the claimed invention’s specification, track query representation includes a numerical representation of the object’s position).
As to claim 8, Hung further discloses the method of claim 6, further comprising:
generating a new query feature for each object track by processing an input comprising respective embeddings of detections that have been associated with the object track using a temporal fusion neural network ([2]; [1]; [3]; Fig. 3; Fig. 6):
PNG
media_image27.png
664
501
media_image27.png
Greyscale
PNG
media_image28.png
613
685
media_image28.png
Greyscale
PNG
media_image29.png
480
1423
media_image29.png
Greyscale
PNG
media_image30.png
343
339
media_image30.png
Greyscale
PNG
media_image31.png
286
345
media_image31.png
Greyscale
As to claim 9, Hung discloses the method of claim 1, further comprising:
processing the feature representation of each object track using a track state decoder neural network to generate a predicted state of the object track at the current time point ([3.1];
[3.2]; Fig. 3):
PNG
media_image23.png
366
471
media_image23.png
Greyscale
PNG
media_image32.png
169
464
media_image32.png
Greyscale
PNG
media_image33.png
144
469
media_image33.png
Greyscale
PNG
media_image34.png
266
485
media_image34.png
Greyscale
(This claim is read in line with paragraph 61 of the claimed invention’s specification, which states that a state decoder can be a feed forward neural network trained by optimizing a prediction loss for the initial state predictions).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Hung in view of Caesar et. al “nuScenes: A multimodal dataset for autonomous driving (Hereinafter, “Caesar”, cited by Applicant).
With respect to claim 2, Hung teaches the method of claim 1 upon which claim 2 depends, Hung does not explicitly teach the method of claim 1, further comprising: receiving a laser sensor spin at the current time step; and applying a 3d objection detection model to the laser sensor spin to generate the set of one or more new object detections.
However, Caesar, in the same field of endeavor of object detection in autonomous vehicles, teaches the same ([2.1]; [3]; [4.1]):
PNG
media_image35.png
160
421
media_image35.png
Greyscale
PNG
media_image36.png
168
344
media_image36.png
Greyscale
PNG
media_image37.png
186
397
media_image37.png
Greyscale
PNG
media_image38.png
424
683
media_image38.png
Greyscale
(Bottom: PointPillars model as incorporated by Caesar).
It would have been obvious to one of ordinary skill in the art as of the effective filing date of the claimed invention, to modify Hung to include the elements of lidar and 3d object detection modeling of Caesar. The motivation for doing so would be to implement a means to collect data in real-time, and construct a 3d model of surrounding objects in space for further analysis. Lidar and 3d object detection modeling is readily integrable into the system of Hung with predictable success, as Hung requires an undisclosed source of input data for its analysis.
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Hung in view of Sun et. al “TransTrack: Multiple Object Tracking with Transformer” (Hereinafter, “Sun”, cited by Applicant).
With respect to claim 10, Hung teaches the elements of claims 1 and 9 upon which claim 10 depends.
It is the examiner’s position that Hung further discloses the additional elements of claim 10 as well: the method of claim 9, further comprising:
for each object track, using the predicted state of the object track to select the candidate detections for the object track ([3.1]):
PNG
media_image39.png
469
469
media_image39.png
Greyscale
PNG
media_image40.png
902
469
media_image40.png
Greyscale
Notwithstanding the above, Sun, in the same field endeavor of object tracking, also teaches the further limitation of claim 10 ([3.1]):
PNG
media_image41.png
517
474
media_image41.png
Greyscale
It would have been obvious to one of ordinary skill in the art as of the effective filing date of the claimed invention, to modify Hung to include the elements of object prediction as taught by Sun. Such a combination would allow for predictive updating of the detection boxes using information already collected by Hung, for the advantage of better stabilizing outputted results across frames. Like Hung, Sun also features a similar system of detection boxing and encoding/decoding, further facilitating the transfer of teachings.
Claims 11-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hung in view of Narayanan et. al (US 20200086879 A1) (Hereinafter, “Narayanan”).
With respect to claim 11, it is functionally identical to claim 1, with the exception that the method is claimed “as a system comprising:
one or more computers; and
one or more storage devices storing instructions that, when executed by the one or more computers, cause the one or more computers to perform operations comprising:”
These limitations are not expressly taught by Hung. However, Narayanan, in the same field of endeavor of object detection and autonomous driving, provides for the same ([181]; [0043]).
It would have been obvious to one of ordinary skill in the art as of the effective filing date of the claimed invention, to modify Hung to include the elements of computer implementation as taught by Narayanan. While not explicitly disclosed, Hung implies that a computer is used for the implementation of its methods ([4.1.1]):
PNG
media_image42.png
314
470
media_image42.png
Greyscale
A person of ordinary skill in the art would understand that computers and code are capable of performing the method of Hung, and would be motivated to combine Hung with Narayanan for increased execution efficiency.
With respect to claims 12-19, they are functionally identical to claims 2-9, with the exception of the underlying computer system of claim 11 (by which they depend). For the reasons discussed in the rejection of claim 11, it would have been obvious to combine Narayanan with Hung. The additional limitations do not dissuade from such a combination.
With respect to claim 20, it is functionally identical to claim 1, with the exception that the method is claimed as “one or more computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:”
These limitations are not expressly taught by Hung. However, Narayanan, in the same field of endeavor of object detection and autonomous driving, provides for the same ([0043]).
It would have been obvious to one of ordinary skill in the art as of the effective filing date of the claimed invention, to modify Hung to include the elements of code taught by Naranayan. While not explicitly disclosed, Hung implies that a computer and code is used for the implementation of its methods ([4.1.1]).
A person of ordinary skill in the art would understand that computers and code are capable of performing the method of Hung, and would be motivated to combine Hung with Narayanan for increased execution efficiency.
Additional References
Additionally cited references (see attached PTO-892) otherwise not relied upon above have been made of record in view of the manner in which they evidence the general state of the art.
Inquiry
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NOAH WILLIAM BOYAR whose telephone number is (571)272-8392. The examiner can normally be reached 8:30 – 5:00 EST, Monday – Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached at 571-272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/NOAH W BOYAR/Examiner, Art Unit 2669
/CHAN S PARK/Supervisory Patent Examiner, Art Unit 2669