DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 04/20/2026 has been entered.
Status of Claims
Claims 1-20 are pending. Claims 1, 6-8, 13-14 and 19-20.
Response to Arguments
Applicant’s amendment of independent Claims 1, 8 and 14, which has altered the scope of the claims of the instant application, has necessitated the new ground(s) of rejection presented in this office action with respect to claims of the instant application. Accordingly, because Applicant’s arguments are merely directed to the amended portion of the claims, new analyses have been presented below, which make Applicant’s arguments moot.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 2, 4, 6-9, 11, 13-15, 17, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (US 2021/0302992 A1), in view of Lee et al. (US 2021/0358296 A1) and in further view of Awasthi et al. (US 2023/0224582 A1).
Regarding claim 1, Chen teaches, A system for (Chen, ¶0019: “a control system for controlling a motion of a vehicle”) detecting and tracking objects, (Chen, ¶0062: “detected bounding boxes are then fed into an object tracker to identify the one or more objects”) the system comprising: a processor; (Chen, ¶0039: “control system 100 includes an image processor”) and a memory storing machine-readable instructions (Chen, ¶0042: “The control system 100 includes a memory 108 that stores instructions”) that, when executed by the processor, cause the processor to: (Chen, ¶0042: “controller 104 may be configured to execute the stored instructions in order to control operations”) extract first features from time-sequential perceptual sensor data to generate (Chen, ¶0094: “multi-head neural network 110 executes feature extraction operation”) a first set of bird’s-eye-view (BEV) feature images; (Chen, ¶0007: “environmental state is detected based on a bird's eye view (BEV) map”) extract second features from the first set of BEV feature images (Chen, ¶0055: “extract the spatio-temporal features from the BEV maps”) using a three-dimensional (3D) detection backbone (Chen, ¶0017: “system includes three parts: (1) data representation from raw 3D point clouds to BEV maps; (2) spatio-temporal pyramid network as a backbone”) to generate a second set of BEV feature images, (Chen, ¶0104: “generates the extended BEV image”) wherein each BEV feature image in the second set of BEV feature images corresponds to a distinct time step in the time-sequential perceptual sensor data; (Chen, ¶0104: “extended BEV image includes… a position of the pixel in the extended BEV image at a current time step… a time sequence of future positions of the pixel in subsequent time steps”) and consume the second set of BEV feature images (Chen, ¶0018: “the outputs of the three heads are provided to a motion planner”) using a neural-network 3D detection head that is trained (Chen, ¶0013: “The entire multi-head neural network is trained in an end-to-end manner”) or generating automatically labeled perception data to train one or more of an online perception model, an online prediction model, or an online planning model used to control an autonomous robot. (Chen, ¶0019: “a motion planner configured to produce a motion trajectory of a vehicle using the extended BEV image; and a controller configured to control an actuator of the vehicle based on the motion trajectory”). However, Chen does not explicitly teach, trained with a similarity objective based on comparisons between feature vectors corresponding to pairs of object instances in a same time step or in different time steps to support an object tracker.
In an analogous field of endeavor, Lee teaches, trained with a similarity objective (Lee, ¶0107: “two birds-eye view images 531, 532 are aggregated to train classifiers for the features. Aggregator 552 may be configured to group similar features (in the form of pillars) together and represent them as a single feature”) based on comparisons between feature vectors (Lee, ¶0084: “features can be compared across all images; ¶0109: “An input vector of aggregator 552 may include some or all features directly from two birds-eye view images”).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen using the teachings of Lee to introduce training with a similarity objective. A person skilled in the art would be motivated to combine the known elements as described above and achieve the predictable result of automatically classifying similar feature in the same group. Therefore, it would have been obvious to combine the analogous arts Chen and Lee to obtain the above-described limitations in claim 1. However, the combination of Chen and Lee does not explicitly teach, feature vectors corresponding to pairs of object instances in a same time step or in different time steps to support an object tracker.
In an analogous field of endeavor, Awasthi teaches, feature vectors corresponding to pairs of object instances (Awasthi, ¶0023: “a plurality of features pertaining to the reconstructed plurality of objects… match feature vectors for each pair of features”) in a same time step or in different time steps (Awasthi, ¶0023: “determining a trajectory for each of the vectors of the estimated similarity matrices across the plurality of video frames”) to support an object tracker. (Awasthi, ¶0023: “estimating a motion trajectory of the reconstructed plurality of objects”).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen in view of Lee using the teachings of Awasthi to introduce representing pairs of objects using feature vectors. A person skilled in the art would be motivated to combine the known elements as described above and achieve the predictable result of automatically tracking objects across video frames. Therefore, it would have been obvious to combine the analogous arts Chen, Lee and Awasthi to obtain the invention in claim 1.
Regarding claim 2, Chen in view of Lee and in further view of Awasthi teaches, The system of claim 1, wherein, in connection with generating the automatically labeled perception data, the 3D detection backbone, in processing the first set of BEV feature images in an offline processing environment, (Chen, ¶0053: “image processor 106 generates the BEV maps from the 3D point cloud frames by executing conventional image processing operations such as PointNet, and the like”) performs feature-level temporal aggregation (Chen, ¶0054: “points for static background are aggregated at a time of determining clues on motions of moving objects in the environment”) that includes both forward recurrence and backward recurrence to generate the second set of BEV feature images (Lee, ¶0119: “system can be configured to perform a check of the consistency between the forward and backward flows”) and each BEV feature image in the second set of BEV feature images incorporates information (Lee, ¶0109: “aggregator 552 may include some or all features directly from two birds-eye view images 531, 532 having BeV embeddings such that pillar features from the two (or more) BeV images are aggregated”) from all time steps in the time-sequential perceptual sensor data. (Lee, ¶0105: “two birds-eye view images 531, 532… one representing the first point cloud (e.g., the point cloud at time t−1) and one representing the second point cloud (e.g., the point cloud at time t”).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen in view of Lee and in further view of Awasthi using the additional teachings of Lee to introduce feature-level temporal aggregation in both forward and backward occurrences. A person skilled in the art would be motivated to combine the known elements as described above and achieve the predictable result of tracking through a sequence of images. Therefore, it would have been obvious to combine the analogous arts Chen, Lee and Awasthi to obtain the invention in claim 2.
Regarding claim 4, Chen in view of Lee and in further view of Awasthi teaches, The system of claim 1, wherein, in connection with controlling the autonomous robot in an online processing environment of the autonomous robot, (Chen, ¶0038: “Through the network 126, either wirelessly or through wires, the control system 100 may receive the input data”) the 3D detection backbone, in processing the first set of BEV feature images, performs feature-level temporal aggregation that includes forward recurrence to generate the second set of BEV feature images. (Chen, ¶0040: “a pixel is associated with a time sequence of future positions of the pixel in subsequent time steps representing a prediction of a future motion of the object. The image processor 106 determines the time sequence of future positions of at least some pixel based on the outputs from the motion prediction head”).
Regarding claim 6, Chen in view of Lee and in further view of Awasthi teaches, The system of claim 1, wherein the time-sequential perceptual sensor data includes one or more of camera images, Light Detection and Ranging (LIDAR) data, radar data, sonar data, map data, or audio data. (Chen, ¶0050: “a plurality of sensors on the vehicle 116 such as a light detection and ranging (LiDAR) sensor, a radio detection and ranging (RADAR) sensor, a camera, and the like”).
Regarding claim 7, Chen in view of Lee and in further view of Awasthi teaches, The system of claim 1, wherein the autonomous robot is one of an autonomous vehicle, a search and rescue robot, a delivery robot, an aerial drone, or an indoor robot. (Chen, ¶0097: “the vehicle 500 can be an autonomous vehicle or a semi-autonomous vehicle”).
Regarding claim 8, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 1. Therefore, the recited instructions of the computer-readable medium of claim 8 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 1. Additionally, the rationale and motivation to combine Chen, Lee and Awasthi presented in rejection of claim 1, apply to this claim. In addition, Chen teaches, A non-transitory computer-readable medium for detecting and tracking objects and storing instructions that, when executed by a processor, cause the processor to: (Chen, ¶0109: “the program code or code segments to perform the necessary tasks may be stored in a machine readable medium. A processor(s) may perform the necessary tasks”).
Regarding claim 9, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 2. Therefore, the recited instructions of the computer-readable medium of claim 9 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 2. Additionally, the rationale and motivation to combine Chen, Lee and Awasthi presented in rejection of claim 2, apply to this claim.
Regarding claim 11, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 4. Therefore, the recited instructions of the computer-readable medium of claim 11 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 4. Additionally, the rationale and motivation to combine Chen, Lee and Awasthi presented in rejection of claim 1, apply to this claim.
Regarding claim 13, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 7. Therefore, the recited instructions of the computer-readable medium of claim 13 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 7. Additionally, the rationale and motivation to combine Chen, Lee and Awasthi presented in rejection of claim 1, apply to this claim.
Regarding claim 14, it recites a method with steps corresponding to the elements of the system recited in claim 1. Therefore, the recited steps of the method claim 14 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 1. Additionally, the rationale and motivation to combine Chen, Lee and Awasthi presented in rejection of claim 1, apply to this claim. In addition, Chen teaches, A method (Chen, ¶00111: “Embodiments of the present disclosure may be embodied as a method”).
Regarding claim 15, it recites a method with steps corresponding to the elements of the system recited in claim 2. Therefore, the recited steps of the method claim 15 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 2. Additionally, the rationale and motivation to combine Chen, Lee and Awasthi presented in rejection of claim 2, apply to this claim.
Regarding claim 17, it recites a method with steps corresponding to the elements of the system recited in claim 4. Therefore, the recited steps of the method claim 17 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 4. Additionally, the rationale and motivation to combine Chen, Lee and Awasthi presented in rejection of claim 1, apply to this claim.
Regarding claim 19, it recites a method with steps corresponding to the elements of the system recited in claim 6. Therefore, the recited steps of the method claim 19 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 6. Additionally, the rationale and motivation to combine Chen, Lee and Awasthi presented in rejection of claim 1, apply to this claim.
Regarding claim 20, it recites a method with steps corresponding to the elements of the system recited in claim 7. Therefore, the recited steps of the method claim 20 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 7. Additionally, the rationale and motivation to combine Chen, Lee and Awasthi presented in rejection of claim 1, apply to this claim.
Claims 3, 10 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (US 2021/0302992 A1), in view of Lee et al. (US 2021/0358296 A1), in further view of Awasthi et al. (US 2023/0224582 A1) and still in further view of Park et al. (US 2024/0020953 A1).
Regarding claim 3, Chen in view of Lee and in further view of Awasthi teaches, The system of claim 2, wherein the machine-readable instructions include further instructions that, when executed by the processor, cause the processor to. However, the combination of Chen, Lee and Awasthi does not explicitly teach, improve robustness of the object tracker by applying global association to object comparisons output by the 3D detector head.
In an analogous field of endeavor, Park teaches, improve robustness of the object tracker by applying global association to object comparisons output by the 3D detector head. (Park, ¶0021: “The MLP network encodes global contextual information with respect to the region—providing for accurate transformation when objects appear at different heights in the view”).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen in view of Lee in further view of Awasthi using the teachings of Park to introduce a global contextual information. A person skilled in the art would be motivated to combine the known elements as described above and achieve the predictable result of improving the robustness of the object tracker. Therefore, it would have been obvious to combine the analogous arts Chen, Lee, Awasthi and Park to obtain the invention in claim 3.
Regarding claim 10, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 3. Therefore, the recited instructions of the computer-readable medium of claim 10 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 3. Additionally, the rationale and motivation to combine Chen, Lee, Awasthi and Park presented in rejection of claim 3, apply to this claim.
Regarding claim 16, it recites a method with steps corresponding to the elements of the system recited in claim 3. Therefore, the recited steps of the method claim 16 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 3. Additionally, the rationale and motivation to combine Chen, Lee, Awasthi and Park presented in rejection of claim 3, apply to this claim.
Claims 5, 12 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (US 2021/0302992 A1), in view of Lee et al. (US 2021/0358296 A1), in further view of Awasthi et al. (US 2023/0224582 A1), and in still further view of Ji et al. (US 2025/0200751 A1).
Regarding claim 5, Chen in view of Lee and in further view of Awasthi teaches, The system of claim 1. However, the combination of Chen, Lee and Awasthi does not explicitly teach, wherein the similarity objective includes a cosine-similarity loss.
In an analogous field of endeavor, Ji teaches, wherein the similarity objective includes a cosine-similarity loss. (Ji, ¶0065: “training system trains the cloud point processing neural network to optimize a loss. The loss can be a cosine similarity loss that measures the similarity between the target pointwise features 318 and the pointwise features 322”).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen in view of Lee in further view of Awasthi using the teachings of Ji to introduce a cosine similarity loss. A person skilled in the art would be motivated to combine the known elements as described above and achieve the predictable result of improving the detection accuracy of the object tracker. Therefore, it would have been obvious to combine the analogous arts Chen, Lee, Awasthi and Ji to obtain the invention in claim 5.
Regarding claim 12, it recites a computer-readable medium including instructions corresponding to the elements of the system recited in claim 5. Therefore, the recited instructions of the computer-readable medium of claim 12 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 5. Additionally, the rationale and motivation to combine Chen, Lee, Awasthi and Ji presented in rejection of claim 5, apply to this claim.
Regarding claim 18, it recites a method with steps corresponding to the elements of the system recited in claim 5. Therefore, the recited steps of the method claim 18 are mapped to the proposed combination in the same manner as the corresponding elements of the system claim 5. Additionally, the rationale and motivation to combine Chen, Lee, Awasthi and Ji presented in rejection of claim 5, apply to this claim.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MEHRAZUL ISLAM whose telephone number is (571)270-0489. The examiner can normally be reached Monday-Friday: 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Saini Amandeep can be reached on (571) 272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MEHRAZUL ISLAM/Examiner, Art Unit 2662
/AMANDEEP SAINI/Supervisory Patent Examiner, Art Unit 2662