DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
Claims 1-20 of U.S. Application No. 18/468,509 filed on 09/15/2023 were examined. Examiner filed a non-final office action on 05/16/2025.
Applicant filed remarks and amendments on 07/30/2025.Claims 1, 2, 5-7, 11, 14, and 16-19 have been amended, and claim 13 has been canceled. Claims 1-12 and 14-20 are presently pending examination.
Response to Arguments
Regarding the claim rejections under 35 USC 101: applicant’s arguments filed 07/30/2025 (hereinafter referred to as the “Remarks”) have been fully considered and they are persuasive. The previously given claim rejections under 35 USC 101 are withdrawn.
Regarding the claim rejections under 35 USC 102 and 103: Applicant's arguments filed 07/30/2025 with respect Ramezani et al. (US 20220076432 A1) in view of Baig et al. (US 20210035442 A1) have been fully considered but they are not persuasive.
Regarding claims 1, 5, 8, 12, 15, and 19, which have been amended, applicant argues that, Ramezani fails to disclose a neural network that “receives only sensor data from a single frame” and “expressly excludes the use of time-series or multi-frame data for inference”
However, the examiner respectfully disagrees, this argument is not persuasive, Ramezani discloses a neural network that receives and processes sensor data (e.g., images or LiDAR frames) from a single frame for object detection and movability inference, independent of temporal context: “Each of the layers 52A-C includes feature nodes corresponding to the features the object detector 12 identified in the corresponding image…” (Ramezani, ¶ [0035]; ).
The object detector (12) processes each frame independently to generate feature vectors, which serve as inputs to the neural network units (e.g., GRUs/LSTMs) at feature nodes. This enables movability classification (e.g., distinguishing dynamic/static objects via association probabilities) solely from single-frame data: “the object detector 12 and the multi-object tracker 14 can be implemented as components of a suitable self-driving software architecture… the object detector 12… output[s] a feature vector for each detection…” (Ramezani, ¶ [0034]; ).
Ramezani’s architecture expressly allows exclusion of time-series data for inference by fixing earlier layers after processing, limiting operations to the current frame’s layer when the rolling window advances: “Upon creating a new layer L_i using new observations… the system advances the window to operate on layers L_{i+1}, L_{i+2}, and L_{i+3}. The system then can fix the layer neural network parameters of layer L_i, and exclude any further change in it as the rolling window no longer includes that layer.” (Ramezani, ¶ [0031]; ).
Applicant further argues that, Ramezani does not teach a “real-time, closed-loop control system” using single-frame movability inference to alter vehicle trajectory.
However, the examiner respectfully disagrees, this argument is not persuasive, Ramezani teaches integration of single-frame movability outputs into a real-time, closed-loop control system for trajectory alteration: “the object detector 12 and the multi-object tracker 14 can operate as components of the tracking module 114 [within a self-driving software architecture that includes a planning module to generate vehicle trajectories based on perception outputs].” (Ramezani, ¶ [0034]).
The fixed single-frame feature probabilities directly feed the planning module, forming a closed loop where movability inferences (e.g., high-probability dynamic object detection) trigger immediate trajectory adjustments: “the multi-object tracker 14 controls the growth of the neural network… by the ‘rolling’ or ‘sliding time’ window… that prunes older edge and feature nodes [enabling real-time updates to vehicle control].” (Ramezani, ¶ [0037]).
Claim Objections
Claim 13 objected to because of the following informalities: According to the remarks filed on 07/30/2025 claim 13 has been cancelled and needs to be removed from the presented list of claims. Appropriate correction is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-5, 8-12 and 15-19 are rejected under 35 U.S.C. 103 as being unpatentable over Ramezani et al. (US 20220076432 A1) in view of Baig et al. (US 20210035442 A1), hereinafter referred to as Ramezani and Baig respectively.
Regarding claims 1, 8 and 15, Ramezani discloses A system comprising:
a memory (“The vehicle controller 322 may include one or more CPUs, GPUs, and a non-transitory memory with persistent components (e.g., flash memory, an optical disk) and/or non-persistent components (e.g., RAM).” [0096]) ;
and one or more processors coupled to the memory (“For example, the laser 310 may include a controller or processor that receives data from each of the sensor heads 312 (e.g., via a corresponding electrical link 320) and processes the received data to construct a point cloud covering a 360-degree horizontal view around a vehicle or to determine distances to one or more targets. The point cloud or information from the point cloud may be provided to a vehicle controller 322 via a corresponding electrical, optical, or radio link 320. The vehicle controller 322 may include one or more CPUs, GPUs, and a non-transitory memory with persistent components (e.g., flash memory, an optical disk) and/or non-persistent components (e.g., RAM).” [0096]),
the one or more processors being configured to:
receive raw sensor data, wherein the raw sensor data is collected by one or more sensors of an autonomous vehicle of a scene, the sensors being configured to capture multiple sensor data frames in a time series, and the raw sensor data includes depictions of an unclassified object in the scene proximal to the autonomous vehicle (“The multi-object tracking architecture is configured to receive a sequence of images generated at respective times by one or more sensors configured to sense an environment through which objects are moving relative to the one or more sensors.” [0008] and “(e.g., based on raw sensor data from one or more of the sensors 102).” [0087]);
provide the sensor data to a neural network, wherein the neural network is configured to detect the unclassified object in a single frame of the multiple sensor data frames and calculate a movability probability of the detected unclassified object in the single frame (“The sensor control architecture 100 also includes a prediction component 120, which processes the perception signals 106 to generate prediction signals 122 descriptive of one or more predicted future states of the vehicle's environment. For a given object, for example, the prediction component 120 may analyze the type/class of the object (as determined by the classification module 112) along with the recent tracked movement of the object (as determined by the tracking module 114) to predict one or more future positions of the object…………… The prediction component 120 may inherently account for such behaviors by utilizing a neural network or other suitable machine learning model, for example.” [0083]);
predict whether the detected unclassified object may move in the scene based on the movability probability, generated by the neural network, of the detected unclassified object in the scene (“As a relatively simple example, the prediction component 120 may assume that any moving objects will continue to travel with no change to their current direction and speed, possibly taking into account first- or higher-order derivatives to better track objects that have continuously changing directions, objects that are accelerating, and so on………. In some embodiments, the prediction component 120 may be omitted from the sensor control architecture 100 (e.g., if the vehicle does not perform any prediction of future environment states, or if the vehicle does perform prediction but predicted environment states are not used to control any sensors).” [0083] and “The segmentation module 110 may determine which points belong to the same object using any suitable rules, algorithms or models. Once the objects 396 are identified, the classification module 112 of FIG. 3 may attempt to classify the objects, and the tracking module 114 of FIG. 3 may attempt to track the classified objects (and, in some embodiments/scenarios, unclassified objects) across future point clouds similar to point cloud 390 (i.e., across multiple point cloud frames).” [0104] see also [0043]).
Ramezani does not explicitly teach in response to determining that the unclassified object may move in the scene, adjust a trajectory of the autonomous vehicle in real time based on the prediction of the movability of the unclassified object, wherein the adjustment comprises generating and executing a control command to steer, accelerate, or decelerate the autonomous vehicle to avoid or accommodate the predicted movement of the unclassified object.
However, Baig does teach And in response to determining that the unclassified object may move in the scene, adjust a trajectory of the autonomous vehicle in real time based on the prediction of the movability of the unclassified object, wherein the adjustment comprises generating and executing a control command to steer, accelerate, or decelerate the autonomous vehicle to avoid or accommodate the predicted movement of the unclassified object (“The AV 1952 can associate a likelihood with each of the first and the second hypotheses. In an example, a likelihood of 0.3 can be assigned to the first hypothesis and a likelihood of 0.7 can be assigned to the second hypothesis. As such, there is a 30% chance that the vehicle 1954 stays in the lane 1953B and a 70% chance that the vehicle will move to the lane 1953A. As further explained below, the likelihood of each hypothesis can be determined based on state information.” [0276]). Both Ramezani and Baig teach methods for tracking and predicting the movement of unclassified object. However, Baig explicitly teaches in response to determining that the unclassified object may move in the scene, adjust a trajectory of the autonomous vehicle in real time based on the prediction of the movability of the unclassified object, wherein the adjustment comprises generating and executing a control command to steer, accelerate, or decelerate the autonomous vehicle to avoid or accommodate the predicted movement of the unclassified object.
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the object tracking monitoring method of Ramezani to also include in response to determining that the unclassified object may move in the scene, adjust a trajectory of the autonomous vehicle in real time based on the prediction of the movability of the unclassified object, wherein the adjustment comprises generating and executing a control command to steer, accelerate, or decelerate the autonomous vehicle to avoid or accommodate the predicted movement of the unclassified object, as taught by Baig, with a reasonable expectation of success. Doing so improves safety for operating autonomous vehicles (With regard to this reasoning, see at least [Baig, 0274-0276]).
Regarding claims 2, 9 and 16, Ramezani discloses The system of claim 1,
Ramezani does not explicitly teach wherein the prediction of the movability of the neural network comprises a probability that the unclassified object moves in the scene
However, Baig does teach wherein the prediction of the movability of the neural network comprises a probability that the unclassified object moves in the scene (“ The world model module 402 fuses sensor information, tracks objects, maintains lists of hypotheses for at least some of the dynamic objects (e.g., an object A might be going straight, turning right, or turning left), creates and maintains predicted trajectories for each hypothesis, and maintains likelihood estimates of each hypothesis (e.g., object A is going straight with probability 90% considering the object pose/velocity and the trajectory poses/velocities).” [0109]). Both Ramezani and Baig teach methods for tracking and predicting the movement of unclassified object. However, Baig explicitly teaches wherein the prediction of the movability of the neural network comprises a probability that the unclassified object moves in the scene.
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the object tracking monitoring method of Ramezani to also include wherein the prediction of the movability of the neural network comprises a probability that the unclassified object moves in the scene, as taught by Baig, with a reasonable expectation of success. Doing so improves safety for operating autonomous vehicles (With regard to this reasoning, see at least [Baig, 0002 and 0005]).
Regarding claims 3, 10 and 17, Ramezani discloses The system of claim 1,
Ramezani does not explicitly teach wherein the one or more processors are configured to: in response to determining that the unclassified object may move in the scene, provide information associated with the movability of the unclassified object to a tracker
However, Baig does teach wherein the one or more processors are configured to: in response to determining that the unclassified object may move in the scene, provide information associated with the movability of the unclassified object to a tracker, which is configured to track a path of the unclassified object in the scene (“Given observations from sensor data, the world model module, according to implementations of this disclosure, tracks and estimates the states of observed objects (i.e., real-world objects) and predicts the future states of the real-world objects with multiple hypotheses in a probabilistic manner. That is, the world model module can provide for improved tracking of objects in the real world. The world model module predicts multiple hypotheses for possible trajectories of real-world objects.” [0050]). Both Ramezani and Baig teach methods for tracking and predicting the movement of unclassified object. However, Baig explicitly teaches wherein the one or more processors are configured to: in response to determining that the unclassified object may move in the scene, provide information associated with the movability of the unclassified object to a tracker, which is configured to track a path of the unclassified object in the scene.
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the object tracking monitoring method of Ramezani to also include wherein the one or more processors are configured to: in response to determining that the unclassified object may move in the scene, provide information associated with the movability of the unclassified object to a tracker, which is configured to track a path of the unclassified object in the scene, as taught by Baig, with a reasonable expectation of success. Doing so improves safety for operating autonomous vehicles (With regard to this reasoning, see at least [Baig, 0002 and 0005]).
Regarding claims 4, 11 and 18, Ramezani discloses The system of claim 1,
Ramezani does not explicitly teach wherein the one or more processors are configured to: in response to determining that the unclassified object may move in the scene, provide information associated with the movability of the unclassified object to a prediction stack, which is configured to predict a path of the unclassified object
However, Baig does teach wherein the one or more processors are configured to: in response to determining that the unclassified object may move in the scene, provide information associated with the movability of the unclassified object to a prediction stack, which is configured to predict a path of the unclassified object (“In the situation 350, the tracking component of the AV 302 detects an oncoming vehicle 352, a first parked vehicle 356, and a second parked vehicle 357. The prediction component of the AV 302 determines that the oncoming vehicle 352 is following a trajectory 354.” [0104]). Both Ramezani and Baig teach methods for tracking and predicting the movement of unclassified object. However, Baig explicitly teaches wherein the one or more processors are configured to: in response to determining that the unclassified object may move in the scene, provide information associated with the movability of the unclassified object to a prediction stack, which is configured to predict a path of the unclassified object.
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the object tracking monitoring method of Ramezani to also include wherein the one or more processors are configured to: in response to determining that the unclassified object may move in the scene, provide information associated with the movability of the unclassified object to a prediction stack, which is configured to predict a path of the unclassified object, as taught by Baig, with a reasonable expectation of success. Doing so improves safety for operating autonomous vehicles (With regard to this reasoning, see at least [Baig, 0002 and 0005]).
Regarding claims 5, 12 and 19, Ramezani discloses The system of claim 1, wherein the one or more processors are configured to: train the neural network to predict the movability of the unclassified object in the scene (“The multi-object tracker can interconnect the features disposed in different layers via edges to define hypotheses regarding possible tracks of features across the sequence of images. For example, the computer system can interconnect feature F.sub.1 in layer L.sub.2 with feature F.sub.1′ in the adjacent, later-in-time layer L.sub.2 as well with feature F.sub.2′ in layer L.sub.2, feature F.sub.3′ in layer L.sub.2, etc. After the system trains a neural network supported by the graph and generates inferences using the techniques of this disclosure, the system generate tracks for the various features, so as to ascertain for example that feature F.sub.1 in layer L.sub.1 is probably associated with feature F.sub.2′ in layer L.sub.2, which in turn is probably associated with feature F.sub.5′ in layer L.sub.3, etc.” [0027]),
wherein the training of the neural network comprises: providing multiple sensor data frames associated with the unclassified object, wherein the multiple sensor data frames are captured in a time series (“The tracking module 114 is generally configured to track distinct objects over time (e.g., across multiple lidar point cloud or camera image frames). The tracked objects are generally objects that have been identified by the segmentation module 110, but may or may not be objects that were classified by the classification module 112, depending on the embodiment and/or scenario. The segmentation module 110 may assign identifiers to identified objects, and the tracking module 114 may associate existing identifiers with specific objects where appropriate (e.g., for lidar data, by associating the same identifier with different clusters of points, at different locations, in successive point cloud frames). Like the segmentation module 110 and the classification module 112, the tracking module 114 may perform separate object tracking based on different sets of the sensor data (e.g., the tracking module 114 may include a number of modules operating in parallel), or may track objects based on a fusion of data from multiple sensors.” [0081]).
generating a motion signal for the unclassified object based on the multiple sensor data frames (“The vehicle controller 322 can include a perception module 352 that receives input from the components 300 and uses a perception machine learning (ML) model 354 to provide indications of detected objects, road markings, etc. to a motion planner 356, which generates commands for the components 330 to maneuver the vehicle 300. Referring back to FIG. 1, the components 352-356 in one embodiment implement the components 102-107, in any suitable configuration” [0099]);
correlating the motion signal with the unclassified object in the sensor data captured within the single frame to create a supervisory label for movability (“If the sensor control component 130 only controls “Sensor 1,” for example, the dynamic object detector 134 may identify dynamic objects using perception signals 106 generated based only on data from “Sensor 1,” using perception signals 106 based only on data from any one or more of “Sensor 2” through “Sensor N,” or using perception signals 106 based on both data from “Sensor 1” and data from any one or more of “Sensor 2” through “Sensor N.” Thus, for example, a camera with a wide-angle view of the environment may be used to determine a narrower area of focus for a lidar device, or a lidar device may initially be set to have a relatively large field of regard, and later be set to focus on (e.g., center a smaller field of regard upon) a dynamic object detected in a specific portion of the larger field of regard, etc.” [0092] see also [0034 & 0037]);
and using the supervisory label to train the neural network to predict movability from only the single frame (“Similarly, the prediction signals 122 may include, for each such grid generated by the perception component 104, one or more “future occupancy grids” that indicate predicted object positions, boundaries and/or orientations at one or more future times (e.g., 1, 2 and 5 seconds ahead). In other embodiments, the sensor control architecture 100 does not generate or utilize occupancy grids.” [0084] see also [0085] see also [0034 & 0037]).
Claims 6-7, 14 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ramezani in view of Baig and in further view of Ebrahim Afrouzi et al. (US 20220066456 A1), hereinafter referred to as Ramezani and Ebrahim Afrouzi respectively.
Regarding claims 6 and 20, Ramezani in view of Baig discloses The system of claim 5,
Ramezani in view of Baig does not explicitly teach wherein the training of the neural network comprises: separating the unclassified object from a background in each of the multiple sensor data frames and determining the movability of the unclassified object based on the separation of the unclassified object in the multiple sensor data frames.
However, Ebrahim Afrouzi does teach wherein the training of the neural network comprises: separating the unclassified object from a background in each of the multiple sensor data frames (“In some embodiments, the processor of the robot may perform segmentation wherein an object captured in an image is separated from other objects and the background of the image.” [0382]);
and determining the movability of the unclassified object based on the separation of the unclassified object in the multiple sensor data frames (“The features that get blocked depend on the FOV of a camera of the robot and its angle relative to the features that represent the background. In embodiments, the processor may extract such background features due to a lack of a straight line of sight. Some embodiments may track objects separately from the background environment and may form decisions based on a combination of both.” [0404]). Both Ramezani in view of Baig and Ebrahim Afrouzi teach methods for tracking and predicting the movement of unclassified object. However, Ebrahim Afrouzi explicitly teaches wherein the training of the neural network comprises: separating the unclassified object from a background in each of the multiple sensor data frames and determining the movability of the unclassified object based on the separation of the unclassified object in the multiple sensor data frames.
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the object tracking monitoring method of Ramezani in view of Baig to also include wherein the training of the neural network comprises: separating the unclassified object from a background in each of the multiple sensor data frames and determining the movability of the unclassified object based on the separation of the unclassified object in the multiple sensor data frames, as taught by Ebrahim Afrouzi, with a reasonable expectation of success. Doing so improves safety for operating autonomous vehicles (With regard to this reasoning, see at least [Ebrahim Afrouzi, 0006]).
Regarding claims 7 and 14, Ramezani in view of Baig discloses The system of claim 5,
Ramezani in view of Baig does not explicitly teach wherein the training of the neural network comprises: correlating a motion signal of the unclassified object, which is determined based on the multiple sensor data frames with the unclassified object in the sensor data captured within the single frame
However, Ebrahim Afrouzi does teach wherein the training of the neural network comprises: correlating a motion signal of the unclassified object, which is determined based on the multiple sensor data frames with the unclassified object in the sensor data captured within the single frame (“In this scenario, for every step the robot takes, there is a ground truth distance measured by the LIDAR that correlates with the movement of pixels captured by the camera. There is also additional information that correlates such as encoder from wheels (odometry), gyroscope data, accelerometer data, compass data, optical tracking sensor data, etc.” [0682]). Both Ramezani in view of Baig and Ebrahim Afrouzi teach methods for tracking and predicting the movement of unclassified object. However, Ebrahim Afrouzi explicitly teaches wherein the training of the neural network comprises: correlating a motion signal of the unclassified object, which is determined based on the multiple sensor data frames with the unclassified object in the sensor data captured within the single frame.
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the object tracking monitoring method of Ramezani in view of Baig to also include wherein the training of the neural network comprises: correlating a motion signal of the unclassified object, which is determined based on the multiple sensor data frames with the unclassified object in the sensor data captured within the single frame, as taught by Ebrahim Afrouzi, with a reasonable expectation of success. Doing so improves safety for operating autonomous vehicles (With regard to this reasoning, see at least [Ebrahim Afrouzi, 0006]).
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AHMED ALKIRSH whose telephone number is (703) 756-4503. The examiner can normally be reached M-F 9:00 am-5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, FADEY JABR can be reached on (571) 272-1516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
AHMED ALKIRSHExaminer, Art Unit 3668
/Fadey S. Jabr/Supervisory Patent Examiner, Art Unit 3668