Last updated: April 19, 2026
Application No. 18/308,622
FRAMEWORK FOR EVALUATION OF MACHINE LEARNING BASED MODEL USED FOR AUTONOMOUS VEHICLE

Final Rejection §103
Filed
Apr 27, 2023
Examiner
HAUSMANN, MICHELLE M
Art Unit
2671
Tech Center
2600 — Communications
Assignee
Perceptive Automata Inc.
OA Round
2 (Final)
Interview Optional

— +21.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 863 resolved cases, 2023–2026
Examiner Intelligence

HAUSMANN, MICHELLE M View full profile →
Grants 76% — above average
Career Allow Rate
658 granted / 863 resolved
+14.2% vs TC avg
Strong +22% interview lift
Without
With
+21.6%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
23 currently pending
Career history
886
Total Applications
across all art units
Statute-Specific Performance

§101
14.6%
-25.4% vs TC avg
§103
61.2%
+21.2% vs TC avg
§102
5.7%
-34.3% vs TC avg
§112
10.1%
-29.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 863 resolved cases
Office Action

§103
DETAILED ACTION
Response to Amendment
Claims 1-20 are pending. Claims 1-20 are amended directly or by dependency on an amended claim.
Response to Arguments
Applicant’s arguments, see pages 13-15, filed 29 January, 2026, with respect to the 35 USC 101 rejections of claims 1-20, along with accompanying amendments on the same date, have been fully considered and are persuasive.  The 35 USC 101 rejections of claims 1-20 have been withdrawn. 
Applicant’s arguments, see page 15, with respect to the objection to the specification along with accompanying amendments on the same date have been fully considered and are persuasive.  The objection to the specification has been withdrawn. 
Applicant’s arguments with respect to the 35 USC 103 rejections of claim(s) 1-20 along with accompanying amendments on the same date have been considered but are moot because the new ground of rejection does not rely on the combination of references applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
 Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3-8, 10-15, and 17-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Agarwal et al. (US 20220144303 A1) in view of Donderici (US 20230252280 A1) in view of Lin et al. (US 20190072960 A1).

Regarding claims 1, 8, and 15, Agarwal et al. disclose a method comprising, a non-transitory computer readable storage medium storing instructions that when executed by one or more computer processors, cause the one or more computer processors to perform steps comprising, and a computer system comprising: a computer processor; and a non transitory computer readable storage medium storing instructions that when executed by one or more computer processors, cause the one or more computer processors to perform steps comprising: sending a first set of video frames to a first set of users, each video frame showing a traffic scenario comprising one or more traffic entities (a scene representation generator generating a scene representation based on the input stream of images and a graph neural network (GNN), a situation predictor generating a prediction of a situation based on the scene representation and the intention of the ego vehicle, [0003], transmitted signals, [0017], The dataset may include short video clips or image sequences, [0030], The image sensor 110 may be an image capture device, such as a video camera, and may receiving an input stream of images of an environment including one or more objects (e.g., pedestrians, road users, etc.) within the environment, [0043]); receiving a first set of annotations based on video frames of the first set of video frames, wherein each annotation of the first set of annotations is for a video frame from the first set of video frames and describes a state of mind of a traffic entity shown in the video frame (pedestrian awareness (e.g., looking or not looking) using face annotations when pedestrians are present, [0028], [0030], [0031], “With regard to pedestrian attentiveness, the dataset may focus on annotations relating to the attention of pedestrians while the ego-vehicle is approaching (e.g., within a threshold distance, etc.). In other words, for the pedestrian attentiveness, the system 100 may select a subset of scenes from the dataset used for the risk object identification, so the subset includes scenes that the driver is influenced by pedestrians. Further, the pedestrian attentiveness portion of the dataset may include pedestrian attentiveness labels, (i.e., looking, not looking, and not sure) and may further include mutual awareness labels relating to driver monitoring and gaze information (e.g., whether the driver and the pedestrian and likely to be aware of one another)”, [0040]; 18based model using the first set of annotations of the first set of video frames, the machine learning based model configured to receive an input video frame and predict a state of mind of a traffic entity displayed in the video frame (a scene representation generator generating a scene representation based on the input stream of images and a graph neural network (GNN), a situation predictor generating a prediction of a situation based on the scene representation and the intention of the ego vehicle, [0003], The method for driver behavior risk assessment and pedestrian awareness may include generating the influenced or non-influenced action determination based on passing the prediction of the situation through a multilayer perceptron (MLP) and the scene representation, [0007], “With regard to pedestrian attentiveness, the annotated dataset may provide a bounding box around both the faces and body of pedestrians and the system 100 or 200 may use these annotations to train one or more of the ROI 120, the intention estimator 130, the scene representation generator 140, the situation predictor 150, or the driver response determiner 160 from a classification perspective and a detection perspective”, [0057]); sending a second set of video frames to a second set of users, each video frame showing a traffic scenario comprising one or more traffic entities (a scene representation generator generating a scene representation based on the input stream of images and a graph neural network (GNN), a situation predictor generating a prediction of a situation based on the scene representation and the intention of the ego vehicle, [0003], transmitted signals, [0017], The dataset may include short video clips or image sequences, [0030], The image sensor 110 may be an image capture device, such as a video camera, and may receiving an input stream of images of an environment including one or more objects (e.g., pedestrians, road users, etc.) within the environment, [0043]); receiving a second set of annotations based on video frames of the second set of video frames, wherein each annotation is for a video frame from the second set of video frames describes a driving recommendation for the traffic scenario shown in the video frame being annotated (A data set having annotations of driver intention (e.g., go straight), scenarios (e.g., a jay-walker is crossing the street), decision of driver maneuver (e.g., slow down), risk object identification framework is provided that explicitly models the causal relationship of driver intention, scenario, and decision of driver maneuver, [0028],“A framework that explicitly models the causal relationship between the driver intention, scenario, and decision of driver maneuver is provided herein. The system for driver behavior risk assessment and pedestrian awareness may examine the problem of risk perception and introduce a new dataset to facilitate research in this domain. The dataset may include short video clips or image sequences that include annotations of driver intent, road network topology, situation (e.g., crossing pedestrian), driver response,” [0030], comprehensive dataset with a diverse set of situations and annotations to enable research for risk object identification, [0032]); determining a measure of driving quality of an autonomous vehicle based on a comparison of driving actions determined based on predictions of the machine learning based model and driving recommendations received from annotators (

    PNG
    media_image1.png
    330
    270
    media_image1.png
    Greyscale
). Agarwal et al. further indicate storing a ground truth mapping from traffic scenarios to driving recommendations (“Further, the annotated dataset described above may be stored on the database or the storage device 106 or may be stored in a remote third party server. In any event, the annotated dataset may be utilized to train one or more of the ROI 120, the intention estimator 130, the scene representation generator 140, the situation predictor 150, or the driver response determiner 160”, [0042]). 

Agarwal et al. do not explicitly disclose identifying additional training data for training the machine learning based model based on the measure of driving quality and training the machine learning based model based on the additional training data. Agarwal et al. do not explicitly disclose identifying additional training data comprises: comparing, for each traffic scenario, driving actions determined based on predictions of the machine learning based model with corresponding driving recommendations from the ground truth mapping; and selecting traffic scenarios where the driving actions differ from the driving recommendations by more than a threshold as the additional training data and sending control signals to controls of the autonomous vehicle to navigate the autonomous vehicle, wherein the control signals are determined based on an output obtained by executing the trained machine learning based model to process sensor data collected by the autonomous vehicle.

Donderici teach determining a measure of driving quality of an autonomous vehicle based on a comparison of driving actions determined based on predictions of the machine learning based model and driving recommendations received from annotators (The CLM 200 includes an error mining 220 service to mine for errors and uses active learning to automatically identify error cases and scenarios having a significant difference between prediction and reality, which are added to a dataset of error instances 230, [0037], A new model (e.g., a neural network) is trained based on the error augmented training data 250, and the new model is tested extensively using various techniques to ensure that the new model exceeds the performance of the previous model and generalizes well to the nearly infinite variety of scenarios found in the various datasets. The model can also be simulated in a virtual environment and analyzed for performance. Once the new model has been accurately tested, the new model can be deployed in an AV to record driving data 210. The CLM 200 is a continual feedback loop that provides continued growth and learning to provide accurate models for an AV to implement, [0047], In some embodiments, the safety self-test can include checks verifying the difference between the outputs between the test against predetermined test sets and the most recent sensor data, [0053]) identifying additional training data for training the machine learning based model based on the measure of driving quality and training the machine learning based model based on the additional training data (“For example, the planning stack 118 can receive the location, speed, and direction of the AV 102, geospatial data, data regarding objects sharing the road with the AV 102 (e.g., pedestrians, bicycles, vehicles, ambulances, buses, cable cars, trains, traffic lights, lanes, road markings, etc.) or certain events occurring during a trip (e.g., emergency vehicle blaring a siren, intersections, occluded areas, street closures for construction or street repairs, double-parked cars, etc.), traffic rules and other safety standards or practices for the road, user input, and other relevant data for directing the AV 102 from one point to another and outputs from the perception stack 112, localization stack 114, and prediction stack 116. The planning stack 118 can determine multiple sets of one or more mechanical operations that the AV 102 can perform (e.g., go straight at a specified rate of acceleration, including maintaining the same speed or decelerating; turn on the left blinker, decelerate if the AV is above a threshold range for turning, and turn left; turn on the right blinker, accelerate if the AV is stopped or below the threshold range for turning, and turn right; decelerate until completely stopped and reverse; etc.), and select the best one to meet changing road conditions and events”, [0023], “A new model (e.g., a neural network) is trained based on the error augmented training data 250, and the new model is tested extensively using various techniques to ensure that the new model exceeds the performance of the previous model and generalizes well to the nearly infinite variety of scenarios found in the various datasets. The model can also be simulated in a virtual environment and analyzed for performance. Once the new model has been accurately tested, the new model can be deployed in an AV to record driving data 210. The CLM 200 is a continual feedback loop that provides continued growth and learning to provide accurate models for an AV to implement”, [0039], checks that test the accuracy of perception, predictions, tracking, planning with the deep-learning kernels for predetermined test scenarios, [0053], “In some embodiments, generating 520 an update may comprise re-training the deep neural network based additional training data which are chosen from scenarios where accuracy of the perception or predictions are low”, [0059]). Donderici also teaches sending sensor data ([0029]) which can include video ([0030]) and sending learning updates ([0055], [0056]).

Agarwal et al. and Donderici are in the same art of UAVs (Agarwal et al., [0025]; Donderici, abstract). The combination of Donderici with Agarwal et al. will enable identifying additional training data for training the machine learning based model based on the measure of driving quality and training the machine learning based model based on the additional training data. It would have been obvious to combine the additional training data of Donderici with the invention of Agarwal et al. as this was known at the time of filing, the combination would have predictable results, and as Donderici indicates “Thus, the present technology addresses the need for an efficient process for continuous learning by each autonomous vehicle in the fleet of autonomous vehicles. More specifically, the present technology can learn over a network by utilizing an on-board continuous learning model and providing an update to the autonomous vehicle. The learnings can be limited to ensure safety of the autonomous vehicle after the update” ([0013]) thereby suggesting an efficiency and safety benefit to the combination of inventions.

Agarwal et al. and Donderici do not explicitly disclose identifying additional training data comprises: comparing, for each traffic scenario, driving actions determined based on predictions of the machine learning based model with corresponding driving recommendations from the ground truth mapping; and selecting traffic scenarios where the driving actions differ from the driving recommendations by more than a threshold as the additional training data and sending control signals to controls of the autonomous vehicle to navigate the autonomous vehicle, wherein the control signals are determined based on an output obtained by executing the trained machine learning based model to process sensor data collected by the autonomous vehicle.

Lin et al. teach comparing, for each traffic scenario, driving actions determined based on predictions of the machine learning based model with corresponding driving recommendations from the ground truth mapping (deviation of the behavior of the vehicle in its current state from the corresponding desired human driving behavior, difference between the speed at the corresponding time step between the actual vehicle and the human driving model, [0054], deviation of the trajectory learned from a first iteration of the reinforcement learning process as compared to the desired human driving trajectory, [0055]); and selecting traffic scenarios where the driving actions differ from the driving recommendations by more than a threshold as the additional training data (“he differences of the speeds (or other vehicle operational data) at the corresponding time steps between the actual vehicle and the human driving model module 175 are calculated and compared to the acceptable error or variance threshold. These differences represent the deviation of the behavior of the vehicle in its current state from the corresponding desired human driving behavior. This deviation can be denoted the reward or penalty corresponding to the current behavior of the vehicle. If the difference between the speed at a corresponding time step between the actual vehicle and the human driving model module 175 is above or greater than the acceptable error or variance threshold (e.g., a penalty condition), corresponding parameters within the human driving model module 175 are updated or trained to reduce the error or variance and cause the difference between the speed at the corresponding time step between the actual vehicle and the human driving model module 175 to be below, less than, or equal to the acceptable error or variance threshold (e.g., a reward condition). The updated vehicle control parameters can be provided as an output to the human driving model module 175 and another iteration of the simulation or actual on-the-road training can be performed”, [0054], “As shown in FIG. 6, a modeled vehicle is initially on a trajectory learned from a first iteration of the reinforcement learning process as described above. In this first iteration, the deviation of the trajectory learned from a first iteration of the reinforcement learning process as compared to the desired human driving trajectory (shown by example in FIG. 6) is determined and used to update the parameters in the human driving model module 175 as described above. A next (second) iteration of the reinforcement learning process is performed. As shown in FIG. 6, the trajectory learned from the second iteration of the reinforcement learning process produces a deviation from the desired human driving trajectory that is less than the deviation from the first iteration”, [0055]) [updating parameters interpreted as “additional training data”] and sending control signals to controls of the autonomous vehicle to navigate the autonomous vehicle, wherein the control signals are determined based on an output obtained by executing the trained machine learning based model to process sensor data collected by the autonomous vehicle (“The trajectory or motion control command can be used by an autonomous vehicle control subsystem, as another one of the subsystems of vehicle subsystems 140. In an example embodiment, the in-vehicle control system 150 can generate a vehicle control command signal, which can be used by a subsystem of vehicle subsystems 140 to cause the vehicle to traverse the generated trajectory or move in a manner corresponding to the motion control command. The autonomous vehicle control subsystem, for example, can use the real-time generated trajectory and vehicle motion control command signal to safely and efficiently navigate the vehicle 105 through a real world driving environment while avoiding obstacles and safely controlling the vehicle.”, [0016], “In some embodiments, the autonomous control unit may be configured to incorporate data from the vehicle speed control module 200, the GPS transceiver, the RADAR, the LIDAR, the cameras, and other vehicle subsystems to determine the driving path or trajectory for the vehicle 10”, [0029], state of a vehicle in simulation or actual operation can be determined by capturing vehicle operational data e.g., the vehicle's position, velocity, acceleration, incremental speed, target speed, and a variety of other vehicle operational data, “The updated vehicle control parameters can be provided as an output to the human driving model module 175 and another iteration of the simulation or actual on-the-road training can be performed, [0054]).

Agarwal et al. and Donderici and Lin et al. are in the same art of UAVs (Agarwal et al., [0025]; Donderici, abstract; Lin et al., abstract). The combination of Lin et al. with Agarwal et al. and Donderici will enable updating training data. It would have been obvious to combine the training update of Lin et al. with the invention of Agarwal et al. and Donderici as this was known at the time of filing, the combination would have predictable results, and as Lin et al. indicate, “The example embodiments enable the autonomous vehicle to perform acceleration and deceleration modeled like a corresponding natural human driving behavior by use of a reinforcement learning framework to learn how to tune these parameters in a manner corresponding to the natural human driving behavior” ([0004]) which will improve the comfort and safety of the autonomous driving experience.


Regarding claims 3, 10, and 17, Agarwal et al. and Donderici and Lin et al. disclose the method, non-transitory CRM, and computer system of claims 1, 8, and 15 above. Donderici further indicate determining the measure of driving quality for each of a plurality of traffic scenarios; and identifying one or more traffic scenarios having the measure of driving quality below a threshold value, wherein the additional training data corresponds to the one or more traffic scenarios (“For example, the planning stack 118 can receive the location, speed, and direction of the AV 102, geospatial data, data regarding objects sharing the road with the AV 102 (e.g., pedestrians, bicycles, vehicles, ambulances, buses, cable cars, trains, traffic lights, lanes, road markings, etc.) or certain events occurring during a trip (e.g., emergency vehicle blaring a siren, intersections, occluded areas, street closures for construction or street repairs, double-parked cars, etc.), traffic rules and other safety standards or practices for the road, user input, and other relevant data for directing the AV 102 from one point to another and outputs from the perception stack 112, localization stack 114, and prediction stack 116. The planning stack 118 can determine multiple sets of one or more mechanical operations that the AV 102 can perform (e.g., go straight at a specified rate of acceleration, including maintaining the same speed or decelerating; turn on the left blinker, decelerate if the AV is above a threshold range for turning, and turn left; turn on the right blinker, accelerate if the AV is stopped or below the threshold range for turning, and turn right; decelerate until completely stopped and reverse; etc.), and select the best one to meet changing road conditions and events”, [0023], “A new model (e.g., a neural network) is trained based on the error augmented training data 250, and the new model is tested extensively using various techniques to ensure that the new model exceeds the performance of the previous model and generalizes well to the nearly infinite variety of scenarios found in the various datasets. The model can also be simulated in a virtual environment and analyzed for performance. Once the new model has been accurately tested, the new model can be deployed in an AV to record driving data 210. The CLM 200 is a continual feedback loop that provides continued growth and learning to provide accurate models for an AV to implement”, [0039], checks that test the accuracy of perception, predictions, tracking, planning with the deep-learning kernels for predetermined test scenarios, [0053], “In some embodiments, generating 520 an update may comprise re-training the deep neural network based additional training data which are chosen from scenarios where accuracy of the perception or predictions are low”, [0059]).

Regarding claims 4, 11, and 18, Agarwal et al. and Donderici and Lin et al. disclose the method, non-transitory CRM, and computer system of claims 1, 8, and 15 above. Agarwal et al. and Donderici further indicate a particular traffic scenario corresponding to a video frame is associated with a filtering criteria based on one or more attributes associated with the autonomous vehicle when the video frame was captured (Agarwal et al., With regard to pedestrian attentiveness, the dataset may focus on annotations relating to the attention of pedestrians while the ego-vehicle is approaching (e.g., within a threshold distance, etc.). In other words, for the pedestrian attentiveness, the system 100 may select a subset of scenes from the dataset used for the risk object identification, so the subset includes scenes that the driver is influenced by pedestrians, [0040]; Donderici, The planning stack 118 can determine multiple sets of one or more mechanical operations that the AV 102 can perform (e.g., go straight at a specified rate of acceleration, including maintaining the same speed or decelerating; turn on the left blinker, decelerate if the AV is above a threshold range for turning, and turn left; turn on the right blinker, accelerate if the AV is stopped or below the threshold range for turning, and turn right; decelerate until completely stopped and reverse; etc.), and select the best one to meet changing road conditions and events. If something unexpected happens, the planning stack 118 can select from multiple backup plans to carry out. For example, while preparing to change lanes to turn right at an intersection, another vehicle may aggressively cut into the destination lane, making the lane change unsafe. The planning stack 118 could have already determined an alternative plan for such an event. Upon its occurrence, it could help direct the AV 102 to go around the block instead of blocking a current lane while waiting for an opening to change lanes, [0023]).

Regarding claims 5, 12, and 19, Agarwal et al. and Donderici and Lin et al. disclose the method, non-transitory CRM, and computer system of claims 4, 11, and 18 above. Agarwal et al. and Donderici further indicate an attribute used in the filtering criteria for the particular traffic scenario describes a movement of the autonomous vehicle when the video frame was captured by a camera mounted on the autonomous vehicle (Agarwal et al., vehicle system including cameras, [0026], [0033], autonomous driving system, camera system, [0034]; Donderici, generating an update for a continuous deep learning neural network on-board the autonomous vehicle, abstract, autonomous vehicle (AV) management system, [0015], environment captured by cameras, [0020], “The planning stack 118 can determine how to maneuver or operate the AV 102 safely and efficiently in its environment. For example, the planning stack 118 can receive the location, speed, and direction of the AV 102,” “The planning stack 118 can determine multiple sets of one or more mechanical operations that the AV 102 can perform (e.g., go straight at a specified rate of acceleration, including maintaining the same speed or decelerating; turn on the left blinker, decelerate if the AV is above a threshold range for turning, and turn left; turn on the right blinker, accelerate if the AV is stopped or below the threshold range for turning, and turn right; decelerate until completely stopped and reverse; etc.), and select the best one to meet changing road conditions and events”, [0023]) [“mounted” implied as the camera is part of the vehicle system/capturing the environment])

Regarding claims 6 and 13, Agarwal et al. and Donderici and Lin et al. disclose the method and non-transitory CRM of claims 4 and 11 above. Agarwal et al. further indicate an attribute used in the filtering criteria for the particular traffic scenario describes a traffic entity displayed in the video frame (Agarwal et al., With regard to pedestrian attentiveness, the dataset may focus on annotations relating to the attention of pedestrians while the ego-vehicle is approaching (e.g., within a threshold distance, etc.). In other words, for the pedestrian attentiveness, the system 100 may select a subset of scenes from the dataset used for the risk object identification, so the subset includes scenes that the driver is influenced by pedestrians, [0040]) [pedestrian = entity].

Regarding claims 7, 14, and 20, Agarwal et al. and Donderici and Lin et al. disclose the method, non-transitory CRM, and computer system of claims 4, 11, and 18 above. Agarwal et al. and Donderici further indicate the autonomous vehicle was at a location on a road when the video frame was captured by a camera mounted on the autonomous vehicle (Agarwal et al., vehicle system including cameras, [0026], [0033], autonomous driving system, camera system, [0034]; Donderici, generating an update for a continuous deep learning neural network on-board the autonomous vehicle, abstract, autonomous vehicle (AV) management system, [0015], environment captured by cameras, [0020]) [“mounted” implied as the camera is part of the vehicle system/capturing the environment]) wherein an attribute used in the filtering criteria for the particular traffic scenario describes a configuration of the road near the location (Agarwal et al., The environment may include a straight topology, a three-way intersection topology, or a four-way intersection topology. The situation may include a stop sign, a traffic light, a crossing pedestrian, a crossing vehicle, a vehicle blocking ego lane, a congestion, a jaywalking, a vehicle backing into parking space, a vehicle on shoulder open door, or a cut-in, [0004], identify road agents that influence the driver in risky situations., data set having road topology of the scene e.g., a 4-way intersection, [0028], Pedestrian awareness data may be focused on intersection scenarios where diverse interactions between drivers and pedestrians are present, [0033], The dataset may include data which is categorized or manually categorized, and include a driver intention, a road topology, a situation, a decision of a driver, and a pedestrian awareness for each clip of data. According to one aspect, automatic situation localization in untrimmed videos may be explored using the proposed dataset, [0035]; Donderici, planning stack, certain events occurring during a trip e.g., intersections, “For example, while preparing to change lanes to turn right at an intersection, another vehicle may aggressively cut into the destination lane, making the lane change unsafe”, [0023], The intersections layer can include geospatial information of intersections (e.g., crosswalks, stop lines, turning lane centerlines and/or boundaries, etc.) and related attributes (e.g., permissive, protected/permissive, or protected only left turn lanes; legal or illegal u-turn lanes; permissive or protected only right turn lanes; etc.). The traffic controls lane can include geospatial information of traffic signal lights, traffic signs, and other road objects and related attributes, [0026]).

Claim(s) 2, 9, and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Agarwal et al. (US 20220144303 A1) and Donderici (US 20230252280 A1) and Lin et al. (US 20190072960 A1) as applied to claims 1, 8, and 15 above, further in view of Maat et al. (US 20200249677 A1).

Regarding claims 2, 9, and 16, Agarwal et al. and Donderici and Lin et al. disclose the method, non-transitory CRM, and computer system of claims 1, 8, and 15 above. Agarwal et al. and Donderici and Lin et al. do not disclose training the machine learning based model comprises: generating statistical information describing the first set of annotations; and training the machine learning based model based on the first set of video frames and corresponding statistical information, wherein the machine learning based model predicts statistical information describing state of mind of a traffic entity shown in an input video frame.

Maat et al. teach training the machine learning based model comprises: generating statistical information describing the first set of annotations; and training the machine learning based model based on the first set of video frames and corresponding statistical information, wherein the machine learning based model predicts statistical information describing state of mind of a traffic entity shown in an input video frame (“As an example, the system requests user responses from users observing images of traffic entities, wherein each user response is one of a plurality of values. Each value from the plurality of values corresponds to a rating that the user provides to a hidden context of the traffic entity. For example, the user response may be a value between 1 to 5 (e.g., each user response is one value selected from the values 1, 2, 3, 4, and 5). The user response indicates, what the user believes is the value of the hidden context attribute, for example, on a scale from 1-5, a number indicating how likely the user believes, a pedestrian is likely to cross the street or how likely a bicyclist is aware of a vehicle”, [0046], “In an embodiment, the neural network 120 is a probabilistic neural network that may generate different outputs for the same input if the neural network is executed repeatedly. However, the outputs generated have a particular statistical distribution, for example, mean and standard deviation. The training process adjusts the parameters of the neural network so that the statistical distribution of the predicted output matches the statistical distribution of the labels in the training dataset. The statistical distribution is determined by parameters of the neural network that can be adjusted to generate different statistical distributions. In an embodiment, the feature extraction component generates features such that each feature value is associate with a statistical distribution, for example, mean and standard deviation values”, [0047]).

Agarwal et al. and Donderici and Maat et al. are in the same art of UAVs (Agarwal et al., [0025]; Donderici, abstract; Maat et al., abstract). The combination of Maat et al. with Agarwal et al. and Donderici and Lin et al. will enable generating statistical information describing the first set of annotations. It would have been obvious to combine the statistical information of Maat et al. with the invention of Agarwal et al. and Donderici and Lin et al. as this was known at the time of filing, the combination would have predictable results, and as Maat et al. indicate, “Hidden context includes factors that affect the behavior of such traffic entities, for example, a state of mind of a pedestrian, a degree of awareness of the existence of the autonomous vehicle in the vicinity (for example, whether a bicyclist is aware of the existence of the autonomous vehicle in the proximity of the bicyclist), and so on. The system uses the hidden context to predict behavior of people near a vehicle in a way that more closely resembles how human drivers would judge the behavior” ([0027]) thus providing the benefit of making the autonomous vehicle behave more like a human driver, which will theoretically improve the safety of the invention.  

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M ENTEZARI HAUSMANN whose telephone number is (571)270-5084. The examiner can normally be reached 10-7 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent M Rudolph can be reached at (571) 272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHELLE M ENTEZARI HAUSMANN/Primary Examiner, Art Unit 2671
Read full office action
Prosecution Timeline

Apr 27, 2023
Application Filed
Jul 26, 2025
Non-Final Rejection — §103
Jan 29, 2026
Response Filed
Mar 07, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/742,463
Patent 12602775
INTERPOLATION OF MEDICAL IMAGES
2y 5m to grant Granted Apr 14, 2026
17/855,522
Patent 12602793
Systems and Methods for Predicting Object Location Within Images and for Analyzing the Images in the Predicted Location for Object Tracking
2y 5m to grant Granted Apr 14, 2026
18/335,046
Patent 12602949
SYSTEM AND METHOD FOR DETECTING HUMAN PRESENCE BASED ON DEPTH SENSING AND INERTIAL MEASUREMENT
2y 5m to grant Granted Apr 14, 2026
17/964,716
Patent 12597261
OBJECT MOVEMENT BEHAVIOR LEARNING
2y 5m to grant Granted Apr 07, 2026
18/346,894
Patent 12597244
METHOD AND DEVICE FOR IMPROVING OBJECT RECOGNITION RATE OF SELF-DRIVING CAR
2y 5m to grant Granted Apr 07, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
76%
Grant Probability
98%
With Interview (+21.6%)
3y 1m
Median Time to Grant
Moderate
PTA Risk
Based on 863 resolved cases by this examiner. Grant probability derived from career allow rate.
FRAMEWORK FOR EVALUATION OF MACHINE LEARNING BASED MODEL USED FOR AUTONOMOUS VEHICLE

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email