DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
Claims 1-20 are pending in this application.
Claims 1, 6, and 17 are amended.
Claims 1-20 are presented for examination.
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 25 September 2025 and 29 August 2025 are being considered by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Ramamoorthy et al. (US Publication 2021/0370980 A1) in view of Crossman et al. (US Publication 2024/0194076 A1).
Regarding claim 1, Ramamoorthy teaches a method for controlling a vehicle, comprising: during a first time period: sampling a first set of measurements of a set of environmental objects in an environment of the vehicle (Ramamoorthy: Para. 60; object tracking may be applied to the sensor inputs, in order track at least one external actor in the encountered driving scenario, and thereby determine an observed trace of the external actor over a time interval); and based on the first set of measurements, determining a set of risks associated with the set of environmental objects (Ramamoorthy: Para. 47, 62; observed trace may be used to predict a current maneuver and/or a future maneuver of the external actor); and during a second time period: based on the set of risks, determining a plurality of actions (Ramamoorthy: Para. 34, 48; a sequence of multiple maneuvers may be determined for at least one goal); determining a first combination of actions from the plurality of actions (Ramamoorthy: Para. 185; the AV planner selects one of the expanded paths determined to be most promising, and generates control signals for controlling the AV to execute the corresponding sequence of manoeuvres in the real-world driving scenario is has encountered); determining a second combination of actions from the plurality of actions, the first combination of actions and the second combination of actions differing by at least one action (Ramamoorthy: Para. 161; if the vehicle is at a particular location relative to a T-junction (corresponding to the parent state), there may be three possible manoeuvres, to stop, turn left, and turn right, but continuing straight would not be an option); determining a first set of vehicle control policies according to the first combination of actions (Ramamoorthy: Para. 185; the AV planner selects one of the expanded paths determined to be most promising, and generates control signals for controlling the AV to execute the corresponding sequence of manoeuvres in the real-world driving scenario is has encountered); determining a second set of vehicle control policies according to the second combination of actions (Ramamoorthy: Para. 185; the AV planner selects one of the expanded paths determined to be most promising, and generates control signals for controlling the AV to execute the corresponding sequence of manoeuvres in the real-world driving scenario is has encountered); sampling a second set of measurements of the set of environmental objects in an environment of the vehicle (Ramamoorthy: Para. 119, 187; sensor data from which it is possible to extract detailed information about the surrounding environment and the state of the AV and other actors; when a new actor is detected during an ongoing MCTS procedure); based on the second set of measurements: performing a set of first forward simulations of the set of environmental objects according to the first set of vehicle control policies (Ramamoorthy: Para. 398; choosing a maneuver for the ego vehicle; collision checker is applied to check whether the ego vehicle collides with any of the other vehicles during the forward simulation); and ………. ; based on a comparison of the set of first forward simulations and the set of second forward simulations, selecting one of the first set of vehicle control policies and the second set of vehicle control policies to be executed to control the vehicle (Ramamoorthy: Para. 185-186; the AV planner A6 selects one of the expanded paths determined to be most promising; the most promising path may be the path having a maximum score); and controlling the vehicle according to the selected set of vehicle control policies (Ramamoorthy: Para. 186; generates control signals (E12) for controlling the AV to execute the corresponding sequence of manoeuvres in the real-world driving scenario is has encountered).
Ramamoorthy doesn’t explicitly teach performing a set of second forward simulations of the set of environmental objects according to the second set of vehicle control policies.
However Crossman, in the same field of endeavor, teaches performing a set of second forward simulations of the set of environmental objects according to the second set of vehicle control policies (Crossman: Para. 46; forward simulations, which examine future scenarios for the ego vehicle and objects in its environment, such as in an event that the ego vehicle performs a certain policy).
It would have been obvious to one having ordinary skill in the art to modify the forward simulating the potential trajectories (Ramamoorthy: Para. 398) by forward stepping in time assessing a certain policy (Crossman: Para. 46) with a reasonable expectation of success because risk severity in conflict zones found through forward simulation helps select a safe vehicle trajectory (Crossman: Para. 58, 60).
Regarding claim 2, Ramamoorthy teaches the method of Claim 1, wherein determining the set of risks comprises aggregating metrics from multiple simulations of the set of environmental objects, wherein the multiple simulations are performed during a same policy election cycle (Ramamoorthy: Para. 131; determining a globally optimal sequence of manoeuvres to be taken in the encountered driving scenario to execute a defined goal (i.e. achieve a desired outcome, such as reaching a particular location on the map)).
Regarding claim 3, Ramamoorthy teaches the method of Claim 1, wherein the set of risks includes a first risk identified at a first timestep and a second risk identified at a second timestep distinct from the first timestep (Ramamoorthy: Para. 155, 239; certain nodes are terminating nodes; the vehicle crashing or otherwise failing for safety reasons; the trace of the other vehicle as actually observed over the time period is matched to the distribution of paths associated with the goal in question for that time period).
Regarding claim 4, Ramamoorthy teaches the method of Claim 1, wherein the plurality of actions are determined based on a predetermined mapping between each risk and a respective set of actions (Ramamoorthy: Para. 234; probabilistic risk of collision along a given trajectory is calculated, and used to rank order the candidate trajectories by safety).
Regarding claim 5, Ramamoorthy teaches the method of Claim 1, wherein determining the first combination of actions comprises amending the first combination of actions such that the actions of the first combination are compatible (Ramamoorthy: Para. 349; step 1008, for each goal G.sub.1, G.sub.2, the goal likelihood L(O|G) is computed in terms of the cost penalty, i.e. difference between cost of optimal plan computed at step 1004 and cost of best available plan computed at step 1006 for that goal).
Regarding claim 6, Ramamoorthy teaches a method for controlling a vehicle, comprising: determining a set of risks, each risk associated with at least one object of a set of objects in an environment of the vehicle (Ramamoorthy: Para. 47, 62; observed trace may be used to predict a current maneuver and/or a future maneuver of the external actor; determining a set of possible maneuvers for the external actor in the encountered driving scenario); based on the set of risks, selecting a plurality of actions from a predetermined set of actions (Ramamoorthy: Para. 34, 48; a sequence of multiple maneuvers may be determined for at least one goal); from the plurality of actions, selecting a first combination of actions and a second combination of actions, wherein the first combination of actions and the second combination of actions are different (Ramamoorthy: Para. 161; if the vehicle is at a particular location relative to a T-junction (corresponding to the parent state), there may be three possible manoeuvres, to stop, turn left, and turn right, but continuing straight would not be an option); based on the first combination of actions, constructing a first set of policies for controlling the vehicle (Ramamoorthy: Para. 185; the AV planner selects one of the expanded paths determined to be most promising, and generates control signals for controlling the AV to execute the corresponding sequence of manoeuvres in the real-world driving scenario is has encountered); simulating the first set of policies applied to the set of objects in the environment of the vehicle (Ramamoorthy: Para. 398; choosing a maneuver for the ego vehicle; collision checker is applied to check whether the ego vehicle collides with any of the other vehicles during the forward simulation); determining a first set of metrics associated with the first set of policies (Ramamoorthy: Para. 158; score assigned to each possible path through the game tree is simply the score assigned to its terminating node; indicates a desirability of the outcome it represents); based on the second combination of actions, constructing a second set of policies for controlling the vehicle (Ramamoorthy: Para. 185; the AV planner selects one of the expanded paths determined to be most promising, and generates control signals for controlling the AV to execute the corresponding sequence of manoeuvres in the real-world driving scenario is has encountered); ……….. ; determining a second set of metrics associated with the second set of policies (Ramamoorthy: Para. 158; score assigned to each possible path through the game tree is simply the score assigned to its terminating node; indicates a desirability of the outcome it represents); based on a comparison of the first set of metrics and second set of metrics, selecting one of the first set of policies and the second set of policies to be executed to control the vehicle (Ramamoorthy: Para. 185-186; the AV planner A6 selects one of the expanded paths determined to be most promising; the most promising path may be the path having a maximum score); and controlling the vehicle according to the selected set of policies (Ramamoorthy: Para. 186; generates control signals (E12) for controlling the AV to execute the corresponding sequence of manoeuvres in the real-world driving scenario is has encountered).
Ramamoorthy doesn’t explicitly teach simulating the second set of policies applied to the set of objects in the environment of the vehicle.
However Crossman, in the same field of endeavor, teaches simulating the second set of policies applied to the set of objects in the environment of the vehicle (Crossman: Para. 46; forward simulations, which examine future scenarios for the ego vehicle and objects in its environment, such as in an event that the ego vehicle performs a certain policy).
It would have been obvious to one having ordinary skill in the art to modify the forward simulating the potential trajectories (Ramamoorthy: Para. 398) by forward stepping in time assessing a certain policy (Crossman: Para. 46) with a reasonable expectation of success because risk severity in conflict zones found through forward simulation helps select a safe vehicle trajectory (Crossman: Para. 58, 60).
Regarding claim 7, Ramamoorthy teaches the method of Claim 6, wherein the set of risks are determined using a forward simulation of the vehicle implementing a set of policies determined at a prior timestep (Ramamoorthy: Para. 398; after choosing a maneuver for the ego vehicle, the environment is simulated forward until the end of the maneuver; includes forward-simulating the trajectories of other cars to the same point in time as the ego vehicle; check whether the ego vehicle collides with any of the other vehicles during the forward simulation).
Regarding claim 8, Ramamoorthy teaches the method of Claim 7, wherein the set of risks are determined by aggregating metrics from a plurality of distinct forward simulations of sets of policies (Ramamoorthy: Para. 209; driving area is divided into grid cells, and occupation probabilities for grid cells and/or transition probabilities between grid cells are determined through long-term observation).
Regarding claim 9, Ramamoorthy teaches the method of Claim 8, wherein the first set of policies and second set of policies are determined at the prior timestep (Ramamoorthy: Para. 54, 398, 411; collision checker is applied to check whether the ego vehicle collides with any of the other vehicles during the forward simulation; cost factors can be computed iteratively by stepping forward in time through the trajectory).
Regarding claim 10, Ramamoorthy teaches the method of Claim 7, wherein the set of risks include risks identified at multiple different timesteps (Ramamoorthy: Para. 155, 239; certain nodes are terminating nodes; the vehicle crashing or otherwise failing for safety reasons; the trace of the other vehicle as actually observed over the time period is matched to the distribution of paths associated with the goal in question for that time period).
Regarding claim 11, Ramamoorthy teaches the method of Claim 6, wherein the first combination of actions comprises a set of semantic action identifiers with a predetermined mapping to types of risks included in the set of risks, wherein selecting the first combination of actions comprises using the predetermined mapping (Ramamoorthy: Para. 234; probabilistic risk of collision along a given trajectory is calculated, and used to rank order the candidate trajectories by safety).
Regarding claim 12, Ramamoorthy teaches the method of Claim 6, wherein the first combination of actions is determined independently of a subset of risks within the set of risks (Ramamoorthy: Para. 398; if there is a collision, that branch in the search tree is immediately “cut” i.e. no longer explored).
Regarding claim 13, Ramamoorthy teaches the method of Claim 6, wherein the first simulation comprises multiple iterations of simulation using the same first set of policies, wherein behavior of another agent in the environment differs between simulations of the multiple iterations of simulation (Ramamoorthy: Para. 47, 62, 82; observed trace may be used to predict a current maneuver and/or a future maneuver of the external actor; determining a set of possible maneuvers for the external actor in the encountered driving scenario; anticipated behaviour may be simulated by sampling at least one maneuver from the possible maneuvers based on a maneuver distribution determined for the external agent).
Regarding claim 14, Ramamoorthy teaches the method of Claim 6, wherein selecting the first combination of actions comprises determining a compatibility of actions within the first combination of actions (Ramamoorthy: Para. 349; step 1008, for each goal G.sub.1, G.sub.2, the goal likelihood L(O|G) is computed in terms of the cost penalty, i.e. difference between cost of optimal plan computed at step 1004 and cost of best available plan computed at step 1006 for that goal).
Regarding claim 15, Ramamoorthy teaches the method of Claim 6, wherein the first set of policies is determined independently of a subset of risks within the set of risks (Ramamoorthy: Para. 398; if there is a collision, that branch in the search tree is immediately “cut” i.e. no longer explored).
Regarding claim 16, Ramamoorthy teaches the method of Claim 6, wherein determining the set of risks comprises estimating a kinetic energy associated with avoiding a collision (Ramamoorthy: Para. 155, 398; a sequence of manoeuvres may be required to reach a terminating node; a point at which the defined goal is determined to have failed; collision checker is applied to check whether the ego vehicle collides with any of the other vehicles during the forward simulation).
Regarding claim 17, Ramamoorthy doesn’t explicitly teach wherein a first risk of the set of risks is a risk associated with a conflict zone in the environment, wherein the first risk is associated with one or more of: a location of the conflict zone; a future time of reaching the conflict zone; an estimated probability of the risk; and an estimated severity of the risk.
However Crossman, in the same field of endeavor, teaches wherein a first risk of the set of risks is a risk associated with a conflict zone in the environment, wherein the first risk is associated with one or more of: a location of the conflict zone; a future time of reaching the conflict zone; an estimated probability of the risk; and an estimated severity of the risk (Crossman: Para. 58, 60; checking for conflict zones includes checking to see if the ego vehicle is approaching and/or within an intersection (e.g., as shown in FIG. 3A, within the forward simulation time, at a time in which another object is predicted to be in the conflict zone, etc.); severity can be determined based on the distance of the ego vehicle from the conflict zone, the speed of the objects, the headings of the objects).
It would have been obvious to one having ordinary skill in the art to modify the forward simulating the potential trajectories (Ramamoorthy: Para. 398) by forward stepping in time assessing a certain policy (Crossman: Para. 46) with a reasonable expectation of success because risk severity in conflict zones found through forward simulation helps select a safe vehicle trajectory (Crossman: Para. 58, 60).
Regarding claim 18, Ramamoorthy doesn’t explicitly teach wherein selecting the first combination of actions comprises using information selected from a set consisting of: the location of the conflict zone; the future time of reaching the conflict zone; the estimated probability of the risk; and the estimated severity of the risk.
However Crossman, in the same field of endeavor, teaches wherein selecting the first combination of actions comprises using information selected from a set consisting of: the location of the conflict zone; the future time of reaching the conflict zone; the estimated probability of the risk; and the estimated severity of the risk (Crossman: Para. 58, 60; checking for conflict zones includes checking to see if the ego vehicle is approaching and/or within an intersection (e.g., as shown in FIG. 3A, within the forward simulation time, at a time in which another object is predicted to be in the conflict zone, etc.); severity can be determined based on the distance of the ego vehicle from the conflict zone, the speed of the objects, the headings of the objects).
It would have been obvious to one having ordinary skill in the art to modify the forward simulating the potential trajectories (Ramamoorthy: Para. 398) by forward stepping in time assessing a certain policy (Crossman: Para. 46) with a reasonable expectation of success because risk severity in conflict zones found through forward simulation helps select a safe vehicle trajectory (Crossman: Para. 58, 60).
Regarding claim 19, Ramamoorthy teaches the method of Claim 6, wherein the first set of policies comprises a first set of constraints distinct from a second set of constraints of the second set of policies (Ramamoorthy: Para. 398, 411; collision checker is applied to check whether the ego vehicle collides with any of the other vehicles during the forward simulation; cost factors can be computed iteratively by stepping forward in time through the trajectory).
Regarding claim 20, Ramamoorthy teaches the method of Claim 6, wherein the first set of policies comprises a vehicle controller (Ramamoorthy: Para. 186; generates control signals (E12) for controlling the AV to execute the corresponding sequence of manoeuvres in the real-world driving scenario is has encountered).
Response to Arguments
Applicant’s arguments, filed 11 February 2026, with respect to the rejection of claims 1-20 under 35 U.S.C. 103 have been fully considered, but they are not persuasive.
The applicant’s attorney argues that Ramamoorthy does not disclose a “set of risks encounterable by the vehicle associated with the set of environmental objects.”
In response to the applicant’s argument above, the applicant’s specification includes “determining a set of risks, wherein each risk is associated with at least one object of a set of objects in an environment surrounding the vehicle” (Specification: Para. 16). Ramamoorthy teaches object tracking based on sensor inputs tracks at least one external actor in the encountered driving scenario over a time interval (Ramamoorthy: Para. 60). This is an example of collecting a first set of measurements over a first period of time. A risk associated with an environmental object is the predicted path of a vehicle or pedestrian around the vehicle. A set of risks are predicted paths for more than one vehicle or pedestrian in the environmental surroundings. Ramamoorthy teaches object tracking determining an observed trace of at least one external actor (Ramamoorthy: Para. 60) that is used to predict a future maneuver of the external actor (Ramamoorthy: Para. 47). The predicted path is a second further period of time that is estimated from the data obtained in the first period of time. External actors of vehicles or pedestrians (Ramamoorthy: Para. 119) are a set of environmental objects. The predicted paths of at least one external vehicle or pedestrian is a set of risks.
The applicant next argues that Ramamoorthy does not disclose “based on a comparison of the set of first forward simulations and the set of second forward simulations, selecting one of the first set of vehicle control policies and the second set of vehicle control policies to be executed to control the vehicle; and controlling the vehicle according to the selected set of vehicle control policies.”
In response to the applicant’s argument above, Ramamoorthy teaches an AV planner that makes various high-level decisions and then increasingly lower-level decisions that are needed to implement the higher-level decisions (Ramamoorthy: Para. 129).The prior art teaches a fully-constructed game tree data structure captures every possible outcome (Ramamoorthy: Para. 134). Ramamoorthy teaches a Monte Carlo Tree Search where not all possibilities are fully explored. This method allows for a convergence towards a sufficiently optimized solution given a reasonable amount of time and computational resources (Ramamoorthy: Para. 135). Ramamoorthy teaches possible manoeuvres for the ego vehicle are hypothesized given its current state and the defined goal to be executed (Ramamoorthy: Para. 144) and performance of the possible manoeuvres are simulated using an action policy (Ramamoorthy: Para. 149) to determine which possibilities have successfully executed the defined goal or the possibility will fail, such as forced to abort, insufficient progress, crashing, failing for safety reasons. Each possibility will be assigned a score based on success, and the failing scores are terminated possibilities (Ramamoorthy: Para. 155, 158). All of this is using a forward simulation calculations (Ramamoorthy: Para. 398).
For a first set of vehicle control policies the model looks at available manoeuvres for the vehicle traversing the current environment and achieving the goal (Ramamoorthy: Para. 161-168). All possibilities of the vehicle with direction, speed, heading, timing, road layout, and road constraints evaluated for success and safety would lead to a long and processor intensive progress. Some of these possibilities will be terminated in this step. Ramamoorthy teaches expanded paths of corresponding sequence of manoeuvres in the real-world driving scenario, a first set of all possible options (Ramamoorthy: Para. 185). The system only selects the realistically achievable manoeuvres that are hypothesized further for vehicle controls (Ramamoorthy: Para. 161) as a second set of realistic options which tends not to includes all of the first set.
Crossman teaches a set of forward simulations that examine the future scenarios for the ego vehicle and objects in its environment such that an event of that ego vehicle performs a certain behavior or an action policy (Crossman: Para. 46).
Ramamoorthy selects the most promising path out of all the expanded paths. The system teaches information other than score (Ramamoorthy: Para. 185-186), such as Crossman’s action policy (Crossman: Para. 46), may also be taken into account to determine the most promising path where the control signals are generated and executed (Ramamoorthy: Para. 185-186).
The BRI of the applicant’s claim requires choosing a set of vehicle controls to be executed from the all the options of a first set of forward simulation and a second set of forward simulations.
Ramamoorthy teaches forward simulations where options are terminated which each stage of additional information: can the car move that way; can the car move that way through the road with external trees, buildings, and road signs; can the car move that way through the road without crashing into an external vehicle or pedestrian; then comparing scores to select the path forward. Crossman teaches forward simulations where the risk of external objects are taken into account, but includes a certain behavior or action policy. Ramamoorthy takes all the options that didn’t fail in the simulations and compared the maximum score and also taking into account the action policy score.
The applicant’s arguments have failed to point out the distinguishing characteristics of the amended claim language over the prior art. For the above reasons, Ramamoorthy’s forward simulated Monte Carlo Tree Search with Crossman’s forward simulated action policy reads on applicant’s system and method for risk-aware behavior poly selection by an autonomous agent. The rejection is maintained.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LAURA E LINHARDT whose telephone number is (571)272-8325. The examiner can normally be reached on M-TR, M-F: 8am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Angela Ortiz can be reached on (571) 272-1206. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/L.E.L./Examiner, Art Unit 3663
/ANGELA Y ORTIZ/Supervisory Patent Examiner, Art Unit 3663