Last updated: May 29, 2026
Application No. 18/495,071
DECISION MAKING METHOD AND APPARATUS, AND VEHICLE

Non-Final OA §101§103
Filed
Oct 26, 2023
Priority
Apr 26, 2021 — CN 202110454337.X +1 more
Examiner
ALGEHAIM, MOHAMED A
Art Unit
3668
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Shenzhen Yinwang Intelligent Technologies Co., Ltd.
OA Round
1 (Non-Final)
This examiner grants 59% of cases after interview

— +21.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 216 resolved cases, 2023–2026
Examiner Intelligence

ALGEHAIM, MOHAMED A View full profile →
Grants 59% of resolved cases
Career Allowance Rate
127 granted / 216 resolved
+6.8% vs TC avg
Strong +22% interview lift
Without
With
+21.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
25 currently pending
Career history
248
Total Applications
across all art units
Statute-Specific Performance

§101
1.5%
-38.5% vs TC avg
§103
93.0%
+53.0% vs TC avg
§102
1.8%
-38.2% vs TC avg
§112
2.3%
-37.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 216 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  

A claim that recites an abstract idea, a law of nature, or a natural phenomenon is directed to a judicial exception.  Abstract ideas include the following groupings of subject matter, when recited as such in a claim limitation: (a) Mathematical concepts – mathematical relationships, mathematical formulas or equations, mathematical calculations; (b) Certain methods of organizing human activity – fundamental economic principles or practices (including hedging, insurance, mitigating risk); commercial or legal interactions (including agreements in the form of contracts; legal obligations; advertising, marketing or sales activities or behaviors; business relations); managing personal behavior or relationships or interactions between people (including social activities, teaching, and following rules or instructions); and (c) Mental processes – concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See the 2019 Revised Patent Subject Matter Eligibility Guidance.
Even when a judicial element is recited in the claim, an additional claim element(s) that integrates the judicial exception into a practical application of that exception renders the claim eligible under §101.  A claim that integrates a judicial exception into a practical application will apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that the claim is more than a drafting effort designed to monopolize the judicial exception.  The following examples are indicative that an additional element or combination of elements may integrate the judicial exception into a practical application: 
the additional element(s) reflects an improvement in the functioning of a computer, or an improvement to other technology or technical field; 
the additional element(s) that applies or uses a judicial exception to effect a particular treatment or prophylaxis for a disease or medical condition; 
the additional element(s) implements a judicial exception with, or uses a judicial exception in conjunction with, a particular machine or manufacture that is integral to the claim; 
the additional element(s) effects a transformation or reduction of a particular article to a different state or thing; and 
the additional element(s) applies or uses the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception.  
Examples in which the judicial exception has not been integrated into a practical application include:
the additional element(s) merely recites the words ‘‘apply it’’ (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea; 
the additional element(s) adds insignificant extra-solution activity to the judicial exception; and
the additional element does no more than generally link the use of a judicial exception to a particular technological environment or field of use.
See the 2019 Revised Patent Subject Matter Eligibility Guidance.
Claims 1, 9, & 17 recite obtaining a predicted moving track of an ego vehicle and predicted moving tracks of obstacles around the ego vehicle, determining a game object, wherein the game object is an obstacle that is in the obstacles around the ego vehicle and whose predicted moving track intersects the predicted moving track of the ego vehicle or whose distance from the ego vehicle is less than a specified threshold, wherein each sampling game space comprises one or more game policies, determining a decision making result of the ego vehicle, wherein the decision making result is a game policy with a smallest policy cost in a common sampling game space, the common sampling game space comprises at least one game policy, and each sampling game space comprises the at least one game policy in the common sampling game space, as drafted, is a device & process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer elements. The claim is practically able to be performed in the mind. For example, but for the “A method comprising, constructing one sampling game space for each game object based on vehicle information of the ego vehicle, obstacle information of the game object, and road condition information that are collected by a sensor system, calculating a policy cost of each game policy, wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost, An apparatus, comprising: at least one processor; and at least one memory coupled to the at least one processor and storing programming instructions for execution by the at least one processor, to cause the apparatus to perform operations comprising, A computer program product comprising computer-executable instructions stored on a non-transitory computer-readable storage medium that, when executed by a processor, cause an apparatus to perform operations comprising” language, “obtaining a predicted moving track of an ego vehicle and predicted moving tracks of obstacles around the ego vehicle, determining a game object, wherein the game object is an obstacle that is in the obstacles around the ego vehicle and whose predicted moving track intersects the predicted moving track of the ego vehicle or whose distance from the ego vehicle is less than a specified threshold, wherein each sampling game space comprises one or more game policies, determining a decision making result of the ego vehicle, wherein the decision making result is a game policy with a smallest policy cost in a common sampling game space, the common sampling game space comprises at least one game policy, and each sampling game space comprises the at least one game policy in the common sampling game space” in the context of this claim encompasses the user discerning and calculating the route he will be taking in a vehicle. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements – using “A method comprising, constructing one sampling game space for each game object based on vehicle information of the ego vehicle, obstacle information of the game object, and road condition information that are collected by a sensor system, calculating a policy cost of each game policy, wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost, An apparatus, comprising: at least one processor; and at least one memory coupled to the at least one processor and storing programming instructions for execution by the at least one processor, to cause the apparatus to perform operations comprising, A computer program product comprising computer-executable instructions stored on a non-transitory computer-readable storage medium that, when executed by a processor, cause an apparatus to perform operations comprising”. The devices are recited at a high-level of generality (i.e., device configured to determine a route for a vehicle) such that it amounts no more than mere instructions to apply the exception using generic computer components. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim(s) do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements, as discussed above with respect to integration of the abstract idea into a practical application, the additional elements of using “A method comprising, constructing one sampling game space for each game object based on vehicle information of the ego vehicle, obstacle information of the game object, and road condition information that are collected by a sensor system, calculating a policy cost of each game policy, wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost, An apparatus, comprising: at least one processor; and at least one memory coupled to the at least one processor and storing programming instructions for execution by the at least one processor, to cause the apparatus to perform operations comprising, A computer program product comprising computer-executable instructions stored on a non-transitory computer-readable storage medium that, when executed by a processor, cause an apparatus to perform operations comprising,”, amounts to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The claim is not patent eligible.

Similarly for claims 2-8, 10-16, & 18-20, is a device that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. In the context of this claim encompasses the user concluding how the vehicle should maneuver. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements. The claim(s) do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The devices are recited at a high-level of generality (i.e., device configured to determine a route for a vehicle) such that it amounts no more than mere instructions to apply the exception using generic computer components. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The claim is not patent eligible.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 3-7, 9, 11-15, 17, & 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over US 2019/0107840A1 (“Green”), further in view of US 2021/0114617A1 (“Phillips”).
As per claim 1 Green discloses
A method, comprising:
obtaining a predicted moving track of an ego vehicle and predicted moving tracks of obstacles around the ego vehicle (see at least Green, para. [0070]:, generate an appropriate motion path through such surrounding environment. The autonomy computing system 102 can control the one or more vehicle controls 107 to operate the autonomous vehicle 10 according to the motion path. & para. [0082]: The prediction system 104 can receive the state data from the perception system 103 and predict one or more future locations for each object based on such state data. For example, the prediction system 104 can predict where each object will be located within the next 5 seconds, 10 seconds, 20 seconds, etc . As one example, an object can be predicted to adhere to its current trajectory according to its current speed.);
determining a game object, wherein the game object is an obstacle that is in the obstacles around the ego vehicle and whose predicted moving track intersects the predicted moving track of the ego vehicle or whose distance from the ego vehicle is less than a specified threshold (see at least Green, para. [0097]: For example, the feature (s) can include a location of the object relative to a travel way (e. g., relative to the left or right lane markings), a location of the object relative to the autonomous vehicle (e. g., a distance between the current locations of the vehicle and the object), one or more characteristic (s) of the object relative to a travel route associated with the autonomous vehicle (e. g., whether the object is moving parallel, towards, or away from the vehicle's current/future travel route or a predicted point of intersection with the vehicle's travel route), etc.);
constructing one sampling game space for each game object based on vehicle information of the ego vehicle, obstacle information of the game object, and road condition information that are collected by a sensor system (see at least Green, para. [0078-0079] & para. [0092]: The world state generator 204 can receive information from the prediction system 104, the map data 126, and/or other information such as vehicle pose, a current route, or other information. The world state generator 204 can synthesize all received information to produce a world state that describes the state of all objects in and other aspects of the surrounding environment of the autonomous vehicle at each time step.), 
wherein each sampling game space comprises one or more game policies (see at least Green, para. [0054]: In some implementations, the features used to make yield decisions which prevent gridlock can include the current position, velocity, and/or acceleration of a relevant object (e. g., the next vehicle ahead of the autonomous vehicle within the current lane, which may be referred to as the “lead vehicle”) and/or the predicted position, velocity, and/or acceleration of the relevant object at the end of a certain time period (e. g., 10 seconds). Other example features include the current and/or predicted future values for the position, velocity, and/or acceleration of other objects such as, for example, the next vehicle in front of the lead vehicle. & para. [0058]: Once the machine learned yield model has provided one or more yield decisions for the autonomous vehicle relative to one or more objects (e .g., other vehicles, traffic signals, etc.), the autonomy computing system can plan the motion of the autonomous vehicle based at least in part on the determined yield decision. For example, the autonomy computing system can select and evaluate one or more cost functions indicative of a cost (e. g., over time) of controlling the motion of the autonomous vehicle (e. g., the trajectory, speed, or other controllable parameters of the autonomous vehicle) to perform a trajectory that executes or otherwise complies with the yield decision. & para. [0084-0086]);
calculating a policy cost of each game policy (see at least Green, para. [0084]: In particular, according to an aspect of the present disclosure, the motion planning system 105 can evaluate one or more cost functions for each of one or more candidate motion plans for the autonomous vehicle 10. For example, the cost function (s) can describe a cost (e. g., over time) of adhering to a particular candidate motion plan and/or describe a reward for adhering to the particular candidate motion plan. For example, the reward can be of opposite sign to the cost.); and
determining a decision making result of the ego vehicle, wherein the decision making result is a game policy with a smallest policy cost in a common sampling game space, the common sampling game space comprises at least one game policy, and each sampling game space comprises the at least one game policy in the common sampling game space (see at least Green, para. [0030-0034] & para. [0058]: For example, the autonomy computing system can select and evaluate one or more cost functions indicative of a cost (e. g., over time) of controlling the motion of the autonomous vehicle (e. g., the trajectory, speed, or other controllable parameters of the autonomous vehicle) to perform a trajectory that executes or otherwise complies with the yield decision. For example, the autonomy computing system can implement an optimization algorithm that considers the cost functions associated with the yield decision determined by the machine learned model as well as other cost functions (e. g., based on speed limits, traffic lights, etc .) to determine optimized variables that make up the motion plan. & para. [0094]: As examples, the scenario controllers 206 can include one or more of: a pass, ignore, queue controller that decides, for each object in the world, whether the autonomous vehicle should pass, ignore, or queue such object; a yield controller that decides, for each adjacent vehicle in the world, whether the autonomous vehicle should yield to such vehicle; a lane change controller that identifies whether and when to change lanes; and/or a speed regressor that determines an appropriate driving speed for each time step.).
However Green does not explicitly disclose
wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost.
Phillips teaches
wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost (see at least Phillips, para. [0058]: In some examples, each sub-cost value can be assigned a weight based on its particular importance. The final cost of a candidate trajectory can then be weighted towards the more important sub-cost values. For examples, the sub-cost value associated with a potential collision can be weighted more heavily than a sub-cost value associated with actor caution costs. In some examples, the weight of particular sub-cost value can depend on the current situation of the autonomous vehicle. Thus, overtaking buffer costs may be weighted more heavily on a single lane road than on a multi-lane highway. para. [0071]: Each cost function can generate a sub-cost value, representing one aspect of the cost of a particular trajectory. Each sub-cost value can be assigned a weight based on one or more factors including the current situation of the autonomous vehicle. The sub-cost values can then be combined to produce a total cost of a trajectory at least partially based on the weights assigned to each sub-cost value.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Green to incorporate the teaching of wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost of Phillips, with a reasonable expectation of success, in order for improved motion planning framework for generating a trajectory that can be used to direct an autonomous vehicle from a first position to a second position (see at least Phillips, para. [0026]).

As per claim 3 Green discloses
wherein the method further comprises:  determining a non-game object, wherein the non-game object is an obstacle that is in the obstacles around the ego vehicle and whose predicted moving track does not intersect the predicted moving track of the ego vehicle or whose distance from the ego vehicle is not less than the specified threshold (see at least Green, para. [0097]: For example, the feature (s) can include a location of the object relative to a travel way (e. g., relative to the left or right lane markings), a location of the object relative to the autonomous vehicle (e. g., a distance between the current locations of the vehicle and the object), one or more characteristic (s) of the object relative to a travel route associated with the autonomous vehicle (e. g., whether the object is moving parallel, towards, or away from the vehicle's current/future travel route or a predicted point of intersection with the vehicle's travel route), etc.);
constructing a feasible region of the ego vehicle based on the vehicle information of the ego vehicle, obstacle information of the non-game object, and the road condition information that are collected by the sensor system, wherein the feasible region of the ego vehicle is at least one policy of using different decisions by the ego vehicle without colliding with the non-game object (see at least Green, para. [0078-0079] & para. [0092]: The world state generator 204 can receive information from the prediction system 104, the map data 126, and/or other information such as vehicle pose, a current route, or other information. The world state generator 204 can synthesize all received information to produce a world state that describes the state of all objects in and other aspects of the surrounding environment of the autonomous vehicle at each time step.); and
in response to detecting that the decision making result of the ego vehicle is within the feasible region of the ego vehicle, outputting the decision making result of the ego vehicle (see at least Green, para. [0030-0034] & para. [0058] & para. [0094]: As examples, the scenario controllers 206 can include one or more of: a pass, ignore, queue controller that decides, for each object in the world, whether the autonomous vehicle should pass, ignore, or queue such object; a yield controller that decides, for each adjacent vehicle in the world, whether the autonomous vehicle should yield to such vehicle; a lane change controller that identifies whether and when to change lanes; and/or a speed regressor that determines an appropriate driving speed for each time step.).

As per claim 4 Green discloses
wherein the constructing one sampling game space for each game object based on vehicle information of the ego vehicle, obstacle information of the game object, and road condition information that are collected by a sensor system comprises (see at least Green, para. [0040-0041]: As another example, the feature data for an object can be descriptive of at least one of a required deceleration and a required acceleration associated with the first object. For example, the required deceleration can describe the amount of deceleration that will be required for the autonomous vehicle to yield to the object. In general, this can indicate how hard it will be to stop the autonomous vehicle (e. g., to avoid collision with the object or to comply with a traffic command provided or predicted to be provided by a traffic signal). As an example, the required deceleration for a traffic light can indicate the amount of deceleration required for the autonomous vehicle to come to a stop at or before a stop line associated with the traffic light.):
determining upper decision limits and lower decision limits of the ego vehicle and each obstacle in the game object based on the vehicle information of the ego vehicle, the obstacle information of the game object, and the road condition information (see at least Green, para. [0098-0099]: For example, the required acceleration can describe an amount of acceleration that will be required to complete an unprotected left turn through an intersection prior to an oncoming vehicle entering the intersection or, as another example, an amount of acceleration that will be required to continue through an intersection prior to a traffic light transitioning to a red light.);
obtaining decision making policies of the ego vehicle and each obstacle in the game object from the upper decision limits and lower decision limits of the ego vehicle and each obstacle in the game object according to a specified rule (see at least Green, para. [0099]: As a further example, the required acceleration associated with the first object can describe the amount of acceleration that will be required for the autonomous vehicle to not yield to the first object. For example, the required acceleration can describe an amount of acceleration that will be required to complete an unprotected left turn through an intersection prior to an oncoming vehicle entering the intersection or, as another example, an amount of acceleration that will be required to continue through an intersection prior to a traffic light transitioning to a red light. Similarly, the required acceleration can be compared to one or more jerk limit (s) to generate additional features.); and
combining a decision making policy of the ego vehicle and a decision making policy of each obstacle in the game object, to obtain at least one game policy between the ego vehicle and each obstacle in the game object (see at least Green, para. [0099]: As a further example, the required acceleration associated with the first object can describe the amount of acceleration that will be required for the autonomous vehicle to not yield to the first object. For example, the required acceleration can describe an amount of acceleration that will be required to complete an unprotected left turn through an intersection prior to an oncoming vehicle entering the intersection or, as another example, an amount of acceleration that will be required to continue through an intersection prior to a traffic light transitioning to a red light. Similarly, the required acceleration can be compared to one or more jerk limit (s) to generate additional features.).

As per claim 5 Green discloses
wherein the method further comprises: determining a behavior label of each game policy based on a distance between the ego vehicle and a conflict point, a distance between the game object and the conflict point, and the at least one game policy between the ego vehicle and each obstacle in the game object (see at least Green, para. [0094]: As examples, the scenario controllers 206 can include one or more of : a pass, ignore, queue controller that decides, for each object in the world, whether the autonomous vehicle should pass, ignore, or queue such object; a yield controller that decides, for each adjacent vehicle in the world, whether the autonomous vehicle should yield to such vehicle; a lane change controller that identifies whether and when to change lanes; and/or a speed regressor that determines an appropriate driving speed for each time step.), 
wherein the conflict point is a location at which the predicted moving track of the ego vehicle and the predicted moving track of the obstacle intersect each other or a location at which a distance between the ego vehicle and the obstacle is less than the specified threshold (see at least Green, para. [0097]: For example, the feature (s) can include a location of the object relative to a travel way (e . g ., relative to the left or right lane markings), a location of the object relative to the autonomous vehicle (e . g ., a distance between the current locations of the vehicle and the object), one or more characteristic (s) of the object relative to a travel route associated with the autonomous vehicle (e. g ., whether the object is moving parallel, towards, or away from the vehicle ' s current / future travel route or a predicted point of intersection with the vehicle ' s travel route), etc . In some implementations, the feature (s) determined for a particular object may depend at least in part on the class of that object.), and 
the behavior label comprises at least one of yielding by the ego vehicle, overtaking by the ego vehicle, and yielding by both the ego vehicle and the obstacle (see at least Green, para. [0113]: The yield controller 250 can input data indicative of at least the feature (s) for one or more objects into the machine learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects. The yield decision can include yielding for the object by, for example, stopping a motion of the autonomous vehicle for the object. In some implementations, the yield decision can include not yielding for the object. For example, the output of the machine learned model can indicate that the vehicle is to maintain its current speed and/or trajectory, without adjusting for the object's presence.).

As per claim 6 Green does not explicitly disclose
wherein the calculating a policy cost of each game policy comprises:
determining factors of the policy cost, wherein the factors of the policy cost comprise at least one of safety, comfort, passing efficiency, right of way, a prior probability of an obstacle, or historical decision correlation;
calculating a factor cost of each of the factors of the policy cost; and
weighting the factor cost of each of the factors of the policy cost, to obtain the policy cost of each game policy.
Phillips teaches
wherein the calculating a policy cost of each game policy comprises: determining factors of the policy cost, wherein the factors of the policy cost comprise at least one of safety, comfort, passing efficiency, right of way, a prior probability of an obstacle, or historical decision correlation(see at least Phillips, para. [0122]: cost function can be encoded for one or more of: the avoidance of object collision, keeping the autonomous vehicle on the travel way/within lane boundaries, preferring gentle accelerations to harsh ones, etc. As further described herein, the cost function(s) can consider vehicle dynamics parameters (e.g., to keep the ride smooth, acceleration, jerk, etc.) and/or map parameters (e.g., speed limits, stops, travel way boundaries, etc.). The cost function(s) can also, or alternatively, take into account at least one of the following object cost(s): collision costs (e.g., cost of avoiding/experience potential collision, minimization of speed, etc.); overtaking buffer (e.g., give 4 ft of space with overtaking a bicycle, etc.); headway(e.g., preserve stopping distance when applying adaptive cruise control motion a moving object, etc.); actor caution (e.g., preserve the ability to stop for unlikely events, etc.); behavioral blocking (e.g., avoid overtaking backed-up traffic in the vehicle's lane, etc.); or other parameters.);
calculating a factor cost of each of the factors of the policy cost (see at least Phillips, para. [0071]: Each cost function can generate a sub-cost value, representing one aspect of the cost of a particular trajectory. Each sub-cost value can be assigned a weight based on one or more factors including the current situation of the autonomous vehicle. The sub-cost values can then be combined to produce a total cost of a trajectory at least partially based on the weights assigned to each sub-cost value.); and
weighting the factor cost of each of the factors of the policy cost, to obtain the policy cost of each game policy (see at least Phillips, para. [0071]: Each cost function can generate a sub-cost value, representing one aspect of the cost of a particular trajectory. Each sub-cost value can be assigned a weight based on one or more factors including the current situation of the autonomous vehicle. The sub-cost values can then be combined to produce a total cost of a trajectory at least partially based on the weights assigned to each sub-cost value.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Green to incorporate the teaching of wherein the calculating a policy cost of each game policy comprises determining factors of the policy cost, wherein the factors of the policy cost comprise at least one of safety, comfort, passing efficiency, right of way, a prior probability of an obstacle, or historical decision correlation; calculating a factor cost of each of the factors of the policy cost; and weighting the factor cost of each of the factors of the policy cost, to obtain the policy cost of each game policy of Phillips, with a reasonable expectation of success, in order for improved motion planning framework for generating a trajectory that can be used to direct an autonomous vehicle from a first position to a second position (see at least Phillips, para. [0026]).

As per claim 7 Green does not explicitly disclose
wherein after the calculating a policy cost of each game policy, the method further comprises: performing comparison to determine whether each of the factors of the policy cost is within a specified range; and
deleting a game policy corresponding to a policy cost comprising a factor that is not within the specified range.
Phillips teaches
wherein after the calculating a policy cost of each game policy, the method further comprises: performing comparison to determine whether each of the factors of the policy cost is within a specified range (see at least Phillips, para. [0158]: The actor caution cost function can take, as input, a list of candidate trajectories, a list of object trajectories likelihoods, and/or blocking interval information for each combination of candidate trajectory and likely object trajectory. In this context, the blocking intervals can represent the minimum and maximum odometer value (distance) that would result in a collision with an object. This blocking interval can be based on an along path speed associated with the autonomous vehicle. The actor caution cost function can determine a speed for each candidate trajectory that brings the autonomous vehicle closest to a particular object without causing a collision. Based on the determined speed, the actor caution cost function can output a one-dimensional vector including a cost for each candidate trajectory based on the comparison of the candidate trajectories, the speed limits they must stay under to avoid an unlikely future collision, and the object (obstacle) trajectory likelihoods.); and
deleting a game policy corresponding to a policy cost comprising a factor that is not within the specified range (see at least Phillips, para. [0138]: The cost value for a particular candidate trajectory can be based on an evaluation of the specific trajectory. In some examples, the trajectory generation and selection system 300 can determine whether the specific candidate trajectory will result in a collision with another object (e.g., a vehicle, a stationary object, and so on). In some examples, a trajectory generation and selection system 300 can remove candidate trajectories that result in a likely collision from consideration or give such candidate trajectories a very high cost, such that it will not be selected unless no other trajectories are possible.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Green to incorporate the teaching of wherein after the calculating a policy cost of each game policy, the method further comprises: performing comparison to determine whether each of the factors of the policy cost is within a specified range; and deleting a game policy corresponding to a policy cost comprising a factor that is not within the specified range of Phillips, with a reasonable expectation of success, in order for improved motion planning framework for generating a trajectory that can be used to direct an autonomous vehicle from a first position to a second position (see at least Phillips, para. [0026]).

As per claim 9 Green discloses
An apparatus, comprising: at least one processor; and at least one memory coupled to the at least one processor and storing programming instructions for execution by the at least one processor, to cause the apparatus to perform operations comprising (see at least Green, para. [0071]: The autonomy computing system 102 includes one or more processors 112 and a memory 114…The memory 114 can include one or more non – transitory computer-readable storage mediums, such as RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof. The memory 114 can store data 116 and instructions 118 which are executed by the processor 112 to cause autonomy computing system 102 to perform operations.):
obtaining a predicted moving track of an ego vehicle and predicted moving tracks of obstacles around the ego vehicle (see at least Green, para. [0070]:, generate an appropriate motion path through such surrounding environment. The autonomy computing system 102 can control the one or more vehicle controls 107 to operate the autonomous vehicle 10 according to the motion path. & para. [0082]: The prediction system 104 can receive the state data from the perception system 103 and predict one or more future locations for each object based on such state data. For example, the prediction system 104 can predict where each object will be located within the next 5 seconds, 10 seconds, 20 seconds, etc . As one example, an object can be predicted to adhere to its current trajectory according to its current speed.); and
determining a game object, wherein the game object is an obstacle that is in the obstacles around the ego vehicle and whose predicted moving track intersects the predicted moving track of the ego vehicle or whose distance from the ego vehicle is less than a specified threshold (see at least Green, para. [0097]: For example, the feature (s) can include a location of the object relative to a travel way (e. g., relative to the left or right lane markings), a location of the object relative to the autonomous vehicle (e. g., a distance between the current locations of the vehicle and the object), one or more characteristic (s) of the object relative to a travel route associated with the autonomous vehicle (e. g., whether the object is moving parallel, towards, or away from the vehicle's current/future travel route or a predicted point of intersection with the vehicle's travel route), etc.);
constructing one sampling game space for each game object based on vehicle information of the ego vehicle, obstacle information of the game object, and road condition information that are collected by a sensor system (see at least Green, para. [0078-0079] & para. [0092]: The world state generator 204 can receive information from the prediction system 104, the map data 126, and/or other information such as vehicle pose, a current route, or other information. The world state generator 204 can synthesize all received information to produce a world state that describes the state of all objects in and other aspects of the surrounding environment of the autonomous vehicle at each time step.), 
wherein each sampling game space comprises one or more game policies (see at least Green, para. [0054]: In some implementations, the features used to make yield decisions which prevent gridlock can include the current position, velocity, and/or acceleration of a relevant object (e. g., the next vehicle ahead of the autonomous vehicle within the current lane, which may be referred to as the “lead vehicle”) and/or the predicted position, velocity, and/or acceleration of the relevant object at the end of a certain time period (e. g., 10 seconds). Other example features include the current and/or predicted future values for the position, velocity, and/or acceleration of other objects such as, for example, the next vehicle in front of the lead vehicle. & para. [0058]: Once the machine learned yield model has provided one or more yield decisions for the autonomous vehicle relative to one or more objects (e .g., other vehicles, traffic signals, etc.), the autonomy computing system can plan the motion of the autonomous vehicle based at least in part on the determined yield decision. For example, the autonomy computing system can select and evaluate one or more cost functions indicative of a cost (e. g., over time) of controlling the motion of the autonomous vehicle (e. g., the trajectory, speed, or other controllable parameters of the autonomous vehicle) to perform a trajectory that executes or otherwise complies with the yield decision. & para. [0084-0086]);
calculating a policy cost of each game policy (see at least Green, para. [0084]: In particular, according to an aspect of the present disclosure, the motion planning system 105 can evaluate one or more cost functions for each of one or more candidate motion plans for the autonomous vehicle 10. For example, the cost function (s) can describe a cost (e. g., over time) of adhering to a particular candidate motion plan and/or describe a reward for adhering to the particular candidate motion plan. For example, the reward can be of opposite sign to the cost.), and
determining a decision making result of the ego vehicle, wherein the decision making result is a game policy with a smallest policy cost in a common sampling game space, the common sampling game space comprises at least one game policy, and each sampling game space comprises the at least one game policy in the common sampling game space (see at least Green, para. [0030-0034] & para. [0058]: For example, the autonomy computing system can select and evaluate one or more cost functions indicative of a cost (e. g., over time) of controlling the motion of the autonomous vehicle (e. g., the trajectory, speed, or other controllable parameters of the autonomous vehicle) to perform a trajectory that executes or otherwise complies with the yield decision. For example, the autonomy computing system can implement an optimization algorithm that considers the cost functions associated with the yield decision determined by the machine learned model as well as other cost functions (e. g., based on speed limits, traffic lights, etc .) to determine optimized variables that make up the motion plan. & para. [0094]: As examples, the scenario controllers 206 can include one or more of: a pass, ignore, queue controller that decides, for each object in the world, whether the autonomous vehicle should pass, ignore, or queue such object; a yield controller that decides, for each adjacent vehicle in the world, whether the autonomous vehicle should yield to such vehicle; a lane change controller that identifies whether and when to change lanes; and/or a speed regressor that determines an appropriate driving speed for each time step.).
However Green does not explicitly disclose
wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost.
Phillips teaches
wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost (see at least Phillips, para. [0058]: In some examples, each sub-cost value can be assigned a weight based on its particular importance. The final cost of a candidate trajectory can then be weighted towards the more important sub-cost values. For examples, the sub-cost value associated with a potential collision can be weighted more heavily than a sub-cost value associated with actor caution costs. In some examples, the weight of particular sub-cost value can depend on the current situation of the autonomous vehicle. Thus, overtaking buffer costs may be weighted more heavily on a single lane road than on a multi-lane highway. para. [0071]: Each cost function can generate a sub-cost value, representing one aspect of the cost of a particular trajectory. Each sub-cost value can be assigned a weight based on one or more factors including the current situation of the autonomous vehicle. The sub-cost values can then be combined to produce a total cost of a trajectory at least partially based on the weights assigned to each sub-cost value.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Green to incorporate the teaching of wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost of Phillips, with a reasonable expectation of success, in order for improved motion planning framework for generating a trajectory that can be used to direct an autonomous vehicle from a first position to a second position (see at least Phillips, para. [0026]).

As per claim 11 Green discloses
wherein the operations further comprise: determining a non-game object, wherein the non-game object is an obstacle that is in the obstacles around the ego vehicle and whose predicted moving track does not intersect the predicted moving track of the ego vehicle or whose distance from the ego vehicle is not less than the specified threshold (see at least Green, para. [0097]: For example, the feature (s) can include a location of the object relative to a travel way (e. g., relative to the left or right lane markings), a location of the object relative to the autonomous vehicle (e. g., a distance between the current locations of the vehicle and the object), one or more characteristic (s) of the object relative to a travel route associated with the autonomous vehicle (e. g., whether the object is moving parallel, towards, or away from the vehicle's current/future travel route or a predicted point of intersection with the vehicle's travel route), etc.);
constructing a feasible region of the ego vehicle based on the vehicle information of the ego vehicle, obstacle information of the non-game object, and the road condition information that are collected by the sensor system, wherein the feasible region of the ego vehicle is at least one policy of using different decisions by the ego vehicle without colliding with the non-game object (see at least Green, para. [0078-0079] & para. [0092]: The world state generator 204 can receive information from the prediction system 104, the map data 126, and/or other information such as vehicle pose, a current route, or other information. The world state generator 204 can synthesize all received information to produce a world state that describes the state of all objects in and other aspects of the surrounding environment of the autonomous vehicle at each time step.); and
in response to detecting that the decision making result of the ego vehicle is within the feasible region of the ego vehicle, outputting the decision making result of the ego vehicle (see at least Green, para. [0030-0034] & para. [0058] & para. [0094]: As examples, the scenario controllers 206 can include one or more of: a pass, ignore, queue controller that decides, for each object in the world, whether the autonomous vehicle should pass, ignore, or queue such object; a yield controller that decides, for each adjacent vehicle in the world, whether the autonomous vehicle should yield to such vehicle; a lane change controller that identifies whether and when to change lanes; and/or a speed regressor that determines an appropriate driving speed for each time step.).

As per claim 12 Green discloses
wherein the constructing one sampling game space for each game object based on vehicle information of the ego vehicle, obstacle information of the game object, and road condition information that are collected by a sensor system comprises (see at least Green, para. [0040-0041]: As another example, the feature data for an object can be descriptive of at least one of a required deceleration and a required acceleration associated with the first object. For example, the required deceleration can describe the amount of deceleration that will be required for the autonomous vehicle to yield to the object. In general, this can indicate how hard it will be to stop the autonomous vehicle (e. g., to avoid collision with the object or to comply with a traffic command provided or predicted to be provided by a traffic signal). As an example, the required deceleration for a traffic light can indicate the amount of deceleration required for the autonomous vehicle to come to a stop at or before a stop line associated with the traffic light.):
determining upper decision limits and lower decision limits of the ego vehicle and each obstacle in the game object based on the vehicle information of the ego vehicle, the obstacle information of the game object, and the road condition information (see at least Green, para. [0098-0099]: For example, the required acceleration can describe an amount of acceleration that will be required to complete an unprotected left turn through an intersection prior to an oncoming vehicle entering the intersection or, as another example, an amount of acceleration that will be required to continue through an intersection prior to a traffic light transitioning to a red light.);
obtaining decision making policies of the ego vehicle and each obstacle in the game object from the upper decision limits and lower decision limits of the ego vehicle and each obstacle in the game object according to a specified rule (see at least Green, para. [0099]: As a further example, the required acceleration associated with the first object can describe the amount of acceleration that will be required for the autonomous vehicle to not yield to the first object. For example, the required acceleration can describe an amount of acceleration that will be required to complete an unprotected left turn through an intersection prior to an oncoming vehicle entering the intersection or, as another example, an amount of acceleration that will be required to continue through an intersection prior to a traffic light transitioning to a red light. Similarly, the required acceleration can be compared to one or more jerk limit (s) to generate additional features.); and
combining a decision making policy of the ego vehicle and a decision making policy of each obstacle in the game object, to obtain at least one game policy between the ego vehicle and each obstacle in the game object (see at least Green, para. [0099]: As a further example, the required acceleration associated with the first object can describe the amount of acceleration that will be required for the autonomous vehicle to not yield to the first object. For example, the required acceleration can describe an amount of acceleration that will be required to complete an unprotected left turn through an intersection prior to an oncoming vehicle entering the intersection or, as another example, an amount of acceleration that will be required to continue through an intersection prior to a traffic light transitioning to a red light. Similarly, the required acceleration can be compared to one or more jerk limit (s) to generate additional features.).

As per claim 13 Green discloses
wherein the operations further comprise: determining a behavior label of each game policy based on a distance between the ego vehicle and a conflict point, a distance between the game object and the conflict point, and the at least one game policy between the ego vehicle and each obstacle in the game object (see at least Green, para. [0094]: As examples, the scenario controllers 206 can include one or more of : a pass, ignore, queue controller that decides, for each object in the world, whether the autonomous vehicle should pass, ignore, or queue such object; a yield controller that decides, for each adjacent vehicle in the world, whether the autonomous vehicle should yield to such vehicle; a lane change controller that identifies whether and when to change lanes; and/or a speed regressor that determines an appropriate driving speed for each time step.), 
wherein the conflict point is a location at which the predicted moving track of the ego vehicle and the predicted moving track of the obstacle intersect each other or a location at which a distance between the ego vehicle and the obstacle is less than the specified threshold (see at least Green, para. [0097]: For example, the feature (s) can include a location of the object relative to a travel way (e . g ., relative to the left or right lane markings), a location of the object relative to the autonomous vehicle (e . g ., a distance between the current locations of the vehicle and the object), one or more characteristic (s) of the object relative to a travel route associated with the autonomous vehicle (e. g ., whether the object is moving parallel, towards, or away from the vehicle ' s current / future travel route or a predicted point of intersection with the vehicle ' s travel route), etc . In some implementations, the feature (s) determined for a particular object may depend at least in part on the class of that object.), and 
the behavior label comprises at least one of yielding by the ego vehicle, overtaking by the ego vehicle, and yielding by both the ego vehicle and the obstacle (see at least Green, para. [0113]: The yield controller 250 can input data indicative of at least the feature (s) for one or more objects into the machine learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects. The yield decision can include yielding for the object by, for example, stopping a motion of the autonomous vehicle for the object. In some implementations, the yield decision can include not yielding for the object. For example, the output of the machine learned model can indicate that the vehicle is to maintain its current speed and/or trajectory, without adjusting for the object's presence.).

As per claim 14 Green does not explicitly disclose
wherein calculating a policy cost of each game policy comprises:
determining factors of the policy cost, wherein the factors of the policy cost comprise at least one of safety, comfort, passing efficiency, right of way, a prior probability of an obstacle, or historical decision correlation;
calculating a factor cost of each of the factors of the policy cost; and 
weighting the factor cost of each of the factors of the policy cost, to obtain the policy cost of each game policy.
Phillips teaches
wherein the calculating a policy cost of each game policy comprises: determining factors of the policy cost, wherein the factors of the policy cost comprise at least one of safety, comfort, passing efficiency, right of way, a prior probability of an obstacle, or historical decision correlation(see at least Phillips, para. [0122]: cost function can be encoded for one or more of: the avoidance of object collision, keeping the autonomous vehicle on the travel way/within lane boundaries, preferring gentle accelerations to harsh ones, etc. As further described herein, the cost function(s) can consider vehicle dynamics parameters (e.g., to keep the ride smooth, acceleration, jerk, etc.) and/or map parameters (e.g., speed limits, stops, travel way boundaries, etc.). The cost function(s) can also, or alternatively, take into account at least one of the following object cost(s): collision costs (e.g., cost of avoiding/experience potential collision, minimization of speed, etc.); overtaking buffer (e.g., give 4 ft of space with overtaking a bicycle, etc.); headway(e.g., preserve stopping distance when applying adaptive cruise control motion a moving object, etc.); actor caution (e.g., preserve the ability to stop for unlikely events, etc.); behavioral blocking (e.g., avoid overtaking backed-up traffic in the vehicle's lane, etc.); or other parameters.);
calculating a factor cost of each of the factors of the policy cost (see at least Phillips, para. [0071]: Each cost function can generate a sub-cost value, representing one aspect of the cost of a particular trajectory. Each sub-cost value can be assigned a weight based on one or more factors including the current situation of the autonomous vehicle. The sub-cost values can then be combined to produce a total cost of a trajectory at least partially based on the weights assigned to each sub-cost value.); and
weighting the factor cost of each of the factors of the policy cost, to obtain the policy cost of each game policy (see at least Phillips, para. [0071]: Each cost function can generate a sub-cost value, representing one aspect of the cost of a particular trajectory. Each sub-cost value can be assigned a weight based on one or more factors including the current situation of the autonomous vehicle. The sub-cost values can then be combined to produce a total cost of a trajectory at least partially based on the weights assigned to each sub-cost value.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Green to incorporate the teaching of wherein the calculating a policy cost of each game policy comprises determining factors of the policy cost, wherein the factors of the policy cost comprise at least one of safety, comfort, passing efficiency, right of way, a prior probability of an obstacle, or historical decision correlation; calculating a factor cost of each of the factors of the policy cost; and weighting the factor cost of each of the factors of the policy cost, to obtain the policy cost of each game policy of Phillips, with a reasonable expectation of success, in order for improved motion planning framework for generating a trajectory that can be used to direct an autonomous vehicle from a first position to a second position (see at least Phillips, para. [0026]).

As per claim 15 Green does not explicitly disclose
wherein after the calculating a policy cost of each game policy, the operations further comprise: performing comparison to determine whether each of the factors of the policy cost is within a specified range; and 
deleting a game policy corresponding to a policy cost comprising a factor that is not within the specified range.
Phillips teaches
wherein after the calculating a policy cost of each game policy, the method further comprises: performing comparison to determine whether each of the factors of the policy cost is within a specified range (see at least Phillips, para. [0158]: The actor caution cost function can take, as input, a list of candidate trajectories, a list of object trajectories likelihoods, and/or blocking interval information for each combination of candidate trajectory and likely object trajectory. In this context, the blocking intervals can represent the minimum and maximum odometer value (distance) that would result in a collision with an object. This blocking interval can be based on an along path speed associated with the autonomous vehicle. The actor caution cost function can determine a speed for each candidate trajectory that brings the autonomous vehicle closest to a particular object without causing a collision. Based on the determined speed, the actor caution cost function can output a one-dimensional vector including a cost for each candidate trajectory based on the comparison of the candidate trajectories, the speed limits they must stay under to avoid an unlikely future collision, and the object (obstacle) trajectory likelihoods.); and
deleting a game policy corresponding to a policy cost comprising a factor that is not within the specified range (see at least Phillips, para. [0138]: The cost value for a particular candidate trajectory can be based on an evaluation of the specific trajectory. In some examples, the trajectory generation and selection system 300 can determine whether the specific candidate trajectory will result in a collision with another object (e.g., a vehicle, a stationary object, and so on). In some examples, a trajectory generation and selection system 300 can remove candidate trajectories that result in a likely collision from consideration or give such candidate trajectories a very high cost, such that it will not be selected unless no other trajectories are possible.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Green to incorporate the teaching of wherein after the calculating a policy cost of each game policy, the method further comprises: performing comparison to determine whether each of the factors of the policy cost is within a specified range; and deleting a game policy corresponding to a policy cost comprising a factor that is not within the specified range of Phillips, with a reasonable expectation of success, in order for improved motion planning framework for generating a trajectory that can be used to direct an autonomous vehicle from a first position to a second position (see at least Phillips, para. [0026]).

As per claim 17 Green discloses
A computer program product comprising computer-executable instructions stored on a non-transitory computer-readable storage medium that, when executed by a processor, cause an apparatus to perform operations comprising (see at least Green, para. [0071]: The autonomy computing system 102 includes one or more processors 112 and a memory 114…The memory 114 can include one or more non – transitory computer-readable storage mediums, such as RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof. The memory 114 can store data 116 and instructions 118 which are executed by the processor 112 to cause autonomy computing system 102 to perform operations.):
obtaining a predicted moving track of an ego vehicle and predicted moving tracks of obstacles around the ego vehicle (see at least Green, para. [0070]:, generate an appropriate motion path through such surrounding environment. The autonomy computing system 102 can control the one or more vehicle controls 107 to operate the autonomous vehicle 10 according to the motion path. & para. [0082]: The prediction system 104 can receive the state data from the perception system 103 and predict one or more future locations for each object based on such state data. For example, the prediction system 104 can predict where each object will be located within the next 5 seconds, 10 seconds, 20 seconds, etc . As one example, an object can be predicted to adhere to its current trajectory according to its current speed.); and
determining a game object, wherein the game object is an obstacle that is in the obstacles around the ego vehicle and whose predicted moving track intersects the predicted moving track of the ego vehicle or whose distance from the ego vehicle is less than a specified threshold (see at least Green, para. [0097]: For example, the feature (s) can include a location of the object relative to a travel way (e. g., relative to the left or right lane markings), a location of the object relative to the autonomous vehicle (e. g., a distance between the current locations of the vehicle and the object), one or more characteristic (s) of the object relative to a travel route associated with the autonomous vehicle (e. g., whether the object is moving parallel, towards, or away from the vehicle's current/future travel route or a predicted point of intersection with the vehicle's travel route), etc.);
constructing one sampling game space for each game object based on vehicle information of the ego vehicle, obstacle information of the game object, and road condition information that are collected by a sensor system (see at least Green, para. [0078-0079] & para. [0092]: The world state generator 204 can receive information from the prediction system 104, the map data 126, and/or other information such as vehicle pose, a current route, or other information. The world state generator 204 can synthesize all received information to produce a world state that describes the state of all objects in and other aspects of the surrounding environment of the autonomous vehicle at each time step.), 
wherein each sampling game space comprises one or more game policies (see at least Green, para. [0054]: In some implementations, the features used to make yield decisions which prevent gridlock can include the current position, velocity, and/or acceleration of a relevant object (e. g., the next vehicle ahead of the autonomous vehicle within the current lane, which may be referred to as the “lead vehicle”) and/or the predicted position, velocity, and/or acceleration of the relevant object at the end of a certain time period (e. g., 10 seconds). Other example features include the current and/or predicted future values for the position, velocity, and/or acceleration of other objects such as, for example, the next vehicle in front of the lead vehicle. & para. [0058]: Once the machine learned yield model has provided one or more yield decisions for the autonomous vehicle relative to one or more objects (e .g., other vehicles, traffic signals, etc.), the autonomy computing system can plan the motion of the autonomous vehicle based at least in part on the determined yield decision. For example, the autonomy computing system can select and evaluate one or more cost functions indicative of a cost (e. g., over time) of controlling the motion of the autonomous vehicle (e. g., the trajectory, speed, or other controllable parameters of the autonomous vehicle) to perform a trajectory that executes or otherwise complies with the yield decision. & para. [0084-0086]);
calculating a policy cost of each game policy (see at least Green, para. [0084]: In particular, according to an aspect of the present disclosure, the motion planning system 105 can evaluate one or more cost functions for each of one or more candidate motion plans for the autonomous vehicle 10. For example, the cost function (s) can describe a cost (e. g., over time) of adhering to a particular candidate motion plan and/or describe a reward for adhering to the particular candidate motion plan. For example, the reward can be of opposite sign to the cost.), and
determining a decision making result of the ego vehicle, wherein the decision making result is a game policy with a smallest policy cost in a common sampling game space, the common sampling game space comprises at least one game policy, and each sampling game space comprises the at least one game policy in the common sampling game space (see at least Green, para. [0030-0034] & para. [0058]: For example, the autonomy computing system can select and evaluate one or more cost functions indicative of a cost (e. g., over time) of controlling the motion of the autonomous vehicle (e. g., the trajectory, speed, or other controllable parameters of the autonomous vehicle) to perform a trajectory that executes or otherwise complies with the yield decision. For example, the autonomy computing system can implement an optimization algorithm that considers the cost functions associated with the yield decision determined by the machine learned model as well as other cost functions (e. g., based on speed limits, traffic lights, etc .) to determine optimized variables that make up the motion plan. & para. [0094]: As examples, the scenario controllers 206 can include one or more of: a pass, ignore, queue controller that decides, for each object in the world, whether the autonomous vehicle should pass, ignore, or queue such object; a yield controller that decides, for each adjacent vehicle in the world, whether the autonomous vehicle should yield to such vehicle; a lane change controller that identifies whether and when to change lanes; and/or a speed regressor that determines an appropriate driving speed for each time step.).
However Green does not explicitly disclose
wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost.
Phillips teaches
wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost (see at least Phillips, para. [0058]: In some examples, each sub-cost value can be assigned a weight based on its particular importance. The final cost of a candidate trajectory can then be weighted towards the more important sub-cost values. For examples, the sub-cost value associated with a potential collision can be weighted more heavily than a sub-cost value associated with actor caution costs. In some examples, the weight of particular sub-cost value can depend on the current situation of the autonomous vehicle. Thus, overtaking buffer costs may be weighted more heavily on a single lane road than on a multi-lane highway. para. [0071]: Each cost function can generate a sub-cost value, representing one aspect of the cost of a particular trajectory. Each sub-cost value can be assigned a weight based on one or more factors including the current situation of the autonomous vehicle. The sub-cost values can then be combined to produce a total cost of a trajectory at least partially based on the weights assigned to each sub-cost value.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Green to incorporate the teaching of wherein the policy cost is a numerical value obtained by performing weighting on each factor weight of the policy cost of Phillips, with a reasonable expectation of success, in order for improved motion planning framework for generating a trajectory that can be used to direct an autonomous vehicle from a first position to a second position (see at least Phillips, para. [0026]).

As per claim 19 Green discloses
wherein the operations further comprise: determining a non-game object, wherein the non-game object is an obstacle that is in the obstacles around the ego vehicle and whose predicted moving track does not intersect the predicted moving track of the ego vehicle or whose distance from the ego vehicle is not less than the specified threshold (see at least Green, para. [0097]: For example, the feature (s) can include a location of the object relative to a travel way (e. g., relative to the left or right lane markings), a location of the object relative to the autonomous vehicle (e. g., a distance between the current locations of the vehicle and the object), one or more characteristic (s) of the object relative to a travel route associated with the autonomous vehicle (e. g., whether the object is moving parallel, towards, or away from the vehicle's current/future travel route or a predicted point of intersection with the vehicle's travel route), etc.);
constructing a feasible region of the ego vehicle based on the vehicle information of the ego vehicle, obstacle information of the non-game object, and the road condition information that are collected by the sensor system, wherein the feasible region of the ego vehicle is at least one policy of using different decisions by the ego vehicle without colliding with the non-game object (see at least Green, para. [0078-0079] & para. [0092]: The world state generator 204 can receive information from the prediction system 104, the map data 126, and/or other information such as vehicle pose, a current route, or other information. The world state generator 204 can synthesize all received information to produce a world state that describes the state of all objects in and other aspects of the surrounding environment of the autonomous vehicle at each time step.); and
in response to detecting that the decision making result of the ego vehicle is within the feasible region of the ego vehicle, outputting the decision making result of the ego vehicle (see at least Green, para. [0030-0034] & para. [0058] & para. [0094]: As examples, the scenario controllers 206 can include one or more of: a pass, ignore, queue controller that decides, for each object in the world, whether the autonomous vehicle should pass, ignore, or queue such object; a yield controller that decides, for each adjacent vehicle in the world, whether the autonomous vehicle should yield to such vehicle; a lane change controller that identifies whether and when to change lanes; and/or a speed regressor that determines an appropriate driving speed for each time step.).

As per claim 20 Green discloses
wherein the constructing one sampling game space for each game object based on vehicle information of the ego vehicle, obstacle information of the game object, and road condition information that are collected by a sensor system comprises (see at least Green, para. [0040-0041]: As another example, the feature data for an object can be descriptive of at least one of a required deceleration and a required acceleration associated with the first object. For example, the required deceleration can describe the amount of deceleration that will be required for the autonomous vehicle to yield to the object. In general, this can indicate how hard it will be to stop the autonomous vehicle (e. g., to avoid collision with the object or to comply with a traffic command provided or predicted to be provided by a traffic signal). As an example, the required deceleration for a traffic light can indicate the amount of deceleration required for the autonomous vehicle to come to a stop at or before a stop line associated with the traffic light.):
determining upper decision limits and lower decision limits of the ego vehicle and each obstacle in the game object based on the vehicle information of the ego vehicle, the obstacle information of the game object, and the road condition information (see at least Green, para. [0098-0099]: For example, the required acceleration can describe an amount of acceleration that will be required to complete an unprotected left turn through an intersection prior to an oncoming vehicle entering the intersection or, as another example, an amount of acceleration that will be required to continue through an intersection prior to a traffic light transitioning to a red light.);
obtaining decision making policies of the ego vehicle and each obstacle in the game object from the upper decision limits and lower decision limits of the ego vehicle and each obstacle in the game object according to a specified rule (see at least Green, para. [0099]: As a further example, the required acceleration associated with the first object can describe the amount of acceleration that will be required for the autonomous vehicle to not yield to the first object. For example, the required acceleration can describe an amount of acceleration that will be required to complete an unprotected left turn through an intersection prior to an oncoming vehicle entering the intersection or, as another example, an amount of acceleration that will be required to continue through an intersection prior to a traffic light transitioning to a red light. Similarly, the required acceleration can be compared to one or more jerk limit (s) to generate additional features.); and
combining a decision making policy of the ego vehicle and a decision making policy of each obstacle in the game object, to obtain at least one game policy between the ego vehicle and each obstacle in the game object (see at least Green, para. [0099]: As a further example, the required acceleration associated with the first object can describe the amount of acceleration that will be required for the autonomous vehicle to not yield to the first object. For example, the required acceleration can describe an amount of acceleration that will be required to complete an unprotected left turn through an intersection prior to an oncoming vehicle entering the intersection or, as another example, an amount of acceleration that will be required to continue through an intersection prior to a traffic light transitioning to a red light. Similarly, the required acceleration can be compared to one or more jerk limit (s) to generate additional features.).

Claim(s) 2, 8, 10, 16, & 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Green, further in view of Phillips, in view of US 2022/0402485A1 (“Kobilarov”).
As per claim 2 Green does not disclose
wherein the determining a decision making result of the ego vehicle comprises:
constructing a feasible region of each sampling game space, wherein the feasible region of each sampling game space is at least one game policy corresponding to a policy cost that meets a specified requirement; and
determining a game policy with a smallest policy cost in same game policies from an intersection of the feasible region of each sampling game space.
Kobilarov teaches
wherein the determining a decision making result of the ego vehicle comprises: constructing a feasible region of each sampling game space, wherein the feasible region of each sampling game space is at least one game policy corresponding to a policy cost that meets a specified requirement (see at least Kobilarov, para. [0042]: In some examples, a tree search can determine potential interactions between object trajectories of different objects and/or object trajectories and vehicle trajectories, and the potential interactions determined by the tree search can be used in a simulation, as further described in FIG. 3and elsewhere. In various examples, the tree search can identify potential interactions over time to reduce an amount of potential interactions for a later time (e.g., at each second or other time interval the tree search can determine a most likely interaction between object(s) and the vehicle). In some examples, vehicle actions can be explored as various branches of a tree search and tree branches can be pruned or ignored if a cost associated with an action meets or exceeds a threshold cost.); and
determining a game policy with a smallest policy cost in same game policies from an intersection of the feasible region of each sampling game space (see at least Kobilarov, para. [0041-0042]: In some examples, the active prediction component 104 can associate one or more of the intersection points 210, 212, 214, 216, 218, and 220 with an intersection probability, and output the intersection points associated with the object most likely to reach the intersection point first for further processing, such as by a simulation component as described in FIG. 3. In some examples, two objects(e.g., the vehicle 110 and the vehicle 112) may intersect with a same vehicle trajectory, and the active prediction component 104 can identify or determine which of the two objects (or object trajectories associated therewith) to process with the simulation component based at least in part on a control policy associated with rules of the road, right of way logic, physics, kinematics, dynamics, and the like. In this way, computational resources can be omitted with respect to, for example, processing data associated with scenarios in which the object reaches an intersection point with the vehicle 102 after another object. …In such an example, the tree may branch at such discrete points based on differing actions that the vehicle could take at those points and the methods described herein may be used in selecting between those branches when expanding the tree. For instance, branches may be explored having the lowest cost and/or in which there is no adverse event (e.g., collision, uncomfortable control, etc.).).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Green to incorporate the teaching of wherein the determining a decision making result of the ego vehicle comprises: constructing a feasible region of each sampling game space, wherein the feasible region of each sampling game space is at least one game policy corresponding to a policy cost that meets a specified requirement; and determining a game policy with a smallest policy cost in same game policies from an intersection of the feasible region of each sampling game spaceof Kobilarov, with a reasonable expectation of success, in order future states of the object(s) and vehicle can be accurately and efficiently processed and identified, thereby improving the overall safety of the vehicle. (see at least Kobilarov, para. [0009]).

As per claim 8 Green discloses
wherein the method further comprises: in response to detecting that the decision making result of the ego vehicle is not within the feasible region of the ego vehicle, outputting a decision making result of yielding by the ego vehicle (see at least Green, para. [0113]: The yield controller 250 can input data indicative of at least the feature(s) for one or more objects into the machine learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects. The yield decision can include yielding for the object by, for example, stopping a motion of the autonomous vehicle for the object. & para. [0118]: The yield decision making component 504 can include a gap selection component 540 that selects a selected yield gap 550 based on the gap classifications 512, 522, and 532. As one example, the first gap (e. g., temporally speaking) to receive a do not yield classification can be selected as the selected yield gap 550. If none of the gaps received a do not yield decision, then a stop decision can be selected. As another example, the first gap (e. g., temporally speaking) to receive a score greater than a threshold value can be selected as the selected yield gap 550. If none of the gaps received a score greater than the threshold, then a stop decision can be selected.).

As per claim 10 Green does not disclose
wherein the determining a decision making result of the ego vehicle comprises:
constructing a feasible region of each sampling game space, wherein the feasible region of each sampling game space is at least one game policy corresponding to a policy cost that meets a specified requirement; and
determining a game policy with a smallest policy cost in same game policies from an intersection of the feasible region of each sampling game space.
Kobilarov teaches
wherein the determining a decision making result of the ego vehicle comprises: constructing a feasible region of each sampling game space, wherein the feasible region of each sampling game space is at least one game policy corresponding to a policy cost that meets a specified requirement (see at least Kobilarov, para. [0042]: In some examples, a tree search can determine potential interactions between object trajectories of different objects and/or object trajectories and vehicle trajectories, and the potential interactions determined by the tree search can be used in a simulation, as further described in FIG. 3and elsewhere. In various examples, the tree search can identify potential interactions over time to reduce an amount of potential interactions for a later time (e.g., at each second or other time interval the tree search can determine a most likely interaction between object(s) and the vehicle). In some examples, vehicle actions can be explored as various branches of a tree search and tree branches can be pruned or ignored if a cost associated with an action meets or exceeds a threshold cost.); and
determining a game policy with a smallest policy cost in same game policies from an intersection of the feasible region of each sampling game space (see at least Kobilarov, para. [0041-0042]: In some examples, the active prediction component 104 can associate one or more of the intersection points 210, 212, 214, 216, 218, and 220 with an intersection probability, and output the intersection points associated with the object most likely to reach the intersection point first for further processing, such as by a simulation component as described in FIG. 3. In some examples, two objects(e.g., the vehicle 110 and the vehicle 112) may intersect with a same vehicle trajectory, and the active prediction component 104 can identify or determine which of the two objects (or object trajectories associated therewith) to process with the simulation component based at least in part on a control policy associated with rules of the road, right of way logic, physics, kinematics, dynamics, and the like. In this way, computational resources can be omitted with respect to, for example, processing data associated with scenarios in which the object reaches an intersection point with the vehicle 102 after another object. …In such an example, the tree may branch at such discrete points based on differing actions that the vehicle could take at those points and the methods described herein may be used in selecting between those branches when expanding the tree. For instance, branches may be explored having the lowest cost and/or in which there is no adverse event (e.g., collision, uncomfortable control, etc.).).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Green to incorporate the teaching of wherein the determining a decision making result of the ego vehicle comprises: constructing a feasible region of each sampling game space, wherein the feasible region of each sampling game space is at least one game policy corresponding to a policy cost that meets a specified requirement; and determining a game policy with a smallest policy cost in same game policies from an intersection of the feasible region of each sampling game spaceof Kobilarov, with a reasonable expectation of success, in order future states of the object(s) and vehicle can be accurately and efficiently processed and identified, thereby improving the overall safety of the vehicle. (see at least Kobilarov, para. [0009]).

As per claim 16 Green discloses
wherein the operations further comprise: in response to detecting that the decision making result of the ego vehicle is not within the feasible region of the ego vehicle, outputting a decision making result of yielding by the ego vehicle (see at least Green, para. [0113]: The yield controller 250 can input data indicative of at least the feature(s) for one or more objects into the machine learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects. The yield decision can include yielding for the object by, for example, stopping a motion of the autonomous vehicle for the object. & para. [0118]: The yield decision making component 504 can include a gap selection component 540 that selects a selected yield gap 550 based on the gap classifications 512, 522, and 532. As one example, the first gap (e. g., temporally speaking) to receive a do not yield classification can be selected as the selected yield gap 550. If none of the gaps received a do not yield decision, then a stop decision can be selected. As another example, the first gap (e. g., temporally speaking) to receive a score greater than a threshold value can be selected as the selected yield gap 550. If none of the gaps received a score greater than the threshold, then a stop decision can be selected.).

As per claim 18 Green does not explicitly disclose
wherein the determining a decision making result of the ego vehicle comprises:
constructing a feasible region of each sampling game space, wherein the feasible region of each sampling game space is at least one game policy corresponding to a policy cost that meets a specified requirement; and
determining a game policy with a smallest policy cost in same game policies from an intersection of the feasible region of each sampling game space.
Kobilarov teaches
wherein the determining a decision making result of the ego vehicle comprises: constructing a feasible region of each sampling game space, wherein the feasible region of each sampling game space is at least one game policy corresponding to a policy cost that meets a specified requirement (see at least Kobilarov, para. [0042]: In some examples, a tree search can determine potential interactions between object trajectories of different objects and/or object trajectories and vehicle trajectories, and the potential interactions determined by the tree search can be used in a simulation, as further described in FIG. 3and elsewhere. In various examples, the tree search can identify potential interactions over time to reduce an amount of potential interactions for a later time (e.g., at each second or other time interval the tree search can determine a most likely interaction between object(s) and the vehicle). In some examples, vehicle actions can be explored as various branches of a tree search and tree branches can be pruned or ignored if a cost associated with an action meets or exceeds a threshold cost.); and
determining a game policy with a smallest policy cost in same game policies from an intersection of the feasible region of each sampling game space (see at least Kobilarov, para. [0041-0042]: In some examples, the active prediction component 104 can associate one or more of the intersection points 210, 212, 214, 216, 218, and 220 with an intersection probability, and output the intersection points associated with the object most likely to reach the intersection point first for further processing, such as by a simulation component as described in FIG. 3. In some examples, two objects(e.g., the vehicle 110 and the vehicle 112) may intersect with a same vehicle trajectory, and the active prediction component 104 can identify or determine which of the two objects (or object trajectories associated therewith) to process with the simulation component based at least in part on a control policy associated with rules of the road, right of way logic, physics, kinematics, dynamics, and the like. In this way, computational resources can be omitted with respect to, for example, processing data associated with scenarios in which the object reaches an intersection point with the vehicle 102 after another object. …In such an example, the tree may branch at such discrete points based on differing actions that the vehicle could take at those points and the methods described herein may be used in selecting between those branches when expanding the tree. For instance, branches may be explored having the lowest cost and/or in which there is no adverse event (e.g., collision, uncomfortable control, etc.).).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Green to incorporate the teaching of wherein the determining a decision making result of the ego vehicle comprises: constructing a feasible region of each sampling game space, wherein the feasible region of each sampling game space is at least one game policy corresponding to a policy cost that meets a specified requirement; and determining a game policy with a smallest policy cost in same game policies from an intersection of the feasible region of each sampling game spaceof Kobilarov, with a reasonable expectation of success, in order future states of the object(s) and vehicle can be accurately and efficiently processed and identified, thereby improving the overall safety of the vehicle. (see at least Kobilarov, para. [0009]).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMED ABDO ALGEHAIM whose telephone number is (571)272-3628. The examiner can normally be reached Monday-Friday 8-5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fadey Jabr can be reached at 571-272-1516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MOHAMED ABDO ALGEHAIM/Primary Examiner, Art Unit 3668
Read full office action
Prosecution Timeline

Oct 26, 2023
Application Filed
Dec 06, 2023
Response after Non-Final Action
Apr 08, 2026
Non-Final Rejection mailed — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/156,703
Patent 12623687
METHOD AND APPARATUS FOR PLANNING OBSTACLE AVOIDANCE PATH OF TRAVELING APPARATUS
3y 3m to grant Granted May 12, 2026
17/605,910
Patent 12613096
Method Of Flight Plan Optimization Of A High Altitude Long Endurance Aircraft
4y 6m to grant Granted Apr 28, 2026
17/021,981
Patent 12594963
DETECTING AN UNKNOWN OBJECT BY A LEAD AUTONOMOUS VEHICLE (AV) AND UPDATING ROUTING PLANS FOR FOLLOWING AVs
5y 6m to grant Granted Apr 07, 2026
17/908,366
Patent 12597865
INVERTER
3y 7m to grant Granted Apr 07, 2026
17/664,256
Patent 12589978
TRUCK-TABLET INTERFACE
3y 10m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
59%
Grant Probability
80%
With Interview (+21.7%)
3y 1m (~6m remaining)
Median Time to Grant
Low
PTA Risk
Based on 216 resolved cases by this examiner. Grant probability derived from career allowance rate.