Office Action Analysis: 18540354 — DISTILLATION-TRAINED MACHINE LEARNING MODELS FOR EFFICIENT TRAJECTORY PREDICTION

Office Action

§102 §103
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
2.	The United States Patent & Trademark Office appreciates the application that is by the inventor/assignee. The United States Patent & Trademark Office reviewed the following application and has made the follow comments below.

Information Disclosure Statement
3.	The information disclosure statement (IDS) submitted on 12/14/2023. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.

4.	The information disclosure statement (IDS) submitted on 05/28/2025 was filed after the mailing date of the first information disclosure statement on 12/14/2023.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 102
5.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

6.	Claims 1-8, 10, 12, and 13 are rejected under 35 U.S.C. 102(a)(2) as being unpatentable over Refaat et al (US Patent Pub. No. 2021/0133582 A1, hereafter referred to as Refaat).

7.	Regarding Claim 1, Refaat teaches a method comprising obtaining first training data (paragraph 70, Refaat teaches training examples that include a training input that includes data characterization. The Examiner interprets the training input that is required for the training examples, which also include data characterization as an initial means such as first training data by which the method claimed will be carried out.), wherein the training data comprises: a first training input representative of a driving environment of a vehicle (paragraph 79, Refaat teaches a training network input that includes data characterizing a scene in an environment in a vicinity of the vehicle. The Examiner interprets the training network input as the first training input and the data characterized as a scene in an environment in a vicinity of the vehicle as a driving environment of a vehicle.), the driving environment comprising one or more objects (paragraphs 31-33, Refaat teaches that the environment in a vicinity of the vehicle includes an agent, which is described as anything present in the vicinity of the vehicle including a pedestrian, bicyclists, or other vehicles. The Examiner interprets the vicinity of the vehicle as the driving environment and any agent in the vicinity as one or more objects in the driving environment.), and
one or more ground truth trajectories associated with a forecasted motion of the vehicle within the driving environment (paragraph 80, Refaat teaches a ground truth output that defines a ground truth future trajectory of the agent within the vicinity of the vehicle. The Examiner interprets the ground truth output that defines the ground truth future trajectories as a ground truth trajectory that is associated with a forecasted motion of the vehicle since ground truth trajectories are real-world data that serve as reference outputs to define the actual path within the driving environment.), wherein the one or more ground truth trajectories are generated by a teacher model using the first training input (paragraph 4, 42, and 48, Refaat teaches obtaining a ground truth output defining a ground truth future trajectory of the agent and processing using the first sub neural network. The Examiner interprets the ground truth future trajectory as the ground truth trajectory and the first sub neural network, which is used to obtain the ground truth output and ground truth future trajectory, as the teacher model using the first training input.); and training, using the training data (paragraph 4, Refaat teaches using values of the parameters of the first sub neural network, which is known as the training network input, to generate a training intermediate representation. The Examiner interprets this as values from the first sub neural network being used to generate the training intermediate representation as the teacher model being trained using training data.), a student model to predict one or more trajectories (paragraph 4, Refaat teaches a second sub neural network to generate respective training confidence scores for each of the one or more candidate future trajectories. The Examiner interprets the second sub neural network as the student model and the confidence scores assigned to trajectories as a means to predict one or more trajectories.), wherein training the student model comprises reducing a difference between the one or more trajectories predicted by the student model (paragraphs 4, 10, 16, 58, 60, Refaat teaches a difference between the training predicted future trajectory and the ground truth future trajectory that is used to compute gradients of first and second losses respective to the first and second sub neural networks to update the values of the parameters of the first and second sub neural networks. The Examiner interprets the difference between the training predicted future trajectory and the ground truth future trajectory as the difference between one or more trajectories predicted by the student model and one or more ground truth trajectories.) and the one or more ground truth trajectories (paragraph 4, Refaat teaches a difference between the training predicted future trajectory and the ground truth future trajectory that is used compute gradients of first and second losses respective to the first and second sub neural networks to update the values of the parameters of the first and second sub neural networks. The Examiner interprets the difference between the training predicted future trajectory and the ground truth future trajectory as the difference between one or more trajectories predicted by the student model and one or more ground truth trajectories.).

8.	In regards to Claim 2, Refaat teaches wherein the first training input representative of the driving environment of the vehicle comprises: a history of motion of (i) the vehicle and (ii) the one or more objects (paragraph 32 and 42, Refaat teaches the trajectory of an agent being made possible by data defining, which is known by the environment at the time point and characteristics of the motion of the agent at the time point, including velocity of the agent; as well as an input that includes data characterizing a scene in an environment in the vicinity of the vehicle, including data representing one or more candidate future trajectories. The Examiner interprets data defining of the motion of the agent at time points as history of the motion of one or more objects and data characterizing a scene in an environment in the vicinity of the vehicle, including information representing future trajectories as a history of motion of the vehicle.).

9.	In regards to Claim 3, Refaat teaches wherein the first training input representative of the driving environment of the vehicle further comprises at least one of: a status of one or more traffic lights of the driving environment of the vehicle, a roadgraph information associated with the driving environment of the vehicle, or a representation of a target route of the vehicle through the driving environment of the vehicle (paragraph 4, Refaat teaches the training network input being used to generate a training intermediate representation of the environment in a vicinity of the vehicle. The Examiner interprets the training intermediate representation as a representation of a target route of the vehicle through the driving environment.).  

10.	In regards to Claim 4, Refaat teaches wherein the one or more trajectories predicted by the student model comprise a first set of probable trajectories of the vehicle (paragraph 81, Refaat teaches the generation of respective training confidence scores for each of the one or candidate future trajectories using the second sub neural network. The Examiner interprets the confidence scores assigned to each of the one candidate future trajectories as a first set of probable trajectories of the vehicle, where the confidence scores indicate a predicted likelihood, the second sub neural network as the student model.), wherein each probable trajectory of the first set of probable trajectories of the vehicle comprises one or more locations of the vehicle (paragraphs 81 and 82, Refaat teaches future trajectories of location defining a path for travel of the vehicle, which would be the probable set of trajectories of the vehicle. Also each taught confidence score corresponding to a predicted likelihood that the agent will follow the corresponding candidate future trajectory. The Examiner interprets the correspondence of a predicted likelihood of the agent, which includes different objects at different moments in the environment of the vicinity of the vehicle as probable trajectories corresponding to one or more locations of the vehicles at one or more future times.) at corresponding one or more future times (paragraph 81, Refaat teaches each confidence score corresponding to a predicted likelihood that the agent will follow the corresponding candidate future trajectory. The Examiner interprets the correspondence of a predicted likelihood of the agent, which includes different objects at different moments in the environment of the vicinity of the vehicle as probable trajectories corresponding to one or more locations of the vehicles at one or more future times.).

11.	In regards to Claim 5, Refaat teaches wherein each probable trajectory of the first set of probable trajectories of the vehicle further comprises one or more velocities of the vehicle at the corresponding one or more future times (paragraph 32, Refaat teaches the trajectories being predicted using characteristics of motion of an agent at specific time points such as location, velocities, acceleration, and headings or orientation of agents, which can be used to obtain the locations and velocities of the autonomous vehicle and agents in the driving environment. The Examiner interprets the characteristics of motion such as velocity at specific time points as velocity of the vehicle at the corresponding one or more future times.).

12.	In regards to Claim 6, Refaat teaches wherein the one or more trajectories predicted by the student model comprise a second set of probable trajectories of at least one object of the one or more objects of the driving environment of the vehicle (paragraphs 81 and 82, Refaat teaches the training predicted future trajectory, which contains data on the future path for an agent in the environment of the vehicle, that is predicted in combination of the second sub neural network and trajectory generation neural network. The Examiner interprets agents as one or more objects in the driving environment of the vehicle, the second sub neural network as the student model, and the training predicted future trajectory predicted by the second sub neural network as the second set of probable trajectories predicted by the student model.).

13.	In regards to Claim 7, Refaat teaches wherein each of the one or more predicted trajectories comprises a plurality of temporal segments (paragraph 31, Refaat teaches predicted trajectories with multiple time points and spatial position occupied by the agent in the environment at the time point and characteristics of the motion of the agent at a time point, an example of temporal segment, which is the driving of image sequences into meaningful chunks based on time or content changes such as motion. The Examiner interprets this as predicted trajectories being comprised of a plurality of temporal segments predicted.), and wherein the plurality of temporal segments of a respective predicted trajectory are generated in parallel (paragraph 31, Refaat teaches predicted trajectories with multiple time points and spatial position occupied by the agent in the environment at the time point and characteristics of the motion of the agent at a time point, an example of temporal segment, which is the driving of image sequences into meaningful chunks based on time or content changes such as motion. The Examiner interprets this as predicted trajectories being comprised of a plurality of temporal segments predicted in parallel.).  

14.	In regards to Claim 8, Refaat teaches wherein each of the one or more ground truth trajectories comprises a plurality of temporal segments (paragraph 31, Refaat teaches predicted trajectories with multiple time points and spatial position occupied by the agent in the environment at the time point and characteristics of the motion of the agent at a time point, an example of temporal segment, which is the driving of image sequences into meaningful chunks based on time or content changes such as motion. The Examiner interprets this as predicted trajectories being comprised of a plurality of temporal segments), and wherein the plurality of temporal segments of a respective ground truth trajectory are generated iteratively (paragraph 31, Refaat teaches predicted trajectories with multiple time points and spatial position occupied by the agent in the environment at the time point and characteristics of the motion of the agent at a time point, an example of temporal segment, which is the driving of image sequences into meaningful chunks based on time or content changes such as motion. The Examiner interprets this as predicted trajectories being comprised of a plurality of temporal segments predicted iteratively.), a later temporal segment of the respective ground truth trajectory being predicated on at least one earlier temporal segment of the respective ground truth trajectory (Fig. 5, paragraphs 33, 34, and 80, Refaat teaches series of data values in the time channel that correspond to spatial positions that define time point that the agent occupies to obtain ground truth outputs based on a ground truth future trajectory of the agent, showing that the actual path was predicated. The Examiner interprets this as a later temporal segment of the respective ground truth trajectory being predicated on an earlier temporal segment of the respective ground truth trajectory.).

15.	In regards to Claim 10, Refaat teaches the teacher model is trained by: obtaining a second training data (paragraph 4, Refaat teaches obtaining ground truth output as a second training data. The Examiner interprets the ground truth output as second training data.), wherein the second training data comprises: a second training input associated with one or more driving missions (paragraphs 4, 15, 27, and 45, Refaat teaches the ground truth output serving as an objective standard used in training the vehicle’s planning system to adjust its own trajectory and avoid collisions by stopping, or turning right or left. The Examiner interprets avoiding collisions as a driving mission.), and one or more trajectories of the vehicle recorded during the one or more driving missions (paragraphs 27, 32, and 45, Refaat teaches during the avoidance of collisions, ground truth trajectories being recorded and utilized through environment data captured by the autonomous vehicle’s onboard system that contains sensors. The Examiner interprets the ground truth trajectories recorded during the avoidance of collisions as the trajectories of the vehicle recorded during driving missions.); and training, using the second training data, the teacher model to generate one or more training trajectories of the vehicle (paragraph 5, Refaat teaches backpropagation of the second sub neural network, where the input is the ground truth output. into the first sub neural network to update the parameter values of the first sub neural network used to generate training trajectories of the vehicle. The Examiner interprets the use of the ground truth output, known as the second training input, as the input into the second sub neural network that is backpropagated or fed into the first sub neural network, known as the teacher model, as using second training data to train the teacher model to generate one or more training trajectories., wherein training the teacher model comprises reducing a difference between the one or more training trajectories of the vehicle and the one or more recorded trajectories of the vehicle (paragraph 4, Refaat teaches a difference between the training predicted future trajectory and the ground truth future trajectory that is used compute gradients of first and second losses respective to the first and second sub neural networks to update the values of the parameters of the first and second sub neural networks. The Examiner interprets the difference between the training predicted future trajectory and the ground truth future trajectory as the difference between one or more trajectories predicted by the first model and one or more ground truth trajectories.).

16.	In regards to Claim 12, Refaat teaches further comprising: causing the student model to be deployed onboard an autonomous vehicle (paragraph 3, Refaat teaches a plurality of sub neural networks such as the first and second sub neural networks being implemented onboard an autonomous vehicle. The Examiner interprets the second sub neural network as the student model.).

17.	In regards to Claim 13, Refaat teaches further comprising: obtaining inference data representative of a new environment of the autonomous vehicle; applying the student model to the inference data to predict one or more trajectories of the autonomous vehicle in the new environment; and causing a driving path of the autonomous vehicle to be modified in view of the one or more predicted trajectories of the autonomous vehicle (paragraph 27, Refaat teaches that the onboard system can generate planning decisions like the future trajectory of the vehicle to assist in operating the vehicle safely, such as adjusting the future trajectory of the vehicle, including obtaining environment data in response to determining the trajectory of another vehicle that is likely to cross the trajectory of the vehicle. The Examiner interprets generating planning decisions using obtained environment data to modify the trajectory of the vehicle as obtaining inference data representative of a new environment of the autonomous vehicle, applying the student model to the data for the prediction of trajectories in the new environment and modifying the driving path in view of predicted trajectories.).

Claim Rejections - 35 USC § 103
18.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

19.	Claims 9, 11, and 14-20 are rejected under 35 U.S.C 103 as being unpatentable over Refaat et al (US Patent Pub. No. 2021/0133582 A1, hereafter referred to as Refaat) in view of Yin et al (US Patent Pub. No. 2022/0180202 A1, hereafter referred to as Yin).

20.	Regarding Claim 9, Refaat in view of Yin teaches the method of Claim 1 that trains models to predict trajectories associated to a vehicle within a driving environment.
Refaat does not teach wherein each of the teacher model and the student model comprise one or more of: a self-attention block of artificial neurons, a cross-attention block of artificial neurons, or a multilayer perceptron block of artificial neurons.
Yin is in the same field of art of training models for predicting trajectories. Further Yin teaches wherein each of the teacher model and the student model comprise one or more of:
a self-attention block of artificial neurons (paragraph 210, Yin teaches an attention module mechanism, which uses self-attention blocks of artificial neurons, to calculate correlations between words to obtain context sensitive representations by increasing precision and quickly selecting high value information from the large datasets. The Examiner interprets the attention module mechanism as self-attention blocks of artificial neurons.),
a cross-attention block of artificial neurons (paragraph 139, Yin teaches the teacher and student model as transformer models, which utilize cross-attention blocks of artificial neurons to fit a scoring matrix from the teacher model to the student model. The Examiner interpret this as both the teacher and student model comprising of cross-attention blocks of artificial neurons.), or a multilayer perceptron block of artificial neurons (paragraphs 139 and 210, Yin teaches a large transformer structure that includes a residual layer, linear normalization, and a forward network module that performs further transformation on the obtained word representation to produce the final output of the transformer layer. The Examiner interprets this as a multilayer perception block of artificial neurons.).
	Therefore it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Refaat by adding a self-attention block of artificial neurons, cross-attention block of artificial neurons, or a multilayer perception block of artificial neurons that is taught by Yin to make the invention quicker at obtaining sensitive representations and high value information from large data sets; thus one of ordinary skill in the art would be motivated to combine the references since they are both in the field of training models for predicting trajectories (paragraphs 139 and 210, Yin).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

21.	Regarding Claim 11, Refaat in view of Yin teaches the methods of Claim 10 that trains models to predict trajectories associated to a vehicle within a driving environment.
	Refaat does not teach wherein the second training input comprises the first training input.
Yin is in the same field of art of training models for predicting trajectories. Further Yin teaches wherein the second training input comprises the first training input (paragraph 198, Yin teaches the sample data from the training text, known as the initial input, being used as a second training input for loss. The Examiner interprets the sample data from the training text as the second training input.).
Therefore it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Xiong by adding the second training input as a part of the first training input that is taught by Yin to make the invention have stand-alone architecture aiding realistic functionality; thus one of ordinary skill in the art would be motivated to combine the reference since they are both in the field of training models for predicting trajectories (paragraph 198, Yin).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

22.	Regarding Claim 14, Refaat teaches a system comprising: a sensing system of a vehicle (paragraph 29, Refaat teaches a perception subsystem, part of the on-board system of the vehicle to visualize the environment in the vicinity of the vehicle, that includes one or more sensors. The Examiner interprets the perception subsystem on the vehicle that includes one of more sensors as the sensing system of a vehicle.), the sensing system configured to: acquire sensing data for a driving environment of the vehicle (paragraphs 29-30, Refaat teaches the perception sub-system repeatedly capturing raw sensor data which can indicate the directions, intensities, and distances travelled by reflected radiation from the environment in the vicinity of the vehicle. The Examiner interprets capturing raw sensing data from the environment in the vicinity of the vehicle as acquiring sensing data for a driving environment.); and a data processing system of the vehicle (paragraph 31, Refaat teaches the on-board system of the vehicle being able to use the raw sensor data that is continually generated by the perception subsystem to continually generate environment data. The Examiner interprets the on-board system of the vehicle being able to use the raw sensor data as a data processing system.), the data processing system configured to: generate, using the acquired sensing data, an inference data characterizing one or more objects in the environment of the vehicle (paragraph 31, Refaat teaches the on-board system of the vehicle being able to use the obtained raw sensor data that is continually generated by the perception subsystem to generate environment data. The Examiner interprets the on-board system as the data processing system and the environment data generated by the onboard system that is continually generated by the perception subsystem as an inference data characterizing one or more objects in the environment of the vehicle.); apply a first model to the inference data to predict one or more trajectories of the vehicle in the environment (paragraph 4, Refaat teaches obtaining a ground truth output defining a ground truth future trajectory of the agent and processing using the first sub neural network. The Examiner interprets the ground truth future trajectory as one or more trajectories and the first sub neural network, as the first model applied to the inference data.), wherein the first model is trained using a second model (paragraph 7, Refaat teaches backpropagating the computed gradient of the second loss, which comes from the second sub neural network, through the trajectory generation neural network into the first sub neural network to update the parameter values of the first sub neural network. The Examiner interprets the first sub neural network as the first model, the second sub neural network as a second model, and the backpropagation of the second loss from the second sub neural network that is used to update the parameter values of the first sub neural network as the first model being trained using a second model.), and cause a driving path of the vehicle to be modified in view of the one or more predicted trajectories (paragraph 82, Refaat teaches the predicted future trajectory defining a path that, as being predicted by the system, will be followed by the agent in the driving environment, which is described as anything present in the vicinity of the vehicle including a pedestrian, bicyclists, or other vehicles. The Examiner interprets the path defined to be followed by the agent as a means to modify the driving path of the vehicle by sensing and avoiding the agent.).
	Refaat does not teach wherein the second model comprises an autoregressive model.
	Yin is in the same field of art of training models for predicting trajectories. Further Yin teaches wherein the second model comprises an autoregressive model (paragraph 162, Yin teaches student model using a generalized autoregressive pretraining for language understanding. The Examiner interprets the student model as the second model and the generalized autoregressive pretraining for language understanding as an autoregressive model.).
Therefore it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Xiong by adding the second model that comprises an autoregressive model that is taught by Yin to make the invention use its past inputs to produce future ones, making the predictions more context-aware; thus one of ordinary skilled in the art would be motivated to combine the reference since they are both in the field of training models for predicting trajectories.
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

23.	In regards to Claim 15, Refaat in view of Yin teaches wherein the inference data comprises one or more (The Examiner finds the claim states "one or more" under Broadest Reasonable Interpretation, the Examiner only need to find one limitation.) of: a history of motion of (i) the vehicle and (ii) the one or more objects, a status of one or more traffic lights of the driving environment of the vehicle, a roadgraph information associated with the driving environment of the vehicle, or a representation of a target route of the vehicle through the driving environment of the vehicle (paragraph 31, Refaat teaches environment data that characterizes a scene of an environment in a vicinity of the vehicle at a current time point, particularly as data that describes any agents that are present in the vicinity of the vehicle. The Examiner interprets the environment data, which is previously not seen before visual inputs from the environment in the vicinity of the vehicle as inference data that is comprised of a representation of a target route of the vehicle through the driving environment of the vehicle.).

24.	In regards to Claim 16, Refaat in view of Yin teaches wherein each of the one or more (The Examiner finds the claim states "one or more" under Broadest Reasonable Interpretation, the Examiner only need to find one limitation.) trajectories predicted by the first model comprises (i) one or more locations of the vehicle at corresponding one or more future times; and (ii) one or more velocities of the vehicle at the corresponding one or more future times (paragraph 32, Refaat teaches the trajectories being predicted using data defining, which incorporates the characteristics of motion of an agent at specific time points such as location, velocities, acceleration, and headings or orientation of agents, which is expressed in angular data. These components can be used to obtain the locations and velocities of the autonomous vehicle and agents in the driving environment. The Examiner interprets the characteristics of motion of an agent at specific time points in the driving environment as part of the prediction by the first model to comprise locations and velocities of the vehicle at corresponding future times.).

25.	In regards to Claim 17, Refaat in view of Yin teaches wherein the first model is trained by: obtaining training data (paragraph 4, Refaat teaches the first sub neural network being trained by the obtained values of the parameters of the training network input. The Examiner interprets the first sub neural network as the first model and the parameters of the training network input as training data.), wherein the training data comprises: a training input representative of a training environment of a reference vehicle, the training environment comprising a plurality of objects (paragraph 4, Refaat teaches the training intermediate representation of the environment in a vicinity of the vehicle, including agents, from the training network input. The Examiner interprets the training intermediate representation as a training input representative of the training environment of a reference vehicle and the environment of the vicinity of the vehicle including agents as the training environment comprising a plurality of objects.), and one or more ground truth trajectories associated with a forecasted motion of the reference vehicle within the training environment (paragraph 80, Refaat teaches a ground truth output that defines a ground truth future trajectory of the agent within the vicinity of the vehicle used in the development of the method. The Examiner interprets the ground truth output that defines the ground truth future trajectories as a ground truth trajectory that is associated with a forecasted motion of the vehicle since ground truth trajectories are real-world data that serve as reference outputs to define the actual path within the training environment.), wherein the one or more ground truth trajectories are generated by the autoregressive model using the training input (paragraph 162, Yin teaches a relationship between objects being obtained using a generalized autoregressive training based on an input layer to measure the paths of the objects, which is known as ground truth trajectories. The Examiner interprets the autoregressive training based on the input layer as the autoregressive model using the training input.); and applying the first model to the training input to predict one or more training trajectories of the reference vehicle (figure 2, Refaat teaches the first sub neural network, Sub Neural Network A being feed into the trajectory generation neural network for the training predicted future trajectory. The Examiner interprets the first sub neural network as the first model and the training predicted future trajectory as the training trajectory.); and modifying parameters of the first model to reduce a difference between the one or more training trajectories predicted by the first model and the one or more ground truth trajectories generated by the autoregressive model (paragraph 4, Refaat teaches a difference between the training predicted future trajectory and the ground truth future trajectory that is used to compute gradients of first and second losses respective to the first and second sub neural networks to update the values of the parameters of the first and second sub neural networks. The Examiner interprets the difference between the training predicted future trajectory and the ground truth future trajectory as the difference between one or more trajectories predicted by the first model and one or more ground truth trajectories.).

26.	In regards to Claim 18, Refaat in view of Yin teaches wherein each of the one or more predicted trajectories of the vehicle comprises a plurality of temporal segments (paragraph 31, Refaat teaches predicted trajectories with multiple time points and spatial position occupied by the agent in the environment at the time point and characteristics of the motion of the agent at a time point, an example of temporal segment, which is the driving of image sequences into meaningful chunks based on time or content changes such as motion. The Examiner interprets this as predicted trajectories being comprised of a plurality of temporal segments predicted.) and wherein the plurality of temporal segments of a respective predicted trajectory are predicted in parallel (paragraph 31, Refaat teaches predicted trajectories with multiple time points and spatial position occupied by the agent in the environment at the time point and characteristics of the motion of the agent at a time point, an example of temporal segments, which is the driving of image sequences into meaningful chunks based on time or content changes such as motion. The Examiner interprets this as predicted trajectories being comprised of a plurality of temporal segments predicted in parallel.).

27.	In regards to Claim 19, Refaat in view of Yin teaches wherein the first model comprises an encoder neural network and a decoder neural network (paragraph 26, Yin teaches a teacher model, known as the first model, being trained with initial training text, known as the initial input, which is combined with another input to add label information, and example of encoding, and decoded to produce sample data outputs; all of which is performed using neural network models. The Examiner interprets the combination of the initial input and an additional input as an encoder neural network and the decoding to produce sample data outputs as a decoder neural network.), wherein the encoder neural network comprises:
one or more self-attention blocks of artificial neurons (paragraph 250, Yin teaches the teacher model being used to train the student model, while the teacher model is a pretrained language model such as BERT, Bidirectional Encoder Representations from Transformers, an encoder built on self-attention blocks of artificial neurons. The Examiner interprets BERT as an encoder neural network comprised of one or more self-attention blocks of artificial neurons.); and wherein the decoder neural network comprises: one or more cross-attention blocks of artificial neurons (paragraphs 147-148, Yin teaches a sequence tagging of the to-be-processed text in the form of translation, where the to-be-processed text is the input. The Examiner interprets this as an output, decoder neural network that generates output sequences such as translations using cross-attention blocks of artificial neurons.).

28.	Regarding Claim 20, Refaat teaches a non-transitory computer-readable memory storing instructions that, when executed by a processing device, cause the processing device to: obtain an inference data characterizing one or more objects in an environment of a vehicle (paragraph 31, Refaat teaches the on-board system of the vehicle being able to use the obtained raw sensor data that is continually generated by the perception subsystem to generate environment data. The Examiner interprets the on-board system as the data processing system and the environment data generated by the onboard system that is continually generated by the perception subsystem as an inference data characterizing one or more objects in the environment of the vehicle.); apply a first model to the inference data to predict one or more trajectories of the vehicle in the environment (paragraph 4, Refaat teaches obtaining a ground truth output defining a ground truth future trajectory of the agent and processing using the first sub neural network. The Examiner interprets the ground truth future trajectory as one or more trajectories and the first sub neural network, as the first model applied to the inference data.), wherein the first model is trained using a second model (paragraph 7, Refaat teaches backpropagating the computed gradient of the second loss, which comes from the second sub neural network, through the trajectory generation neural network into the first sub neural network to update the parameter values of the first sub neural network. The Examiner interprets the first sub neural network as the first model, the second sub neural network as a second model, and the backpropagation of the second loss from the second sub neural network that is used to update the parameter values of the first sub neural network as the first model being trained using a second model.), and cause a driving path of the vehicle to be modified in view of the one or more predicted trajectories (paragraph 82, Refaat teaches the predicted future trajectory defining a path that, as being predicted by the system, will be followed by the agent in the driving environment, which is described as anything present in the vicinity of the vehicle including a pedestrian, bicyclists, or other vehicles. The Examiner interprets the path defined to be followed by the agent as a means to modify the driving path of the vehicle by sensing and avoiding the agent.).
Refaat does not teach wherein the second model comprises an autoregressive model.
Yin is in the same field of art of training models for predicting trajectories. Further Yin teaches wherein the second model comprises an autoregressive model (paragraph 162, Yin teaches the student model using a generalized autoregressive pretraining for language understanding. The Examiner interprets the student model as the second model and the generalized autoregressive pretraining for language understanding as an autoregressive model.).
Therefore it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Xiong by adding the second model that comprises an autoregressive model that is taught by Yin to make the invention use its past inputs to produce future ones, making the predictions more context-aware; thus one of ordinary skill in the art would be motivated to combine the reference since they are both in the field of training models for predicting trajectories.
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

Conclusion
29.	Any inquiry concerning this communication or earlier communications from the
examiner should be directed to LOUIS NWUHA whose telephone number is (571)272-0219.
The examiner can normally be reached Monday to Friday 9 am to 5 pm.

30.	Examiner interviews are available via telephone, in-person, and video conferencing using
a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is
encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

31.	If attempts to reach the examiner by telephone are unsuccessful, the examiner’s
supervisor, Oneal Mistry can be reached at 3134464912. The fax phone number for the
organization where this application or proceeding is assigned is 571-273-8300.

32.	Information regarding the status of published or unpublished applications may be
obtained from Patent Center. Unpublished application information in Patent Center is available
to registered users. To file and manage patent submissions in Patent Center, visit:
https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more
information about Patent Center and https://www.uspto.gov/patents/docx for information about
filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC)
at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service
Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LOUIS NWUHA/Examiner, Art Unit 2674                                                                                                                                                                                                        
/ONEAL R MISTRY/Supervisory Patent Examiner, Art Unit 2674
Read full office action
DISTILLATION-TRAINED MACHINE LEARNING MODELS FOR EFFICIENT TRAJECTORY PREDICTION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

DISTILLATION-TRAINED MACHINE LEARNING MODELS FOR EFFICIENT TRAJECTORY PREDICTION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email