DETAILED ACTION
This action is responsive to communications filed on June 15, 2023. This action is made Non-Final.
Claims 1-20 are pending in the case.
Claims 1, 6, and 15 are independent claims.
Claims 1-20 are rejected.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS(s)) submitted on 06/15/2023 is/are in compliance with the provisions of 37 C.F.R. 1.97. Accordingly, the IDS(s) is/are being considered by the examiner.
Claim Objections
Claim 14 objected to because of the following informalities:
Claim 14 recites “the first stage” and “the second stage.” There appears to be no antecedent basis for “the first stage” and “the second stage” for claim 14.
Appropriate correction is required.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-4, 6-10, 13, 15, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Green et al., US Publication 2022/0066461 (“Green”), and further in view of Chen, Baixu, et al. "Debiased self-training for semi-supervised learning." Advances in Neural Information Processing Systems 35 (2022): 32424-32437 (“Chen”).
Claim 1:
Green teaches or suggests a computer-implemented method for training a first machine learning model using a debiased dataset, comprising:
obtaining a first dataset including a first plurality of data records (see para. 0060 - the machine-learned yield model can be trained based at least in part on log data annotated with yield labels. The log data can describe yield behaviors performed by vehicles (e.g., autonomous vehicles and/or humanly-operated vehicles; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels.);
processing the first dataset using a second machine learning model ... for each data record of the first plurality of records (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0157 - the machine learning computing system 130 and/or the autonomy computing system 102 can train the machine-learned models 110 and/or 140 through use of a model trainer 160. The model trainer 160 can train the machine-learned models 110 and/or 140 using one or more training or learning algorithms; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. scores provided for the yield behaviors are included as labels to train the yield model, in addition or alternatively to a simple positive or negative label);
comparing the respective pseudo-label for each data record of the plurality of data records against at least one of a first threshold or a second threshold (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example.);
in response to comparing each respective pseudo-label against at least one of the first threshold or the second threshold, performing, for each data record of the first plurality of data records one of: determine that the respective pseudo-label exceeds the first threshold and assigning a positive label to the data record; determining that the respective pseudo-label is lower than the second threshold and assigning a negative label to the data record; or determining that the respective pseudo-label is between the first threshold and the second threshold and discarding the data record (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded.);
generating a second dataset including a second plurality of data records from the first plurality of data records based at least in part on the data records of the first plurality of records that were assigned positive labels, the data records of the first plurality of data records that were assigned negative labels, and the discarded data records of the first plurality of data records, such that the second plurality of data records includes the data records from the first plurality of data records having positive labels and data records from the first plurality of data records having negative labels (see para. 0060 - the machine learned yield model can be trained based at least in part on log data annotated with yield labels; para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded.); ... and
training the first machine learning model using the ... training dataset (see para. 0060 - the machine learned yield model can be trained based at least in part on log data annotated with yield labels; para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded.).
Green does not explicitly disclose generate a debiased training dataset from the second dataset; using the debiased training dataset.
Chen teaches or suggests to generate a respective pseudo-label; generate a debiased training dataset from the second dataset; using the debiased training dataset (see §I – assign pseudo labels to unlabeled samples with the model’s predictions and then iteratively train the model with these pseudo labeled samples as if they were labeled examples. Training with biased and unreliable pseudo labels has the chance to accumulate errors and ultimately lead to performance fluctuations. And for those poorly-behaved categories, the bias of the pseudo labels gets worse and will be further enhanced as self-training progresses. present Debiased Self-Training (DST), a novel approach to decrease the undesirable bias in self-training. Specifically to reduce the training bias, the classifier head is only trained with clean labeled samples and no longer trained with unreliable pseudo-labeled samples.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Green to include to generate a respective pseudo-label; generate a debiased training dataset from the second dataset; using the debiased training dataset for the purpose of efficiently removing bias from ML models by not training on unreliable pseudo-labeled samples, improving model bias, as taught by Chen (I).
Claim 2:
Green further teaches or suggests wherein the first machine learning model is employed as a first stage of a multi-stage content recommendation system (see Fig. 8; para. 0048 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects; para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0157 - the machine learning computing system 130 and/or the autonomy computing system 102 can train the machine-learned models 110 and/or 140 through use of a model trainer 160. The model trainer 160 can train the machine-learned models 110 and/or 140 using one or more training or learning algorithms; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. scores provided for the yield behaviors are included as labels to train the yield model, in addition or alternatively to a simple positive or negative label).
Claim 3:
Green further teaches or suggests wherein the second machine learning model is employed as a second stage of the multi-stage content recommendation system (see Fig. 8; para. 0048 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects; para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0157 - the machine learning computing system 130 and/or the autonomy computing system 102 can train the machine-learned models 110 and/or 140 through use of a model trainer 160. The model trainer 160 can train the machine-learned models 110 and/or 140 using one or more training or learning algorithms; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. scores provided for the yield behaviors are included as labels to train the yield model, in addition or alternatively to a simple positive or negative label).
Claim 4:
Green further teaches or suggests wherein the first dataset was previously served by the first stage of the multi-stage content recommendation system (see Fig. 8; para. 0060 - the machine-learned yield model can be trained based at least in part on log data annotated with yield labels. The log data can describe yield behaviors performed by vehicles (e.g., autonomous vehicles and/or humanly-operated vehicles) during previously conducted real-world driving sessions. As another example, the machine-learned yield model can be trained based at least in part on synthesized yield behaviors generated by playing forward or otherwise simulating certain scenarios that are described by log data; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels.).
Claim 6:
Green teaches or suggests a computer-implemented method for generating a debiased dataset for training a first machine learning model, comprising:
obtaining a first dataset that includes a plurality of data records and was previously served by the first machine learning model (see Fig. 8; para. 0060 - the machine-learned yield model can be trained based at least in part on log data annotated with yield labels. The log data can describe yield behaviors performed by vehicles (e.g., autonomous vehicles and/or humanly-operated vehicles) during previously conducted real-world driving sessions. As another example, the machine-learned yield model can be trained based at least in part on synthesized yield behaviors generated by playing forward or otherwise simulating certain scenarios that are described by log data; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels.);
generating, using a second machine learning model, ... corresponding to the first plurality of data records (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0157 - the machine learning computing system 130 and/or the autonomy computing system 102 can train the machine-learned models 110 and/or 140 through use of a model trainer 160. The model trainer 160 can train the machine-learned models 110 and/or 140 using one or more training or learning algorithms; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. scores provided for the yield behaviors are included as labels to train the yield model, in addition or alternatively to a simple positive or negative label);
determining an upper pseudo-label threshold (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example.);
determining a lower pseudo-label threshold (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example.);
comparing each of the plurality of corresponding pseudo-labels corresponding to the first plurality of data records against at least one of the upper pseudo-label threshold or the lower pseudo-label threshold (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example.);
in response to comparing each of the plurality of corresponding pseudo-labels against at least one of the upper pseudo-label threshold or the lower pseudo-label threshold, performing, for each corresponding data record of the first plurality of data records, one of: determining that the pseudo-label exceeds the upper pseudo-label threshold and associating a positive label with the corresponding data record; determining that the pseudo-label is lower than the lower pseudo-label threshold and associating a negative label with the corresponding data record; or determining that the pseudo-label is between the first threshold and the second threshold and discarding the corresponding data record (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded.);
generating a second dataset including the data records from the first plurality of data records that are associated with the positive label or the negative label (see para. 0060 - the machine learned yield model can be trained based at least in part on log data annotated with yield labels; para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded.); and
Green does not explicitly disclose generating a debiased training dataset from the second dataset.
Chen teaches or suggests a plurality of corresponding pseudo-labels; generating a debiased training dataset from the second dataset (see §I – assign pseudo labels to unlabeled samples with the model’s predictions and then iteratively train the model with these pseudo labeled samples as if they were labeled examples. Training with biased and unreliable pseudo labels has the chance to accumulate errors and ultimately lead to performance fluctuations. And for those poorly-behaved categories, the bias of the pseudo labels gets worse and will be further enhanced as self-training progresses. present Debiased Self-Training (DST), a novel approach to decrease the undesirable bias in self-training. Specifically to reduce the training bias, the classifier head is only trained with clean labeled samples and no longer trained with unreliable pseudo-labeled samples.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Green to include a plurality of corresponding pseudo-labels; generating a debiased training dataset from the second dataset for the purpose of efficiently removing bias from ML models by not training on unreliable pseudo-labeled samples, improving model bias, as taught by Chen (I).
Claim 7:
Green further teaches or suggests training the first machine learning model using the ... training dataset, wherein the first machine learning model forms at least a portion of a multistage content recommendation system (see Fig. 8; para. 0048 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects; para. 0060 - the machine-learned yield model can be trained based at least in part on log data annotated with yield labels. The log data can describe yield behaviors performed by vehicles (e.g., autonomous vehicles and/or humanly-operated vehicles) during previously conducted real-world driving sessions. As another example, the machine-learned yield model can be trained based at least in part on synthesized yield behaviors generated by playing forward or otherwise simulating certain scenarios that are described by log data; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded.).
Chen further teaches or suggest using the debiased training dataset (see §I – assign pseudo labels to unlabeled samples with the model’s predictions and then iteratively train the model with these pseudo labeled samples as if they were labeled examples. Training with biased and unreliable pseudo labels has the chance to accumulate errors and ultimately lead to performance fluctuations. And for those poorly-behaved categories, the bias of the pseudo labels gets worse and will be further enhanced as self-training progresses. present Debiased Self-Training (DST), a novel approach to decrease the undesirable bias in self-training. Specifically to reduce the training bias, the classifier head is only trained with clean labeled samples and no longer trained with unreliable pseudo-labeled samples.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Green to include using the debiased training dataset for the purpose of efficiently removing bias from ML models by not training on unreliable pseudo-labeled samples, improving model bias, as taught by Chen (I).
Claim 8:
Green further teaches or suggests receiving a request for content items; determining, using the multi-stage content recommendation system, at least one content item from a corpus of content items as recommended content in response to the request for content items (see para. 0023 - configured to receive and process feature data descriptive of objects perceived by the autonomous vehicle and/or the surrounding environment and, in response to receipt of the feature data, provide yield decisions for the autonomous vehicle relative to the objects; para. 0032 - searches (e.g., iteratively searches) over a motion planning space (e.g., a vehicle state space) to identify a motion plan that optimizes (e.g., locally optimizes) a total cost associated with the motion plan, as provided by one or more cost functions; para. 0048 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects. Yield decision can include yielding for the object by, for example, stopping a motion of the autonomous vehicle for the object. In some implementations, the yield decision can include not yielding for the object. For example, the output of the machine-learned model can indicate that the vehicle is to maintain its current speed and/or trajectory, without adjusting for the object's presence; para. 0113 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects; para. 0182 - input the first feature data into a machine-learned yield model. The machine-learned yield model can be configured to receive and process feature data descriptive of objects perceived by the autonomous vehicle and, in response, provide yield decisions.).
Claim 9:
Green further teaches or suggests wherein the recommended content is determined online in real-time (see Fig. 8; para. 0023 - configured to receive and process feature data descriptive of objects perceived by the autonomous vehicle and/or the surrounding environment and, in response to receipt of the feature data, provide yield decisions for the autonomous vehicle relative to the objects; para. 0048 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects. Yield decision can include yielding for the object by, for example, stopping a motion of the autonomous vehicle for the object. In some implementations, the yield decision can include not yielding for the object. For example, the output of the machine-learned model can indicate that the vehicle is to maintain its current speed and/or trajectory, without adjusting for the object's presence.).
Claim 10:
Green further teaches or suggests wherein the recommended content includes at least one of: images; videos; documents; or advertisements (see Fig. 8; para. 0023 - configured to receive and process feature data descriptive of objects perceived by the autonomous vehicle and/or the surrounding environment and, in response to receipt of the feature data, provide yield decisions for the autonomous vehicle relative to the objects; para. 0048 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects. Yield decision can include yielding for the object by, for example, stopping a motion of the autonomous vehicle for the object. In some implementations, the yield decision can include not yielding for the object. For example, the output of the machine-learned model can indicate that the vehicle is to maintain its current speed and/or trajectory, without adjusting for the object's presence.).
Claim 13:
Green further teaches or suggests the first machine learning model is employed as a first stage of a multi-stage content recommendation system; and the second machine learning model is employed as a second stage of the multi-stage content recommendation system (see Fig. 8; para. 0048 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects; para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0157 - the machine learning computing system 130 and/or the autonomy computing system 102 can train the machine-learned models 110 and/or 140 through use of a model trainer 160. The model trainer 160 can train the machine-learned models 110 and/or 140 using one or more training or learning algorithms; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. scores provided for the yield behaviors are included as labels to train the yield model, in addition or alternatively to a simple positive or negative label).
Claim 15:
Green teaches or suggests a computing system, comprising: one or more processors; a memory storing program instructions that, when executed by the one or more processors, cause the one or more processors to at least:
obtain a first dataset that includes a plurality of first data records and was previously served by a first machine learning model of a multi-stage content recommendation service (see Fig. 8; para. 0048 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects; para. 0060 - the machine-learned yield model can be trained based at least in part on log data annotated with yield labels. The log data can describe yield behaviors performed by vehicles (e.g., autonomous vehicles and/or humanly-operated vehicles) during previously conducted real-world driving sessions. As another example, the machine-learned yield model can be trained based at least in part on synthesized yield behaviors generated by playing forward or otherwise simulating certain scenarios that are described by log data; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels.);
generate, using a second machine learning model of the multi-stage content recommendation service, ... for each of the plurality of first data records (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0157 - the machine learning computing system 130 and/or the autonomy computing system 102 can train the machine-learned models 110 and/or 140 through use of a model trainer 160. The model trainer 160 can train the machine-learned models 110 and/or 140 using one or more training or learning algorithms; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. scores provided for the yield behaviors are included as labels to train the yield model, in addition or alternatively to a simple positive or negative label);
determine an upper pseudo-label threshold (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example.);
determining a lower pseudo-label threshold (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example.);
compare each respective pseudo-label against at least one of the upper pseudo-label threshold or the lower pseudo-label threshold (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example.);
in response to the comparison of each respective pseudo-label against at least one of the upper pseudo-label threshold or the lower pseudo-label threshold, perform, for each corresponding data record of the first plurality of data records, one of: determining that the respective pseudo-label exceeds the upper pseudo-label threshold and associate a positive label with the corresponding data record; determine that the respective pseudo-label is lower than the lower pseudo-label threshold and associate a negative label with the corresponding data record; or determine that the pseudo-label is between the first threshold and the second threshold and discard the corresponding data record (see para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded.);
generate a second dataset including data records from the first plurality of data records that are associated with the positive label or the negative label (see para. 0060 - the machine learned yield model can be trained based at least in part on log data annotated with yield labels; para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded.);
train the first machine learning model using the ... training dataset (see para. 0060 - the machine learned yield model can be trained based at least in part on log data annotated with yield labels; para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. the first threshold and the second threshold can be different values and yield behaviors that receive scores between the first threshold and the second threshold can simply be discarded.).
receive a request for content items (see para. 0023 - configured to receive and process feature data descriptive of objects perceived by the autonomous vehicle and/or the surrounding environment and, in response to receipt of the feature data, provide yield decisions for the autonomous vehicle relative to the objects; para. 0048 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects; para. 0113 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects; para. 0182 - input the first feature data into a machine-learned yield model. The machine-learned yield model can be configured to receive and process feature data descriptive of objects perceived by the autonomous vehicle and, in response, provide yield decisions.);
determine, using the multi-stage content recommendation system, at least one content item from a corpus of content items as recommended content in response to the request for content items (see para. 0023 - configured to receive and process feature data descriptive of objects perceived by the autonomous vehicle and/or the surrounding environment and, in response to receipt of the feature data, provide yield decisions for the autonomous vehicle relative to the objects; para. 0032 - searches (e.g., iteratively searches) over a motion planning space (e.g., a vehicle state space) to identify a motion plan that optimizes (e.g., locally optimizes) a total cost associated with the motion plan, as provided by one or more cost functions; para. 0048 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects. Yield decision can include yielding for the object by, for example, stopping a motion of the autonomous vehicle for the object. In some implementations, the yield decision can include not yielding for the object. For example, the output of the machine-learned model can indicate that the vehicle is to maintain its current speed and/or trajectory, without adjusting for the object's presence; para. 0113 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects; para. 0182 - input the first feature data into a machine-learned yield model. The machine-learned yield model can be configured to receive and process feature data descriptive of objects perceived by the autonomous vehicle and, in response, provide yield decisions.).
Green does not explicitly disclose generating a debiased training dataset from the second dataset; using the debiased training dataset.
Chen teaches a respective pseudo-label; generating a debiased training dataset from the second dataset; using the debiased training dataset (see §I – assign pseudo labels to unlabeled samples with the model’s predictions and then iteratively train the model with these pseudo labeled samples as if they were labeled examples. Training with biased and unreliable pseudo labels has the chance to accumulate errors and ultimately lead to performance fluctuations. And for those poorly-behaved categories, the bias of the pseudo labels gets worse and will be further enhanced as self-training progresses. present Debiased Self-Training (DST), a novel approach to decrease the undesirable bias in self-training. Specifically to reduce the training bias, the classifier head is only trained with clean labeled samples and no longer trained with unreliable pseudo-labeled samples.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Green to include a respective pseudo-label; generating a debiased training dataset from the second dataset; using the debiased training dataset for the purpose of efficiently removing bias from ML models by not training on unreliable pseudo-labeled samples, improving model bias, as taught by Chen (I).
Claim 20:
Green further teaches or suggests wherein periodically generate an updated ... training dataset; and periodically update the first machine learning model using the updated ... training dataset (see para. 0064 - machine-learned yield model can be more easily adjusted (e.g., via refinement training) than a rules-based system (e.g., requiring re written rules or manually tuned parameters) as the autonomy computing system is periodically updated to handle new scenarios. This can allow for more efficient upgrading of the autonomy computing system.).
Chen further teaches or suggests debiased training dataset; debiased training dataset (see §I – assign pseudo labels to unlabeled samples with the model’s predictions and then iteratively train the model with these pseudo labeled samples as if they were labeled examples. Training with biased and unreliable pseudo labels has the chance to accumulate errors and ultimately lead to performance fluctuations. And for those poorly-behaved categories, the bias of the pseudo labels gets worse and will be further enhanced as self-training progresses. present Debiased Self-Training (DST), a novel approach to decrease the undesirable bias in self-training. Specifically to reduce the training bias, the classifier head is only trained with clean labeled samples and no longer trained with unreliable pseudo-labeled samples.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Green to include debiased training dataset; debiased training dataset for the purpose of efficiently removing bias from ML models by not training on unreliable pseudo-labeled samples, improving model bias, as taught by Chen (I).
Claim(s) 5, 11, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Green, in view of Chen, and further in view of Sherif et al., US Publication 2024/0311683 (“Sherif”).
Claim 5:
As indicated above, Green teaches or suggests determining the first threshold and the second threshold.
Green does not explicitly disclose determining based at least in part on changes in user interactions associated with the first plurality of data records.
Sherif teaches or suggests determining based at least in part on changes in user interactions associated with the first plurality of data records (see para. 0061 - system may determine a threshold based on the rate of change of detected actions. For example, the system may determine a rate of change of detected actions and determine the first threshold based on the rate of change of detected actions over a number of periods. determining a threshold based on the rate of change of detected actions the system may adaptively react to changes in a user's behavior, thereby reducing the risk of inconveniencing the user.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Green to include determining based at least in part on changes in user interactions associated with the first plurality of data records for the purpose of tuning a model based on changes in user associated actions and interactions, improving model performance and increasing user convenience, as taught by Sherif (0061).
Claim 11:
As indicated above, Green teaches or suggests determining the lower pseudo-label threshold and determining the upper pseudo-label threshold.
Sherif teaches or suggests is based at least in part on changes in user interactions associated with the first plurality of data records (see para. 0061 - system may determine a threshold based on the rate of change of detected actions. For example, the system may determine a rate of change of detected actions and determine the first threshold based on the rate of change of detected actions over a number of periods. determining a threshold based on the rate of change of detected actions the system may adaptively react to changes in a user's behavior, thereby reducing the risk of inconveniencing the user.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Green to include is based at least in part on changes in user interactions associated with the first plurality of data records for the purpose of tuning a model based on changes in user associated actions and interactions, improving model performance and increasing user convenience, as taught by Sherif (0061).
Claim 17:
As indicated above, Green teaches or suggests determination of the lower pseudo-label threshold and determination of the upper pseudo-label threshold.
Sherif further teaches or suggests is based at least in part on changes in user interactions associated with the first plurality of data records (see para. 0061 - system may determine a threshold based on the rate of change of detected actions. For example, the system may determine a rate of change of detected actions and determine the first threshold based on the rate of change of detected actions over a number of periods. determining a threshold based on the rate of change of detected actions the system may adaptively react to changes in a user's behavior, thereby reducing the risk of inconveniencing the user.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Green to include is based at least in part on changes in user interactions associated with the first plurality of data records for the purpose of tuning a model based on changes in user associated actions and interactions, improving model performance and increasing user convenience, as taught by Sherif (0061).
Claim(s) 16 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Green, in view of Chen, and further in view of Balasubramanian et al., US Publication 2023/0135683 (“Balasubramanian”).
Claim 16:
Green further teaches or suggests wherein: the first machine learning model forms at least a portion of a content retrieval stage of the multi-stage content recommendation system; and the second machine model forms at least a portion of a ... stage of the multi-stage content recommendation system (see Fig. 8; para. 0048 - can input data indicative of at least the feature(s) for one or more objects into the machine-learned yield model and receive, as an output, data indicative of a recommended yield decision relative to the one or more objects; para. 0060 - the machine-learned yield model can be trained based at least in part on log data annotated with yield labels. The log data can describe yield behaviors performed by vehicles (e.g., autonomous vehicles and/or humanly-operated vehicles) during previously conducted real-world driving sessions. As another example, the machine-learned yield model can be trained based at least in part on synthesized yield behaviors generated by playing forward or otherwise simulating certain scenarios that are described by log data; para. 0061 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example; para. 0157 - the machine learning computing system 130 and/or the autonomy computing system 102 can train the machine-learned models 110 and/or 140 through use of a model trainer 160. The model trainer 160 can train the machine-learned models 110 and/or 140 using one or more training or learning algorithms; para. 0159 - model trainer 160 can train a machine-learned model 110 and/or 140 based on a set of training data 162. The training data 162 can include, for example log data annotated with yield labels; para. 0160 - if the yield behavior receives a score that is greater than a first threshold (or less than depending on scoring style) the yield behavior can be labelled as a positive training example; while if the yield behavior receives a score that is less than a second threshold (or greater than depending on scoring style) the yield behavior can be labelled as a negative training example. scores provided for the yield behaviors are included as labels to train the yield model, in addition or alternatively to a simple positive or negative label).
Balasubramanian further teaches or suggests content ranking (see para. 0005 - system may calculate a Hadamard product of search query embeddings for the search query and user embeddings for the user, wherein the user embeddings are learned using the machine learning embedding model. The online concierge system may approximate nearest neighbors from the Hadamard product to the item embeddings for the promoted items. The online concierge system may retrieve a set of candidate promoted items based on the approximating. The online concierge system may rank the set of candidate items using a machine learning click through rate model; para. 0006 - may calculate an inner product of the search query embeddings, the user embeddings, and the item embeddings; para. 0045 - calculates 530 the Hadamard product of the search query embeddings and the user embeddings; para. 0046 – approximates nearest neighbors from the product of the search query embeddings and the user embeddings to the item embeddings for each promoted item; para. 0047 - retrieve the promoted items that are approximated to have item embeddings which are nearest to the product of the search query embeddings and the user embeddings.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Green to include content ranking for the purpose of efficiently using similarity calculations to determine scores of matching or near matching content, improving model performance and content retrieval, as taught by Balasubramanian (0005, 0006, and 0047).
Claim 19:
As indicated above, Green teaches or suggests wherein the determination of the at least one content item from the corpus of content items as recommended content in response to the request for content items.
Balasubramanian further teaches or suggests determining a dot product of a user embedding representative of a user associated with the request for content items and a query embedding representative of the request for content items (see para. 0005 - system may calculate a Hadamard product of search query embeddings for the search query and user embeddings for the user, wherein the user embeddings are learned using the machine learning embedding model. The online concierge system may approximate nearest neighbors from the Hadamard product to the item embeddings for the promoted items. The online concierge system may retrieve a set of candidate promoted items based on the approximating. The online concierge system may rank the set of candidate items using a machine learning click through rate model; para. 0006 - may calculate an inner product of the search query embeddings, the user embeddings, and the item embeddings; para. 0045 - calculates 530 the Hadamard product of the search query embeddings and the user embeddings; para. 0046 – approximates nearest neighbors from the product of the search query embeddings and the user embeddings to the item embeddings for each promoted item; para. 0047 - retrieve the promoted items that are approximated to have item embeddings which are nearest to the product of the search query embeddings and the user embeddings.).
Accordingly, it would have been obvious to one having ordinary skill before the effective filing date of the claimed invention to modify the system and method, taught in Green to include determining a dot product of a user embedding representative of a user associated with the request for content items and a query embedding representative of the request for content items for the purpose of efficiently using similarity calculations to identify matching or near matching content, improving model performance and content retrieval, as taught by Balasubramanian (0005, 0006, and 0047).
Allowable Subject Matter
Claim(s) 12, 14, 18 is/are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Andrew T McIntosh whose telephone number is (571)270-7790. The examiner can normally be reached M-Th 8:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached at 571-272-4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ANDREW T MCINTOSH/Primary Examiner, Art Unit 2144