Last updated: April 19, 2026
Application No. 17/192,515
MACHINE LEARNING TECHNIQUES FOR GENERATING ENJOYMENT SIGNALS FOR WEIGHTING TRAINING DATA

Non-Final OA §103§112
Filed
Mar 04, 2021
Examiner
FACCENDA, GISEL GABRIELA
Art Unit
2127
Tech Center
2100 — Computer Architecture & Software
Assignee
Netflix Inc.
OA Round
5 (Non-Final)
This examiner grants 56% of cases after interview

— +49.2% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 16 resolved cases, 2023–2026
Examiner Intelligence

FACCENDA, GISEL GABRIELA View full profile →
Grants 56% of resolved cases
Career Allow Rate
9 granted / 16 resolved
+1.3% vs TC avg
Strong +49% interview lift
Without
With
+49.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 11m
Avg Prosecution
24 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
34.9%
-5.1% vs TC avg
§103
35.4%
-4.6% vs TC avg
§102
8.4%
-31.6% vs TC avg
§112
21.3%
-18.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 16 resolved cases
Office Action

§103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/12/2025 has been entered.

Response to Amendment
The office action is responsive to the amendment filed on 12/12/2025.  Claims 1, 2, 6, 8, 10-12, 16, and 18-20 are amended. Claim 21 is new. Claims 1-21 are pending for examination. 

Response to Arguments
Regarding the 35 U.S.C § 101 Rejection:
Applicant’s arguments, see pg. 8-15, filed 12/12/2025, with respect to claims 1-21 have been fully considered and are persuasive.  The rejection of claims 1-21 under 35 U.S.C § 101 has been withdrawn. 

Regarding the 35 U.S.C § 103:
APLLICANT ARGUMENT: 
Applicant argues, claim 1 is amended and none of the cited references teaches or suggests these limitations. “Therefore, no combination of the cited references can teach or suggest each and every limitation of amended claim 1... Additionally, each of amended independent claims 11, 19, and 20 recite features that are similar to the features of allowable claim 1, discussed above. Therefore, amended claims 11, 19, and 20, and all claims dependent thereon, respectively, are in condition for allowance for at least the reasons set forth herein”. Lastly applicant argues, none of the cited references teaches or suggests the limitations of new claim 21. 

EXAMINER RESPONSE: Applicant’s arguments with respect to claims 1-21 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Independent claim 1 recites the limitation in line 2-3 “training, by a feedback training engine, a prediction machine learning model using a first set of training data comprising one or more training examples,... . However, the disclosure does not teach a feedback training engine and a prediction machine learning model rather the specification 0025], [0060] and [0064]  and Figure 2 teach a training engine – element 122 and a personalized prediction model - element 220. In addition, claim 1 recites the limitation in lines 4-7 “wherein the feedback training engine re-weights the one or more training examples using an inverse propensity weight that is based on a frequency of a content feedback” (emphasis added). Nevertheless, the disclosure does not teach how the “feedback tracing engine” re-weight the training examples using the inverse propensity weight. Paragraph [0042] provides a definition for inverse propensity weight, but it does not provide detail into how the inverse propensity weight is being utilized by the feedback training engine to re-weights the one or more training examples. 
Moreover, claim 1 recited the limitation in lines 10-12 “generating, by the trained version of the prediction machine learning model, a predicted enjoyment signal associated with playback of a digital content item...” (emphasis added). Yet, the specification does not teach or suggest the “the trained version of the prediction machine learning model” is used to generate the “predicted enjoyment signal” as suggested by amended claim 1. Rather as can be seen in paragraph [0060] the a “Training engine 122 generates, using personalized prediction model 220, predicted enjoyment signal(s) 246 associated with playback of one or more digital content items 266”. Therefore, the “predicted enjoyment signal” is being generated a  personalized prediction model not a trained version of a “prediction machine learning model”. 
Lastly, claim 1 recites the limitation in line 13 “...content ranking machine learning model”, however, the disclosure does not teach a “content ranking machine learning model”. By contrast, paragraph [0025] and Figure 2, element 250 teach a “personalized ranking model”. 
Claim 2, recites the limitation “...wherein one or more parameters of the content ranking machine learning model is updated by the feedback training engine for the training”. However, the specification does not teach the parameters of a content ranking machine learning model being updated by the feedback training engine. Rather, [0073] of the specification teaches a “training engine 122 updates the parameters of personalized ranking model 250 based on the loss function”.  
Claim 20 recite the limitation in line 2-3, “training, by a feedback training engine, a prediction machine learning model using a first training data set and a second training data set”, however the specification does not teach or suggest a training a prediction machine learning model using a first training data set and a second training data set. Rather, paragraph [0060] teaches “...Training engine 122 generates a first set of training data 241 for personalized prediction model 220. ... Training engine 122 trains personalized ranking model 250 based on the second set of training data 241”. Furthermore, claim 20 recites the limitation “wherein the feedback training engine re-weights the one or more training examples of the first training data set using an inverse propensity weight that is based on a frequency of a content feedback behavior” (emphasis added), however, the disclosure does not teach how the “feedback tracing engine” re-weight the training examples of the first training data set using an inverse propensity weight that is based on a frequency of a content feedback behavior. 
Independent claims 11, 19 and 20 recites similar limitation to those of claim 1, thus are rejected for reasons set forth in the rejection of claim 1. 
Claims 2-10, 12-18 and 21 are dependent on claims 11, 19 and 20, and thus are rejected for reasons set forth in the rejection of claims 11, 19 and 20. 

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 20 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 20 recites the following limitation “training, by a feedback training engine, a prediction machine learning model using a first training data set and a second training data set, ...and wherein the second training data set is based on the first training data set and a predicted enjoyment signal”. However, it’s not clear how the prediction machine learning model is being trained with the first training data set and a second training data set. As stated in applicant specification, in particular the abstract, the second training data is generated based on the first set of training data and the predicted enjoyment signal generated by a personalized prediction model (prediction machine learning model).  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 3, 5-7, and 9-20 are rejected under 35 U.S.C. 103 as being unpatentable over Burkhart et al. US 2020/0320382 A1 (hereinafter Burkhart) in view of Li et al. US 2021/0264106 A1 (hereinafter Li) in further view of Joachims et al. Unbiased Learning-to-Rank with Biased Feedback (hereinafter Joachims). 

Regarding claim 1: 
Burkhart teaches A computer-implemented method (Fig. 1, element 102, "Computing Device"),
the method comprising: training,  a prediction machine learning model using a first set of training data comprising one or more training examples...  (Note examiner is interpreting the second training data set in Burkhart reference as the “first set of training data”.  Burkhart [0060] teaches the second training data set includes training data specific to the digital experience such as samples (i.e., examples),  “each sample of training data in the training data set also includes known ratings (the ground truths) for multiple movies that the ensemble deep learning model 124 is generating a prediction for”.  Furthermore, Burkhart  Fig. 3 and [0062] teaches an estimator ensemble – element 202 (i.e., prediction machine learning model) being trained using the second training data set – element 308). 
generating, by the trained version of the prediction machine learning model, a predicted enjoyment signal associated with playback of a digital content item; and (Burkhart [0016] “The digital experience generation system leverages an ensemble deep learning model that generates recommendations to enhance the digital experience”. In addition, Burkhart [0034] teaches the content items and [0020] the predicted enjoyment signal “the estimator output values generated by the estimators in the estimator ensemble from the second training data set”). 
training,  a content ranking machine learning model using a second set of training data (estimator output values),  generates the second set of training data, based on  the predicted enjoyment signal (Burkhart disclosed [0062] in which a second training data set (i.e., first training data set) is input to the estimators in the estimator ensemble to generate an estimator output value (i.e., a second set of training data) for each sample of training data, the estimator output values are input to the neural network (i.e., content ranking machine learning model) and for each sample of training data in the second training data set the neural network generates a “digital experience enhancement recommendation based on based on the estimator output values generated by the estimators in the estimator ensemble from the second training data set”. The digital experience enhancement recommendation can be view as the predicted enjoyment signal associated with the content items. Further, Burkhart teaches [0063] “the first stage 302 and the second stage 306 can be fed with samples from the first training data set 304 and the second training data set 308 in various batch sizes”.  The examiner is interpreting the second training data set as the “first set of training data” and the estimator output value as the “second set of training data”, which in combination are being used as input to the neural network. Therefore, this shows how the estimator output values (second set of training data) are based on the second training data (first set of training data). In addition, Burkhart [0124] teaches how the estimator output values are generated form sample of training data, and [0059-0060] teaches how training data set contain sample (training examples) such as data for multiple users and training features such as “ratings that the user gave movies in the past).
Burkhart does not disclose a feedback training engine training a prediction machine learning model, and wherein the feedback training engine re-weights the one or more training examples using an inverse propensity weight that is based on a frequency of a content feedback. 
Nevertheless, Li teaches the following: 
training, by a feedback training engine, a prediction machine learning model using a first set of training data comprising one or more training examples,... ( Li Fig. 4 and [0048] teaches the training data set - element 455 being provided to the training mechanism - element 430 (i.e., training engine) which is then used to train the teacher models, since the training data set is received by the training mechanism this will inheritably generate the first training data to train each teacher models. Therefore, the training mechanism generates the first training data. In addition, Fig. 4 and [0041] teaches a training mechanism is used for training each teacher model (i.e., prediction machine learning model). See annotated Fig 4 Below

    PNG
    media_image1.png
    674
    666
    media_image1.png
    Greyscale

). 
training, by the feedback training engine, a content ranking machine learning model using a second set of training data, wherein the feedback training engine generates the second set of training data, based on the first set of training data and the predicted enjoyment signal (output) ( Li [0020] teaches using machine learning models, which generally includes various algorithms that automatically build and improve over time. “The foundation of these algorithms is generally built on mathematics and statistics” that can be employed to predict, classify, diagnose. Li Fig. 4 teaches the training mechanism (i.e., feedback training engine) generates the second set of training data. Further, Li Fig. 5 and [0047]  teaches using a teacher model to generate outputs – element 520 which can be seen as predicted enjoyment signal based on the provided training data set - element 510 (i.e., first training data set), and teaches generating a second training data that is “The output may then be used by an inference mechanism to infer soft labeled soft labeled training data” - element 525. Moreover, Li [0048] teaches “once all required data has been provided, the data may be used to train the student model (i.e., content ranking machine learning model)).
Li is also in the same field of endeavor as Burkhart (machine learning for content recommendation). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the functionality of generating a second training data based on the first training data and output of the prediction model as being disclosed and taught by Li,  in the system taught by Burkhart to yield the predictable results of improve accuracy of recommendation models such that by using knowledge distillation techniques it enables to provide “an improved method of training ML models to increase accuracy and efficiency” (see Li [0018]). 
Neither Burkhart or Li specifically disclose ...wherein the feedback training engine re-weights the one or more training examples using an inverse propensity weight that is based on a frequency of a content feedback. 

However, Joachims teaches the following: 
...wherein the feedback training engine re-weights the one or more training examples using an inverse propensity weight that is based on a frequency of a content feedback; ( Examiner notes, Inverse Propensity Scoring (IPS) and Inverse Probability Weighting (IPW) are used herein interchangeably. Joachims Abstract teaches a counterfactual inference framework (i.e., training engine) and pg. 3, sec: 4 Partial-Info Learning to Rank, left col., para 4 & right col., para 1 teaches the inverse propensity scoring being used to reweight the training examples, in the following function:
    PNG
    media_image2.png
    267
    611
    media_image2.png
    Greyscale
. In particular, the inverse propensity scoring is used to reweight training examples based on estimated probabilities that a particular feedback (i.e., click) was observed. In addition, Joachims pg. 8, sec: 7.6 Real-World Experiment, para. 2, teaches a ratio of observed click-through rates is used as to estimate the propensity, thus suggesting the inverse propensity weight is derived from observed feedback frequency). 
	Joachims is also in the same field of endeavor as Burkhart and Li (machine learning). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the functionality of inverse propensity score, as being disclosed and taught by Joachims, in the system taught by Burkhart and Li to yield the predictable results of provide a learning method that is “highly effective in dealing with biases, that it is robust to noise and propensity model misspecification, and that it scales efficiently” (Joachims Abstract). 

Regarding Claim 3:
	Burkhart, Li, and Joachims teach the computer-implemented method of claim 1. Burkhart specifically teaches further comprising: generating, based on a second weight, a transformed predicted enjoyment signal (Burkhart [0121] “The neural network 402 includes an input layer 404, a hidden layer 406, and an output layer 408”, & [0020] “The neural network includes various filters or nodes with weights that are tuned (e.g., trained)”.  Further, Burkhart teaches a transformed predicted enjoyment signal [0039] “The estimator ensemble 202 uses various different estimators to generate estimation values, illustrated as estimator output values 210. The neural network 204 uses the estimator output values 210 to generate the digital experience enhancement recommendation 206” and Fig. 1 element 104 teaches the digital generating system with a computer device to process and transform content. Which is then rendered in a user interference for output by the digital experience generation system). 

Regarding Claim 5:
	Burkhart, Li, and Joachims teach the computer-implemented method of claim 3. Burkhart specifically teaches wherein generating the second set of training data further comprises combining the transformed predicted enjoyment signal with a first ranking weight used to generate a second ranking weight (Burkhart [0020] teaches the predicted enjoyment signal and weights of the neural network “the estimator output values generated by the estimators in the estimator ensemble from the second training data set. The neural network includes various filters or nodes with weights that are tuned”, the weights of the neural networks can be “a raking weight”).

Regarding Claim 6:
	Burkhart, Li, and Joachims teach the computer-implemented method of claim 5.  Burkhart specifically teaches, further comprising generating, using the content ranking machine learning model, one or more content recommendations based on the second ranking weight (Burkhart Fig. 2 teaches an “ensemble deep learning model” element 124, receiving an “enhancement request” element 208, to then produce “digital experience recommendation” element 206. Burkhart [0020] teaches the neural network includes various filters or nodes with weights that are tuned”, thus the weights of the neural networks include a “second ranking weight” that is used to provide content recommendations).

Regarding Claim 7:
Burkhart, Li, and Joachims teach the computer-implemented method of claim 1. Burkhart specifically teaches wherein the predicted enjoyment signal is associated with a probability that a user who did not provide user feedback enjoyed the playback of the digital content item (Burkhart [0082] teaches the “implicit preference a user gives to an item through the act of using the item (e.g., if the item is a movie, then watching and rating the movie)”, & [0107]  teaches “A probability distribution can be used by the digital experience generation system” to provide the probability of a movie being rated). 

Regarding Claim 9:
	Burkhart, Li, and Joachims teach the computer-implemented method of claim 1. Burkhart specifically teaches further comprising: determining a loss function based on the second set of training data; and determining, based on the loss function, whether a threshold condition is achieved (Burkhart [0041] teaches that any of a variety of loss functions or algorithms can be used to train the machine learning systems). 

Regarding claim 10: 
	Burkhart, Li, and Joachims teach The computer-implemented method of claim 9. Burkhart teaches further comprising: updating one or more parameters of the content ranking machine learning model to reduce at least one of: mean square error, mean absolute error, smooth mean absolute error, log-cosh loss, quantile loss associated with the loss function ( Examiner will like to emphasize, the claim as presented recites updating one or more parameters of the content ranking machine learning model to reduce at least one of: mean square error... (emphasizes added), for which Burkhart [0048] teaches updating the parameter of the neural network (i.e., content ranking machine learning model) using Adam optimizer, and teaches the Mean Square Error (MSE) is used as the objective function to minimize during training). 

Regarding claim 11: is rejected under the same rational of claim 1. Claim 11 only recites the additional element of One or more non-transitory computer-readable media storing instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of... for which for which Burkhart Fig. 8 element 806 and [0143] teaches the computer-readable storage media.

Regarding Claim 12:
Burkhart, Li and Liu teach The one or more non-transitory computer-readable media of claim 11, further storing instructions that, when executed by the one or more processors, cause the one or more processors to perform the steps of:... (and thus the rejection of Claim 11 is incorporated). 
Burkhart specifically teaches updating one or more parameters of the (Burkhart [0084] teaches “For all parameter updates, the Adam optimizer is used”, & [0004] teaches “The estimators in an estimator ensemble are each trained, using the first training data set”).
Neither Burkhart and Joachims teach how parameter can be updated using the Adam optimizer though it does not specifically disclose the personalized prediction model being updated. 

However, Li teaches the following: 
...updating  the prediction machine learning model based on the first set of training data ( Li [0022] teaches how the training data (first training data set) can be continually updated, and one or more of the ML models (such as the prediction machine learning model) used by the system can be regenerated (updated) to reflect the updates to the training data).

Regarding claim 13: is rejected under the same rational of claim 3.

Regarding Claim 14:
Burkhart, Li and Joachims teach The one or more non-transitory computer-readable media of claim 13. Burkhart specifically teaches wherein the second weight is associated with a monotonic function configured to optimize a range of the predicted enjoyment signal (Burkhart [0020] teaches weights included with a neural network and Burkhart [0121] teaches “mapping and normalization layer 410, which maps the logits L to a mapped value using the function                          
                            
                                
                                    1
                                
                                
                                    1
                                    +
                                    
                                        
                                            e
                                        
                                        
                                            -
                                            L
                                        
                                    
                                
                            
                             
                        
                     These mapped values are normalized to produce probabilities for one of multiple (e.g., 5) potential item values”).

Regarding claim 15: is rejected under the same rational of claim 5. 
Regarding claim 16: is rejected under the same rational of claim 6. 
Regarding claim 17: is rejected under the same rational of claim 7. 
Regarding claim 18: is rejected under the same rational of claim 8. 
Regarding claim 19: is rejected under the same rational of claim 1. Claim 19 only recites the additional element of A system, comprising: a memory storing one or more software applications; and a processor that, when executing the one or more software applications, is configured to perform the steps of... for which for which Burkhart Fig. 8 element 804 and [0143] teaches a processing system.

Regarding claim 20:
A computer-implemented method, the method comprising: ( Burkhart Fig. 1, element 102, "Computing Device").
training,  a prediction machine learning model using a first training data set and wherein the second training data set is based on the first training data set and a predicted enjoyment signal; (Burkhart Fig. 3 teaches an estimator ensemble – element 202 (i.e., prediction machine learning model) trained with Training Data Set 2 – element 308 (i.e., first training data set). In addition, Burkhart Fig.3 teaches the estimator output value ( i.e., second training data set) is based on the Training Data Set 2, since the Training Data Set 2 is used as input to the estimator ensemble. Furthermore, Burkhart [0060] teaches the Training Data Set 2 includes data from multiple users such as ratings that the user gave to movies. Therefore, the Training Data Set 2 includes “predicted enjoyment signal” (e.g., rating) which are also used for training the estimator ensemble ). 
processing one or more attributes associated with a given user using a content ranking machine learning model to identify a set of content items, wherein the content ranking machine learning model is trained on the second training data set; and ( Burkhart [0020] teaches “for each sample of training data in the second training data set, the neural network generates a digital experience enhancement recommendation”. The digital experience recommendation can be seen as the content items identified such that can be presented to the user. Further, Fig. 3 teaches the neural network - element 204 (i.e., content ranking machine learning model) is trained using the estimator output values - element 210 ( second training data set)).
presenting at least a subset of the set of content items to the given user ( Burkhart [0028] teaches a “digital experience” is used to display different data (i.e., at least a subset of the set of content items) in different manners. Further, [0034] and Fig. 1 element 116 teaches the use of a display device used to display content items – element 106 to the user, such content “can take various forms, such as image content, video content, mixed media content, and so forth”).
Burkhart does not explicitly teach the prediction machine learning model being trained by the feedback engine, using first and second data set, neither wherein the feedback training engine re-weights one or more training examples of the first training data set using an inverse propensity weight that is based on a frequency of a content feedback behavior. 

Nevertheless, Li teaches the following: 
training, by a feedback training engine, a prediction machine learning model using a first training data set and a second training data set, ...and wherein the second training data set is based on the first training data set and a predicted enjoyment signal; ( Li Fig. 1 teaches models being trained by a training mechanism – element 430 (i.e., feedback training engine) and [0005] teaches utilizing the first training data set and the second training data set to train a machine learning model ( i.e., a prediction machine learning model). Further, Li Fig. 5 and [0047]  teaches using a teacher model to generate outputs – element 520 which can be seen as predicted enjoyment signal based on the provided training data set - element 510 (i.e., first training data set), further teaches generating a second training data that is “The output may then be used by an inference mechanism to infer soft labeled soft labeled training data” - element 525. Therefore, the second training data set is based on the first training data set and a predicted enjoyment signal).
Neither Burkhart or Li teaches ...wherein the feedback training engine re-weights one or more training examples of the first training data set using an inverse propensity weight that is based on a frequency of a content feedback behavior.... 

However, Joachims teaches the following: 
...wherein the feedback training engine re-weights the one or more training examples of the first training data set using an inverse propensity weight that is based on a frequency of a content feedback behavior.... ( Examiner notes, Inverse Propensity Scoring (IPS) and Inverse Probability Weighting (IPW) are used herein interchangeably. Joachims Abstract teaches a counterfactual inference framework (i.e., training engine) and pg. 3, sec: 4 Partial-Info Learning to Rank, left col., para 4 & right col., para 1 teaches the inverse propensity scoring being used to reweight the training examples, in the following function:
    PNG
    media_image2.png
    267
    611
    media_image2.png
    Greyscale
. In particular, the inverse propensity scoring is used to reweight training examples based on estimated probabilities that a particular feedback (i.e., click) was observed. In addition, Joachims pg. 8, sec: 7.6 Real-World Experiment, para. 2, teaches a ratio of observed click-through rates is used as to estimate the propensity, thus suggesting the inverse propensity weight is derived from observed feedback frequency). 

Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Burkhart, Li, Joachims in further view of Fang et al. US 2019/0251446 A1 (hereinafter Fang).
Regarding claim 2:
Burkhart, Li and Joachims teach The computer-implemented method of claim 1. Burkhart [0062] and Fig. 3 specifically teaches a neural network (i.e., content ranking machine learning model) and similar Li [0048] teaches training a student model (i.e., content ranking machine learning model). 
Neither Burkhart, Li and Joachims disclose updating one or more parameter of the content ranking machine learning model by the feedback training engine for the training. 
Nevertheless, Fang teaches the following: 
...wherein one or more parameters of the content ranking machine learning model is updated by the feedback training engine for the training ( Fang [0075] teaches a “personalized ranking model” (i.e., content ranking machine learning model) that is used to generate personalized ranking items for a user. Further, [0024] teaches a fashion recommendation system (i.e., feedback training engine) is being implemented to train the personalized ranking model and [0078] teaches the fashion recommendation system can train “the personalized ranking model 230 for one or more iterations”. As would be familiar to one skilled in the art, if a model such as the “personalized ranking model” is being trained for one or more iteration, this will inheritably involve updating the model internal parameter (like weights and biases) with each iteration). 
	Fang is also in the same field of endeavor as Burkhart, Li and Joachims (machine learning). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the functionality of update the parameters of a ranking model using the training engine, as being disclosed and taught by Fang, in the system taught by Burkhart, Li and Joachims to yield the predictable results of “provide benefits and/or solve one or more of the foregoing or other problems in the art with systems, computer media, and methods for effectively providing personalized fashion recommendations to users using deep learning visually-aware techniques trained using implicit user feedback” (see Fang [0008]).

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Burkhart, Li, Joachims in further view of Anatomise Biostats Transforming Skewed Data: How to choose the right transformation for your distribution (hereinafter Anatomise). 

Regarding claim 4: 
Burkhart, Li and Joachims teach The computer-implemented method of claim 3. Burkhart specifically teaches wherein the second weight is associated with a monotonic function of a probability of positive feedback, and the monotonic function is configured to optimize a range of the predicted enjoyment signal  (Burkhart [0020] teaches weights included with a neural network and Burkhart [0121] teaches “mapping and normalization layer 410, which maps the logits L to a mapped value using the function                          
                            
                                
                                    1
                                
                                
                                    1
                                    +
                                    
                                        
                                            e
                                        
                                        
                                            -
                                            L
                                        
                                    
                                
                            
                             
                        
                     These mapped values are normalized to produce probabilities for one of multiple (e.g., 5) potential item values”).
Neither Burkhart, Li and Joachims teach the monotonic function is configured to increase a spread of weights in an instance in which the probability of positive feedback is greater than a threshold value.

Nevertheless, Anatomise teaches the following:
the monotonic function is configured to... increasing a spread of weights in an instance in which the probability of positive feedback is greater than a threshold value ( Anatomise pg. 1, para. 2, line 1 & pg. 2, para. 1, lines 1-3, teaches monotonic transformation such as square root, natural log, log base 10 and inverse transformation, which are a specific application of a monotonic function, that can increase the spread of weights related to for positively skewed data that is >0). 
Anatomise is also in the same field of endeavor as Burkhart, Li and Joachims (artificial intelligence). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the functionality of monotonic transformations as being disclosed and taught by Anatomise, in the system taught by Burkhart, Li and Joachims to yield the predictable results of “improve normality, homogeneity of variance or both” (Anatomise pg. 1, para. 1, line 4 ). 

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Burkhart, Li, Joachims in further view of Wang et al. Learning to Rank with Selection Bias in Personal Search (hereinafter Wang). 

Regarding claim 8: 
Burkhart, Li, and Joachims teach The computer-implemented method of claim 1. 
While  Joachims teaches the inverse propensity weight, neither Burkhart, Li, and Joachims specifically teach wherein the inverse propensity weight is generated based on a probability of one or more users providing user feedback associated with the playback of the digital content item. 
Nonetheless, Wang teaches: 
wherein the inverse propensity weight is generated based on a probability of one or more users providing user feedback associated with the playback of the digital content item ( Wang pg. 3, right col., sec: 3.3 Inverse Propensity Weighting, para. 2 teaches the “ inverse propensity weighting,                         
                            
                                
                                    P
                                
                                ^
                            
                            
                                
                                    Q
                                
                            
                             
                        
                    is known as the propensity score of                         
                            Q
                        
                    ”, specifically teaches the inverse propensity weights                         
                            w
                            Q
                            =
                            P
                            
                                
                                    Q
                                
                            
                            ∕
                            
                                
                                    P
                                
                                ^
                            
                            
                                
                                    Q
                                
                            
                        
                     is the ratio between the probability of a query                         
                            Q
                        
                     appearing in the data set                         
                            U
                        
                     and the probability                         
                            Q
                        
                     that actually appears in the sample                         
                            S
                        
                    . Thus suggesting the inverse propensity weight is based on a probability that a user feedback (e.g. clicks) is associated with the playback of the digital content). 
	Wang is also in the same field of endeavor as Burkhart, Li, and Joachims (machine learning). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the functionality of inverse propensity weight based on a probability that a user feedback interaction with content, as being disclosed and taught by Wang, in the system taught by Burkhart, Li, and Joachims to yield the predictable results of providing a theoretical framework for eliminating selection bias in personal search and providing an extensive empirical evaluation using large-scale live experiments (Wang pg. 2, left col. para. 1)

Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Burkhart, Li, Joachims in further view of Jeong et al. WO 2021/089863 A1 (hereinafter Jeong). 

Regarding claim 21: 
Burkhart, Li and Joachims teach The computer-implemented method of claim 1. 
Neither Burkhart, Li or Joachims teach wherein the inverse propensity weight is further based on one or more distribution differences between training data and inference data. 
Nevertheless, Jeong teaches the following:
 wherein the inverse propensity weight is further based on one or more distribution differences between training data and inference data ( Jeong [0015] teaches using the inverse propensity scoring (i.e., inverse propensity weight) to correct for distribution unbalance between a baseline policy and a target policy. Thus, Jeong teaches the inverse propensity scoring is applies to address distribution differences between the baseline policy (i.e., training data) and the target policy (i.e., inference data)). 
	Jeong is also in the same field of endeavor as Burkhart, Li and Joachims (machine learning). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the functionality of inverse propensity scoring for mitigate distribution differences, as being disclosed and taught by Jeong, in the system taught by Burkhart, Li and Joachims to yield the predictable results of improve the process for training and deploying a model (Jeong [0016]). 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
McInerney et al. Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions (2020). 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GISEL G FACCENDA whose telephone number is (703)756-1919. The examiner can normally be reached Monday - Friday 8:00 am - 4:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Al Kawsar can be reached at (571) 270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/G.G.F./Examiner, Art Unit 2127                         

/JEREMY L STANLEY/Examiner, Art Unit 2127
Read full office action
Prosecution Timeline

Mar 04, 2021
Application Filed
Jun 14, 2024
Non-Final Rejection — §103, §112
Sep 24, 2024
Response Filed
Oct 22, 2024
Final Rejection — §103, §112
Dec 03, 2024
Response after Non-Final Action
Dec 13, 2024
Response after Non-Final Action
Feb 20, 2025
Request for Continued Examination
Feb 27, 2025
Response after Non-Final Action
Mar 06, 2025
Non-Final Rejection — §103, §112
Jun 10, 2025
Response Filed
Aug 11, 2025
Final Rejection — §103, §112
Oct 10, 2025
Response after Non-Final Action
Dec 12, 2025
Request for Continued Examination
Dec 21, 2025
Response after Non-Final Action
Jan 07, 2026
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/654,146
Patent 12511538
HYBRID GRAPH-BASED PREDICTION MACHINE LEARNING FRAMEWORKS
2y 5m to grant Granted Dec 30, 2025
17/586,178
Patent 12450489
METHOD, SYSTEM AND APPARATUS FOR FEDERATED LEARNING
2y 5m to grant Granted Oct 21, 2025
17/142,822
Patent 12393863
Distributed Training Method and System, Device and Storage Medium
2y 5m to grant Granted Aug 19, 2025
17/352,224
Patent 12314852
METHOD FOR RECOMMENDING OBJECT, COMPUTING DEVICE AND COMPUTER-READABLE STORAGE MEDIUM
2y 5m to grant Granted May 27, 2025
17/402,938
Patent 12242970
Incremental cluster validity index-based offline clustering for machine learning
2y 5m to grant Granted Mar 04, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
56%
Grant Probability
99%
With Interview (+49.2%)
3y 11m
Median Time to Grant
High
PTA Risk
Based on 16 resolved cases by this examiner. Grant probability derived from career allow rate.
MACHINE LEARNING TECHNIQUES FOR GENERATING ENJOYMENT SIGNALS FOR WEIGHTING TRAINING DATA

This examiner grants 56% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email