Last updated: May 29, 2026
Application No. 17/789,132
METHOD AND APPARATUS FOR TRAINING INFORMATION PREDICTION MODELS, METHOD AND APPARATUS FOR PREDICTING INFORMATION, AND STORAGE MEDIUM AND DEVICE THEREOF

Non-Final OA §103§112
Filed
Jun 24, 2022
Priority
Dec 25, 2019 — CN 201911360658.2 +1 more
Examiner
MAUNI, HUMAIRA ZAHIN
Art Unit
2141
Tech Center
2100 — Computer Architecture & Software
Assignee
BIGO TECHNOLOGY PTE. LTD.
OA Round
3 (Non-Final)
This examiner grants 42% of cases after interview

— +58.9% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 19 resolved cases, 2023–2026
Examiner Intelligence

MAUNI, HUMAIRA ZAHIN View full profile →
Grants 42% of resolved cases
Career Allowance Rate
8 granted / 19 resolved
-12.9% vs TC avg
Strong +59% interview lift
Without
With
+58.9%
Interview Lift
resolved cases with interview
Typical timeline
4y 1m
Avg Prosecution
15 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
4.4%
-35.6% vs TC avg
§103
91.2%
+51.2% vs TC avg
§102
2.9%
-37.1% vs TC avg
§112
1.5%
-38.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 19 resolved cases
Office Action

§103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/09/2025 has been entered.

Claim Objections
Claims 1 and 15 are objected to because of the following informalities: “ wherein the information item to be recommended is determined by the information prediction computer device based on prediction results corresponding to candidate information items in a case that the prediction results are generated by the information prediction computer device using a third information recommendation model;” should be “wherein the information item to be recommended is determined by the information prediction computer device based on prediction results corresponding to candidate information items , wherein the prediction results are generated by the information prediction computer device using a third information recommendation model;”.  Appropriate correction is required.
Claim 10 is objected to because of the following informalities: “determines an information item to be recommended in the candidate information items based on the prediction results” should be “determining an information item to be recommended in the candidate information items based on the prediction results”. Appropriate correction is required.

 Response to Amendment
Claims 1, 5, 7, 9-11, 14-17, and 21 remain pending within the application.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1, 5, 7, 9-11, 14-17 and 21 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 1, 10, and 15 recites the limitation "wherein the third information recommendation model is acquired by training by a computer device for training an information recommendation model and published to a server to enable the information prediction computer device to acquire;".  It is unclear whether the information prediction computer device acquires “an information item”, “information recommendation model”, or some other embodiment. There is insufficient antecedent basis for this limitation. For examination purposes, the examiner is interpreting “published to a server to enable the information prediction computer device to acquire;” to be “published to a server to enable the information prediction computer device;”.
Dependent claims 5, 7, 9, 11, 14, 16, 17 and 21 inherit the deficiency and therefore are rejected on the same basis.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5, 7, 9-11, 14-17 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over  Bilenko et al. (Pub. No.: US 2014/0337096 A1), hereafter Bilenko, in view of Zhang et al. ("A Deep Joint Network for Session-based News Recommendations with Contextual Augmentation"), hereafter Zhang.

Regarding claim 1, Bilenko discloses:
A method for recommending information, executed by a recommendation system, the method comprising: acquiring an information item to be recommended from an information prediction computer device, and recommending the information item to a user (Bilenko, Fig. 6, ¶[0057], ¶[0099] and ¶[0106] teaches acquiring and recommending content items, such as ads, to a user by a recommendation system),
wherein the information item to be recommended is determined by the information prediction computer device based on prediction results corresponding to candidate information items in a case that the prediction results are generated by the information prediction computer device using a third information recommendation model (Bilenko, Fig. 12, Fig. 5, Fig. 6 and ¶[0087], ¶[0094], and ¶[0106] teaches the recommended ad to be determined based on the prediction results corresponding to candidate information items, i.e. candidate ads, where the prediction results are generated by the system device using the final deployed model 518 as the third information prediction model),
wherein the third information recommendation model is acquired by training by a computer device for training an information recommendation model and published to a server to enable the information prediction computer device to acquire (Bilenko, Fig. 12, Fig. 13, Fig. 6 and ¶[0086-0087] teaches acquiring the final deployed model by training an information recommendation model using the processing device and publishing the trained prediction model in the prediction module to a server),
wherein the third information recommendation model is acquired by the training by the computer device for training the information recommendation model through the following processes: acquiring a set of training samples corresponding to a current training period (Bilenko, Fig. 1, ¶[0032] and ¶[0057] teaches Data Collection Process 108 acquiring a set of training samples corresponding to a current training period for training an information prediction model 106),
wherein training samples in the set of training samples comprise feature items, feature attribute values corresponding to the feature items, and behavior data of a user for information items (Bilenko, Figs. 1 and 2, ¶[0002] and ¶[0033-0034] teaches feature vectors of feature items with corresponding attribute values, and user-related aspects as behavior data of a user for information items),
the feature items comprising at least one of features of the user and features of the information items (Bilenko, Figs. 1 and 3, and ¶[0002] teaches the feature vectors to comprise user IDs and ad IDs as features of users and information items),
acquiring current behavior statistics data by performing statistical collection on the behavior data in the set of training samples (Bilenko, Fig. 1 and ¶[0003] teaches statistical information corresponding to feature information as current behavior statistics data acquired by performing statistical collection on the behavior data in the set of training samples),
acquiring a second information recommendation model by updating, based on the current behavior statistics data, first behavior statistics data in a first information recommendation model, wherein the first information recommendation model corresponds to a previous training period (Bilenko, Fig. 8 and ¶[0094] teaches a second training period where the model is trained based on the first training period and based on the current behavior statistics data, first behavior statistics data in a first information recommendation model),
wherein the first information recommendation model comprises an information recommendation model based on deep neural networks (DNN) (Bilenko, ¶[0082] and ¶[0098] teaches the prediction model to be a deep neural network),
acquiring a trained third information recommendation model by training the second information recommendation model based on the set of training samples (Bilenko, Fig. 5 and ¶[0094] teaches acquiring the final deployed model 518 as the trained third information prediction model by training the second information prediction model based on the set of training samples),
wherein acquiring the current behavior statistics data by performing the statistical collection on the behavior data in the set of training samples comprises: acquiring first behavior statistics amounts in the first behavior statistics data corresponding to the feature attribute values present in the set of training samples (Bilenko, Fig. 7 and ¶[0090-0092] teaches updating the statistical information to acquire first behavior statistics amounts in the first behavior statistics data corresponding to the feature attribute values present in the set of training samples),  
acquiring current behavior statistics amounts corresponding to the feature attribute values by superimposing behavior data corresponding to the feature attribute values present in the set of training samples …(Bilenko, Figs. 7 and 10, Fig. 11, ¶[0102-0104] teaches acquiring current behavior statistics amounts corresponding to the feature attribute values by  superimposing behavior data on first behavior statistic amount through the prediction model providing plural instances of statistical information and using post-deployment data to perform further training),
acquiring the current behavior statistics data by aggregating the current behavior statistics amounts corresponding to the feature attribute values (Bilenko, Fig. 11 and ¶[0104] teaches generating subsets of data via aggregation module as aggregating the current behavior statistics amounts corresponding to the feature attribute values).

Bilenko teaches acquiring first behavior statistics amounts, but does not teach:
calculating a product of the first … amounts and a predetermined time decay factor.
Zhang teaches:
calculating a product of the first … amounts and a predetermined time decay factor (Zhang, page 206, left column, paragraph 2, last 2 lines “The decay rates are multiplied by the output values from LSTM RNN layer to form the final outputs” teaches calculating a product of the first amounts and a predetermined time decay factor).

Bilenko teaches acquiring current behavior statistics amounts corresponding to the feature attribute values by superimposing behavior data corresponding to the feature attribute values present in the set of training samples … but does not teach superimposing the data on the product. 
Zhang teaches:
superimposing … data … on the product (Zhang, page 206, Equation 15 and 2 lines below equation 15 “λ is the parameter that needs to be tuned during training, and controls the decay rate for the news.”, page 202, right column, paragraph 2, last 3 lines “we adopt time-decay function to reduce the weight of the historical news articles, and character-level encoding to alleviate sparsity problem.” And Fig. 1 teaches superimposing data on the product throughout training of the model recited in Figure 1).

Bilenko and Zhang are analogous art because they are from the same field of endeavor, feature engineering, recommendations, and machine learning models.
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Bilenko to include calculating a product of the first amounts and a predetermined time decay factor and superimposing data on the product, based on the teachings of Zhang. One of ordinary skill in the art would have been motivated to make this modification in order to improve accuracy while simplify feature engineering steps, as suggested by Zhang (Zhang, page 203, left column, paragraph 1, last line).

Regarding claim 5, Bilenko, in view of Zhang, discloses the method according to claim 1. Bilenko further discloses:
wherein the first information recommendation model comprises an embedding layer and a fully connected layer, the fully connected layer receiving the embedding layer and the first behavior statistics data (Bilenko, ¶[0098] teaches an input vector as an embedding layer received by the input layer, i.e. fully connected layer of a neural network of the prediction model),
acquiring the trained third information recommendation model by training the second information recommendation model based on the set of training samples comprises: acquiring the trained third information recommendation model by updating parameters of the embedding layer and the fully connected layer in the second information recommendation model by means of training the second information recommendation model based on the set of training sample (Bilenko, Fig. 5, ¶[0082] and ¶[0098] teaches generating a trained third model 518 by updating parameters of the embedding layer and the fully connected layer in the second information prediction model by means of training the second information prediction model during model training, based on the set of training samples in collected data 510).


Regarding claim 7, Bilenko, in view of Zhang, discloses the method according to claim 1. Bilenko further discloses:
wherein the feature attribute values are represented by hash values (Bilenko, ¶[0051] teaches feature attribute values represented by hash values).  

Regarding claim 9, Bilenko, in view of Zhang, discloses the method according to claim 1. Bilenko further discloses:
 wherein the firstinformation recommendation model comprises an information recommendation model based on click through rates CTR (Bilenko, ¶[0049] teaches an information recommendation model based on click through rates).  

Regarding claim 10, Bilenko discloses:
A method for recommending information, executed by a recommendation system, the method comprising: acquiring an information item to be recommended from an information prediction computer device, and recommending the information item to a user (Bilenko, Fig. 6, ¶[0057], ¶[0099] and ¶[0106] teaches acquiring and recommending content items, such as ads, to a user by a recommendation system),
wherein the information prediction computer device determines the information item to be recommended through the following processes: acquiring samples corresponding to candidate information items (Bilenko, ¶[0046] teaches acquiring samples corresponding to candidate information items for predicting information),
acquiring a third information recommendation model from a server, wherein the third information recommendation model is acquired by training by a computer device for training an information recommendation model and published to a server to enable the information prediction computer device to acquire (Bilenko, Fig. 12, Fig. 13, Fig. 6 and ¶[0086-0087] teaches acquiring the final deployed model by training an information recommendation model using the processing device and publishing the trained prediction model in the prediction module to a server),
inputting the samples into the third information recommendation model (Bilenko, Fig. 1 and ¶[0046] teaches inputting the samples into the information prediction model),  
determining, based on an output result of the third information recommendation model, prediction results corresponding to the candidate information items; and determines an information item to be recommended in the candidate information items based on the prediction results(Bilenko, Fig. 12, Fig. 5, Fig. 6 and ¶[0087], ¶[0094], and ¶[0106] teaches the recommended ad to be determined based on the prediction results corresponding to candidate information items, i.e. candidate ads, where the prediction results are generated by the system device using the final deployed model 518 as the third information prediction model),
wherein the third information recommendation model is acquired by training by the computer device for training the information recommendation model through the following processes: acquiring a set of training samples corresponding to a current training period (Bilenko, Fig. 1, ¶[0032] and ¶[0057] teaches Data Collection Process 108 acquiring a set of training samples corresponding to a current training period for training an information prediction model 106),
wherein training samples in the set of training samples comprise feature items, feature attribute values corresponding to the feature items, and behavior data of a user for information items (Bilenko, Figs. 1 and 2, ¶[0002] and ¶[0033-0034] teaches feature vectors of feature items with corresponding attribute values, and user-related aspects as behavior data of a user for information items),
the feature items comprising at least one of features of the user and features of the information items (Bilenko, Figs. 1 and 3, and ¶[0002] teaches the feature vectors to comprise user IDs and ad IDs as features of users and information items),
acquiring current behavior statistics data by performing statistical collection on the behavior data in the set of training samples (Bilenko, Fig. 1 and ¶[0003] teaches statistical information corresponding to feature information as current behavior statistics data acquired by performing statistical collection on the behavior data in the set of training samples),  
acquiring a second information recommendation model by updating, based on the current behavior statistics data, first behavior statistics data in a first information recommendation model, wherein the first information recommendation model corresponds to a previous training period (Bilenko, Fig. 8 and ¶[0094] teaches a second training period where the model is trained based on the first training period and based on the current behavior statistics data, first behavior statistics data in a first information recommendation model),
wherein the first information recommendation model comprises an information recommendation model based on deep neural networks (DNN) (Bilenko, ¶[0082] and ¶[0098] teaches the prediction model to be a deep neural network),
acquiring a trained third information recommendation model by training the second information recommendation model based on the set of training samples (Bilenko, Fig. 5 and ¶[0094] teaches acquiring the final deployed model 518 as the trained third information prediction model by training the second information prediction model based on the set of training samples),
wherein acquiring the current behavior statistics data by performing the statistical collection on the behavior data in the set of training samples comprises: acquiring first behavior statistics amounts in the first behavior statistics data corresponding to the feature attribute values present in the set of training samples (Bilenko, Fig. 7 and ¶[0090-0092] teaches updating the statistical information to acquire first behavior statistics amounts in the first behavior statistics data corresponding to the feature attribute values present in the set of training samples),  
acquiring current behavior statistics amounts corresponding to the feature attribute values by superimposing behavior data corresponding to the feature attribute values present in the set of training samples …(Bilenko, Figs. 7 and 10, Fig. 11, ¶[0102-0104] teaches acquiring current behavior statistics amounts corresponding to the feature attribute values by  superimposing behavior data on first behavior statistic amount through the prediction model providing plural instances of statistical information and using post-deployment data to perform further training),
acquiring the current behavior statistics data by aggregating the current behavior statistics amounts corresponding to the feature attribute values (Bilenko, Fig. 11 and ¶[0104] teaches generating subsets of data via aggregation module as aggregating the current behavior statistics amounts corresponding to the feature attribute values).

Bilenko teaches acquiring first behavior statistics amounts, but does not teach:
calculating a product of the first … amounts and a predetermined time decay factor.
Zhang teaches:
calculating a product of the first … amounts and a predetermined time decay factor (Zhang, page 206, left column, paragraph 2, last 2 lines “The decay rates are multiplied by the output values from LSTM RNN layer to form the final outputs” teaches calculating a product of the first amounts and a predetermined time decay factor).

Bilenko teaches acquiring current behavior statistics amounts corresponding to the feature attribute values by superimposing behavior data corresponding to the feature attribute values present in the set of training samples … but does not teach superimposing the data on the product. 
Zhang teaches:
superimposing … data … on the product (Zhang, page 206, Equation 15 and 2 lines below equation 15 “λ is the parameter that needs to be tuned during training, and controls the decay rate for the news.”, page 202, right column, paragraph 2, last 3 lines “we adopt time-decay function to reduce the weight of the historical news articles, and character-level encoding to alleviate sparsity problem.” And Fig. 1 teaches superimposing data on the product throughout training of the model recited in Figure 1).

It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Bilenko to include calculating a product of the first amounts and a predetermined time decay factor and superimposing data on the product, based on the teachings of Zhang. One of ordinary skill in the art would have been motivated to make this modification in order to improve accuracy while simplify feature engineering steps, as suggested by Zhang (Zhang, page 203, left column, paragraph 1, last line).


Regarding claim 11, Bilenko, in view of Zhang, discloses the method according to claim 10. Bilenko further discloses:
wherein the information recommendation model comprises an information recommendation model based on click through rates CTR (Bilenko, ¶[0049] teaches an information recommendation model based on click through rates),
determining, based on the output result of the information recommendation model, the prediction result corresponding to the candidate information items comprises: determining, based on the output result of the information recommendation model, a CTR prediction result corresponding to the candidate information items (Bilenko, ¶[0048-0049], ¶[0067-0069] teaches a training system to determine a CTR prediction result corresponding to candidate information, based on the output result of the prediction model, by forming clusters of user IDs that have similar click through rates in a manner that minimizes the loss of the predictive accuracy),
upon determining, based on the output result of the information recommendation model, the prediction results corresponding to the candidate information items, the method further comprises: determining … the candidate information items based on the CTR prediction result (Bilenko, ¶[0001] teaches serving one or more ads having high click probabilities based on the model output as determining the candidate information items based on the CTR prediction result),
determining… an information item to be recommended in the candidate information items (Bilenko, ¶[0106] teaches determining which ads to display to users as determining an information item to be recommended in the candidate information items).

While Bilenko teaches determining … the candidate information items based on the CTR prediction result, and determining… an information item to be recommended in the candidate information items, they do not explicitly disclose determining the order of information items.
	Zhang teaches:
determining an order of … information items (Zhang, page 203, paragraph 1, last 4 lines “…the system is to predict…A recommendation … is an ordered list of recommended items, where we would want to see the next item as close to the top as possible.” Teaches determining an ordered list of recommendations as determining an order of information items).

It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Bilenko to include determining an order of … information items, based on the teachings of Zhang. One of ordinary skill in the art would have been motivated to make this modification in order to improve accuracy while simplify feature engineering steps, as suggested by Zhang (Zhang, page 203, left column, paragraph 1, last line).


Regarding claim 14, Bilenko, in view of Zhang, discloses the method for recommending information as defined in claim 1. Bilenko further discloses:
A non-transitory computer-readable storage medium, storing a computer program, wherein the computer program, when run by a processor, causes the processor to perform the method for recommending information as defined in claim 1 (Bilenko, Fig. 13 and ¶[0109-0114] teaches a computer-readable storage medium, storing a computer program, wherein the computer program, when run by a processor, causes the processor to perform the method for training information recommendation models as defined in claim 1).

Claim 15 is substantially similar to claim 1, and thus are rejected on the same basis as claim 1.

Regarding claim 16, Bilenko, in view of Zhang, discloses the method for recommending information as defined in claim 10. Bilenko further discloses:
A computer device for predicting information, comprising: a memory, a processor, and a computer program that is stored in the memory and runnable in the processor, wherein the processor, when running the computer program, is caused to perform the method for recommending information as defined in claim 10 (Bilenko, Fig. 13 and ¶[0109-0114] teaches a computer device for predicting information, comprising: a memory, a processor, and a computer program that is stored in the memory and runnable in the processor, wherein the processor, when running the computer program, is caused to perform the method for recommending information as defined in claim 10).  

Regarding claim 17, Bilenko, in view of Zhang, discloses the method for recommending information as defined in claim 10. Bilenko further discloses:
A non-transitory computer-readable storage medium, storing a computer program, wherein the computer program, when run by a processor, causes the processor to perform the method for recommending information as defined in claim 10 (Bilenko, Fig. 13 and ¶[0109-0114] teaches a computer-readable storage medium, storing a computer program, wherein the computer program, when run by a processor, causes the processor to perform the method for predicting information as defined in claim 10).

Claim 21 is substantially similar to claim 5, and thus is rejected on the same basis as claim 5.  




Response to Arguments

Applicant's arguments filed 12/09/2025 have been fully considered with regards to the 35 U.S.C. 101  rejection, and they are persuasive. The rejection is withdrawn.

Applicant's arguments filed 12/09/2025 have been fully considered with regards to the 35 U.S.C. 102/103 rejection, but they are not persuasive. 
The applicant asserts on page 23 of the remarks “Therefore, the updating of the model in Bilenko is based on the master dataset, which includes the data acquired before and after the model deployment. In contrast, claim 1 recites updating the model only based on the data of the current training period. Thus, Bilenko does not teach the updating as recited in claim 1.”. In response to applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., updating a model without post deployment data) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). The independent claims recite “acquiring a set of training samples corresponding to a current training period”, the BRI of which includes any data received during a current training cycle, and thus is taught by Bilenko (Fig. 1 and Fig. 7).
The applicant asserts on page 23 of the remarks that Bilenko does not teach statistical collection as disclosed in the amended claim 1, and further asserts that “Zhang does not involve, disclose, imply, or teach the following contents in the training of an information recommendation model: embedding behavior statistics in the model to enable such statistics to be used synchronously with sample data for training the model; and using a time decay factor to improve the timeliness of the behavioral statistics embedded in the model in each training cycle, thereby enhancing the timeliness of the model's prediction results. Therefore, Zhang does not cure the failures of Bilenko to disclose or render obvious at least the acquiring the current behavior statistics based on a product of the first behavior statistics amounts and a predetermined time decay as recited in claim 1”. In response to applicant's arguments against the Bilenko and Zhang references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). In particular, Bilenko discloses acquiring current behavior statistics data by performing statistical collection on the behavior data in the set of training samples (Fig. 1 and ¶[0003] teaches statistical information corresponding to feature information as current behavior statistics data acquired by performing statistical collection on the behavior data in the set of training samples), but does not teach calculating a product of the first … amounts and a predetermined time decay factor and superimposing the data on the product. Zhang teaches: calculating a product of the first … amounts and a predetermined time decay factor (page 206, left column, paragraph 2, last 2 lines “The decay rates are multiplied by the output values from LSTM RNN layer to form the final outputs” teaches calculating a product of the first amounts and a predetermined time decay factor) and superimposing … data … on the product (Zhang, page 206, Equation 15 and 2 lines below equation 15 “λ is the parameter that needs to be tuned during training, and controls the decay rate for the news.”, page 202, right column, paragraph 2, last 3 lines “we adopt time-decay function to reduce the weight of the historical news articles, and character-level encoding to alleviate sparsity problem.” And Fig. 1 teaches superimposing data on the product throughout training of the model recited in Figure 1). Bilenko and Zhang are analogous art because they are from the same field of endeavor, feature engineering, recommendations, and machine learning models. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Bilenko to include calculating a product of the first amounts and a predetermined time decay factor and superimposing data on the product, based on the teachings of Zhang. One of ordinary skill in the art would have been motivated to make this modification in order to improve accuracy while simplify feature engineering steps, as suggested by Zhang (page 203, left column, paragraph 1, last line). Furthermore, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art.  See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007). 
	Claims 10 and 15 and substantially similar to claim 1, and thus are rejected on the same basis.
Claims dependent on the independent claims do not overcome the deficiencies of the rejected independent claims.

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure. 
Pi et al. (“Practice on Long Sequential User Behavior Modeling for Click-Through Rate Prediction”) teaches recommendation models and behavior data modeling.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUMAIRA ZAHIN MAUNI whose telephone number is (703)756-5654. The examiner can normally be reached Monday - Friday, 9 am - 5 pm (ET).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MATT ELL can be reached at (571) 270-3264. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/H.Z.M./Examiner, Art Unit 2141                                                                                                                                                                                                        
/MATTHEW ELL/Supervisory Patent Examiner, Art Unit 2141
Read full office action
Prosecution Timeline

Jun 24, 2022
Application Filed
Jun 04, 2025
Non-Final Rejection mailed — §103, §112
Sep 04, 2025
Response Filed
Oct 09, 2025
Final Rejection mailed — §103, §112
Dec 09, 2025
Response after Non-Final Action
Jan 06, 2026
Request for Continued Examination
Jan 14, 2026
Response after Non-Final Action
Mar 19, 2026
Non-Final Rejection mailed — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/475,557
Patent 12585969
GENERATING CONFIDENCE SCORES FOR MACHINE LEARNING MODEL PREDICTIONS
4y 6m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 1 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
42%
Grant Probability
99%
With Interview (+58.9%)
4y 1m (~2m remaining)
Median Time to Grant
High
PTA Risk
Based on 19 resolved cases by this examiner. Grant probability derived from career allowance rate.