DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to the Application filed on 6/20/2023. Claims 1-20 are pending in the case. Claims 1, 8, and 15 are independent claims.
Claim Interpretation
Each independent claim recites a method step of “obtaining a target sample, wherein the target sample comprises:” followed by one clause describing the contents of the target sample and several other clauses describing other method steps. The indentation of these following clauses implies that all are comprised by the “obtaining” step. However, logic dictates that the target sample cannot contain method steps and only the one clause describing the contents of the target sample is comprised by the “obtaining” step. Examiner suggests correcting the indentation for clarity.
Claim Objections
Claims 9 and 16 are objected to because of the following informalities:
Claim 9 recites “comprising” where “comprising one or more instructions executable by a computer system to perform one or more operations, comprising” was apparently intended.
Claim 16 recites “comprising” where “comprising one or more instructions that, when executed by the one or more computers, perform one or more operations, comprising” was apparently intended.
Appropriate correction is required.
Claim Rejections - 35 U.S.C. § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA 35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. § 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-6, 8-13, and 15-20 are rejected under 35 U.S.C. § 102(a)(1) as being anticipated by Hu et al. (CN 111523044 A, citations to attached machine translation, hereinafter Hu).
As to independent claim 1, Hu discloses a computer-implemented method for prediction model training, comprising:
obtaining a target sample (“generating input data based on the acquired user behavior information, target object information, and environmental information in a predetermined time interval,” page 1 last line to page 2 line 2), wherein the target sample comprises:
a sample feature, a first label, and a second label, wherein a user corresponds to the target sample, wherein the first label indicates whether a target object is clicked on by the user, and wherein the second label indicates whether the user implements a target behavior related to the target object (“The input data includes at least the user's click feature, browsing feature, purchase feature, and target object feature for the target object,” page 2 lines 2-4);
performing model processing on the sample feature by using a prediction model, wherein the prediction model comprises a first branch and a second branch, wherein the first branch outputs a first probability that the target object is clicked on by the user, and wherein the second branch outputs a second probability that the user implements the target behavior (“via the first neural network The network model extracts the features of the sub-input data to predict the user’s click probability. The first neural network model is optimized based on the first loss function; through the second neural network model, the features of the sub-input data are extracted to predict the target The conversion probability of the object, the second neural network model is optimized based on the second loss function, the second neural network model and the first neural network model at least share the embedding layer (embedding),” page 2 lines 7-13);
determining a first loss based on a first label value of the first label and the first probability (“The first neural network model is optimized based on the first loss function,” page 2 lines 8-9); and
when a predetermined condition is satisfied (“The input data includes at least the user's click feature,” page 2 lines 2-3):
determining a second loss based on a second label value of the second label and the second probability (“the second neural network model is optimized based on the second loss function,” page 2 lines 11-12); and
determining a predicted loss of the target sample based on the first loss and the second loss, wherein the predetermined condition comprises the first label value, which indicates that the target object is clicked on by the user (“The third loss function is determined based on the first loss function, the first predetermined weight, the second loss function, and the second predetermined weight,” page 2 lines 16-18); and
training the prediction model based on the predicted loss (“based on the user’s click probability, the information about the target object The conversion probability and the third loss function predict the recommended probability of the target object,” page 2 lines 13-15).
As to dependent claim 2, Hu further discloses a method comprising:
when the predetermined condition is not satisfied (“The input data includes at least the user's click feature,” page 2 lines 2-3), determining a third loss based on a first product of the first label value and the second label value and a second product of the first probability and the second probability (“Loss3=W1* Loss1+ W2* Loss2,” page 12 last line); and
determining the predicted loss based on the first loss and the third loss (“Loss3=W1* Loss1+ W2* Loss2,” page 12 last line).
As to dependent claim 3, Hu further discloses a method wherein the predetermined condition comprises: the second label value indicates that the user does not implement the target behavior (“The input data includes at least … purchase feature,” page 2 lines 2-3).
As to dependent claim 4, Hu further discloses a method wherein:
the prediction model comprises an embedding layer (“embedding layer (embedding),” page 2 line 13); and
the model processing comprises:
encoding the sample feature into an embedding vector by using the embedding layer (“mapping the click feature, the browsing feature and the purchase feature, the target object feature, and the environmental feature into vectors via the same embedding matrix,” page 3 lines 5-7); and
inputting the embedding vector separately into the first branch and the second branch (“the second neural network model and the first neural network model at least share the embedding layer (embedding),” page 2 lines 12-13).
As to dependent claim 5, Hu further discloses a method wherein:
the sample feature comprises:
a user feature of the user and an object feature of the target object (“The input data includes at least the user's click feature, browsing feature, purchase feature, and target object feature for the target object,” page 2 lines 2-4); and
encoding the sample feature into an embedding vector by using the embedding layer, comprises:
encoding the user feature into a first vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2);
encoding the object feature into a second vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2); and
aggregating the first vector and the second vector to obtain the embedding vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2).
As to dependent claim 6, Hu further discloses a method wherein:
the sample feature comprises:
an interaction feature of the user and the target object (“The input data includes at least the user's click feature, browsing feature, purchase feature, and target object feature for the target object,” page 2 lines 2-4); and
encoding the sample feature into an embedding vector by using the embedding layer comprises:
encoding the interaction feature into a third vector, wherein the embedding vector is obtained based on the third vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2).
As to independent claim 8, Hu discloses a non-transitory, computer-readable medium storing one or more instructions executable by a computer system (“methods, computing devices, and computer storage media,” page 1 section “Technical field” line 2) to perform one or more operations, comprising:
obtaining a target sample (“generating input data based on the acquired user behavior information, target object information, and environmental information in a predetermined time interval,” page 1 last line to page 2 line 2), wherein the target sample comprises:
a sample feature, a first label, and a second label, wherein a user corresponds to the target sample, wherein the first label indicates whether a target object is clicked on by the user, and wherein the second label indicates whether the user implements a target behavior related to the target object (“The input data includes at least the user's click feature, browsing feature, purchase feature, and target object feature for the target object,” page 2 lines 2-4);
performing model processing on the sample feature by using a prediction model, wherein the prediction model comprises a first branch and a second branch, wherein the first branch outputs a first probability that the target object is clicked on by the user, and wherein the second branch outputs a second probability that the user implements the target behavior (“via the first neural network The network model extracts the features of the sub-input data to predict the user’s click probability. The first neural network model is optimized based on the first loss function; through the second neural network model, the features of the sub-input data are extracted to predict the target The conversion probability of the object, the second neural network model is optimized based on the second loss function, the second neural network model and the first neural network model at least share the embedding layer (embedding),” page 2 lines 7-13);
determining a first loss based on a first label value of the first label and the first probability (“The first neural network model is optimized based on the first loss function,” page 2 lines 8-9); and
when a predetermined condition is satisfied (“The input data includes at least the user's click feature,” page 2 lines 2-3):
determining a second loss based on a second label value of the second label and the second probability (“the second neural network model is optimized based on the second loss function,” page 2 lines 11-12); and
determining a predicted loss of the target sample based on the first loss and the second loss, wherein the predetermined condition comprises the first label value, which indicates that the target object is clicked on by the user (“The third loss function is determined based on the first loss function, the first predetermined weight, the second loss function, and the second predetermined weight,” page 2 lines 16-18); and
training the prediction model based on the predicted loss (“based on the user’s click probability, the information about the target object The conversion probability and the third loss function predict the recommended probability of the target object,” page 2 lines 13-15).
As to dependent claim 9, Hu further discloses a medium comprising:
when the predetermined condition is not satisfied (“The input data includes at least the user's click feature,” page 2 lines 2-3), determining a third loss based on a first product of the first label value and the second label value and a second product of the first probability and the second probability (“Loss3=W1* Loss1+ W2* Loss2,” page 12 last line); and
determining the predicted loss based on the first loss and the third loss (“Loss3=W1* Loss1+ W2* Loss2,” page 12 last line).
As to dependent claim 10, Hu further discloses a medium wherein the predetermined condition comprises: the second label value indicates that the user does not implement the target behavior (“The input data includes at least … purchase feature,” page 2 lines 2-3).
As to dependent claim 11, Hu further discloses a medium wherein:
the prediction model comprises an embedding layer (“embedding layer (embedding),” page 2 line 13); and
the model processing comprises:
encoding the sample feature into an embedding vector by using the embedding layer (“mapping the click feature, the browsing feature and the purchase feature, the target object feature, and the environmental feature into vectors via the same embedding matrix,” page 3 lines 5-7); and
inputting the embedding vector separately into the first branch and the second branch (“the second neural network model and the first neural network model at least share the embedding layer (embedding),” page 2 lines 12-13).
As to dependent claim 12, Hu further discloses a medium wherein:
the sample feature comprises:
a user feature of the user and an object feature of the target object (“The input data includes at least the user's click feature, browsing feature, purchase feature, and target object feature for the target object,” page 2 lines 2-4); and
encoding the sample feature into an embedding vector by using the embedding layer, comprises:
encoding the user feature into a first vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2);
encoding the object feature into a second vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2); and
aggregating the first vector and the second vector to obtain the embedding vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2).
As to dependent claim 13, Hu further discloses a medium wherein:
the sample feature comprises:
an interaction feature of the user and the target object (“The input data includes at least the user's click feature, browsing feature, purchase feature, and target object feature for the target object,” page 2 lines 2-4); and
encoding the sample feature into an embedding vector by using the embedding layer comprises:
encoding the interaction feature into a third vector, wherein the embedding vector is obtained based on the third vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2).
As to independent claim 15, Hu discloses a computer-implemented system, comprising:
one or more computers (“methods, computing devices, and computer storage media,” page 1 section “Technical field” line 2); and
one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions (“methods, computing devices, and computer storage media,” page 1 section “Technical field” line 2) that, when executed by the one or more computers, perform one or more operations, comprising:
obtaining a target sample (“generating input data based on the acquired user behavior information, target object information, and environmental information in a predetermined time interval,” page 1 last line to page 2 line 2), wherein the target sample comprises:
a sample feature, a first label, and a second label, wherein a user corresponds to the target sample, wherein the first label indicates whether a target object is clicked on by the user, and wherein the second label indicates whether the user implements a target behavior related to the target object (“The input data includes at least the user's click feature, browsing feature, purchase feature, and target object feature for the target object,” page 2 lines 2-4);
performing model processing on the sample feature by using a prediction model, wherein the prediction model comprises a first branch and a second branch, wherein the first branch outputs a first probability that the target object is clicked on by the user, and wherein the second branch outputs a second probability that the user implements the target behavior (“via the first neural network The network model extracts the features of the sub-input data to predict the user’s click probability. The first neural network model is optimized based on the first loss function; through the second neural network model, the features of the sub-input data are extracted to predict the target The conversion probability of the object, the second neural network model is optimized based on the second loss function, the second neural network model and the first neural network model at least share the embedding layer (embedding),” page 2 lines 7-13);
determining a first loss based on a first label value of the first label and the first probability (“The first neural network model is optimized based on the first loss function,” page 2 lines 8-9); and
when a predetermined condition is satisfied (“The input data includes at least the user's click feature,” page 2 lines 2-3):
determining a second loss based on a second label value of the second label and the second probability (“the second neural network model is optimized based on the second loss function,” page 2 lines 11-12); and
determining a predicted loss of the target sample based on the first loss and the second loss, wherein the predetermined condition comprises the first label value, which indicates that the target object is clicked on by the user (“The third loss function is determined based on the first loss function, the first predetermined weight, the second loss function, and the second predetermined weight,” page 2 lines 16-18); and
training the prediction model based on the predicted loss (“based on the user’s click probability, the information about the target object The conversion probability and the third loss function predict the recommended probability of the target object,” page 2 lines 13-15).
As to dependent claim 16, Hu further discloses a system comprising:
when the predetermined condition is not satisfied (“The input data includes at least the user's click feature,” page 2 lines 2-3), determining a third loss based on a first product of the first label value and the second label value and a second product of the first probability and the second probability (“Loss3=W1* Loss1+ W2* Loss2,” page 12 last line); and
determining the predicted loss based on the first loss and the third loss (“Loss3=W1* Loss1+ W2* Loss2,” page 12 last line).
As to dependent claim 17, Hu further discloses a system wherein the predetermined condition comprises: the second label value indicates that the user does not implement the target behavior (“The input data includes at least … purchase feature,” page 2 lines 2-3).
As to dependent claim 18, Hu further discloses a system wherein:
the prediction model comprises an embedding layer (“embedding layer (embedding),” page 2 line 13); and
the model processing comprises:
encoding the sample feature into an embedding vector by using the embedding layer (“mapping the click feature, the browsing feature and the purchase feature, the target object feature, and the environmental feature into vectors via the same embedding matrix,” page 3 lines 5-7); and
inputting the embedding vector separately into the first branch and the second branch (“the second neural network model and the first neural network model at least share the embedding layer (embedding),” page 2 lines 12-13).
As to dependent claim 19, Hu further discloses a system wherein:
the sample feature comprises:
a user feature of the user and an object feature of the target object (“The input data includes at least the user's click feature, browsing feature, purchase feature, and target object feature for the target object,” page 2 lines 2-4); and
encoding the sample feature into an embedding vector by using the embedding layer, comprises:
encoding the user feature into a first vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2);
encoding the object feature into a second vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2); and
aggregating the first vector and the second vector to obtain the embedding vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2).
As to dependent claim 20, Hu further discloses a system wherein:
the sample feature comprises:
an interaction feature of the user and the target object (“The input data includes at least the user's click feature, browsing feature, purchase feature, and target object feature for the target object,” page 2 lines 2-4); and
encoding the sample feature into an embedding vector by using the embedding layer comprises:
encoding the interaction feature into a third vector, wherein the embedding vector is obtained based on the third vector (“Using the same embedding matrix to provide mapping vectors for multiple fields,” page 16 lines 1-2).
Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA 35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 C.F.R. § 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. § 102(b)(2)(C) for any potential 35 U.S.C. § 102(a)(2) prior art against the later invention.
Claims 7 and 14 are rejected under 35 U.S.C. § 103 as being unpatentable over Hu in view of Lardeux et al. (US 2020/0134696 A1, hereinafter Lardeux).
As to dependent claim 7, the rejection of claim 2 is incorporated.
Hu does not appear to expressly teach a method wherein the prediction model comprises: a gating unit and a product calculation unit, wherein the gating unit blocks a target path when the predetermined condition is satisfied and conducts the target path when the predetermined condition is not satisfied, and wherein the target path is used to transmit the first probability and the second probability to the product calculation unit to calculate the second product.
Lardeux teaches a method wherein the prediction model comprises: a gating unit and a product calculation unit, wherein the gating unit blocks a target path when the predetermined condition is satisfied and conducts the target path when the predetermined condition is not satisfied, and wherein the target path is used to transmit the first probability and the second probability to the product calculation unit to calculate the second product (“embodiments of the invention thus employ two-level cascaded machine learning models, … The second-level machine learning model has the benefit of additional features that may be derived from the output of the first-level machine learning model, which may enable improved predictions of user behaviors, such as item selection (e.g. ‘click’) and list conversion (e.g. purchase) rates,” paragraph 0016 lines 1-2, 5-10).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the model of Hu to comprise the cascade of Lardeux. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known software development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely calculating the product only when relevant (“embodiments of the invention thus employ two-level cascaded machine learning models, … The second-level machine learning model has the benefit of additional features that may be derived from the output of the first-level machine learning model, which may enable improved predictions of user behaviors, such as item selection (e.g. ‘click’) and list conversion (e.g. purchase) rates,” Lardeux paragraph 0016 lines 1-2, 5-10). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
As to dependent claim 14, the rejection of claim 9 is incorporated.
Hu does not appear to expressly teach a medium wherein the prediction model comprises: a gating unit and a product calculation unit, wherein the gating unit blocks a target path when the predetermined condition is satisfied and conducts the target path when the predetermined condition is not satisfied, and wherein the target path is used to transmit the first probability and the second probability to the product calculation unit to calculate the second product.
Lardeux teaches a medium wherein the prediction model comprises: a gating unit and a product calculation unit, wherein the gating unit blocks a target path when the predetermined condition is satisfied and conducts the target path when the predetermined condition is not satisfied, and wherein the target path is used to transmit the first probability and the second probability to the product calculation unit to calculate the second product (“embodiments of the invention thus employ two-level cascaded machine learning models, … The second-level machine learning model has the benefit of additional features that may be derived from the output of the first-level machine learning model, which may enable improved predictions of user behaviors, such as item selection (e.g. ‘click’) and list conversion (e.g. purchase) rates,” paragraph 0016 lines 1-2, 5-10).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the model of Hu to comprise the cascade of Lardeux. (1) The Examiner finds that the prior art included each claim element listed above, although not necessarily in a single prior art reference, with the only difference between the claimed invention and the prior art being the lack of actual combination of the elements in a single prior art reference. (2) The Examiner finds that one of ordinary skill in the art could have combined the elements as claimed by known software development methods, and that in combination, each element merely performs the same function as it does separately. (3) The Examiner finds that one of ordinary skill in the art would have recognized that the results of the combination were predictable, namely calculating the product only when relevant (“embodiments of the invention thus employ two-level cascaded machine learning models, … The second-level machine learning model has the benefit of additional features that may be derived from the output of the first-level machine learning model, which may enable improved predictions of user behaviors, such as item selection (e.g. ‘click’) and list conversion (e.g. purchase) rates,” Lardeux paragraph 0016 lines 1-2, 5-10). Therefore, the rationale to support a conclusion that the claim would have been obvious is that the combining prior art elements according to known methods to yield predictable results to one of ordinary skill in the art. See MPEP § 2143(I)(A).
Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant’s disclosure:
US 2021/0264319 A1 disclosing a split machine learning module for predicting user behavior
Applicant is required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
In the interests of compact prosecution, Applicant is invited to contact the examiner via electronic media pursuant to USPTO policy outlined MPEP § 502.03. All electronic communication must be authorized in writing. Applicant may wish to file an Internet Communications Authorization Form PTO/SB/439. Applicant may wish to request an interview using the Interview Practice website: http://www.uspto.gov/patent/laws-and-regulations/interview-practice.
Applicant is reminded Internet e-mail may not be used for communication for matters under 35 U.S.C. § 132 or which otherwise require a signature. A reply to an Office action may NOT be communicated by Applicant to the USPTO via Internet e-mail. If such a reply is submitted by Applicant via Internet e-mail, a paper copy will be placed in the appropriate patent application file with an indication that the reply is NOT ENTERED. See MPEP § 502.03(II).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ryan Barrett whose telephone number is 571 270 3311. The examiner can normally be reached 9:00am to 5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor Michelle Bechtold can be reached at 571 431 0762. The fax phone number for the organization where this application or proceeding is assigned is 571 273 8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Ryan Barrett/
Primary Examiner, Art Unit 2148