DETAILED ACTION
In the response filed August 22, 2025, the Applicant amended claims 1, 9, and 17. Claims 1, 2, 4-10, and 12-22, are pending in the current application.
Notice of AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments for claims 1, 2, 4-10, and 12-22, with respect to the 35 U.S.C. 101 rejection have been considered. The current interpretation of the amended claims (see below regarding the interpretation of the claims) recite a combination of additional elements that together serve to integrate the abstract idea into a practical application.
Specifically, the additional elements recite a specific manner of training a reinforcement learning-based machine learning model for determining advertising campaign actions. Thus, the claims are eligible because the claim as a whole meaningfully integrates the method of organizing human activity into a practical application.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f):
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) because the claim limitations use a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitations are:
“providing, by a computing system, to a reinforcement learning-based machine learning model, observations received from an online advertisement environment;” recited in claim 1.
“receiving, by the computing system, from the reinforcement learning-based machine learning model, one or more budget allocation actions…” recited in claim 1.
“updating, by the computing system during said training…” recited in claim 1.
“receiving, by the computing system from said actor after said training…” recited in claim 1.
“performing, by the computing system based on the one or more budget allocation actions received…” recited in claim 1.
“providing, by a computing system, to a reinforcement learning-based machine learning model, observations received from an online advertisement environment;” recited in claim 9.
“receiving, by the computing system, from the reinforcement learning-based machine learning model, one or more bid update actions…” recited in claim 9.
“updating, by the computing system during said training…” recited in claim 9.
“receiving, by the computing system from said actor after said training…” recited in claim 9.
“automatically tendering, by the computing system based on the one or more bid update actions…” recited in claim 9.
“providing, by a computing system, to a reinforcement learning-based machine learning model, observations received from an online advertisement environment;” recited in claim 17.
“receiving, by the computing system, from the reinforcement learning-based machine learning model, one or more bid multiplier actions…” recited in claim 17.
“generating, by the computing system, using the critic, one or more of a cost per action error signal, a pacing error signal, or a spend rate error signal…” recited in claim 17.
“updating, by the computing system during said training…” recited in claim 17.
“receiving, by the computing system from said actor after said training…” recited in claim 17.
“automatically tendering, by the computing system based on the one or more bid multiplier actions…” recited in claim 17.
Because these claim limitations are being interpreted under 35 U.S.C. 112(f), they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. If applicant does not intend to have these limitations interpreted under 35 U.S.C. 112(f), Applicant may: (1) amend the claim limitations to avoid them being interpreted under 35 U.S.C. 112(f) (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitations recite sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f).
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim 1, 2, 4-10, and 12-22, are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claims 1, 9, and 17 recite the limitation "said training" in the claim language “updating, by the computing system during said training using the generated temporal difference error signal, one or more parameters of the actor, wherein said parameters include one or more neural network weights.” There is insufficient antecedent basis for this limitation in the claim. The claim as written does not recite a specific training step. The claim does recite a step of providing observations to a machine learning model in training and receiving one or more budget allocation actions from the machine learning model in training. It is unclear if the limitation of “said training” is referring to the providing step, the receiving step, or both. As such, the claim is indefinite for failing to distinctly claim the invention.
Claims 1, 9, and 17 recite the limitation "said trained actor" in line 23, 24, and 25 respectively. There is insufficient antecedent basis for this limitation in the claim. It is unclear as to what the what the trained actor is referring to. The claim as written does not recite a specific trained actor. The claim does recite said actor after said training but it is unclear if the limitation of “said trained actor” is referring to the limitation of “wherein the actor learns during said training” or the limitation of “receiving, by the computing system from said actor after said training”. As such, the claim is indefinite for failing to distinctly claim the invention.
Dependent claims 2, 4-6, 10, 12-14, and 18-20, which depend from claims 1, 9, and 17, inherit the deficiencies noted for claims 1, 9, and 17.
For purposes of examination, the claims are interpreted to read as such:
1. A computer-implemented method, comprising:
training, by a computing system comprising of one or more processors, a reinforcement learning-based machine learning model to seek a policy that minimizes a penalty reward issued by the online advertisement environment,
wherein the training of the reinforcement learning-based machine learning model includes:
generating a temporal difference error signal,
receiving, by an actor, the generated temporal difference error signal from a critic,
updating, by using the generated temporal difference error signal, one or more parameters of the actor, wherein said parameters include one or more neural network weights,
training, the actor, by learning one or more sequences of budget movement decisions, wherein the budget movement decisions redistributes a budget among at least one of ad campaigns, or ad sets of an ad campaign,
providing to the reinforcement learning-based machine learning model, observations received from an online advertisement environment, and
receiving from the reinforcement learning-based machine learning model, one or more budget allocation actions;
receiving, by the computing system from the actor after the training, one or more budget allocation actions; and
performing, by the computing system based on the one or more budget allocation actions received from said trained actor, one or more of:
automatically reducing budget at a first ad campaign and increasing budget at a second ad campaign, or
automatically reducing budget at a first ad set and increasing budget at a second ad set.
9. A computer-implemented method, comprising:
training, by a computing system comprising of one or more processors, a reinforcement learning-based machine learning model to seek a policy that minimizes a penalty reward issued by the online advertisement environment,
wherein the training of the reinforcement learning-based machine learning model includes:
generating a temporal difference error signal,
receiving, by an actor, the generated temporal difference error signal from a critic,
updating, by using the generated temporal difference error signal, one or more parameters of the actor, wherein said parameters include one or more neural network weights; and
training, the actor, by learning one or more sequences of budget movement decisions, wherein the budget movement decisions redistributes a budget among at least one of ad campaigns, or ad sets of an ad campaign,
providing to the reinforcement learning-based machine learning model, observations received from an online advertisement environment, and
receiving from the reinforcement learning-based machine learning model, one or more budget allocation actions;
receiving, by the computing system from the actor after the training, one or more budget allocation actions; and
performing, by the computing system based on the one or more budget allocation actions received from said trained actor, one or more of:
automatically reducing budget at a first ad campaign and increasing budget at a second ad campaign, or
automatically reducing budget at a first ad set and increasing budget at a second ad set.
17. A computer-implemented method, comprising:
training, by a computing system comprising of one or more processors, a reinforcement learning-based machine learning model to seek a policy that minimizes a penalty reward issued by the online advertisement environment,
wherein the training of the reinforcement learning-based machine learning model includes:
generating a temporal difference error signal,
receiving, by an actor, the generated temporal difference error signal from a critic,
updating, by using the generated temporal difference error signal, one or more parameters of the actor, wherein said parameters include one or more neural network weights; and
training, the actor, by learning one or more sequences of budget movement decisions, wherein the budget movement decisions redistributes a budget among at least one of ad campaigns, or ad sets of an ad campaign, and
providing to the reinforcement learning-based machine learning model, observations received from an online advertisement environment;
receiving from the reinforcement learning-based machine learning model, one or more budget allocation actions;
receiving, by the computing system from the actor after the training, one or more budget allocation actions; and
performing, by the computing system based on the one or more budget allocation actions received from said trained actor, one or more of:
automatically reducing budget at a first ad campaign and increasing budget at a second ad campaign, or
automatically reducing budget at a first ad set and increasing budget at a second ad set.
Allowable Subject Matter
Claims 1, 2, 4-10, and 12-22, would be allowable subject matter if revised and amended to overcome the rejections under 35 U.S.C. 35 U.S.C. 112(b) as set forth in this Office action.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Patrick Kim whose telephone number is (571)272-8619. The examiner can normally be reached Monday - Friday, 9AM - 5PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Resha Desai can be reached at (571)270-7792. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Patrick Kim/Examiner, Art Unit 3628 /RESHA DESAI/Supervisory Patent Examiner, Art Unit 3628