Last updated: April 18, 2026

Application No. 18/652,152

REINFORCED DATA TRAINING FOR GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

Non-Final OA §101§103

Filed

May 01, 2024

Examiner

IMMANUEL, ILSE I

Art Unit

3699

Tech Center

3600 — Transportation & Electronic Commerce

Assignee

Obrizum Group Ltd.

OA Round

1 (Non-Final)

This examiner grants 23% of cases after interview

— +27.1% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 293 resolved cases, 2023–2026

Examiner Intelligence

IMMANUEL, ILSE I View full profile →

Grants only 23% of cases

Career Allow Rate

68 granted / 293 resolved

-28.8% vs TC avg

Strong +27% interview lift

Without

With

+27.1%

Interview Lift

resolved cases with interview

Typical timeline

4y 7m

Avg Prosecution

47 currently pending

Career history

340

Total Applications

across all art units

Statute-Specific Performance

§101

26.7%

-13.3% vs TC avg

§103

35.4%

-4.6% vs TC avg

§102

6.0%

-34.0% vs TC avg

§112

30.0%

-10.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 293 resolved cases

Office Action

§101 §103

DETAILED ACTION
Acknowledgements
This office action is in response to the claims filed 01/14/2026.
Claims 1-7 are withdrawn.
Claims 8-20 are elected.
Claims 8-20 are pending.
Claims 8-20 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Restriction/Election Acknowledgement 
The Applicant’s election on claims 8-20 without traverse in the reply on 01/14/2026 is acknowledged.  Claims 1-7 are withdrawn from further consideration pursuant to 37 CFR 1.142(b) as being drawn to a nonelected group(s).

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 8-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Subject Matter Eligibility Standard
When considering subject matter eligibility under 35 U.S.C. § 101, it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter (101 Analysis: Step 1). Even if the claim does fall within one of the statutory categories, it must then be determined whether the claim is directed to a judicial exception (i.e., law of nature, natural phenomenon, and abstract idea) (101 Analysis: Step 2a(Prong 1), and if so, Identify whether there are any additional elements recited in the claim beyond the judicial exception(s), and evaluate those additional elements to determine whether they integrate the exception into a practical application of the exception. (101 Analysis: Step 2a (Prong 2). If additional elements does not integrate the exception into a practical application of the exception, claim still requires an evaluation of whether the claim recites additional elements that amount to an inventive concept (aka “significantly more”) than the recited judicial exception. If the claim as a whole amounts to significantly more than the exception itself (there is an inventive concept in the claim), the claim is eligible. If the claim as a whole does not amount to significantly more (there is no inventive concept in the claim), the claim is ineligible. (101 Analysis: Step 2b). 
The 2019 PEG explains that the abstract idea exception includes the following groupings of subject matter: a) Mathematical concepts b) Certain methods of organizing human activity and c) Mental processes
Analysis
In the instant case, claim 8 is directed to a method, and claim 17 is directed to an article of manufacture.

Step 2a.1– Identifying an Abstract Idea
The claims recite the steps of “receiving… options… providing, to a user device, a user interface that displays … receiving, … an allocation of credits … and performing reinforcement learning from human feedback ….” The recited limitations fall within the certain methods of organizing human activity grouping of abstract ideas, specifically, commercial behavior of receiving and using user feedback and reviews. Accordingly, the claims recites an abstract idea.
See MPEP 2106.

Step 2a.2 – Identifying a Practical Application
 The claim does currently recite an additional element but the additional element does not integrate the judicial exception into a practical application. 
The additional element is “based on the user feedback, performing reinforcement learning from human feedback training using a reward model to train the Al model.” According to the disclosure (¶ 3, 16, 18, 34, 70), “Reinforcement learning from human feedback (RLHF) currently works by asking human labelers to select the single best completion to a given document, or the single best response to a model prompt to produce a product, such as artwork or freeform text… Reinforcement learning from human feedback is a means of applying additional policy on top of the intrinsic loss function used when training a machine learning model from a curated dataset without human feedback… Providing feedback for a system's current behavior using human feedback, also known as “Reinforcement Learning From Human Feedback” (RLHF) is an increasingly popular means of creating a reward structure to help shape desired behaviours for the AI models to learn. However, this approach is expensive in terms of both financial costs and time-often requiring hundreds or even thousands of person hours… The reward model provides the reinforcement learning aspect of RLHF. ” First, the disclosure provides conflicting descriptions of what Reinforcement learning from human feedback (RLHF) is and secondly, the receipt and use of human feedback is part of performing Reinforcement learning from human feedback (RLHF) and having rewards. The disclosure does not provide what additional steps are included in “performing reinforcement learning from human feedback training using a reward model…” as this performance is part of the abstract idea of commercial behavior of receiving and using user feedback and reviews. And giving rewards for feedback or reviews are also abstract and drawn to commercial behaviors.
Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. 
Mere instructions to apply the exception using generic computer components and limitations to a particular field of use or technological environment do not amount to practical applications. The claim in directed to an abstract idea.

Step 2b
The claim  limitations recite “receiving… options… providing, to a user device, a user interface that displays … receiving, … an allocation of credits … and performing reinforcement learning from human feedback ….” are not additional elements and they amount to no more than mere instructions to apply the exception using a generic computer component. For the same reason these elements are not sufficient to provide an inventive concept. This is also determined to be well-understood, routine and conventional activity in the field. The Symantec, TLI, and OIP Techs, court decision cited in MPEP 2106.05(d)(II) indicates that mere receipt or transmission of data over a network is a well-understood, routine and conventional function when it is claimed in a merely generic manner, as it is here. Therefore, when considering the additional elements alone, and in combination, there is no inventive concept in the claim and thus the claim is not eligible. 
Viewed as a whole, instructions/method claims recite the concept of a commercial behavior as performed by a generic computer. The claims do not currently recite any additional elements or combination of additional elements that amount to significantly more than the judicial exception. The elements used to perform the claimed judicial exception amount to no more than mere instructions to implement the abstract idea in a network, and/or merely uses a network as a tool to perform an abstract idea and/or generally linking the use of the judicial exception to a particular environment.   
Dependent  Claims 9, 10, 12, 14, 16, and 18 provide descriptive language surrounding the abstract idea. As such, these elements do not provide the significantly more to the underlying abstract idea necessary to render the invention patentable. 
Dependent claims 11, 13, 15, 19 and 20 discuss  functions in more descriptive detail of the steps geared toward the abstract idea. As such, these elements do not provide the significantly more to the underlying abstract idea necessary to render the invention patentable. 

The claims do not, for example, purport to improve the functioning of the computer itself. Nor do they effect an improvement in any other technology or technical field. Therefore, based on case law precedent, the claims are claiming subject matter similar to concepts already identified by the courts as dealing with abstract ideas. See Alice Corp. Pty. Ltd., 573 U.S. 208 (citing Bilski v. Kappos, 561, U.S. 593, 611 (2010)).
The claims at issue amount to nothing significantly more than an instruction to apply the abstract idea using some unspecified, generic computer.  See Alice Corp. Pty. Ltd., 573 U.S. 208. Mere instructions to apply the exception using a generic computer component and limitations to a particular field of use or technological environment cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B. The use of a computer or processor to merely automate and/or implement the abstract idea cannot provide significantly more than the abstract idea itself (MPEP 2106.05(I)(A)(f) & (h)). Therefore, the claim is not patent eligible.
Conclusion
The claim as a whole, does not amount to significantly more than the abstract idea itself. This is because the claim does not affect an improvement to another technology or technical filed; the claim does not amount to an improvement to the functioning of a computer system itself; and the claim does not move beyond a general link of the use of an abstract idea to a particular technological environment. 
Accordingly, the Examiner concludes that there are no meaningful limitations in the claim that transform the judicial exception into a patent eligible application such that the claim amounts to significantly more than the judicial exception itself. 
Dependent claims do not resolve the deficiency of independent claims and accordingly stand rejected under 35 USC 101 based on the same rationale.
Dependent claims 9-16 and 18-20 are also rejected. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 8-20 are rejected under 35 U.S.C. 103 as being unpatentable over Aggarwal et al. (US 20200341976) (“Aggarwal”), and further in view of Heine et al. (US 20180129724) (“Heine”).
Regarding claims 8 and 17, Aggarwal discloses receiving, from an artificial intelligence (Al) model, a plurality of output options ( ¶ 16, 44-50, 107-110);
Aggarwal-  In an example, the AI module 128 is trained using Asynchronous Advantage Actor-Critic (A3C) algorithm, which uses RL. The AI module 128 transmits information associated with the agent action to the NLP engine 126, which translates it to a format that is presentable in the UI 300 b. The NLP engine 126 then transmits a formatted response to the module 156, where the formatted response may be based on the selected action. In some cases, the formatted response may also include the requested search results. ( ¶ 46);

providing, to a user device, a user interface that displays at least the plurality of output options associated with a prompt to be apportioned to the plurality of output options ( ¶ 16, 65-95, 110-121);
Aggarwal- This interaction takes place over multiple interaction cycles with the user, where in a given cycle the RL-based agent prompts, and the user responds (or the user provides an input, and the RL-based agent responds). ( ¶ 16);

receiving, from the user device via the user interface, an allocation of credits to each output option in the plurality of output options, each allocation of credits ranging between a first amount of credits and a second amount of credits and the allocations of credits comprising user feedback on the plurality of output options; and ( ¶ 65-82, 113-121, 141-170);
Aggarwal- User actions may be categorized into two or more feedback categories, such as good, average, bad, etc. (or may be scaled in a scale or 1 to 5, with 5 being best or as intended by the agent 229, and 1 being worst). For example, if the agent 229 prompts the user 101 to refine a search query and the user does follow the prompt, then the agent 229 receives a relatively high extrinsic reward rextrinsic, e.g., because the user 101 played along with the agent 229. On the other hand, if the user 101 refuses to refine the search query, a relatively low (or zero, or even negative) extrinsic reward rextrinsic is awarded to the agent 229.  ( ¶ 71);

based on the user feedback, performing reinforcement learning from human feedback training using a reward model to train the Al model( ¶ 22, 49, 65-80, 141-170);
Aggarwal-  the AI module 128 (e.g., the agent 229) is trained using A3C algorithm, which uses RL. For example, RL is used to select an action of the agent 229, in response to input received from the user 101…Thus, the agent prompts the user to complete one or more auxiliary tasks, during interaction or conversation of the agent with the user. Rewards awarded to the agent, based on the user performing one or more auxiliary tasks, is generally referred to herein as an auxiliary reward... wherein the artificial intelligence module is to: train the RL model using rewards, wherein rewards awarded during a search episode include a first reward for successful completion of the search episode, a second reward, based on user response to an action selected by the RL model, and a third reward for completion of an auxiliary task identified by the RL model. ( ¶ 22, 49, 170);

Aggarwal does not disclose and an amount of credits.

Heine teaches providing, to a user device, a user interface that displays at least the plurality of output options associated with a prompt and an amount of credits to be apportioned to the plurality of output options( ¶ 22-40, 42-47, 52-60, 64-66);
Heine-  the queries received from the requesting users include reward values assigned by the requesting user, wherein the reward values are offered to suggesting users in exchange for description suggestions…Ultimately, with enough entries (e.g., requests and suggestions), the crowd assisted query system may train an image search AI that would then be able to identify items based on a visual search alone ( ¶ 22, 28);

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Aggarwal and Heine in order to assist and train AI in search queries (Heine; ¶ 1, 28).
Regarding claim 9, Aggarwal discloses wherein the first amount of credits is zero and the second amount of credits is a maximum amount of credits that can be allocated to a respective output option ( ¶ 65-75).  
Regarding claims 10 and 18, Aggarwal discloses wherein the user interface further comprises at least one of: a credits deployed indicator indicating a total number of credits deployed; a time to completion indicator indicating an estimated time to completion; a points indicator indicating a total number of earned points; a conversion component configured to convert earned points into available credits; or an ablation indicator, the ablation indicator configured to indicate a portion of at least one output option for ablation ( ¶ 16, 65-95, 110-121).
Regarding claim 11, Aggarwal discloses awarding one or more points based on a determination that the user feedback reduces uncertainty of the Al model ( ¶ 19, 73, 101, 162).  
Regarding claim 12, Aggarwal discloses wherein: the method further comprises recording at least one of: a time-to-answer for the allocation of credits; or a speed of a movement of an input device; and the time-to-answer or the speed of the movement are considered in the determination that the user feedback reduces uncertainty of the Al model ( ¶ 66-68, 81-85, 95).  
Regarding claim 13, Aggarwal discloses subtracting one or more points based on a determination that the user feedback does not reduce uncertainty of the Al model ( ¶ 68-71).  
Regarding claim 14, Aggarwal discloses wherein the Al model comprises a large language model ( ¶ 30, 34, 45, 46, 136; claim 14).  
Regarding claim 15, Aggarwal discloses receiving, via the user interface, audio input or text input as additional feedback about one or more output options ( ¶ 14, 16, 45, 46, 64, 120).  
Regarding claim 16, Aggarwal discloses wherein each output option comprises one or more of: natural language; an image; or a video ( ¶ 14, 16, 45, 46, 64, 120).  
Regarding claim 19, Aggarwal discloses wherein the memory stores further instructions for: awarding one or more points based on the determination that the user feedback reduces the uncertainty of the Al model; and subtracting one or more points based on a determination that the user feedback does not reduce the uncertainty of the Al model ( ¶ 65-71). 
Regarding claim 20, Aggarwal discloses wherein the memory stores further instructions for comparing the user feedback to confidence-weighted user feedback associated with a group of users prior to awarding the one or more points ( ¶ 90-106).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Edwards et al., (US 20240354646) teaches user feedback in training AI.
Bechtel et al. (US 8239228) teaches awarding users points for collaborating on results.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ILSE I IMMANUEL whose telephone number is (469)295-9094. The examiner can normally be reached Monday-Friday 9:00 am to 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, NEHA H PATEL can be reached on (571) 270-1492. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ILSE I IMMANUEL/Primary Examiner, Art Unit 3699

Read full office action

Prosecution Timeline

May 01, 2024

Application Filed

Apr 03, 2026

Non-Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/466,900

Patent 12586062

MULTI-BLOCKCHAIN TOKEN REBALANCER

2y 5m to grant Granted Mar 24, 2026

18/368,749

Patent 12555106

DIGITIZATION OF PAYMENT CARDS FOR WEB 3.0 AND METAVERSE TRANSACTIONS

2y 5m to grant Granted Feb 17, 2026

18/373,845

Patent 12555117

ARCHITECTURES, SYSTEMS, AND METHODS FOR CARD BASED TRANSACTIONS

2y 5m to grant Granted Feb 17, 2026

18/344,547

Patent 12443942

SYSTEMS AND METHODS OF BLOCKCHAIN TRANSACTION RECORDATION

2y 5m to grant Granted Oct 14, 2025

14/975,235

Patent 12430635

SYSTEMS AND METHODS FOR AN ACCOUNT ISSUER TO MANAGE A MOBILE WALLET

2y 5m to grant Granted Sep 30, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

23%

Grant Probability

50%

With Interview (+27.1%)

4y 7m

Median Time to Grant

Low

PTA Risk

Based on 293 resolved cases by this examiner. Grant probability derived from career allow rate.