Last updated: May 29, 2026

Application No. 18/341,398

REINFORCEMENT MACHINE LEARNING WITH MULTI-LEVEL AGENT SEARCH AND HYPERPARAMETER OPTIMIZATION

Non-Final OA §103

Filed

Jun 26, 2023

Examiner

BRAHMACHARI, MANDRITA

Art Unit

2144

Tech Center

2100 — Computer Architecture & Software

Assignee

International Business Machines Corporation

OA Round

1 (Non-Final)

Interview Optional

— +29.8% interview lift. Examiner has a relatively high allowance rate (76%); +29.8% interview lift. A written response may suffice.

Based on 408 resolved cases, 2023–2026

Examiner Intelligence

BRAHMACHARI, MANDRITA View full profile →

Grants 76% — above average

Career Allowance Rate

312 granted / 408 resolved

+21.5% vs TC avg

Strong +30% interview lift

Without

With

+29.8%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

21 currently pending

Career history

436

Total Applications

across all art units

Statute-Specific Performance

§101

0.4%

-39.6% vs TC avg

§103

85.6%

+45.6% vs TC avg

§102

2.1%

-37.9% vs TC avg

§112

2.0%

-38.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 408 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
The action is in response to claims dated 6/26/2023
Claims pending in the case: 1-20	


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3-8, 10-14, 16-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu (US 20220366318) in view of Liu (Adaptive and Efficient GPU Time Sharing for Hyperparameter Tuning in Cloud – please refer to attached for claim mapping).

Regarding Claim 1,  Wu teaches, A method of training machine learning models comprising:
identifying, via at least one processor, a plurality of configurations for the machine learning models, wherein each configuration indicates a machine learning model and a corresponding technique to determine parameters for the machine learning model (Wu: Fig. 2 [28-29]: identify hyperparameter combinations; learned parameters of the ML models are based on the hyperparameters (technique));
evaluating, via the at least one processor, the plurality of configurations by training the machine learning model of the plurality of configurations according to the parameters determined by the corresponding technique (Wu: Fig. 2 [28-29]: evaluate by training); 
monitoring, via the at least one processor, performance of the machine learning models of the plurality of configurations (Wu: Fig. 2 [38]: monitor model performance); and 
adjusting, via the at least one processor, resources used for evaluating at least one configuration based on the performance of the machine learning model for the at least one configuration relative to the priority of the machine learning models of others of the plurality of configurations (Wu: [37]: adjust  resources based on priority);
However, Wu does not specifically teach, 
based on the priority of performance of the machine learning model 
Liu teaches, adjusting resources based on the performance of the machine learning model (Liu: Pg. 1 col 2 section 1 [2], Pg. 2 col 1 [1-3], Pg. 4 col 1 section 2.1 [2]: adjust resource based on accuracy/performance during hyperparameter tuning);
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of  Wu and Liu because the combination would enable adjust resources used based not only on priority but also on performance. One of ordinary skill in the art would have been motivated to combine the teachings because the combination would provide an option to adjust resources adaptively according to accuracy at run time (see Liu Pg. 1 col 1 [1]). 

Regarding claim 3, Wu and Liu teach the invention as claimed in claim 1 above and, wherein adjusting the resources for evaluating the at least one configuration comprises: terminating the evaluation of the at least one configuration based on the at least one configuration having a machine learning model with lesser performance relative to the performance of the machine learning models of others of the plurality of configurations (Liu: Pg. 2 col 2 section 1 [3], Pg. 4 col 2 [1]: low accuracy terminated).

Regarding claim 4, Wu and Liu teach the invention as claimed in claim 1 above and, wherein monitoring performance of the machine learning models comprises:
pausing evaluation of the plurality of configurations at an intermediate portion of the evaluation; and producing a report for the performance of the machine learning models of the plurality configurations (Wu: [38]: generate performance report after training (pause)) (Liu: Pg. 3 col 1, Pg. 5 section 2.3 [1]: approaches using checkpoints pauses at checkpoints to save report files).

Regarding claim 5, Wu and Liu teach the invention as claimed in claim 4 above and, wherein adjusting the resources for evaluating the at least one configuration comprises:
allocating additional resources to resume the evaluation of the at least one configuration based on the at least one configuration having a machine learning model with greater performance relative to the performance of the machine learning models of others of the plurality of configurations (Wu: [37]: adjust  resources based on priority) (Liu: Pg. 3 col 1 [4], Pg. 2 col 1 [1-3], Pg. 4 col 1 section 2.1 [2]: adjust resource based on accuracy/performance during hyperparameter tuning).

Regarding claim 6, Wu and Liu teach the invention as claimed in claim 1 above and, further comprising: 
identifying one or more configurations with a machine learning model providing greater performance relative to performance of machine learning models of others of the plurality of configurations (Wu: Fig. 4 [38, 40]: select based on performance) (Liu: Pg. 4 col 1 section 2.1 [1]: select the highest performer set).

Regarding claim 7, Wu and Liu teach the invention as claimed in claim 1 above and, further comprising:
controlling, via the at least one processor, the evaluation of a configuration based on defining a search space for the evaluation of the configuration comprising a set of machine learning hyperparameters and value ranges for the set of machine learning hyperparameters (Wu: [32, 37]: define search space) (Liu Pg. 2 col section 1 [1]: search space).

Regarding Claim(s) 8, 10-13, this/these claim(s) is/are similar in scope as claim(s) 1, 4-7, respectively. Therefore, this/these claim(s) is/are rejected under the same rationale.

Regarding Claim(s) 14, 16-120, this/these claim(s) is/are similar in scope as claim(s) 1, 3-7 respectively. Therefore, this/these claim(s) is/are rejected under the same rationale.

Claim(s) 2, 9, 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu (US 20220366318) and Liu (Adaptive and Efficient GPU Time Sharing for Hyperparameter Tuning in Cloud – please refer to attached for claim mapping) in view of Worth (US 20240289639).

Regarding claim 2, Wu and Liu teach the invention as claimed in claim 1 above and, wherein the machine learning model of one or more configurations includes a … learning agent and the corresponding technique includes a hyperparameter optimization technique, … (Wu: Fig. 2 [38]: monitor model performance; tune the hyperparameters (learning agent) ) (Liu: Pg. 2 col 2 [3]: iterations of training );
However, Wu and Liu do not specifically teach, configurations includes a reinforcement learning agent;
and wherein the one or more configurations further indicate an environment for the reinforcement learning agent;
Worth teaches, configurations includes a reinforcement learning agent (Worth: Fig. 2a, [41-42, 72]: RL agent using performance);
and wherein the one or more configurations further indicate an environment for the reinforcement learning agent (Worth: Fig. 2a, [43, 47]: RL agent using information on environment);
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of  Wu, Liu and Worth because the combination would enable using an agent in the hyperparameter tuning process. One of ordinary skill in the art would have been motivated to combine the teachings because the combination would provide an option to “automatically adapt a behavior of a learning agent to optimize and/or improve performance in an environment” (see Worth [2-3]). 

Regarding Claim(s) 9 and 15, this/these claim(s) is/are similar in scope as claim(s) 2. Therefore, this/these claim(s) is/are rejected under the same rationale.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure in attached 892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MANDRITA BRAHMACHARI whose telephone number is (571)272-9735. The examiner can normally be reached Monday to Friday, 11 am to 8 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached at 571 272 4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Mandrita Brahmachari/Primary Examiner, Art Unit 2144

Read full office action

Prosecution Timeline

Jun 26, 2023

Application Filed

Apr 01, 2026

Non-Final Rejection mailed — §103

May 28, 2026

Interview Requested

Precedent Cases

Applications granted by this same examiner with similar technology

17/382,108

Patent 12632696

SYSTEM, CIRCUIT, DEVICE AND/OR PROCESSES FOR ACCUMULATING NEURAL NETWORK SIGNALS

4y 10m to grant Granted May 19, 2026

17/878,096

Patent 12626109

EVENT-DRIVEN ACCELERATOR SUPPORTING INHIBITORY SPIKING NEURAL NETWORK

3y 9m to grant Granted May 12, 2026

18/060,095

Patent 12614094

SYSTEM AND METHOD FOR MANAGING DATA PROCESSING SYSTEMS HOSTING DISTRIBUTED INFERENCE MODELS

3y 5m to grant Granted Apr 28, 2026

17/747,672

Patent 12608588

METHODS, SYSTEMS, AND MEDIA FOR CONTEXTUAL DISCRIMINATIVE EXPLANATION OF CONVOLUTIONAL NEURAL NETWORKS

3y 11m to grant Granted Apr 21, 2026

17/600,603

Patent 12596746

AUDIO PREVIEWING METHOD, APPARATUS AND STORAGE MEDIUM

4y 6m to grant Granted Apr 07, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

76%

Grant Probability

99%

With Interview (+29.8%)

2y 11m (~0m remaining)

Median Time to Grant

Low

PTA Risk

Based on 408 resolved cases by this examiner. Grant probability derived from career allowance rate.