Prosecution Insights
Last updated: May 29, 2026
Application No. 18/341,398

REINFORCEMENT MACHINE LEARNING WITH MULTI-LEVEL AGENT SEARCH AND HYPERPARAMETER OPTIMIZATION

Non-Final OA §103
Filed
Jun 26, 2023
Examiner
BRAHMACHARI, MANDRITA
Art Unit
2144
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
1 (Non-Final)
76%
Grant Probability
Favorable
1-2
OA Rounds
0m
Est. Remaining
99%
With Interview

Examiner Intelligence

Grants 76% — above average
76%
Career Allowance Rate
312 granted / 408 resolved
+21.5% vs TC avg
Strong +30% interview lift
Without
With
+29.8%
Interview Lift
resolved cases with interview
Typical timeline
2y 11m
Avg Prosecution
21 currently pending
Career history
436
Total Applications
across all art units

Statute-Specific Performance

§101
0.4%
-39.6% vs TC avg
§103
85.6%
+45.6% vs TC avg
§102
2.1%
-37.9% vs TC avg
§112
2.0%
-38.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 408 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . DETAILED ACTION The action is in response to claims dated 6/26/2023 Claims pending in the case: 1-20 Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1, 3-8, 10-14, 16-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu (US 20220366318) in view of Liu (Adaptive and Efficient GPU Time Sharing for Hyperparameter Tuning in Cloud – please refer to attached for claim mapping). Regarding Claim 1, Wu teaches, A method of training machine learning models comprising: identifying, via at least one processor, a plurality of configurations for the machine learning models, wherein each configuration indicates a machine learning model and a corresponding technique to determine parameters for the machine learning model (Wu: Fig. 2 [28-29]: identify hyperparameter combinations; learned parameters of the ML models are based on the hyperparameters (technique)); evaluating, via the at least one processor, the plurality of configurations by training the machine learning model of the plurality of configurations according to the parameters determined by the corresponding technique (Wu: Fig. 2 [28-29]: evaluate by training); monitoring, via the at least one processor, performance of the machine learning models of the plurality of configurations (Wu: Fig. 2 [38]: monitor model performance); and adjusting, via the at least one processor, resources used for evaluating at least one configuration based on the performance of the machine learning model for the at least one configuration relative to the priority of the machine learning models of others of the plurality of configurations (Wu: [37]: adjust resources based on priority); However, Wu does not specifically teach, based on the priority of performance of the machine learning model Liu teaches, adjusting resources based on the performance of the machine learning model (Liu: Pg. 1 col 2 section 1 [2], Pg. 2 col 1 [1-3], Pg. 4 col 1 section 2.1 [2]: adjust resource based on accuracy/performance during hyperparameter tuning); It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Wu and Liu because the combination would enable adjust resources used based not only on priority but also on performance. One of ordinary skill in the art would have been motivated to combine the teachings because the combination would provide an option to adjust resources adaptively according to accuracy at run time (see Liu Pg. 1 col 1 [1]). Regarding claim 3, Wu and Liu teach the invention as claimed in claim 1 above and, wherein adjusting the resources for evaluating the at least one configuration comprises: terminating the evaluation of the at least one configuration based on the at least one configuration having a machine learning model with lesser performance relative to the performance of the machine learning models of others of the plurality of configurations (Liu: Pg. 2 col 2 section 1 [3], Pg. 4 col 2 [1]: low accuracy terminated). Regarding claim 4, Wu and Liu teach the invention as claimed in claim 1 above and, wherein monitoring performance of the machine learning models comprises: pausing evaluation of the plurality of configurations at an intermediate portion of the evaluation; and producing a report for the performance of the machine learning models of the plurality configurations (Wu: [38]: generate performance report after training (pause)) (Liu: Pg. 3 col 1, Pg. 5 section 2.3 [1]: approaches using checkpoints pauses at checkpoints to save report files). Regarding claim 5, Wu and Liu teach the invention as claimed in claim 4 above and, wherein adjusting the resources for evaluating the at least one configuration comprises: allocating additional resources to resume the evaluation of the at least one configuration based on the at least one configuration having a machine learning model with greater performance relative to the performance of the machine learning models of others of the plurality of configurations (Wu: [37]: adjust resources based on priority) (Liu: Pg. 3 col 1 [4], Pg. 2 col 1 [1-3], Pg. 4 col 1 section 2.1 [2]: adjust resource based on accuracy/performance during hyperparameter tuning). Regarding claim 6, Wu and Liu teach the invention as claimed in claim 1 above and, further comprising: identifying one or more configurations with a machine learning model providing greater performance relative to performance of machine learning models of others of the plurality of configurations (Wu: Fig. 4 [38, 40]: select based on performance) (Liu: Pg. 4 col 1 section 2.1 [1]: select the highest performer set). Regarding claim 7, Wu and Liu teach the invention as claimed in claim 1 above and, further comprising: controlling, via the at least one processor, the evaluation of a configuration based on defining a search space for the evaluation of the configuration comprising a set of machine learning hyperparameters and value ranges for the set of machine learning hyperparameters (Wu: [32, 37]: define search space) (Liu Pg. 2 col section 1 [1]: search space). Regarding Claim(s) 8, 10-13, this/these claim(s) is/are similar in scope as claim(s) 1, 4-7, respectively. Therefore, this/these claim(s) is/are rejected under the same rationale. Regarding Claim(s) 14, 16-120, this/these claim(s) is/are similar in scope as claim(s) 1, 3-7 respectively. Therefore, this/these claim(s) is/are rejected under the same rationale. Claim(s) 2, 9, 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu (US 20220366318) and Liu (Adaptive and Efficient GPU Time Sharing for Hyperparameter Tuning in Cloud – please refer to attached for claim mapping) in view of Worth (US 20240289639). Regarding claim 2, Wu and Liu teach the invention as claimed in claim 1 above and, wherein the machine learning model of one or more configurations includes a … learning agent and the corresponding technique includes a hyperparameter optimization technique, … (Wu: Fig. 2 [38]: monitor model performance; tune the hyperparameters (learning agent) ) (Liu: Pg. 2 col 2 [3]: iterations of training ); However, Wu and Liu do not specifically teach, configurations includes a reinforcement learning agent; and wherein the one or more configurations further indicate an environment for the reinforcement learning agent; Worth teaches, configurations includes a reinforcement learning agent (Worth: Fig. 2a, [41-42, 72]: RL agent using performance); and wherein the one or more configurations further indicate an environment for the reinforcement learning agent (Worth: Fig. 2a, [43, 47]: RL agent using information on environment); It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Wu, Liu and Worth because the combination would enable using an agent in the hyperparameter tuning process. One of ordinary skill in the art would have been motivated to combine the teachings because the combination would provide an option to “automatically adapt a behavior of a learning agent to optimize and/or improve performance in an environment” (see Worth [2-3]). Regarding Claim(s) 9 and 15, this/these claim(s) is/are similar in scope as claim(s) 2. Therefore, this/these claim(s) is/are rejected under the same rationale. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure in attached 892. Any inquiry concerning this communication or earlier communications from the examiner should be directed to MANDRITA BRAHMACHARI whose telephone number is (571)272-9735. The examiner can normally be reached Monday to Friday, 11 am to 8 pm EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached at 571 272 4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /Mandrita Brahmachari/Primary Examiner, Art Unit 2144
Read full office action

Prosecution Timeline

Jun 26, 2023
Application Filed
Apr 01, 2026
Non-Final Rejection mailed — §103
May 28, 2026
Interview Requested

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12632696
SYSTEM, CIRCUIT, DEVICE AND/OR PROCESSES FOR ACCUMULATING NEURAL NETWORK SIGNALS
4y 10m to grant Granted May 19, 2026
Patent 12626109
EVENT-DRIVEN ACCELERATOR SUPPORTING INHIBITORY SPIKING NEURAL NETWORK
3y 9m to grant Granted May 12, 2026
Patent 12614094
SYSTEM AND METHOD FOR MANAGING DATA PROCESSING SYSTEMS HOSTING DISTRIBUTED INFERENCE MODELS
3y 5m to grant Granted Apr 28, 2026
Patent 12608588
METHODS, SYSTEMS, AND MEDIA FOR CONTEXTUAL DISCRIMINATIVE EXPLANATION OF CONVOLUTIONAL NEURAL NETWORKS
3y 11m to grant Granted Apr 21, 2026
Patent 12596746
AUDIO PREVIEWING METHOD, APPARATUS AND STORAGE MEDIUM
4y 6m to grant Granted Apr 07, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2
Expected OA Rounds
76%
Grant Probability
99%
With Interview (+29.8%)
2y 11m (~0m remaining)
Median Time to Grant
Low
PTA Risk
Based on 408 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month