Last updated: May 29, 2026

Application No. 17/401,002

SYSTEM AND METHOD FOR OPTIMIZING A MACHINE LEARNING MODEL

Non-Final OA §101§103

Filed

Aug 12, 2021

Examiner

NILSSON, ERIC

Art Unit

2151

Tech Center

2100 — Computer Architecture & Software

Assignee

VISA INTERNATIONAL SERVICE ASSOCIATION

OA Round

3 (Non-Final)

Interview Optional

— +17.7% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 83% grant rate with +17.7% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 501 resolved cases, 2023–2026

Examiner Intelligence

NILSSON, ERIC View full profile →

Grants 83% — above average

Career Allowance Rate

415 granted / 501 resolved

+27.8% vs TC avg

Strong +18% interview lift

Without

With

+17.7%

Interview Lift

resolved cases with interview

Typical timeline

3y 1m

Avg Prosecution

26 currently pending

Career history

528

Total Applications

across all art units

Statute-Specific Performance

§101

14.4%

-25.6% vs TC avg

§103

63.9%

+23.9% vs TC avg

§102

7.7%

-32.3% vs TC avg

§112

1.3%

-38.7% vs TC avg

Black line = Tech Center average estimate • Based on career data from 501 resolved cases

Office Action

§101 §103

DETAILED ACTION
This action is in response to claims filed 28 May 2025 for application 17401002 filed 12 August 2021. Currently claims 1-12, 14-16, and 18-20 are pending. 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 05 September 2025 has been entered.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-12, 14 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Jannink et al. (US 2020/0184273 A1) in view of Barratt et al. (Optimizing for Generalization in Machine Learning with Cross-Validation Gradients).

Regarding claim 1, Jannink discloses: A computing device [0094], comprising: 
a training platform at the computing device, the computing device including a processor to execute instructions associated with the training platform (Fig 5 and [0071] model training, [0094]); and 
an inference platform coupled to the training platform (Fig 5 processing), wherein based upon an updating of hyperparameters in the training platform (Fig 5 hyperparam search), an optimized inference model is configured to be deployed to the inference platform (Fig 5 Deploy).

Jannink does not explicitly disclose, However, Barratt teaches: wherein: (i) a metric score and (ii) a gradient of the metric score of each hyperparameter of the hyperparameters are calculated and utilized as joint prediction criteria by the training platform to perform a cross-validation prediction to select the optimized inference model (Algorithm 1 shows a cross-validation gradient method for selecting optimal hyperparameters (optimized inference model), hyperparameter metric α and gradient of α are used for the joint prediction.)

Jannink and Barratt are in the same field of endeavor of hyperparameter/ML optimization and are analogous. Jannink discloses a machine learning training and hyperparameter optimization platform for continually updating models. Barratt discloses a method of model optimization using hyperparameter metrics and gradients. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the known hyperparameter optimization as disclosed by Jannink with the known hyperparameter gradient optimization method as taught by Barratt to yield predictable results.

Regarding claim 2, Jannink discloses: The computing device of claim 1, wherein: the training platform updates the hyperparameters using the metric score generated in the training platform (“A further benefit of incorporating the model retraining and improvement into the overall system is that it becomes possible to continuously generate new subject models by varying model hyper parameters. Taking multiple candidate subject models in this way, training them with those new parameters, and validating them against existing models, enables a directed optimization search through the model parameter space. At the end of each training cycle we preserve only the best of the candidates, if they supersede the existing models in accuracy.” [0023]).

Regarding claim 3, Jannink discloses: The system of claim 2, wherein: the metric score is indicative of an accuracy of a candidate inference model (“A further benefit of incorporating the model retraining and improvement into the overall system is that it becomes possible to continuously generate new subject models by varying model hyper parameters. Taking multiple candidate subject models in this way, training them with those new parameters, and validating them against existing models, enables a directed optimization search through the model parameter space. At the end of each training cycle we preserve only the best of the candidates, if they supersede the existing models in accuracy.” [0023]).
Regarding claim 4, Jannink discloses: The computing device of claim 3, wherein: the optimized inference model is generated in the training platform by using a feedback of the updated hyperparameters (“This data collection provides an initial input to the feedback loop in which the first representations of the physical sensor environment are stored and deployed. Subsequently, the feedback enables improved versions of analysis algorithms and machine learning model hyperparameters to be selected for the system model. The refinement process is called the training cycle and a separate training pipeline performs all of the functions necessary to complete this function.” [0032]).

Regarding claim 5, Jannink discloses: The computing device of claim 4, wherein: the inference platform updates the optimized inference model to generate a second version of the optimized inference model (“Returning to FIG. 2, at 62, dataset annotation occurs. Thereafter, training sets 64 are created to be used in the creation of trained models to recognize the subjects of data capture. These training sets are used in model training 66 to produce different versions of the model of the environment 68. The repeated process of training and validating models, and the loops 68 of further dataset generation, and annotation needed to produce an updated model tend to make each successively trained model superior to the previous one.” [0038]).

Regarding claim 6, Jannink discloses: The computing device of claim 5, wherein: the inference platform updates the optimized inference model by using a client observation and a first prediction response (“Returning to FIG. 2, at 62, dataset annotation occurs. Thereafter, training sets 64 are created to be used in the creation of trained models to recognize the subjects of data capture. These training sets are used in model training 66 to produce different versions of the model of the environment 68. The repeated process of training and validating models, and the loops 68 of further dataset generation, and annotation needed to produce an updated model tend to make each successively trained model superior to the previous one.” [0038]).

Regarding claim 7, Jannink discloses: The computing device of claim 6, wherein: an observation difference is generated by the inference platform by calculating a difference between the client observation and the first prediction response ([0037], [0090-91] disclose difference analysis of collected data vs existing data).

Regarding claim 8, Jannink discloses: The computing device of claim 7, wherein: the second version of the optimized inference model is used to generate a second prediction response ([0042] discloses validating and release of optimized models).

Regarding claim 9, Jannink discloses: The computing device of claim 8, wherein: the second prediction response is generated in response to a second prediction request [0070].

Regarding claim 10, Jannink discloses: A training platform in a computing device [0094], comprising: 
a model training unit at the computing device, the computing device including a processor to execute instructions associated with the computing device (Fig 5); 
a model validation unit coupled to the model training unit at the computing device (Fig 5, [0070], [0094]); and 
a hyperparameter updating unit at the computing device coupled to the model validation unit (Fig 5), wherein based upon an updating of hyperparameters associated with an inference model generated in the model training unit (Fig 5 hyperparam search), an optimized inference model is output by the training platform (Fig 5 deploy).

However, Jannink does not explicitly disclose: wherein Barratt teaches: wherein (i) a metric score and (ii) a gradient of the metric score of each hyperparameter of the hyperparameters are calculated and utilized as joint prediction criteria by the training platform to perform a cross-validation prediction to select the optimized inference model (Algorithm 1 shows a cross-validation gradient method for selecting optimal hyperparameters (optimized inference model), hyperparameter metric α and gradient of α are used for the joint prediction.)

Regarding claim 11, Jannink discloses: The training platform of claim 10, wherein: the hyperparameter updating unit uses the metric score to update the hyperparameters (“A further benefit of incorporating the model retraining and improvement into the overall system is that it becomes possible to continuously generate new subject models by varying model hyper parameters. Taking multiple candidate subject models in this way, training them with those new parameters, and validating them against existing models, enables a directed optimization search through the model parameter space. At the end of each training cycle we preserve only the best of the candidates, if they supersede the existing models in accuracy.” [0023]).

Regarding claim 12, Jannink discloses: The training platform of claim 11, wherein: the metric score used to update the hyperparameters is indicative of an accuracy of the inference model (“A further benefit of incorporating the model retraining and improvement into the overall system is that it becomes possible to continuously generate new subject models by varying model hyper parameters. Taking multiple candidate subject models in this way, training them with those new parameters, and validating them against existing models, enables a directed optimization search through the model parameter space. At the end of each training cycle we preserve only the best of the candidates, if they supersede the existing models in accuracy.” [0023]).

Regarding claim 14, Jannink discloses: The training platform of claim 12, further comprising: a separation unit coupled to the model training unit, wherein the separation unit provides a first data set to the model training unit and a second data set to the model validation unit (“In FIG. 2, the sandbox testing 70 of the output model prior to its release and replacement of existing models allows for validation. This testing may compare the output of the new model against a previously validated “golden data set” output of the existing models.” [0042]).

Regarding claim 15, Jannink discloses: The training platform of claim 14, wherein: the model validation unit uses a candidate inference model and the second data set to generate the metric score (“In FIG. 2, the sandbox testing 70 of the output model prior to its release and replacement of existing models allows for validation. This testing may compare the output of the new model against a previously validated “golden data set” output of the existing models. Assuming the new model is deemed acceptable in that it accurately reflects the instrumented environment (perhaps within predetermined tolerances), the updated model is released 72, and the now-deployed new ML model 74 assumes the place of the original model 52 for further data collection and analysis. As should be evident, the end of one training cycle is also the start of the next one.” [0042], see also [0023]).

Claims 16 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Jannink et al. (US 2020/0184273 A1) in view of Rishe (US 20140280319 A1).

Regarding claim 16, Jannink discloses: A computer implemented method, comprising: 
generating, at an observation service unit of a computing device, an observation difference (“When the error rate exceeds a defined bound, a simple active learning process will be invoked that presents a subset images from the stream that are similar to the discovered errors in the validation process. Similarity is defined in one implementation as a linear combination of the metric of the model's domain and other simple metrics. Cosine distance between the source images is one such a simple metric.” [0037], see also [0090-91]); and 
updating, at a model updating service unit at the computing device, a first prediction model based upon the observation difference ([0037], “This data collection provides an initial input to the feedback loop in which the first representations of the physical sensor environment are stored and deployed. Subsequently, the feedback enables improved versions of analysis algorithms and machine learning model hyperparameters to be selected for the system model. The refinement process is called the training cycle and a separate training pipeline performs all of the functions necessary to complete this function.” [0032], “Returning to FIG. 2, at 62, dataset annotation occurs. Thereafter, training sets 64 are created to be used in the creation of trained models to recognize the subjects of data capture. These training sets are used in model training 66 to produce different versions of the model of the environment 68. The repeated process of training and validating models, and the loops 68 of further dataset generation, and annotation needed to produce an updated model tend to make each successively trained model superior to the previous one.” [0038]).

Jannink does not explicitly disclose, however, Rishe teaches: the observation difference being a function of a prediction response and a client observation (“In certain embodiments, the future location of a moving object is predicted for obtaining visual results of moving objects (positions) to optimize client-server bandwidth. Location prediction refers to statistical methods that derive patterns or mathematical formulas whose purpose is, given the recent trajectory of a moving object, to predict its future location. In one related embodiment, sensor data streams are queried, wherein each update from a sensor is associated with a function allowing prediction of future values of that sensor. The sensor commits to update its value whenever the difference between the observed value and the value estimated using the prediction function exceeds a certain threshold. Location prediction enables selective transfer of moving objects' data from the server to the client. More specifically, moving objects whose locations are predicted to be viewable will be transferred, whereas other moving objects' data will not, to optimize client-server bandwidth.” [0047]);
wherein in order to generate the observation difference, the observation service unit measures an error between a client observation and a previous prediction response generated by the first prediction model [0047].

Jannink and Rishe are in the same field of endeavor of prediction and are analogous. Jannink discloses a machine learning training and hyperparameter optimization platform for continually updating models. Rishe discloses a method of comparing observations to predictions. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the hyperparameter optimization as disclosed by Jannink with the known comparison of predictions and observations to yield predictable results.

Regarding claim 18, Jannink discloses: The method of claim 16, wherein: a client provides a first prediction request and the client observation to the observation service unit to generate the observation difference ([0037] discloses validation and error catching in response to manual requests).

Regarding claim 19, Jannink discloses: The method of claim 18, further comprising: generating a second prediction model based upon the updating of the first prediction model (“Returning to FIG. 2, at 62, dataset annotation occurs. Thereafter, training sets 64 are created to be used in the creation of trained models to recognize the subjects of data capture. These training sets are used in model training 66 to produce different versions of the model of the environment 68. The repeated process of training and validating models, and the loops 68 of further dataset generation, and annotation needed to produce an updated model tend to make each successively trained model superior to the previous one.” [0038]).

Regarding claim 20, Jannink discloses: The method of claim 19, further comprising: using the second prediction model to generate a second prediction response ([0038], Fig 5).
Response to Arguments
Applicant’s arguments, see pp6-12, filed 28 May 2025, with respect to the 35 USC §101 rejection are persuasive. The rejection of claims 1-12, 14-16 and 18-20 under 35 USC §101 have been withdrawn. 
Applicant’s arguments with respect to claim(s) 1-12, 14-16, and 18-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Please see the addition of new reference Barratt for how the amended claim language is taught.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC NILSSON whose telephone number is (571)272-5246. The examiner can normally be reached M-F: 7-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, James Trujillo can be reached at (571)-272-3677. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ERIC NILSSON/Primary Examiner, Art Unit 2151

Read full office action

Prosecution Timeline

Show 4 earlier events

May 21, 2025

Interview Requested

May 27, 2025

Applicant Interview (Telephonic)

May 27, 2025

Examiner Interview Summary

Sep 05, 2025

Request for Continued Examination

Oct 03, 2025

Response after Non-Final Action

Oct 23, 2025

Non-Final Rejection mailed — §101, §103

Apr 21, 2026

Examiner Interview Summary

Apr 21, 2026

Applicant Interview (Telephonic)

Precedent Cases

Applications granted by this same examiner with similar technology

18/211,153

Patent 12626169

BAYESIAN CAUSAL RELATIONSHIP NETWORK MODELS FOR HEALTHCARE DIAGNOSIS AND TREATMENT BASED ON PATIENT DATA

2y 11m to grant Granted May 12, 2026

17/471,124

Patent 12619869

LEARNING APPARATUS, LEARNING METHOD, AND A NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM

4y 7m to grant Granted May 05, 2026

17/792,580

Patent 12619925

CONTEXT-LEVEL FEDERATED LEARNING

3y 9m to grant Granted May 05, 2026

17/781,539

Patent 12608613

PARAMETER OPTIMIZATION DEVICE, PARAMETER OPTIMIZATION METHOD, AND PARAMETER OPTIMIZATION PROGRAM

3y 10m to grant Granted Apr 21, 2026

17/954,485

Patent 12607972

Method and Apparatus for Monitoring Machine Learning Models

3y 6m to grant Granted Apr 21, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

83%

Grant Probability

99%

With Interview (+17.7%)

3y 1m (~0m remaining)

Median Time to Grant

High

PTA Risk

Based on 501 resolved cases by this examiner. Grant probability derived from career allowance rate.