Prosecution Insights
Last updated: April 19, 2026
Application No. 18/184,700

ARTIFICIAL INTELLIGENCE-BASED GAMIFICATION FOR SERVICE BACKGROUND

Non-Final OA §103
Filed
Mar 16, 2023
Examiner
GRUSZKA, DANIEL PATRICK
Art Unit
2121
Tech Center
2100 — Computer Architecture & Software
Assignee
Siemens Healthineers AG
OA Round
1 (Non-Final)
Grant Probability
Favorable
1-2
OA Rounds
3y 3m
To Grant

Examiner Intelligence

Grants only 0% of cases
0%
Career Allow Rate
0 granted / 0 resolved
-55.0% vs TC avg
Minimal +0% lift
Without
With
+0.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
32 currently pending
Career history
32
Total Applications
across all art units

Statute-Specific Performance

§101
38.3%
-1.7% vs TC avg
§103
42.3%
+2.3% vs TC avg
§102
12.0%
-28.0% vs TC avg
§112
7.4%
-32.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 0 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1, 2, 7, 11, and 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Meghani (US 2023/0096811 A1) in view of Zak (US 2025/0046436 A1 with PCT filed 12/8/2022). Regarding claim 1, Meghani teaches: A method for machine training an artificial intelligence to make recommendations in a service management system, the method comprising: ([0016] “In accordance with another embodiment of the invention, a method for generating an optimum task schedule for fulfilling a large-scale capital project using reinforcement learning”) modeling the service management system, the modeling being a model including machines, …, service personnel, …, and service times; ([0105] “The reinforcement engine is trained using large amounts of data that is generated by a data simulation service 710 for different projects and different combinations of work packages, resource needs, timelines, resources/constraints available etc.” and [0096] “The projects may include building factories, hospitals, warehouses, shipyards, and the like. The four projects have various requirements of resources, personnel, time demands, and the like.”) machine training, by a processor, the artificial intelligence with reinforcement learning, the artificial intelligence being trained to make the recommendations for service by the service personnel of the machines based on simulations from the modeling of the service system and based on rewards from a performance indicator from the service times; and ([0078] “The trained reinforcement engine 240 receives the data inputs 210, 220, and 230 and generates schedule alternatives based on a reward function which codifies which and how many constraints are violated versus how many are optimized. This may be set by the simulation environment where the constraints data 220 are given as above, and the agent within the trained reinforcement engine 240 could then explore the action space towards first a feasible and—in a second or more steps—even optimal solution to a schedule 250.”) storing a policy of the artificial intelligence as trained by the machine training. ([0078] “The trained reinforcement engine 240 receives the data inputs 210, 220, and 230 and generates schedule alternatives based on a reward function which codifies which and how many constraints are violated versus how many are optimized.” This implies the AI policy was stored to be used later). Meghani does not teach modeling the service management system, the modeling being a model including locations of the machines and locations of the service personnel. However Zak does: modeling the service management system, the modeling being a model including locations of the machines and locations of the service personnel. ([0009] “A method to optimize human resource management within emergency medical services (EMS) according to one approach may have the steps of: receiving inputs from at least one or more the data sources selected from the list comprising traffic conditions, weather, incident location of emergency or non-emergency call, call type, dispatch type, latitude and longitude of incident location, age, sex, chief complaint, incident date and time, holiday, day of the week, call classification, emergency department population status, incoming EMS service requests, available medical consumables, available medical non consumables, available staff, Cellular triangulation of staff, Cellular triangulation of ambulance or other mobile EMS equipment, Cellular triangulation of service base sites, GPS location of staff, GPS location of the service base sites, GPS location of ambulance or other mobile EMS equipment, identified location of needed services, dispatch requests, time of dispatch requests, latitude and longitude of dispatch requests, hospital census counts, duration of patient admittance in hospital, admitting diagnosis in hospital, discharge diagnosis in hospital, unit transition patterns of patients, unit transition date and time, unit admitting date and time, admitting unit in hospital, on-site manufacturing capabilities; “) Meghani and Zak are considered analogous art to the claimed invention because they are in the same field of endeavor being reinforcement learning. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the reinforcement learning Meghani with the location tracking of Zak. One would want to do this to create better simulations for the reinforcement learning. Regarding claim 2, Meghani in view of Zak teaches claim 1 as outlined above. Meghani further teaches: modeling comprises modeling with a distribution of the service times based, at least in part, on travel times, and wherein machine training comprises simulating using different samples from the distribution for the simulations and/or variance of the distribution. ([0071] “Each of the nodes 130, 135, and 140, as listed, for example in FIG. 1, will have resources that are required to perform the task, as well as a planned duration for how much time the resource will be required to complete each task.” And [0105] “The reinforcement engine is trained using large amounts of data that is generated by a data simulation service 710 for different projects and different combinations of work packages, resource needs, timelines, resources/constraints available etc.”) Regarding claim 7, Meghani in view of Zak teaches claim 1 as outlined above. Meghani further teaches: machine training comprises estimating states, taking actions, and receiving the rewards based on the simulations. ([0114] “Referring to FIG. 8B, an agent 880 takes an action “a.sub.t” 884 in an environment 886 and receives a reward “r.sub.t” 888 in a state “s.sub.t” 890. Both the Q-learning method and the neural network architecture are trained to predict the Q-function Q(s,a)—e.g., the reward 888 of taking action “a.sub.t” 884 in state “s.sub.t” 890.”) Regarding claim 11, Meghani in view of Zak teaches claim 1 as outlined above. Meghani further teaches: re-training the policy of the artificial intelligence based on review results for the recommendations by a service manager. ([0090] “The AI auto-scheduler 200 may be further trained by providing user feedback and industry feedback. For example, the AI auto-scheduler 200 may receive feedback relating to execution of the sequenced third level items in the work package and automatically update the sequence of uncompleted third level items and resources needed for completion of such items.” Regarding claim 16, Meghani teaches: A system for machine-learned model service assistance, the system comprising: ([0008] “a system for generating task schedules using an electronic device includes: a processor, the processor comprising neural networks; a memory coupled to the processor”) a memory configured to store a policy of the machine-learned model, the policy having been learned by reinforcement machine learning in a gamification using simulation of a service environment in combination with the reinforcement machine learning of the policy; ([0078] “The trained reinforcement engine 240 receives the data inputs 210, 220, and 230 and generates schedule alternatives based on a reward function which codifies which and how many constraints are violated versus how many are optimized. This may be set by the simulation environment where the constraints data 220 are given as above, and the agent within the trained reinforcement engine 240 could then explore the action space towards first a feasible and—in a second or more steps—even optimal solution to a schedule 250.” This implies the AI policy was stored to be used later). a processor configured to input measurements from the service environment to the policy and to output a recommendation from the policy in response to the input of the measurements; and ([0078] ““The trained reinforcement engine 240 receives the data inputs 210, 220, and 230 and generates schedule alternatives based on a reward function which codifies which and how many constraints are violated versus how many are optimized.”) Meghani does not teach: a display configured to display the recommendation from the policy. However, Zak does: a display configured to display the recommendation from the policy. ([0009] “displaying to a user the results of the comparison, including the one or more comparison outcomes, and one or more predicted assessment values that relate to the comparison of inputs or to the predicted evolution of data over a time interval using machine learning.”) Meghani and Zak are considered analogous art to the claimed invention because they are in the same field of endeavor being reinforcement learning. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the reinforcement learning Meghani with the display of Zak. One would want to do this to be able to display the recommendation/scheduling produced by Meghani. Regarding claim 17, Meghani in view of Zak teaches claim 16 as outlined above. Meghani further teaches: the policy was learned using rewards based on a key performance indicator of the service environment, ([0079] “Reinforcement learning works by exploring possible actions and receiving feedback for each action in the form of a reward, and by that implicitly learning the underlying logic and dynamics of a system to eventually outperform classical approaches. The agent selects a single data input and sorts it into possible schedules. After each individual action or after several individual actions, e.g. whenever an individual IWP is “completed” (i.e. when no more components get added to the same IWP and the IWP is ready for scheduling) this is sent into a simulation which gives a reward. The reward is a real scalar but can take several aspects into account. A simulation engine could check the output 250 by the agent on whether it is feasible, e.g. whether any physical constraints are violated. Furthermore, the simulation environment puts out a state to the agent. This state defines the remaining components, resources, constraints, etc. The agent then selects a new action based on the updated reward and state.”) Meghani does not teach: wherein the display of the recommendation includes an expected value of the key performance indicator given the recommendation and a period for the expected value. However, Zak does: wherein the display of the recommendation includes an expected value of the key performance indicator given the recommendation and a period for the expected value. ([0408] “In parallel the patient condition profile is continuously updated during triage and displayed to the dispatcher to aid in determining the right response based on characteristics of the patient condition profile. If it is determined that an EMS unit is needed, the closest available unit is then searched and once a unit become available, it is dispatched to the location of the 911 call. This method performs continuous intelligent analytics to adapt to changes in the EMS unit and receiving entity profile with relation to changing patient condition throughout the patient trajectory. Some inputs to this method may include patient assessment findings, complaints, point of care diagnostics, and the like.”). Regarding claim 18, Meghani in view of Zak teaches claim 16 as outlined above. Meghani further teaches: the policy was learned using rewards based on a key performance indicator of the service environment, ([0079] “Reinforcement learning works by exploring possible actions and receiving feedback for each action in the form of a reward, and by that implicitly learning the underlying logic and dynamics of a system to eventually outperform classical approaches. The agent selects a single data input and sorts it into possible schedules. After each individual action or after several individual actions, e.g. whenever an individual IWP is “completed” (i.e. when no more components get added to the same IWP and the IWP is ready for scheduling) this is sent into a simulation which gives a reward. The reward is a real scalar but can take several aspects into account. A simulation engine could check the output 250 by the agent on whether it is feasible, e.g. whether any physical constraints are violated. Furthermore, the simulation environment puts out a state to the agent. This state defines the remaining components, resources, constraints, etc. The agent then selects a new action based on the updated reward and state.”) Meghani does not teach: wherein the display of the recommendation includes display of a value of the key performance indicator with no change and a value of the key performance indicator when the recommendation is followed. However, Zak does: wherein the display of the recommendation includes display of a value of the key performance indicator with no change and a value of the key performance indicator when the recommendation is followed. ([0408] “In parallel the patient condition profile is continuously updated during triage and displayed to the dispatcher to aid in determining the right response based on characteristics of the patient condition profile. If it is determined that an EMS unit is needed, the closest available unit is then searched and once a unit become available, it is dispatched to the location of the 911 call. This method performs continuous intelligent analytics to adapt to changes in the EMS unit and receiving entity profile with relation to changing patient condition throughout the patient trajectory. Some inputs to this method may include patient assessment findings, complaints, point of care diagnostics, and the like.”). Regarding claim 19, Meghani in view of Zak teaches claim 16 as outlined above. Zak further teaches: the processor is configured to adapt the display of the recommendation with a priority based on frequency of assessment of results by a service manager. ([0178] “Suggestions are presented in the map such as optimal ambulance location, specific locations of higher priority call types, and the like.”) Claims 12 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Meghani in view of Nag (US 2020/0065703 A1). Regarding claim 12, Meghani teaches: A method for machine training an artificial intelligence to make recommendations in a service management system, the method comprising: ([0016] “In accordance with another embodiment of the invention, a method for generating an optimum task schedule for fulfilling a large-scale capital project using reinforcement learning”) storing the policy as trained by the machine training. ([0078] “The trained reinforcement engine 240 receives the data inputs 210, 220, and 230 and generates schedule alternatives based on a reward function which codifies which and how many constraints are violated versus how many are optimized.” This implies the AI policy was stored to be used later). Meghani does not teach: modeling the service management system, the modeling using a model with state parameters and state transition parameters for the service management system; machine training, by a processor, a policy with reinforcement learning, the policy being trained to make the recommendations based on simulations using the model, the simulations perturbing sampling of distributions and/or selection of distributions for the state parameters and/or the state transition parameters; and However, Nag does: modeling the service management system, the modeling using a model with state parameters and state transition parameters for the service management system; ([0076] “First, reinforcement learning is used to train an environment simulator 2002 by one or both of operating the simulator against a live-distributed-system environment 2004 or against a simulated distributed-system environment that replays archived data generated by a live distributed system to the simulator 2006.”) machine training, by a processor, a policy with reinforcement learning, the policy being trained to make the recommendations based on simulations using the model, the simulations perturbing sampling of distributions and/or selection of distributions for the state parameters and/or the state transition parameters; and ([0067] “In the reinforcement-learning approach, the environment is considered to inhabit a particular state at each point in time. The state may be represented by one or more numeric values or character-string values, but generally is a function of hundreds, thousands, millions, or more different variables. The observations generated by the environment and transmitted to the manager reflect the state of the environment at the time that the observations are made. The possible state transitions can be described by a state-transition diagram for the environment. FIG. 14A illustrates a portion of a state-transition diagram.”) Meghani and Nag are considered analogous art to the claimed invention because they are in the same field of endeavor being reinforcement learning. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the reinforcement learning Meghani with the state transition and simulation perturbing of Nag. One would want to do this for better simulations that can be used in reinforcement learning. Regarding claim 14, Meghani in view of Nag teaches claim 12 as outlined above. Nag further teaches: machine training comprises machine training with an adversarial machine-learned agent configured by past training to perturb values of the state parameters and/or the state transition parameters of the model in the simulations such that an adverse reward is received where the policy fails to improve rewards of the reinforcement learning. ([0095] “In this approach, during training, the automated reinforcement-learning-based application manager is directed to select non-optimal, potentially disadvantageous actions at various points in time in order to push the control trajectories into otherwise unexplored regions of the system-state space. In essence, this approach uses the disadvantages suffered by a conventionally trained automated reinforcement-learning-based application manager during live control of a computing environment as advantages during adversarial training. As a result of the disadvantageous actions taken during adversarial training, the automated reinforcement-learning-based application manager is forced to visit a much larger subset of the system states within the system-state space and therefore gain much broader experience, which, in turn, guarantees that the control policy learned during adversarial training is significantly more robust and complete then control policies learned during conventional training.”) Claims 3-6, 8, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Meghani in view of Zak and Nag. Regarding claim 3, Meghani in view of Zak teaches claim 1 as outlined above. Neither Meghani nor Zak teach: machine training comprises machine training with an adversarial machine-learned agent configured by past training to perturb values of parameters of the model in the simulations such that an adverse reward is received for the adversarial machine-learned agent where the artificial intelligence fails to improve the rewards for the artificial intelligence. However, Nag does: machine training comprises machine training with an adversarial machine-learned agent configured by past training to perturb values of parameters of the model in the simulations such that an adverse reward is received for the adversarial machine-learned agent where the artificial intelligence fails to improve the rewards for the artificial intelligence. ([0095] “In this approach, during training, the automated reinforcement-learning-based application manager is directed to select non-optimal, potentially disadvantageous actions at various points in time in order to push the control trajectories into otherwise unexplored regions of the system-state space. In essence, this approach uses the disadvantages suffered by a conventionally trained automated reinforcement-learning-based application manager during live control of a computing environment as advantages during adversarial training. As a result of the disadvantageous actions taken during adversarial training, the automated reinforcement-learning-based application manager is forced to visit a much larger subset of the system states within the system-state space and therefore gain much broader experience, which, in turn, guarantees that the control policy learned during adversarial training is significantly more robust and complete then control policies learned during conventional training.”) Meghani, Zak and Nag are considered analogous art to the claimed invention because they are in the same field of endeavor being reinforcement learning. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the reinforcement learning Meghani with the location tracking of Zak with the state transition and simulation perturbing of Nag. One would want to do this for better simulations that can be used in reinforcement learning. Regarding claim 4, Meghani in view of Zak teaches claim 1 as outlined above. Nag further teaches: modeling comprises representing the service management system as a random process defined over states of the machines, locations of the machines, service personnel, locations of the service personnel, the service times, service personnel shifts, and service agreement information with state transition functions defining probabilities of change in the states. ([0068] “For example, a transition from state 1420 to state 1422 as a result of action 1424 produces observation 1426, while transition from state 1420 to state 1421 via action 1424 produces observation 1428. A second additional detail is that each state transition is associated with a probability.” And Meghani and Zak teaches the machines, locations of the machines, service personnel, locations of the service personnel, the service times, service personnel shifts, and service agreement information as mentioned above.) Regarding claim 5, Meghani in view of Zak and Nag teaches claim 4 as outlined above. Nag further teaches: modeling comprises refining the states and the state transition functions based on matching observations from the modeling of the service management system to observations from the service management system. ([0068] “ As indicated by expressions 1434, the function O returns the probability that a particular observation o is returned by the environment given a particular action and the state to which the environment transitions following execution of the action. In other words, in general, there are many possible observations o that might be generated by the environment following transition to a particular state through a particular action, and each possible observation is associated with a probability of occurrence of the observation given a particular state transition through a particular action.”) Regarding claim 6, Meghani in view of Zak and Nag teaches claim 5 as outlined above. Meghani further teaches: refining comprises refining based on actions and resulting values of the performance indicator. ([0079] “A simulation engine could check the output 250 by the agent on whether it is feasible, e.g. whether any physical constraints are violated. Furthermore, the simulation environment puts out a state to the agent. This state defines the remaining components, resources, constraints, etc. The agent then selects a new action based on the updated reward and state.”) Regarding claim 8, Meghani in view of Zak teaches claim 1 as outlined above. Nag further teaches: machine training comprises the reinforcement learning using perturbation of the modeling in the simulations, the perturbations being for different initial conditions and/or state transitions. ([0095] “In this approach, during training, the automated reinforcement-learning-based application manager is directed to select non-optimal, potentially disadvantageous actions at various points in time in order to push the control trajectories into otherwise unexplored regions of the system-state space. In essence, this approach uses the disadvantages suffered by a conventionally trained automated reinforcement-learning-based application manager during live control of a computing environment as advantages during adversarial training. As a result of the disadvantageous actions taken during adversarial training, the automated reinforcement-learning-based application manager is forced to visit a much larger subset of the system states within the system-state space and therefore gain much broader experience, which, in turn, guarantees that the control policy learned during adversarial training is significantly more robust and complete then control policies learned during conventional training.”) Regarding claim 13, Meghani in view of Nag teaches claim 12 as outlined above. Meghani further teaches: modeling comprises modeling with the state parameters comprising including machines, …, service personnel, … and the state transition parameters comprising service times and travel times, and wherein machine training comprises the reinforcement learning using rewards from performance indicators for the service times and the travel times. ([0105] “The reinforcement engine is trained using large amounts of data that is generated by a data simulation service 710 for different projects and different combinations of work packages, resource needs, timelines, resources/constraints available etc.” and [0096] “The projects may include building factories, hospitals, warehouses, shipyards, and the like. The four projects have various requirements of resources, personnel, time demands, and the like.”) Meghani does not teach modeling the service management system, the modeling being a model including locations of the machines and locations of the service personnel. However Zak does: modeling the service management system, the modeling being a model including locations of the machines and locations of the service personnel. ([0009] “A method to optimize human resource management within emergency medical services (EMS) according to one approach may have the steps of: receiving inputs from at least one or more the data sources selected from the list comprising traffic conditions, weather, incident location of emergency or non-emergency call, call type, dispatch type, latitude and longitude of incident location, age, sex, chief complaint, incident date and time, holiday, day of the week, call classification, emergency department population status, incoming EMS service requests, available medical consumables, available medical non consumables, available staff, Cellular triangulation of staff, Cellular triangulation of ambulance or other mobile EMS equipment, Cellular triangulation of service base sites, GPS location of staff, GPS location of the service base sites, GPS location of ambulance or other mobile EMS equipment, identified location of needed services, dispatch requests, time of dispatch requests, latitude and longitude of dispatch requests, hospital census counts, duration of patient admittance in hospital, admitting diagnosis in hospital, discharge diagnosis in hospital, unit transition patterns of patients, unit transition date and time, unit admitting date and time, admitting unit in hospital, on-site manufacturing capabilities; “) Meghani, Zak and Nag are considered analogous art to the claimed invention because they are in the same field of endeavor being reinforcement learning. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the reinforcement learning Meghani with the location tracking of Zak with the state transition and simulation perturbing of Nag. One would want to do this for better simulations that can be used in reinforcement learning. Regarding claim 20, Nag teaches the gamification comprised use of the simulation with a model fit to the service environment, the simulations having used perturbation of distributions and/or sampling of parameters of the model as fit to the service environment and resulting changes in performance indicators as rewards in the reinforcement machine learning. ([0067] “In the reinforcement-learning approach, the environment is considered to inhabit a particular state at each point in time. The state may be represented by one or more numeric values or character-string values, but generally is a function of hundreds, thousands, millions, or more different variables. The observations generated by the environment and transmitted to the manager reflect the state of the environment at the time that the observations are made. The possible state transitions can be described by a state-transition diagram for the environment. FIG. 14A illustrates a portion of a state-transition diagram.”) Meghani, Zak and Nag are considered analogous art to the claimed invention because they are in the same field of endeavor being reinforcement learning. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the reinforcement learning Meghani with the location tracking of Zak with the state transition and simulation perturbing of Nag. One would want to do this for better simulations that can be used in reinforcement learning. Claims 9 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Meghani in view of Zak and Gopalan (US 2018/0285772 A1). Regarding claim 9, Meghani in view of Zak teaches claim 1 as outlined above. Neither Meghani or Zak teaches: comprising updating the model with statistical testing of the service times and/or other parameters of the model. However, Gopalan does: comprising updating the model with statistical testing of the service times and/or other parameters of the model. ([0013] “In addition, examples of the present disclosure may include a root cause analysis (RCA)—which identifies what features are most responsible for the accuracy of the machine learning model with respect to the service (e.g., classification, clustering, mapping, etc.). RCA includes a quantitative statistical analysis on historical data to determine relevance scores on the impact of the features/variables to the accuracy of the machine learning model.”) Meghani, Zak and Gopalan are considered analogous art to the claimed invention because they are in the same field of endeavor being reinforcement learning. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the reinforcement learning Meghani with the location tracking of Zak with the retraining of Gopalan One would want to do this for another method of training the model Regarding claim 10, Meghani in view of Zak teaches claim 1 as outlined above. Gopalan further teaches: comprising re-training the policy of the artificial intelligence when an actual distribution of a parameter of the model is a threshold difference from a distribution or distributions used in the machine training. ([0016] “retraining a machine learning model when a counter for a likelihood of new data based upon a training data set being less than a first threshold exceeds a second threshold may operate in accordance with the present disclosure.”) Claims 15 is rejected under 35 U.S.C. 103 as being unpatentable over Meghani in view of Nag and Gopalan. Regarding claim 15, Meghani in view of Nag teaches claim 12 as outlined above. Neither Meghani or Zak teaches: comprising updating the model with statistical testing the state transition parameters of the model and/or with replacement of values of the state parameters. However, Gopalan does: comprising updating the model with statistical testing the state transition parameters of the model and/or with replacement of values of the state parameters. ([0013] “In addition, examples of the present disclosure may include a root cause analysis (RCA)—which identifies what features are most responsible for the accuracy of the machine learning model with respect to the service (e.g., classification, clustering, mapping, etc.). RCA includes a quantitative statistical analysis on historical data to determine relevance scores on the impact of the features/variables to the accuracy of the machine learning model.”) Meghani, Nag and Gopalan are considered analogous art to the claimed invention because they are in the same field of endeavor being reinforcement learning. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the reinforcement learning Meghani with the state transition and simulation perturbing of Nag with the retraining of Gopalan One would want to do this for another method of training the model. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL PATRICK GRUSZKA whose telephone number is (571)272-5259. The examiner can normally be reached M-F 9:00 AM - 6:00 PM ET. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached at (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /DANIEL GRUSZKA/ Examiner, Art Unit 2121 /Li B. Zhen/ Supervisory Patent Examiner, Art Unit 2121
Read full office action

Prosecution Timeline

Mar 16, 2023
Application Filed
Jan 14, 2026
Non-Final Rejection — §103 (current)

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
Grant Probability
3y 3m
Median Time to Grant
Low
PTA Risk
Based on 0 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month