Office Action Analysis: 18362125 — Adversarial Cooperative Imitation Learning for Dynamic Treatment

Office Action

§101 §103 §112 §DP
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Claims 1-10 are presented for examination. Specification The disclosure is objected to because of the following informalities: [0001]: "Untied States" should read "United States," and “both of which are incorporated by reference in their entireties., incorporated herein by reference in its entirety” should read “both of which are incorporated by reference in their entireties” [0020]: "body temperature, blood sugar levels" should read "body temperature, and blood sugar levels" [0029]: "output a decoder network" should read "output of decoder network" [0031]: “In general, the differences between two policies (…) by comparing the trajectories they generate” is an incomplete sentence [0033]: " repsents " should read "represents" [0035]: "maximize the probability of the data that is generated by πθ is positive" should read "maximize the probability that the data that is generated by πθ is positive" Appropriate correction is required. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b ) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. The following is a quotation of 35 U.S.C. 112 (pre-AIA), second paragraph: The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the appl icant regards as his invention. Claims 1-10 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claims 1 and 6 recite “similar” and “dissimilar,” which are relative terms that render the claims indefinite. The terms “similar” and “dissimilar” are not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. Claim 6 recites “the patient , ” which has insufficient antecedent basis in the claim. The remainder of the claims are rejected due to dependency on a rejected base claim. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 6-10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim s do not fall within at least one of the four categories of patent eligible subject matter because they are directed to a system comprising a machine learning model, a model trainer, and a response interface, which encompasses software per se, as all of these components could be implemented with software. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis ( i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness . This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. Claim s 1 , 5, 6 , and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Komorowski et al. (NPL: “The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care”) (“ Komorowski ”) in view of Nguyen et al. (NPL: “Dual Discriminator Generative Adversarial Nets”) (“Nguyen”), and Mannu et al. ( US20210299353 ). Regarding claim 1, Komorowski discloses “A method for responding to changing conditions, comprising: training a model on historical treatment trajectories, using a processor, including trajectories that resulted in a positive health outcome and trajectories that resulted in a negative health outcome ( Komorowski , page 1716, paragraph 5: “A Markov decision process (MDP) was used to model the patient environment and trajectories. The various elements of the model were defined using patient data time series from the training set (a random sample of 80% of MIMIC-III; Fig. 1)” and page 1716, paragraph 4: “In both datasets, we extracted a set of 48 variables, including demographics, Elixhauser premorbid status, vital signs, laboratory values, fluids and vasopressors received (Supplementary Table 2). Patients’ data were coded as multidimensional discrete time series with 4-h time steps, and for each patient, we included up to 72 h of measurements taken around the estimated time of onset of sepsis. The total volume of intravenous fluids and maximum dose of vasopressors administered over each 4-h period defined the medical treatments of interest. The model aims at optimizing patient mortality, so a reward was associated to survival and a penalty to death” ; the examiner notes that the patient time series data that led to survival corresponds to “trajectories that resulted in a positive health outcome” and the patient time series data that led to death corresponds to “trajectories that resulted in a negative health outcome” ), by using … [reinforcement learning] to train the model to generate trajectories that are similar to historical treatment trajectories that resulted in a positive health outcome ( Komorowski , Methods, Building the computational model, paragraph 5: “The sequences of successive states and actions are referred to as patients’ trajectories. In our models, we used either hospital mortality or 90-d mortality as the sole defining factor for the system-defined penalty and reward. When a patient survived, a positive reward was released at the end of each patient’s trajectory (a ‘reward’ of + 100)” and paragraph 6: “As such, the resulting AI policy suggests the best possible treatment among all the options chosen (relatively frequently) by clinicians” ; the examiner notes that the patient treatment trajectories generated by the trained AI policy would be similar to historical treatment trajectories that resulted in survival, as trajectories leading to survival were rewarded during the training process ) , and by using … [reinforcement learning] to train the model to generate trajectories that are dissimilar to historical treatment trajectories that resulted in a negative health outcome ( Komorowski , Methods, Building the computational model, paragraph 5: “…a negative reward (a ‘penalty’ of –100) was issued if the patient died” and paragraph 6: “As such, the resulting AI policy suggests the best possible treatment among all the options chosen (relatively frequently) by clinicians” ; the examiner notes that the patient treatment trajectories generated by the trained AI policy would be dissimilar to historical treatment trajectories that resulted in death, as trajectories leading to death were penalized during the training process ) , and… generating the dynamic treatment regime for a patient using the trained model and patient information … ” ( Komorowski , page 1716, paragraph 2: “We developed the AI Clinician, a computational model using reinforcement learning, which is able to dynamically suggest optimal treatments for adult patients with sepsis in the intensive care unit (ICU),” and page 1716, paragraph 5: “We deployed the AI Clinician to solve the MDP and predict outcomes of treatment strategies” and page 1719, paragraph 5: “We envision that this system would be used in real-time, with patient data obtained from different streams being fed into electronic health record software fitted with our algorithm, which would suggest a course of action” ; the examiner notes that the AI Clinician corresponds to the “dynamic treatment regime” ). Komorowski does not appear to explicitly disclose the further limitations of the claim. However, Nguyen discloses an “adversarial discriminator” and a “cooperative discriminator” ( Nguyen, Section 3: “Our intuition is based on GAN, but we formulate a three-player game that consists of two different discriminators D 1 and D 2 , and one generator G. Given a sample x in data space, D 1 (x) rewards a high score if x is drawn from the data distribution P data , and gives a low score if generated from the model distribution P G . In contrast, D 2 (x) returns a high score for x generated from P G whilst giving a low score for a sample drawn from P data ; the examiner notes that discriminator D 1 (x) is “adversarial” to the generator because it gives a high score to the real data, and discriminator D 2 (x) is “cooperative” with the generator because it gives a high score to the generated data ), and “iteratively training the adversarial discriminator, the cooperative discriminator, and… [a generator] using a three-party optimization” ( Nguyen, Section 3: “More formally, D 1 , D 2 and G now play the following three-player minimax optimization game: [see eq (1)]” ). Nguyen and the instant application both relate to machine learning and are analogous. It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified Komorowski with the teachings of Nguyen to include using an adversarial discriminator to train the model to generate trajectories that are similar to historical treatment trajectories that resulted in a positive outcome, using a cooperative discriminator to train the model to generate trajectories that are dissimilar to historical treatment trajectories that resulted in a negative outcome, and including iteratively training the adversarial discriminator, the cooperative discriminator, and a dynamic response regime using a three-party optimization, and one would have been motivated to do so for the purpose of avoiding mode collapse and efficiently scaling up to very large datasets ( see Nguyen, Section 1, paragraph 5 ). Neither Komorowski nor Nguyen appear s to explicitly disclose the further limitations of the claim. However, M annu discloses “ treating … [a] patient in accordance with … [a] dynamic treatment regime, in a manner that is responsive to changing patient conditions, by triggering one or more medical devices to administer a treatment to the patient ” ( Mannu , [0036] : “ The system may be further configured to determine a trained model from the application of machine learning techniques to historical data, wherein the trained model may be used to infer a change in the patient's condition ”; [0042] : “ The system may be configured to determine at least one rule from the trained model ”; [0043] : “ The rule may define a danger zone on at least one parameter, wherein the danger zone defines a threshold on the at least one parameter to indicate when a measurement for that parameter indicates deterioration in the patient ”; [0044] : “ The system may be configured to infer a change in the patient's condition responsive to a determination that the at least one parameter is indicating deterioration in the condition of the patient ”; [0045] : “ The system may, responsive to inferring a change in the patient's condition, be configured to initiate a response action designated by the rule ”; [0050] : “ The response action may be a request to a fluid bolus mechanism to increase or decrease the amount of fluid delivered to the patient ”; [0052] : “ The fluid bolus mechanism may, responsive to receiving the request, implement the request and delivers the requested amount or rate to the patient ” ; the examiner notes that the trained model corresponds to the dynamic treatment regime , and the fluid bolus mechanism delivering the request ed amount of fluid to the patient corresponds to triggering a medical device to treat the patient in accordance with the dynamic treatment regime in a manner that is respons iv e to changing patient conditions ) . Mannu and the instant application both relate to machine learning and are analogous. It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention to have modified the combination of Komorowski and Nguyen to include “ treatin g the patient in accordance wit h the dynamic treatment regime, in a manner that is responsive to changing patient conditions, by triggering one or more medical devices to administer a treatment to the patient ” as disclosed by Mannu , and one would have been motivated to do so for the purpose of improving patient outcomes by reducing human error and delay in providing the optimal treatment for a negative condition . Regarding claim 5, the rejection of claim 1 is incorporated. Mannu further discloses “wherein responding to changing patient conditions comprises automatically performing a responsive action to correct a negative condition” ( Mannu , [0044] : “ The system may be configured to infer a change in the patient's condition responsive to a determination that the at least one parameter is indicating deterioration in the condition of the patient ”; [0045] : “ The system may, responsive to inferring a change in the patient's condition, be configured to initiate a response action designated by the rule ”; [0050] : “ The response action may be a request to a fluid bolus mechanism to increase or decrease the amount of fluid delivered to the patient ”; [0052] : “ The fluid bolus mechanism may, responsive to receiving the request, implement the request and delivers the requested amount or rate to the patient ” ). Mannu and the instant application both relate to machine learning and are analogous. It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention to have modified the combination of Komorowski and Nguyen to include automatically performing a responsive action to correct a negative condition, as disclosed by Mannu , and one would have been motivated to do so for the purpose of improving patient outcomes by reducing human error and delay in providing the optimal treatment for a negative condition. Regarding claim 6 , Komorowski discloses “A system for responding to changing conditions, comprising: a machine learning model, configured to generate a dynamic treatment regime for using patient information ( Komorowski , page 1716, paragraph 2: “We developed the AI Clinician, a computational model using reinforcement learning, which is able to dynamically suggest optimal treatments for adult patients with sepsis in the intensive care unit (ICU)” and paragraph 4: “We deployed the AI Clinician to solve the MDP and predict outcomes of treatment strategies” and page 1719, paragraph 5: “We envision that this system would be used in real-time, with patient data obtained from different streams being fed into electronic health record software fitted with our algorithm, which would suggest a course of action” ; the examiner notes that the MDP corresponds to “a machine learning model” and the AI Clinician corresponds to a “dynamic treatment regime” ); a model trainer, configured to train the machine learning model, including trajectories that resulted in a positive health outcome and trajectories that resulted in a negative health outcome ( Komorowski , page 1716, paragraph 5: “A Markov decision process (MDP) was used to model the patient environment and trajectories. The various elements of the model were defined using patient data time series from the training set (a random sample of 80% of MIMIC-III; Fig. 1)” and page 1716, paragraph 4: “In both datasets, we extracted a set of 48 variables, including demographics, Elixhauser premorbid status, vital signs, laboratory values, fluids and vasopressors received (Supplementary Table 2). Patients’ data were coded as multidimensional discrete time series with 4-h time steps, and for each patient, we included up to 72 h of measurements taken around the estimated time of onset of sepsis. The total volume of intravenous fluids and maximum dose of vasopressors administered over each 4-h period defined the medical treatments of interest. The model aims at optimizing patient mortality, so a reward was associated to survival and a penalty to death” ; the examiner notes that the patient time series data that led to survival corresponds to “trajectories that resulted in a positive health outcome” and the patient time series data that led to death corresponds to “trajectories that resulted in a negative health outcome” ), by using… [reinforcement learning] to train the machine learning model to generate trajectories that are similar to historical treatment trajectories that resulted in a positive health outcome ( Komorowski , Methods, Building the computational model, paragraph 5: “The sequences of successive states and actions are referred to as patients’ trajectories. In our models, we used either hospital mortality or 90-d mortality as the sole defining factor for the system-defined penalty and reward. When a patient survived, a positive reward was released at the end of each patient’s trajectory (a ‘reward’ of +100)” and paragraph 6: “As such, the resulting AI policy suggests the best possible treatment among all the options chosen (relatively frequently) by clinicians” ; the examiner notes that the patient treatment trajectories generated by the trained AI policy would be similar to historical treatment trajectories that resulted in survival, as trajectories leading to survival were rewarded during the training process ), and by using… [reinforcement learning] to train the model to generate trajectories that are dissimilar to historical treatment trajectories that resulted in a negative health outcome…” ( Komorowski , Methods, Building the computational model, paragraph 5: “…a negative reward (a ‘penalty’ of –100) was issued if the patient died” and paragraph 6: “As such, the resulting AI policy suggests the best possible treatment among all the options chosen (relatively frequently) by clinicians” ; the examiner notes that the patient treatment trajectories generated by the trained AI policy would be dissimilar to historical treatment trajectories that resulted in death, as trajectories leading to death were penalized during the training process ). Komorowski does not appear to explicitly disclose the further limitations of the claim. However, Nguyen discloses an “adversarial discriminator” and a “cooperative discriminator” ( Nguyen, Section 3: “Our intuition is based on GAN, but we formulate a three-player game that consists of two different discriminators D 1 and D 2 , and one generator G. Given a sample x in data space, D 1 (x) rewards a high score if x is drawn from the data distribution P data , and gives a low score if generated from the model distribution P G . In contrast, D 2 (x) returns a high score for x generated from P G whilst giving a low score for a sample drawn from P data ; the examiner notes that discriminator D 1 (x) is “adversarial” to the generator because it gives a high score to the real data, and discriminator D 2 (x) is “cooperative” with the generator because it gives a high score to the generated data ), and “iteratively training the adversarial discriminator, the cooperative discriminator, and… [a generator] using a three-party optimization” ( Nguyen, Section 3: “More formally, D 1 , D 2 and G now play the following three-player minimax optimization game: [see eq (1)]” ). Nguyen and the instant application both relate to machine learning and are analogous. It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified Komorowski with the teachings of Nguyen to include using an adversarial discriminator to train the model to generate trajectories that are similar to historical treatment trajectories that resulted in a positive outcome, using a cooperative discriminator to train the model to generate trajectories that are dissimilar to historical treatment trajectories that resulted in a negative outcome, and to iteratively train the adversarial discriminator, the cooperative discriminator, and a dynamic treatment regime using a three-party optimization, and one would have been motivated to do so for the purpose of avoiding mode collapse and efficiently scaling up to very large datasets ( see Nguyen, Section 1, paragraph 5 ). Neither Komorowski nor Nguyen appear to explicitly disclose the further limitations of the claim. However, Mannu discloses “a response interface, configured to trigger one or more medical devices to treat the patient in accordance with the dynamic treatment regime in a manner that is responsive to changing patient conditions ” ( Mannu , [0036] : “ The system may be further configured to determine a trained model from the application of machine learning techniques to historical data, wherein the trained model may be used to infer a change in the patient's condition ”; [0042] : “ The system may be configured to determine at least one rule from the trained model ”; [0043] : “ The rule may define a danger zone on at least one parameter, wherein the danger zone defines a threshold on the at least one parameter to indicate when a measurement for that parameter indicates deterioration in the patient ”; [0044] : “ The system may be configured to infer a change in the patient's condition responsive to a determination that the at least one parameter is indicating deterioration in the condition of the patient ”; [0045] : “ The system may, responsive to inferring a change in the patient's condition, be configured to initiate a response action designated by the rule ”; [0050] : “ The response action may be a request to a fluid bolus mechanism to increase or decrease the amount of fluid delivered to the patient ”; [0052] : “ The fluid bolus mechanism may, responsive to receiving the request, implement the request and delivers the requested amount or rate to the patient ”; [0217] : “ The communication interface 912 may be built into the container 902 where corresponding electronics 914 may be situated to receive the request message and convert the information in the request message into an action which correspondingly adjusts the amount of fluid bolus delivered to the patient through the cannula 910 by controlling the amount of fluid bolus which is conveyed through syringe 908 ” ; the examiner notes that the communication interface 912 corresponds to a response interface, the trained model corresponds to the dynamic treatment regime, and the fluid bolus mechanism delivering the request ed amount of fluid to the patient corresponds to triggering a medical device to treat the patient in accordance with the dynamic treatment regime in a manner that is response to changing patient conditions ). Mannu and the instant application both relate to machine learning and are analogous. It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention to have modified the combination of Komorowski and Nguyen to include “ a response interface, configured to trigger one or more medical devices to treat the patient in accordance with the dynamic treatment regime in a manner that is responsive to changing patient conditions ” as disclosed by Mannu , and one would have been motivated to do so for the purpose of improving patient outcomes by reducing human error and delay in providing the optimal treatment for a negative condition. Regarding claim 10, the rejection of claim 6 is incorporated. Claim 10 is a system claim corresponding to method claim 5, and the rejection follows the same rationale as that of claim 5 above. Claim s 2 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Komorowski in view of Nguyen, Mannu , and Chatzimichail et al. (NPL: “Predicting Asthma Outcome Using Partial Least Square Regression and Artificial Neural Networks”) (“ Chatzimichail ”). Regarding claim 2, the rejection of claim 1 is incorporated. Komorowski as modified by Nguyen and Mannu discloses “the adversarial discriminator,” “the cooperative discriminator” and “the dynamic treatment regime,” but does not appear to explicitly disclose the further limitations of the claim. However, Chatzimichail discloses implementing an asthma outcome prediction model with a multiple-layer perceptron ( Chatzmichail , Section 3.1: “The prediction algorithm which has been employed in this study consists of two stages: the feature reduction through partial least square regression and the classification stage by MLP [multilayer perceptron] and PNN classifiers” ). Chatzimichail and the instant application both relate to machine learning and are analogous. It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the combination of Komorowski , Nguyen, and Mannu with the teachings of Chatzmichail to implement the adversarial discriminator, the cooperative discriminator, and the dynamic treatment regime as multiple-layer perceptrons , and one would have been motivated to do so for the purpose of achieving accurate modeling of complex data patterns, and because MLPs have a clear architecture and simple algorithm compared to other types of neural networks ( see Chatzimichail , Section 1 ). Regarding claim 7 , the rejection of claim 6 is incorporated. Komorowski as modified by Nguyen and Mannu discloses “the adversarial discriminator,” “the cooperative discriminator” “the dynamic treatment regime,” and “the machine learning model” but does appear to explicitly disclose the further limitations of the claim. However, Chatzimichail discloses implementing an asthma outcome prediction model with a multiple-layer perceptron ( Chatzmichail , Section 3.1: “The prediction algorithm which has been employed in this study consists of two stages: the feature reduction through partial least square regression and the classification stage by MLP [multilayer perceptron] and PNN classifiers” ). Chatzimichail and the instant application both relate to machine learning and are analogous. It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the combination of Komorowski , Nguyen, and Mannu with the teachings of Chatzmichail to implement the adversarial discriminator, the cooperative discriminator, and the dynamic response regime as multiple-layer perceptrons in the machine learning model, and one would have been motivated to do so for the purpose of achieving accurate modeling of complex data patterns, and because MLPs have a clear architecture and simple algorithm compared to other types of neural networks ( see Chatzimichail , Section 1 ). Claim s 3-4 and 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Komorowski in view of Nguyen, Mannu , and Rampasek et al. (NPL: “Dr. VAE: Drug Response Variational Autoencoder”) (“ Rampasek ”). Regarding claim 3, the rejection of claim 1 is incorporated. Komorowski , Nguyen, and Mannu do not appear to explicitly disclose the further limitations of the claim. However, Rampasek discloses “training an environment model that encodes… patient information as a vector in a latent space” ( Rampasek , 2.1: “Perturbation Variational Autoencoder ( PertVAE ) is an unsupervised model for drug-induced gene expression perturbations, that embeds the data space (gene expression) in a lower dimensional latent space. In the latent space we model the drug-induced effect as a linear function, which is trained jointly with the embedding encoder and decoder. We fit PertVAE on “perturbation pairs” [x1; x2] of pre-treatment and post-treatment gene expression with shared stochastic embedding encoder q and decoder p. The original dimension of each vector x is 903 genes. Additionally we use unpaired pre-treatment data (with no know post- treatment state) to improve learning of the latent representation ” ; the examiner notes that “ PertVAE ” corresponds to an “environment model” and “ pre-treatment and post-treatment gene expression” corresponds to “ patient information” ). Rampasek and the instant application both relate to machine learning and are analogous. It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the combination of Komorowski , Nguyen, and Mannu with the teachings of Rampasek to include training an environment model that encodes the patient information as a vector in a latent space, and one would have been motivated to do so for the purpose of capturing the essence of the observed environment information that is most useful for prediction ( see Rampasek , Section 5 ). Regarding claim 4, the rejection of claim 3 is incorporated. Rampasek further discloses “wherein the environment model is implemented as a variational auto-encoder network” ( Rampasek , 2.1: “Perturbation Variational Autoencoder ( PertVAE ) is an unsupervised model for drug-induced gene expression perturbations, that embeds the data space (gene expression) in a lower dimensional latent space ” ). Rampasek and the instant application both relate to machine learning and are analogous. It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the combination of Komorowski , Nguyen and Mannu to include an environment model implemented as a variational auto-encoder network, as disclosed by Rampasek , and one would have been motivated to do so for the purpose of capturing the essence of the observed patient information that is most useful for prediction ( see Rampasek , Section 5 ). Regarding claim 8, the rejection of claim 6 is incorporated. Claim 8 is a system claim corresponding to method claim 3 and is rejected using the same rationale as claim 3 above. Regarding claim 9, the rejection of claim 8 is incorporated. Komorowski as modified by Nguyen and Mannu discloses “the machine learning model” but does not appear to explicitly disclose the further limitations of the claim. Rampasek further discloses “wherein the environment model is implemented as a variational auto-encoder network…” ( Rampasek , 2.1: “Perturbation Variational Autoencoder ( PertVAE ) is an unsupervised model for drug-induced gene expression perturbations, that embeds the data space (gene expression) in a lower dimensional latent space” ). Rampasek and the instant application both relate to machine learning and are analogous. It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the combination of Komorowski , Nguyen and Mannu with the teachings of Rampasek to implement the environment model as a variational auto-encoder network in the machine learning model, and one would have been motivated to do so for the purpose of capturing the essence of the observed patient information that is most useful for prediction ( see Rampasek , Section 5 ). Double Patenting The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg , 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman , 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi , 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum , 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel , 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington , 418 F.2d 528, 163 USPQ 644 (CCPA 1969). A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA. A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13. The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA/25, or PTO/AIA/26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer . Claims 1 and 6 are rejected on the ground of nonstatutory obviousness-type double patenting a s being unpatentable over claim s 3 and 1 1 of U.S. Patent No. 11783189 (“reference”) in view of Mannu . Reference claims 3 and 11 recite all of the limitations of instant claims 1 and 6, respectively , except insofar as the instant claims recite a “dynamic treatment regime” generated “for a patient,” rather than a dynamic response regime, “patient information” and “patient conditions” rather than environment information and environment conditions, and “treating the patient… by triggering one or more medical devices to administer a treatment to the patient.” Mannu discloses a dynamic treatment regime for a patient, and treating the patient by triggering one or more medical devices to administer a treatment to the patient , responsive to changing patient conditions ( Mannu , [0036] : “ The system may be further configured to determine a trained model from the application of machine learning techniques to historical data, wherein the trained model may be used to infer a change in the patient's condition ”; [0042] : “ The system may be configured to determine at least one rule from the trained model ”; [0043] : “ The rule may define a danger zone on at least one parameter, wherein the danger zone defines a threshold on the at least one parameter to indicate when a measurement for that parameter indicates deterioration in the patient ”; [0044] : “ The system may be configured to infer a change in the patient's condition responsive to a determination that the at least one parameter is indicating deterioration in the condition of the patient ”; [0045] : “ The system may, responsive to inferring a change in the patient's condition, be configured to initiate a response action designated by the rule ”; [0050] : “ The response action may be a request to a fluid bolus mechanism to increase or decrease the amount of fluid delivered to the patient ”; [0052] : “ The fluid bolus mechanism may, responsive to receiving the request, implement the request and delivers the requested amount or rate to the patient ” ) . Mannu additionally discloses generating a dynamic treatment regime using patient information ( Mannu , [0062]: “The trained model may be based on the demographic status of the patient or on the medical history of the patient” ). It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the reference claims such that the dynamic response regime is a dynamic treatment regime for a patient, the environment information and environment conditions are patient information and patient conditions, and to include treating the patient by triggering one or more medical devices to administer a treatment to the patient, as disclosed by Mannu , and one would have been motivated to do so for the purpose of improving patient outcomes by reducing human error and delay in providing the optimal treatment for a negative condition. A claim comparison chart is provided below. Instant Application US Patent No. 11783189 1. A method for responding to changing conditions, comprising: training a model on historical treatment trajectories, using a processo r, including trajectories that resulted in a positive health outcome and trajectories that resulted in a negative health outcome, by using an adversarial discriminator to train the model to generate trajectories that are similar to historical treatment trajectories that resulted in a positive health outcome, and by using a cooperative discriminator to train the model to generate trajectories that are dissimilar to historical treatment trajectories that resulted in a negative health outcome , and including iteratively training the adversarial discriminator, the cooperative discriminator, and a dynamic treatment regime using a three-party optimization; generating the dynamic treatment regime for a patient using the trained model and patient information; and treating the patient in accordance with the dynamic treatment regime , in a manner that is responsive to changing patient conditions , by triggering one or more medical devices to administer a treatment to the patient. 1. A method for responding to changing conditions, comprising: training a model, using a processor, including trajectories that resulted in a positive outcome and trajectories that resulted in a negative outcome, by using an adversarial discriminator to train the model to generate trajectories that are similar to historical trajectories that resulted in a positive outcome, and by using a cooperative discriminator to train the model to generate trajectories that are dissimilar to historical trajectories that resulted in a negative outcome, and including iteratively training the adversarial discriminator, the cooperative discriminator, and the dynamic response regime using a three-party optimization; generating a dynamic response regime using the trained model and environment information; and responding to changing environment conditions in accordance with the dynamic response regime. 2. The method of claim 1, wherein the historical trajectories that resulted in a positive outcome and the historical trajectories that resulted in a negative outcome include patient treatment trajectories. 3. The method of claim 2, wherein the positive outcomes are positive patient health outcomes, and the negative outcomes are negative patient health outcomes. 6. A system for responding to changing conditions, comprising: a machine learning model, configured to generate a dynamic treatment regime for using patient information; a model trainer, configured to train the machine learning model, including trajectories that resulted in a positive health outcome and trajectories that resulted in a negative health outcome, by using an adversarial discriminator to train the machine learning model to generate trajectories that are similar to historical treatment trajectories that resulted in a positive health outcome, and by using a cooperative discriminator to train the model to generate trajectories that are dissimilar to historical treatment trajectories that resulted in a negative health outcome, and to iteratively train the adversarial discriminator, the cooperative discriminator, and a dynamic treatment regime using a three-party optimization; and a response interface, configured to trigger one or more medical devices to treat the patient in accordance with the dynamic treatment regime in a manner that is responsive to changing patient conditions. 9. A system for responding to changing conditions, comprising: a machine learning model, configured to generate a dynamic response regime for using environment information; a model trainer, configured to train the machine learning model, including trajectories that resulted in a positive outcome and trajectories that resulted in a negative outcome, by using an adversarial discriminator to train the machine learning model to generate trajectories that are similar to historical trajectories that resulted in a positive outcome, and by using a cooperative discriminator to train the model to generate trajectories that are dissimilar to historical trajectories that resulted in a negative outcome, and to iteratively train the adversarial discriminator, the cooperative discriminator, and the dynamic response regime using a three-party optimization; and a response interface, configured to trigger a response to changing environment conditions in accordance with the dynamic response regime. 10. The system of claim 9, wherein the historical trajectories that resulted in a positive outcome and the historical trajectories that resulted in a negative outcome include patient treatment trajectories. 11. The system of claim 10, wherein the positive outcomes are positive patient health outcomes, and the negative outcomes are negative patient health outcomes. Claims 1 and 6 are provisionally rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 3 and 11 of copending Application No. 18/362,193 (“reference”) in view of Mannu . Reference claims 3 and 11 recite all of the limitations of i nstant claims 1 and 6 , respectively, except insofar as the instant claims recite a “dynamic treatment regime” generated “for a patient,” rather than a dynamic response regime, “patient information” and “patient conditions” rather than environment information and environment conditions, and “treating the patient… by triggering one or more medical devices to administer a treatment to the patient.” Mannu discloses a dynamic treatment regime for a patient, and treating the patient by triggering one or more medical devices to administer a treatment to the patient, responsive to changing patient conditions ( Mannu , [0036] : “ The system may be further configured to determine a trained model from the application of machine learning techniques to historical data, wherein the trained model may be used to infer a change in the patient's condition ”; [0042] : “ The system may be configured to determine at least one rule from the trained model ”; [0043] : “ The rule may define a danger zone on at least one parameter, wherein the danger zone defines a threshold on the at least one parameter to indicate when a measurement for that parameter indicates deterioration in the patient ”; [0044] : “ The system may be configured to infer a change in the patient's condition responsive to a determination that the at least one parameter is indicating deterioration in the condition of the patient ”; [0045] : “ The system may, responsive to inferring a change in the patient's condition, be configured to initiate a response action designated by the rule ”; [0050] : “ The response action may be a request to a fluid bolus mechanism to increase or decrease the amount of fluid delivered to the patient ”; [0052] : “ The fluid bolus mechanism may, responsive to receiving the request, implement the request and delivers the requested amount or rate to the patient ” ) . Mannu additionally discloses generating a dynamic treatment regime using patient information ( Mannu , [0062]: “The trained model may be based on the demographic status of the patient or on the medical history of the patient” ). It would have been obvious to one of ordinary skill in the art, prior to the effective filing date of the claimed invention, to have modified the reference claims such that the dynamic response regime is a dynamic treatment regime for a patient, the environment information and environment conditions are patient information and patient conditions, and to include treating the patient by triggering one or more medical devices to administer a treatment to the patient, as disclosed by Mannu , and one would have been motivated to do so for the purpose of improving patient outcomes by reducing human error and delay in providing the optimal treatment for a negative condition. A claim comparison chart is provided below. Instant Application Reference Application 18/362,193 1. A method for responding to changing conditions, comprising: training a model on historical treatment trajectories, using a processor, including trajectories that resulted in a positive health outcome and trajectories that resulted in a negative health outcome, by using an adversarial discriminator to train the model to generate trajectories that are similar to historical treatment trajectories that resulted in a positive health outcome, and by using a cooperative discriminator to train the model to generate trajectories that are dissimilar to historical treatment trajectories that resulted in a negative health outcome, and including iteratively training the adversarial discriminator, the cooperative discriminator, and a dynamic treatment regime using a three-party optimization; generating the dynamic treatment regime for a patient using the trained model and patient information; and treating the patient in accordance with the dynamic treatment regime, in a manner that is responsive to changing patient conditions, by triggering one or more medical devices to administer a treatment to the patient. 1. A method for responding to changing conditions, comprising: training a model, using a processor, including trajectories that resulted in a positive outcome and trajectories that resulted in a negative outcome, by using an adversarial discriminator to train the model to generate trajectories that are similar to historical trajectories that resulted in a positive outcome, and by using a cooperative discriminator to train the model to generate trajectories that are dissimilar to historical trajectories that resulted in a negative outcome, and including iteratively training the adversarial discriminator, the cooperative discriminator, and the dynamic response regime using a three-party optimization until a predetermined number of iterations has been reached ; generating a dynamic response regime using the trained model and environment information; and responding to changing environment conditions in accordance with the dynamic response regime. 2. The method of claim 1, wherein the historical trajectories include patient treatment trajectories. 3. The method of claim 2, wherein the positive outcomes are positive patient health outcomes, and the negative outcomes are negative patient health outcomes. 6. A system for responding to changing conditions, comprising: a machine learning model, configured to generate a dynamic treatment regime for using patient information; a model trainer, configured to train the machine learning model, including trajectories that resulted in a positive health outcome and trajectories that re
Read full office action
Adversarial Cooperative Imitation Learning for Dynamic Treatment

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Adversarial Cooperative Imitation Learning for Dynamic Treatment

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email