Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 6/27/2022 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement has been considered by the examiner.
Status of Claims
Claims 1-20 are pending.
Claims 1-20 are rejected.
Drawings
The Drawings filed on 6/27/2022 were considered.
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The instant application filed on 08/03/2022 claims benefit to a foreign application filed on 6/29/2021. Therefore, the effective filing date of the instant application is 6/29/2021
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite: (a) mathematical concepts, (e.g., mathematical relationships, formulas or equations, mathematical calculations); and (b) mental processes, i.e., concepts performed in the human mind, (e.g., observation, evaluation, judgement, opinion).
Subject matter eligibility evaluation in accordance with MPEP 2106:
Eligibility Step 1: Claims 1-14 and 18-20 are directed to a method for estimating solubility. Claims 15-17 are directed towards a system for estimating solubility.
Independent claim 1 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
generating at least one descriptor based on the input data (mental process, mathematical process)
obtaining at least one solubility parameter by providing the at least one descriptor to a machine learning model trained based on chemical structures and sample solubility parameters of sample materials (mathematical process)
calculating the solubility based on the at least one solubility parameter, wherein the at least one descriptor includes at least one of a zero-dimensional descriptor, a one-dimensional descriptor, a two-dimensional descriptor, or a three-dimensional descriptor, each representing the chemical structure of the target material (mathematical process)
Dependent claim 2 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the zero-dimensional descriptor includes at least one of an atom count, a bond count, atomic charges, atom-centered fragment charges, a total positive charge, a total negative charge, a number of atomic positive charges, a number of atomic negative charges, electronegativity, and ionization potential (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
Dependent claim 3 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the one-dimensional descriptor includes at least one of a fragment count, a hydrogen bond acceptor, a hydrogen bond donor, atom-centered fragment charges, and a number of disconnected fragments (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
Dependent claim 4 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the two-dimensional descriptor includes at least one of graph invariants, a number of fragment positive charges, a number of fragment negative charges, topological charge indices corresponding to charge transfers between pairs of atoms, and connectivity indices (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
Dependent claim 5 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the three-dimensional descriptor includes at least one of a size, a surface, and a volume of the target material (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
Dependent claim 6 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the target material includes an ionic compound, and the at least one descriptor includes a descriptor including information about a charge of the ionic compound (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
Dependent claim 7 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the input data corresponds to a string including a series of characters defining the chemical structure of the target material, and the at least one descriptor is constituted of at least one number (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
Dependent claim 8 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the at least one solubility parameter includes a dispersion force parameter, a polar force parameter, and a hydrogen bond force parameter as Hansen solubility parameters (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
Dependent claim 9 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
herein the calculating of the solubility includes calculating solubility of a solute in a solvent based on at least one first solubility parameter corresponding to the solute and at least one second solubility parameter corresponding to the solvent (mathematical process)
Dependent claim 10 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
when the solute is a composite of at least two materials, the obtaining of the at least one solubility parameter includes calculating the at least one first solubility parameter based on a weighted sum of at least two solubility parameters respectively corresponding to the at least two materials, wherein a weight of the weighted sum corresponds to a proportion of a mass or a volume of each of the at least two materials in the composite (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
Dependent claim 11 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
when the solvent is a mixture of at least two solvents (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
the obtaining of the at least one solubility parameter includes calculating the at least one second solubility parameter based on a weighted sum of at least two solubility parameters respectively corresponding to the at least two solvents (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
wherein a weight of the weighted sum corresponds to a proportion of a mass or a volume of each of the at least two solvents in the mixture (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
Dependent claim 12 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
generating a plurality of sample descriptors based on the training data (mental process, mathematical process)
extracting at least one sample solubility parameter of the sample material from the training data (mathematical process)
training the machine learning model based on the plurality of sample descriptors and the at least one sample solubility parameter (mathematical process)
Dependent claim 13 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
identifying importance levels of the plurality of sample descriptors based on the trained machine learning model (mental process, mathematical process)
setting a descriptor feature group based on the importance levels of the plurality of sample descriptors (mental process, mathematical process)
the at least one descriptor is included in the descriptor feature group (mental process, mathematical process)
Dependent claim 14 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the training of the machine learning model is based on regression learning using at least one of a random forest and a Gaussian process (mathematical process)
Independent claim 15 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
generating at least one descriptor based on the input data (mathematical process)
generating at least one descriptor based on the input data; an operation of obtaining at least one solubility parameter by providing the at least one descriptor to a machine learning model trained based on chemical structures and sample solubility parameters of sample materials (mathematical process)
calculating the solubility based on the at least one solubility parameter (mathematical process)
generating at least one descriptor based on the input data; an operation of obtaining at least one solubility parameter by providing the at least one descriptor to a machine learning model trained based on chemical structures and sample solubility parameters of sample materials; and an operation of calculating the solubility based on the at least one solubility parameter, wherein the at least one descriptor includes at least one of a zero-dimensional descriptor, a one-dimensional descriptor, a two-dimensional descriptor, or a three-dimensional descriptor, each representing the chemical structure of the target material (mathematical process)
Dependent claim 16 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the target material includes an ionic compound, and the at least one descriptor includes a descriptor including information about a charge of the ionic compound (mathematical process)
Dependent claim 17 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the input data corresponds to a string including a series of characters, and the at least one descriptor is constituted of at least one number (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
Independent claim 18 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
generating a machine learning model trained to derive at least one solubility parameter from at least one descriptor defining a chemical structure of a material (mathematical process)
generating a plurality of sample descriptors based on the training data (mathematical process)
extracting at least one sample solubility parameter of the sample material from the training data (mathematical process)
training the machine learning model based on the plurality of sample descriptors and the at least one sample solubility parameter, wherein the plurality of sample descriptors include at least one of a zero-dimensional descriptor, a one-dimensional descriptor, a two-dimensional descriptor, or a three-dimensional descriptor, each representing a chemical structure of the sample material (mathematical process)
generating a plurality of sample descriptors based on the training data (mental process)
Dependent claim 19 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the sample material includes an ionic compound, and the plurality of sample descriptors include a descriptor including information about a charge of the ionic compound (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
Dependent claim 20 recites the following steps which fall within the mental processes and/or mathematical concepts groupings of abstract ideas:
wherein the training of the machine learning model is based on regression learning using at least one of a random forest and a Gaussian process (mathematical process)
This dependent claim is just limiting the parameters for mathematical calculations.
The abstract ideas recited in the claims are evaluated under the broadest reasonable interpretation (BRI) of the claim limitations when read in light of and consistent with the specification. As noted in the foregoing section, the claims are determined to contain limitations that can practically be performed in the human mind with the aid of a pencil and paper, and therefore recite judicial exceptions from the mental process grouping of abstract ideas. Additionally, the recited limitations that are identified as judicial exceptions from the mathematical concepts grouping of abstract ideas are abstract ideas irrespective of whether or not the limitations are practical to perform in the human mind.
Therefore, claims 1-20 recite an abstract idea as the dependent claims will inherit the abstract ideas from the independent claims.
[Step 2A Prong One: YES]
Eligibility Step 2A Prong Two: In determining whether a claim is directed to a judicial exception, further
examination is performed that analyzes if the claim recites additional elements that when examined as a
whole integrates the judicial exception(s) into a practical application (MPEP 2106.04(d)). A claim that
integrates a judicial exception into a practical application will apply, rely on, or use the judicial exception
in a manner that imposes a meaningful limit on the judicial exception. The claimed additional elements
are analyzed to determine if the abstract idea is integrated into a practical application (MPEP
2106.04(d)(I); MPEP 2106.05(a-h)). If the claim contains no additional elements beyond the abstract
idea, the claim fails to integrate the abstract idea into a practical application (MPEP 2106.04(d)(III)).
The judicial exceptions identified in Eligibility Step 2A Prong One are not integrated into a practical application because of the reasons noted below.
The additional element in independent claim 1 includes:
obtaining input data representing a chemical structure of a target material
The additional element in dependent claim 12 includes:
obtaining training data with respect to an attribute of a sample material
The additional element in independent claim 15 includes:
obtaining input data representing a chemical structure of a target material
The additional element in independent claim 18 includes:
obtaining training data with respect to an attribute of a sample material;
The additional elements of obtaining input data representing a chemical structure of a target material (claim 1), obtaining training data with respect to an attribute of a sample material (claim 12), obtaining input data representing a chemical structure of a target material (claim 15), obtaining training data with respect to an attribute of a sample material; (claim 18) are insignificant extra-solution activity that are part of the data gathering process used in the recited judicial exceptions (see MPEP 2106.05(g)).
Claims 1-20 do not recite any elements in addition to the judicial exception, and thus are part of the judicial exception.
Thus, the additionally recited elements merely invoke a computer as a tool, and/or amount to insignificant extra-solution data gathering activity, and as such, when all limitations in claims 1, 12, 15, 18 have been considered as a whole, the claims are deemed to not recite any additional elements that would integrate a judicial exception into a practical application, and therefore claims 1-20 are directed to an abstract idea (MPEP 2106.04(d)).
[Step 2A Prong Two: NO]
Eligibility Step 2B: Because the claims recite an abstract idea, and do not integrate that abstract idea into a practical application, the claims are probed for a specific inventive concept. The judicial exception alone cannot provide that inventive concept or practical application (MPEP 2106.05). Identifying whether the additional elements beyond the abstract idea amount to such an inventive concept requires considering the additional elements individually and in combination to determine if they amount to significantly more than the judicial exception (MPEP 2106.05A i-vi).
The claims do not include any additional elements that are sufficient to amount to significantly more than the judicial exception(s) because of the reasons noted below.
The additional elements recited in claims 1, 12, 15, 18 are identified above, and carried over from Step 2A: Prong Two along with their conclusions for analysis at Step 2B. Any additional element or combination of elements that was considered to be insignificant extra-solution activity at Step 2A: Prong Two was re-evaluated at Step 2B, because if such re-evaluation finds that the element is unconventional or otherwise more than what is well-understood, routine, conventional activity in the field, this finding may indicate that the additional element is no longer considered to be insignificant; and all additional elements and combination of elements were evaluated to determine whether any additional elements or combination of elements are other than what is well-understood, routine, conventional activity in the field, or simply append well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception, per MPEP 2106.05(d).
The additional elements of
obtaining input data representing a chemical structure of a target material (Claim 1);
obtaining training data with respect to an attribute of a sample material (Claim 12);
obtaining input data representing a chemical structure of a target material (Claim 15);
obtaining training data with respect to an attribute of a sample material (Claim 18);
are conventional and part of the data gathering process used in the recited judicial exceptions (see MPEP 2106.05(g)). Evidence for conventionality is shown by Boobier et al. (Boobier, S.; Hose, D. R. J.; Blacker, A. J.; Nguyen, B. N. Machine Learning with Physicochemical Relationships: Solubility Prediction in Organic Solvents and Water. Nature Communications 2020, 11 (1).) which obtains data regarding both attributes and chemical structure for machine learning for solubility, Sanchez-Lengeling et al. (Sanchez‐Lengeling, B.; Roch, L. M.; Perea, J. D.; Langner, S.; Brabec, C. J.; Aspuru‐Guzik, A. A Bayesian Approach to Predict Solubility Parameters. Advanced Theory and Simulations 2018, 2 (1), 1800069) which obtains data regarding both attributes and chemical structure for machine learning for solubility parameters,Manzanilla-Granados et al. (Héctor Manuel Manzanilla-Granados; Saint-Martin, H.; Raúl Fuentes-Azcatl; Alejandre, J. Direct Coexistence Methods to Determine the Solubility of Salts in Water from Numerical Simulations. Test Case NaCl. 2015, 119 (26), 8389–8396.) which obtains data regarding both attributes and chemical structure for machine learning for solubility.
Claims 1-20 do not recite any elements in addition to the judicial exception.
Therefore, when taken alone, all additional elements in claims 1-20 do not amount to significantly more than the above-identified judicial exception(s). Even when evaluated as a combination, the additional elements fail to transform the exception(s) into a patent-eligible application of that exception. Thus, claims 1-20 are deemed to not contribute an inventive concept, i.e., amount to significantly more than the judicial exception(s) (MPEP 2106.05(II)).
[Step 2B: NO]
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-5, 7, 8, 12-15, 18, 20 are rejected under 35 U.S.C. 103 as being unpatentable by Boobier et al. (Boobier, S.; Hose, D. R. J.; Blacker, A. J.; Nguyen, B. N. Machine Learning with Physicochemical Relationships: Solubility Prediction in Organic Solvents and Water. Nature Communications 2020, 11 (1).) in view of Sanchez-Lengeling et al. (Sanchez‐Lengeling, B.; Roch, L. M.; Perea, J. D.; Langner, S.; Brabec, C. J.; Aspuru‐Guzik, A. A Bayesian Approach to Predict Solubility Parameters. Advanced Theory and Simulations 2018, 2 (1), 1800069)
Regarding the limitations of independent claim 1,
obtaining input data representing a chemical structure of a target material
Boobier et al. teaches that each compound was identified by its InChIKey and analyzed using SMILES code (pg. 9, col. 1 Methods paragraph 1). Initial 3D coordinates were generated with CIRpy. Molecules were optimized in gas phase with B3LYP/6-31 + G(d) method using Gaussian 09. The solution phase optimization was carried out with an implicit polarizable continuum solvent model (IEFPCM) or solvation model based on electron density (SMD), pre-parametrized for each solvent (pg. 9, col. 1 Methods paragraph 1).
generating at least one descriptor based on the input data
Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
calculating the solubility based on the at least one solubility parameter, wherein the at least one descriptor includes at least one of a zero-dimensional descriptor, a one-dimensional descriptor, a two-dimensional descriptor, or a three-dimensional descriptor, each representing the chemical structure of the target material.
Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
Regarding the limitations of dependent claim 2,
wherein the zero-dimensional descriptor includes at least one of an atom count, a bond count, atomic charges, atom-centered fragment charges, a total positive charge, a total negative charge, a number of atomic positive charges, a number of atomic negative charges, electronegativity, and ionization potential
Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
Regarding the limitations of dependent claim 3,
wherein the one-dimensional descriptor includes at least one of a fragment count, a hydrogen bond acceptor, a hydrogen bond donor, atom-centered fragment charges, and a number of disconnected fragments.
Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
Regarding the limitations of dependent claim 4,
wherein the two-dimensional descriptor includes at least one of graph invariants, a number of fragment positive charges, a number of fragment negative charges, topological charge indices corresponding to charge transfers between pairs of atoms, and connectivity indices.
Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
Regarding the limitations of dependent claim 5,
wherein the three-dimensional descriptor includes at least one of a size, a surface, and a volume of the target material.
Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
Regarding the limitations of dependent claim 7,
wherein the input data corresponds to a string including a series of characters defining the chemical structure of the target material, and the at least one descriptor is constituted of at least one number.
Boobier et al. teaches that each compound was identified by its InChIKey and analyzed using SMILES code (pg. 9, col. 1 Methods paragraph 1). A SMILES is a string that represents chemical structure.
Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4) The number of atoms is at least 1 number.
Regarding the limitations of dependent claim 12,
further comprising generating the trained machine learning model, wherein the generating of the trained machine learning model includes: obtaining training data with respect to an attribute of a sample material
Boobier et al. collected solubility data were collected from Open Notebook Science Challenge aqueous solubility dataset and the Reaxys database. For this study, only solubility data of neutral solutes in single component solvents were collected (pg. 2, col. 1, Data Curation 1st Paragraph) The data was used in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
generating a plurality of sample descriptors based on the training data
Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
training the machine learning model based on the plurality of sample descriptors and the at least one sample solubility parameter.
Boobier et al. teaches that eight machine learning methods, i.e. MLR (Multiple Linear Regression), PLS (Partial Least Square), ANN (Artificial Neural Network), SVM (Support Vector Machine), GP (Gaussian Process), RF (Random Forest), ET (Extra Trees) and Bag (Bagging), were applied to all 5 datasets (pg. 3, col. 1 paragraph 2).
Regarding the limitations of dependent claim 13,
identifying importance levels of the plurality of sample descriptors based on the trained machine learning model
Boobier et al. teaches feature importance plots of the 5 ET models showed very high dependence of the models for Ethanol_set, Benzene_set, and Acetone_set on melting point. The models for Water_set_wide and Water_set_narrow showed a more even distribution of importance across all the descriptors. In solvents other than benzene, MW, molar volume, SASA, charges on heteroatoms, which are linked to solvent-solute interactions, were also given high importance. These analyses showed crucial insights into the factors controlling solubility in the four solvents in this study.
setting a descriptor feature group based on the importance levels of the plurality of sample descriptors, and the at least one descriptor is included in the descriptor feature group.
Boobier et al. teaches using an acceptable threshold of correlation R2 ≤ 0.9, the descriptors N_atoms, E0_gas, E0_solv, DeltaE0_sol, G_gas, gas_- dip, HOMO and LUMO were removed. Consequently, the trimmed down set of 14 was taken forward for solubility prediction models.
Regarding the limitations of dependent claim 14,
wherein the training of the machine learning model is based on regression learning using at least one of a random forest and a Gaussian process.
Boobier et al. teaches that eight machine learning methods, i.e. MLR (Multiple Linear Regression), PLS (Partial Least Square), ANN (Artificial Neural Network), SVM (Support Vector Machine), GP (Gaussian Process), RF (Random Forest), ET (Extra Trees) and Bag (Bagging), were applied to all 5 datasets (pg. 3, col. 1 paragraph 2).
Regarding the limitations of independent claim 15,
an operation of obtaining input data representing a chemical structure of a target material
Boobier et al. teaches that each compound was identified by its InChIKey and analyzed using SMILES code (pg. 9, col. 1 Methods paragraph 1). Initial 3D coordinates were generated with CIRpy. Molecules were optimized in gas phase with B3LYP/6-31 + G(d) method using Gaussian 09. The solution phase optimization was carried out with an implicit polarizable continuum solvent model (IEFPCM) or solvation model based on electron density (SMD), pre-parametrized for each solvent (pg. 9, col. 1 Methods paragraph 1).
an operation of generating at least one descriptor based on the input data
Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
an operation of calculating the solubility based on the at least one solubility parameter, wherein the at least one descriptor includes at least one of a zero-dimensional descriptor, a one-dimensional descriptor, a two-dimensional descriptor, or a three-dimensional descriptor, each representing the chemical structure of the target material.
Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4) and Boobier et al. teaches that eight machine learning methods, i.e. MLR (Multiple Linear Regression), PLS (Partial Least Square), ANN (Artificial Neural Network), SVM (Support Vector Machine), GP (Gaussian Process), RF (Random Forest), ET (Extra Trees) and Bag (Bagging), were applied to all 5 datasets (pg. 3, col. 1 paragraph 2).
Regarding the limitations of dependent claim 17,
wherein the input data corresponds to a string including a series of characters, and the at least one descriptor is constituted of at least one number.
Boobier et al. teaches that each compound was identified by its InChIKey and analyzed using SMILES code (pg. 9, col. 1 Methods paragraph 1). A SMILES is a string that represents chemical structure.
Boobier et al. also teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4) The number of atoms is at least 1 number.
Regarding the limitations of independent claim 18,
obtaining training data with respect to an attribute of a sample material
Boobier et al. collected solubility data were collected from Open Notebook Science Challenge aqueous solubility dataset and the Reaxys database. For this study, only solubility data of neutral solutes in single component solvents were collected (pg. 2, col. 1, Data Curation 1st Paragraph) The data was used in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
generating a plurality of sample descriptors based on the training data
Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
training the machine learning model based on the plurality of sample descriptors and the at least one sample solubility parameter, wherein the plurality of sample descriptors include at least one of a zero-dimensional descriptor, a one-dimensional descriptor, a two-dimensional descriptor, or a three-dimensional descriptor, each representing a chemical structure of the sample material.
Boobier et al. teaches that eight machine learning methods, i.e. MLR (Multiple Linear Regression), PLS (Partial Least Square), ANN (Artificial Neural Network), SVM (Support Vector Machine), GP (Gaussian Process), RF (Random Forest), ET (Extra Trees) and Bag (Bagging), were applied to all 5 datasets (pg. 3, col. 1 paragraph 2) and Boobier et al. teaches in order to develop interpretable predictive models for solubility in different solvents, a small set of molecular descriptors, which represent solute-solute and solute-solvent interactions, was selected. This small set of descriptors will also benefit the statistical robustness of the models given the relatively small size of the datasets. Some descriptors are covering sum of thermal and electronic energies of the solute molecule, solvation energy, orbital interaction between solute and solvent, dipole moment and charge distribution in the solute molecule, molecular volume, Solvent Accessible Surface Area, molecular weight and the number of atoms of the solute. (pg.2, col. 2, paragraph 4)
Regarding the limitations of independent claim 20,
wherein the training of the machine learning model is based on regression learning using at least one of a random forest and a Gaussian process.
Boobier et al. teaches that eight machine learning methods, i.e. MLR (Multiple Linear Regression), PLS (Partial Least Square), ANN (Artificial Neural Network), SVM (Support Vector Machine), GP (Gaussian Process), RF (Random Forest), ET (Extra Trees) and Bag (Bagging), were applied to all 5 datasets (pg. 3, col. 1 paragraph 2).
Boobier et al. does not explicitly teach
obtaining at least one solubility parameter by providing the at least one descriptor to a machine learning model trained based on chemical structures and sample solubility parameters of sample materials (claim 1)
wherein the at least one solubility parameter includes a dispersion force parameter, a polar force parameter, and a hydrogen bond force parameter as Hansen solubility parameters (Claim 8)
extracting at least one sample solubility parameter of the sample material from the training data (Claim 12)
an operation of obtaining at least one solubility parameter by providing the at least one descriptor to a machine learning model trained based on chemical structures and sample solubility parameters of sample materials (Claim 15)
extracting at least one sample solubility parameter of the sample material from the training data (Claim 18)
Regarding the limitations of independent claim 1,
obtaining at least one solubility parameter by providing the at least one descriptor to a machine learning model trained based on chemical structures and sample solubility parameters of sample materials
Sanchez-Lengeling teaches a machine learning model named gpHSP, a probabilistic and interpretable predictive model for Hansen solubility parameters. This model is trained on experimental and theoretical data, and is validated with regression metrics. Our work demonstrates higher predictive power over several baseline models, and can leverage several types of information. If using only topological information, prediction takes less than a second, significantly reducing the timeline of trial-and-error approaches to material synthesis and device fabrication of organic blends. Additionally, it avoids over-fitting and requires a low number of hyperparameters (pg. 8, col. 2, Conclusion paragraph 1)
Regarding the limitations of dependent claim 8,
wherein the at least one solubility parameter includes a dispersion force parameter, a polar force parameter, and a hydrogen bond force parameter as Hansen solubility parameters
Sanchez-Lengeling teaches a machine learning model named gpHSP, a probabilistic and interpretable predictive model for Hansen solubility parameters. This model is trained on experimental and theoretical data, and is validated with regression metrics. Our work demonstrates higher predictive power over several baseline models, and can leverage several types of information. If using only topological information, prediction takes less than a second, significantly reducing the timeline of trial-and-error approaches to material synthesis and device fabrication of organic blends. Additionally, it avoids over-fitting and requires a low number of hyperparameters (pg. 8, col. 2, Conclusion paragraph 1) Figure 2 also shows the Hansen solubility parameters of dispersion force parameter, a polar force parameter, and a hydrogen bond force parameter being calculated. (Figure 2, pg. 3)
Regarding the limitations of dependent claim 12,
extracting at least one sample solubility parameter of the sample material from the training data
Sanchez-Lengeling teaches a machine learning model named gpHSP, a probabilistic and interpretable predictive model for Hansen solubility parameters. This model is trained on experimental and theoretical data, and is validated with regression metrics. Our work demonstrates higher predictive power over several baseline models, and can leverage several types of information. If using only topological information, prediction takes less than a second, significantly reducing the timeline of trial-and-error approaches to material synthesis and device fabrication of organic blends. Additionally, it avoids over-fitting and requires a low number of hyperparameters (pg. 8, col. 2, Conclusion paragraph 1) Figure 2 also shows the Hansen solubility parameters of dispersion force parameter, a polar force parameter, and a hydrogen bond force parameter being calculated. (Figure 2, pg. 3)
Regarding the limitations of dependent claim 15,
an operation of obtaining at least one solubility parameter by providing the at least one descriptor to a machine learning model trained based on chemical structures and sample solubility parameters of sample materials
Sanchez-Lengeling teaches a machine learning model named gpHSP, a probabilistic and interpretable predictive model for Hansen solubility parameters. This model is trained on experimental and theoretical data, and is validated with regression metrics. Our work demonstrates higher predictive power over several baseline models, and can leverage several types of information. If using only topological information, prediction takes less than a second, significantly reducing the timeline of trial-and-error approaches to material synthesis and device fabrication of organic blends. Additionally, it avoids over-fitting and requires a low number of hyperparameters (pg. 8, col. 2, Conclusion paragraph 1)
Regarding the limitations of dependent claim 18,
extracting at least one sample solubility parameter of the sample material from the training data
Sanchez-Lengeling teaches a machine learning model named gpHSP, a probabilistic and interpretable predictive model for Hansen solubility parameters. This model is trained on experimental and theoretical data, and is validated with regression metrics. Our work demonstrates higher predictive power over several baseline models, and can leverage several types of information. If using only topological information, prediction takes less than a second, significantly reducing the timeline of trial-and-error approaches to material synthesis and device fabrication of organic blends. Additionally, it avoids over-fitting and requires a low number of hyperparameters (pg. 8, col. 2, Conclusion paragraph 1)
It would be obvious to combine the method of determining solubility of Boobier et al. with the solubility parameters calculated by Sanchez-Lengeling et al. because both work is in the same field of endeavor and Hansen solubility parameters are commonly used to determine the solubility of a sample. Therefore, a person having ordinary skill in the art would be motivated to add the calculation of Hansen solubility parameters with the machine learning model of Boobier et al. There is a reasonable expectation of success because adding an additional parameter for solubility prediction does not change the method of Boobier et al. as it just adds additional descriptors for the machine learning model.
Claims 6, 16, 19 are rejected under 35 U.S.C. 103 as being unpatentable by Boobier et al. in view of Sanchez-Lengeling et al. as applied to Claims 1-5, 7, 8, 12-15, 18, 20 above, and further in view of Manzanilla-Granados et al. (Héctor Manuel Manzanilla-Granados; Saint-Martin, H.; Raúl Fuentes-Azcatl; Alejandre, J. Direct Coexistence Methods to Determine the Solubility of Salts in Water from Numerical Simulations. Test Case NaCl. 2015, 119 (26), 8389–8396.)
As applied to Claims 1-5, 7, 8, 12-15, 18, 20 (detailed above), Boobier et al. in view of Sanchez-Lengeling et al. teaches a system and method for estimating solubility.
Boobier et al. in view of Sanchez-Lengeling et al.does not explicitly teach:
the at least one descriptor includes a descriptor including information about a charge of the ionic compound (Claim 6).
the at least one descriptor includes a descriptor including information about a charge of the ionic compound (Claim 16).
the at least one descriptor includes a descriptor including information about a charge of the ionic compound (Claim 19).
Regarding the limitations of dependent claim 6,
the at least one descriptor includes a descriptor including information about a charge of the ionic compound.
Manzanilla-Granados et al. teaches an appraisal of direct coexistence methods to determine the solubility of NaCl in water from MD simulations. The results show that it is possible to use relatively small systems to study salt solubility in water, but that a large simulation time, several microseconds, are required to attain equilibrium, and that the best starting configuration to do it is a nanocrystal immersed in supersaturated solution (pg. 8395, col. 1, Conclusion paragraph 1)
Regarding the limitations of dependent claim 16,
the at least one descriptor includes a descriptor including information about a charge of the ionic compound.
Manzanilla-Granados et al. teaches an appraisal of direct coexistence methods to determine the solubility of NaCl in water from MD simulations. The results show that it is possible to use relatively small systems to study salt solubility in water, but that a large simulation time, several microseconds, are required to attain equilibrium, and that the best starting configuration to do it is a nanocrystal immersed in supersaturated solution (pg. 8395, col. 1, Conclusion paragraph 1).
Regarding the limitations of dependent claim 19,
the at least one descriptor includes a descriptor including information about a charge of the ionic compound.
Manzanilla-Granados et al. teaches an appraisal of direct coexistence methods to determine the solubility of NaCl in water from MD simulations. The results show that it is possible to use relatively small systems to study salt solubility in water, but that a large simulation time, several microseconds, are required to attain equilibrium, and that the best starting configuration to do it is a nanocrystal immersed in supersaturated solution (pg. 8395, col. 1, Conclusion paragraph 1).
It would be obvious to modify the method of determining solubility of Boobier et al. in view Sanchez-Lengeling et al. in order to predict the solubility of ionic compounds taught by Manzanilla-Granados et al. because the works in the same field of endeavor and the only difference is ionic compound solubility being predicted. A person having ordinary skill in the art would be motivated to adjust their method to optimize for ionic compounds taught by Manzanilla-Granados et al. There is a reasonable expectation of success because the method of determining solubility of Boobier et al. in view Sanchez-Lengeling et al. is being modified with additional descriptors for the machine learning model taught by Manzanilla-Granados et al.
Claims 9, 11 are rejected under 35 U.S.C. 103 as being unpatentable by Boobier et al. in view of Sanchez-Lengeling et al. as applied to Claims 1-5, 7, 8, 12-15, 18, 20 above, and further in view of Shakeel et al. (Shakeel, F.; Haq, N.; Alsarra, I. A.; Alshehri, S. Solubility, Hansen Solubility Parameters and Thermodynamic Behavior of Emtricitabine in Various (Polyethylene Glycol-400 + Water) Mixtures: Computational Modeling and Thermodynamics. Molecules 2020, 25 (7), 1559.)
As applied to Claims 1-5, 7, 8, 12-15, 18, 20 (detailed above), Boobier et al. in view of Sanchez-Lengeling et al. teaches a system and method for estimating solubility.
Boobier et al. in view of Sanchez-Lengeling et al. does not explicitly teach:
wherein the calculating of the solubility includes calculating solubility of a solute in a solvent based on at least one first solubility parameter corresponding to the solute and at least one second solubility parameter corresponding to the solvent. (Claim 9)
wherein, when the solute is a composite of at least two materials, the obtaining of the at least one solubility parameter includes calculating the at least one first solubility parameter based on a weighted sum of at least two solubility parameters respectively corresponding to the at least two materials, wherein a weight of the weighted sum corresponds to a proportion of a mass or a volume of each of the at least two materials in the composite. (Claim 10)
wherein, when the solvent is a mixture of at least two solvents, the obtaining of the at least one solubility parameter includes calculating the at least one second solubility parameter based on a weighted sum of at least two solubility parameters respectively corresponding to the at least two solvents, wherein a weight of the weighted sum corresponds to a proportion of a mass or a volume of each of the at least two solvents in the mixture. (Claim 11)
Regarding the limitations of dependent claim 9,
wherein the calculating of the solubility includes calculating solubility of a solute in a solvent based on at least one first solubility parameter corresponding to the solute and at least one second solubility parameter corresponding to the solvent.
Shakeel et al. teaches the HSP of solute is associated with its solubility in pure solvent or cosolvent–water mixtures. If the HSP of solute is closed with that of pure solvent or cosolvent–water mixtures, the solubility of solute will be higher in that pure solvent or cosolvent–water mixture (pg. 11, paragraph 2)
Shakeel et al. teaches the RMSD values for ECT in various “PEG-400 + water” mixtures and pure solvents were estimated as 0.19% to 0.74% with an overall RMSD of 0.46%. (pg 8, paragraph 2)
Shakeel et al. teaches calculating HSP parameters for solvent and solute and determining solubility based on how close these values are.
Regarding the limitations of dependent claim 11,
wherein, when the solvent is a mixture of at least two solvents, the obtaining of the at least one solubility parameter includes calculating the at least one second solubility parameter based on a weighted sum of at least two solubility parameters respectively corresponding to the at least two solvents, wherein a weight of the weighted sum corresponds to a proportion of a mass or a volume of each of the at least two solvents in the mixture
Shakeel et al. teaches the following equation δmix = ∝ δ1 + (1− ∝)δ2 where α = volume fraction of PEG-400 in “PEG-400 + water” mixtures; δ1 = HSP of pure PEG-400 and δ2 = HSP of pure water. (Equation 4, pg. 11)
It would be obvious to modify the method of determining solubility of Boobier et al. in view Sanchez-Lengeling et al. with the solubility parameters calculated by Shakeel et al. because both work is in the same field of endeavor and Hansen solubility parameters are commonly used to determine the solubility of a sample. Therefore, a person having ordinary skill in the art would be motivated to add the calculation of Hansen solubility parameters with the machine learning model of Boobier et al. in view Sanchez-Lengeling et al. There is a reasonable expectation of success because adding an additional parameter for solubility prediction does not change the method of Boobier et al. in view Sanchez-Lengeling et al. as it just adds additional descriptors for the machine learning model from Shakeel et al.
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable by Boobier et al. in view of Sanchez-Lengeling et al. as applied to Claims 1-5, 7, 8, 12-15, 18, 20 above, and further in view of DiPaola-Baranyi et al. (DiPaola-Baranyi, G. Estimation of Polymer Solubility Parameters by Inverse Gas Chromatography. Macromolecules 1982, 15 (2), 622–624.)
As applied to Claims 1-5, 7, 8, 12-15, 18, 20 (detailed above), Boobier et al. in view of Sanchez-Lengeling et al. teaches a system and method for estimating solubility.
Boobier et al. in view of Sanchez-Lengeling et al. does not explicitly teach:
wherein, when the solute is a composite of at least two materials, the obtaining of the at least one solubility parameter includes calculating the at least one first solubility parameter based on a weighted sum of at least two solubility parameters respectively corresponding to the at least two materials, wherein a weight of the weighted sum corresponds to a proportion of a mass or a volume of each of the at least two materials in the composite. (Claim 10)
Regarding the limitations of dependent claim 10,
wherein, when the solute is a composite of at least two materials, the obtaining of the at least one solubility parameter includes calculating the at least one first solubility parameter based on a weighted sum of at least two solubility parameters respectively corresponding to the at least two materials, wherein a weight of the weighted sum corresponds to a proportion of a mass or a volume of each of the at least two materials in the composite.
DiPaola-Baranyi et al. teaches the solubility parameter determined for each copolymer is similar to that obtained by a linear interpolation (in weight fraction) of the corresponding homopolymer values (pg. 623, col. 1, paragraph 4)
It would be obvious to modify the method of determining solubility of Boobier et al. in view Sanchez-Lengeling et al. with the solubility parameters calculated by DiPaola-Baranyi et al. because both work is in the same field of endeavor and Hansen solubility parameters are commonly used to determine the solubility of a sample. Therefore, a person having ordinary skill in the art would be motivated to add the calculation of Hansen solubility parameters with the machine learning model of Boobier et al. in view Sanchez-Lengeling et al. There is a reasonable expectation of success because adding an additional parameter for solubility prediction does not change the method of Boobier et al. in view Sanchez-Lengeling et al. as it just adds additional descriptors for the machine learning model from DiPaola-Baranyi et al.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Connor Beveridge whose telephone number is 571-272-2099. The examiner can normally be reached Monday - Thursday 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karlheinz Skowronek can be reached at 571-272-9047. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/C.H.B./Examiner, Art Unit 1687
/Karlheinz R. Skowronek/Supervisory Patent Examiner, Art Unit 1687