DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. 202121016622, filed on 04/08/2021.
Status of Claims
Claims 1-18 are currently pending and examined on the merits.
Information Disclosure Statement
The information disclosure statements filed 03/10/2022 is/are acknowledged. A signed copy of the corresponding 1449 form has been included with this Office action.
Specification
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed. The examiner suggests: “SYSTEMS AND METHODS FOR COMPUTATIONAL DESIGN OF CRYSTAL STRUCTURES AND FUNCTIONAL MATERIALS”.
Applicant is reminded of the proper language and format for an abstract of the disclosure.
The abstract should be in narrative form and generally limited to a single paragraph on a separate sheet within the range of 50 to 150 words in length. The abstract should describe the disclosure sufficiently to assist readers in deciding whether there is a need for consulting the full patent text for details.
The language should be clear and concise and should not repeat information given in the title. It should avoid using phrases which can be implied, such as, “The disclosure concerns,” “The disclosure defined by this invention,” “The disclosure describes,” etc. In addition, the form and legal phraseology often used in patent claims, such as “means” and “said,” should be avoided.
Claim Objections
Claims 1 and 13 objected to because of the following informalities:
A typo at the phrase “passing. via”. There is a period where a comma should be. Appropriate correction is required.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1, 2, 4, 7, 8, 10, 13, 14, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Court et al. (Journal of Chemical Information and Modeling 2020 60 (10), 4518-4535;) in view of Zekun Ren et al. (arXiv dot com preprint, pages 1-9 (Year: 2020, version 1)).
The instant claims are drawn to a method and apparatus for:
1) obtaining the crystal structure data from a given training data set from a plurality of materials using a processor,
2) converting the crystal structure data into a three-dimensional (“3D” or “3-D”) crystal structure model comprised of a cell and basis image using gaussian functions,
3) creating a 3D elements matrix representing the location of one or more elements in the 3D basis image for each material,
4) training a basis autoencoder using the 3D basis image of each material and obtaining a set of reconstructed basis images,
5) training a segmentation network using reconstructed basis images to identify the location and types of a set of elements as atomic clusters,
6) ensuring the segmentation network is trained by using a species matrix for each material set as ground truth (the species matrix determined using the 3D elements matrix for the material),
7) training a cell autoencoder using the 3D cell image of each material and obtaining a set of reconstructed cell images,
8) training a generative model to obtain a continuous latent space,
9) sampling the continuous latent space of the generative model to obtain a set of cell encodings and a set of basis encodings for new materials (depending on model query),
10) sampling the latent space using random sampling and interpolating between latent vectors of one or more materials from amongst the plurality of known materials using one of a spherical and linear interpolation (SLERP) techniques,
11) passing the set of cell encodings through the cell autoencoder to obtain a set of sampled cell images and the set of basis encodings through the basis autoencoder to obtain a set of sampled basis images,
12) inverting the set of sampled cell images to obtain a set of lattice vectors for the one or more new materials, and
13) passing the set of sampled basis images through a segmentation network to obtain a set of atomic clusters indicative of atomic positions and elements types at said positions (in which the coordinates of the atoms are combined with the set of lattice vectors to constitute a crystal structure of one or more materials).
Court et al. is drawn to a computationally-based crystal structure prediction pipeline, Inorganic Crystal Structure Generation in 3D (termed ICSG3D in the Title, pg. 4518, and Notes section, pg. 4533, ln. 4-5), composed of 1) a Conditional Deep Feature Consistent Variational Autoencoder (Cond-DFC-VAE), 2) a 3D Unet segmentation architecture, and 3) a Crystal Graph Convolutional Neural Network (CGCNN) (pg. 4530, "Model Architecture", para. 1 (in totality).
Regarding claims 1, 7, and 13, Court et al. discloses a processor-based method of 1) obtaining crystal structures of each of a plurality of materials (Results, pg. 4522, para. 2, ln. 1-5), 2) converting crystal structure materials into 3D cell images using training data and gaussian functions (Conditional Deep-Feature-Consistent Variational Autoencoder for 3-D Crystal Structures, pg. 4530, para. 1, ln. 3-9; pg. 4533, para. 1, ln. 17-18, "Gaussian smearing…"), 3) creates a 3D matrix of atomic positions representing location of one or more elements (pg. 4524, para. 2, ln. 8-16, "each voxel…is concatenated with its 3-D Cartesian coordinates…"), 4) generates novel crystal structures by training an autoencoder from crystallographic information files (CIFs) (pg. 4524, para. "Generation of New Crystal Structures", ln. 7-9, "…a conditional variational autoencoder architecture…", pg. 4522, “We trained the VAE and UNet models independently on crystallographic information files (CIFs)…”), 5) trains a basis (Variational) autoencoder (VAE) using 3D electron-density maps derived from crystal structure data (pg. 4530, "Conditional Deep-Feature-Consistent Variational Autoencoder for 3-D Crystal Structures.", para. 1, ln. 6-9), 6) trains a segmentation network using the electron density maps to identify location and types of a set of elements (pg. 4531, "3-D Multiclass Atom Segmentation", para. 1, ln. 1-3; Fig. 1b, "UNet converts…to segmented species matrices") 7) trains a generative model electron-density map input ("Scope of this Work", pg. 4521, para. 1, ln. 1-15, "…using a conditional autoencoder, that encodes the electron-density maps…per atom of the associated crystals"), 8) samples the latent space of the generative model (pg. 4525, para. 2, ln. 1-2, "…via pure random sampling of the VAE latent space." para. 7, ln 6-8, "The latent space…was sampled."), 8) passes the samplings through a segmentation network (pg. 4530, "Model Architecture", para. 2, ln. 4-10, "The sampled latent vector…were decoded… to produce an electron-density map. Subsequently, the UNet was employed to convert this electron-density map into an atom segmentation map, from which the Cartesian coordinates of the atoms were obtained via morphological transformations.").
Even so, Court et al. does not explicitly utilize basis images and species matrices during the training of an autoencoder, does not explicitly train a segmentation network using reconstructed basis images to identify the location and types of elements, and does not teach applying spherical and linear interpolation (SLERP) to sample the latent space of a generative model for crystal structures. Court et al uses electron density maps to train a segmentation network which derives atomic coordinates from the electron density map for later use in their trained generative model, while in claim 1 the basis autoencoder is trained with 3D basis images and elements matrixes.
Ren et al. is directed to a generalized invertible representation that encodes crystallographic information into descriptors in both real space and reciprocal space, combining with a generative variational autoencoder (VAE), such that a wide range of crystallographic structures and chemistries with desired properties can be inverse designed and/or predicted using the VAE. Ren et al. teaches a VAE model capable of predicting novel crystal structures that do not exist in the training and test database, and teaches validating those predicted crystals by first-principles calculations (Abstract, pg. 1)
With respect to claims 1, 7, and 13, Ren et al. discloses, “Real space representation”, 3D crystal structure representation using a cell matrix (“the length and angle of three translation lattice vectors”), a basis matrix (“all atom coordinates in the unit cell, with size N sites”), and an elements matrix (“bin and one-hot encode each element in the unit cell as a vector Z which has K features which are the number of atomic properties such as group number, electronegativity, first ionization energy, etc.”) (page 3, section 2.1). Ren et al. also is directed to a generalized invertible representation that encodes the crystallographic information into descriptors in both real space and reciprocal space and uses three sampling strategies, one being SLERP (Table 1, pg. 5; Sec. 4.2; para. 2, ln. 1-6).
In KSR Int 'l v. Teleflex, the Supreme Court, in rejecting the rigid application of the teaching, suggestion, and motivation test by the Federal Circuit, indicated that “The principles underlying [earlier] cases are instructive when the question is whether a patent claiming the combination of elements of prior art is obvious. When a work is available in one field of endeavor, design incentives and other market forces can prompt variations of it, either in the same field or a different one. If a person of ordinary skill can implement a predictable variation, § 103 likely bars its patentability.” KSR Int'l v. Teleflex lnc., 127 S. Ct. 1727, 1740 (2007).
Applying the KSR standard to Court et al. and Ren et al., the examiner concludes that the combination of Court et al. and Ren et al. represents combining prior art elements according to known methods to yield predictable results. Both Court et al. and Ren et al. disclose methods to predict novel crystal structures given crystal structure data as input to their model; they differ in the manner in which they solve for novel crystal structures.
Court et al. accepts CIFs (such as from the Materials Project with unit cell, atomic coordinate and element identifying data), applies gaussian rendering to the CIF data to produce electron density maps, and then uses the maps (with coordinate convolution, pg. 4524, “Before input into the VAE, each voxel of the electron-density map is concatenated with its 3-D Cartesian coordinates in the original crystal geometry.”) to train a VAE convolutional neural network (CNN). Then, the VAE samples the latent space of the input electron density maps to produce novel electron-density maps (decoding), which is run through a segmentation network to derive atomic structures and positions, which are validated as optimal by an externally applied crystal-graph convolutional neural network (CGCNN). In this manner, Court et al.’s teaching is limited; while it does disclose atomic density, it does not disclose detailed crystallographic structure (lattice vectors, basis vectors through atomic coordinates, elements matrices), instead relying on later validation.
Ren et al. complements Court et al.’s limitation as it extracts cell matrices, basis matrices and element matrices from CIFs, uses these crystallographic descriptors to train a VAE (Fig. 2, pg. 4; Sec. 3.5, para. 1, “…the latent space of the VAE is mapped to material properties using the regression model, the latent space or reduced material space becomes an organized and continuous crystal representation with different material properties…”), and outputs atomic coordinates, having the VAE sample the latent space using SLERP (sec. 4.2., para. 1-2, “…we investigate the generation by sampling out-of-the-distribution latent space…Three different sampling methods: local perturbation (Lp), spherical linear interpolation (Slerp) and random sampling (Random) are implemented to sample points that are different from training dataset in the latent space… As our crystal representation contains both elemental and structural information, we invert those two parts separately to recover the full 3D crystal.”). Therefore, Ren et al. excels at modeling explicit coordinates (cell encodings, basis images, and element matrices) into 3D representations of crystal structures; however, Ren et al. does not explicitly disclose use of a segmentation network.
One of ordinary skill in the art of 3D crystal structure modeling would be motivated to combine Court et al.’s 3D crystal structure representation method with Ren et al.’s 3D crystal structure representation method, as both arts generally seek to identify novel 3D crystal structures but differ in their approach. Court et al. teaches segmenting density measurements derived from CIFs into accurate, novel 3D crystal structure representations, while Ren et al. teaches extracting precise coordinate data from unit cell, basis images, and element matrices to generate novel coordinate-forward 3D crystal structure representations. One would have been motivated to modify the 3D crystal structure modeling pipeline of Court et al. to include the pipeline of Ren et al. to 1) obtain crystal structures from a training data set, 2) convert the crystal structures into a 3D cell image using gaussian functions, 3) create a training set from each material and element matrix, and 4) establish crystal lattice parameters for the 3D voxels and train an autoencoder on said parameters in combination with Ren’s teaching of explicit coordinate extraction using a VAE with generalized invertible representation using a sampling strategy (randomly and/or using SLERP) on the compressed latent space of the generative model to get a sample of cell encodings and basis encodings from a variety of materials in the model, as combining prior art elements according to known methods are likely to yield the predictable result of generating crystal structure latent space samplings. Furthermore, for the same reason, one of ordinary skill in the art would be motivated to pass the set of cell encodings and basis encodings through their respective encoders to obtain cell and basis images, invert the sampled images, obtain novel lattice vectors, and subsequently pass the new materials through a segmentation network using Court’s teachings to get atomic clusters of new crystal structures.
One of ordinary skill in the art before the effective filing date of the claimed invention would have had a reasonable expectation of success because the teachings of Court et al. and Ren et al. are similar and interchangeable—both accept CIF input data, have used data from the same source (The Materials Project) for model training, and both utilize VAEs. Combining both arts would have been expected to have provided more-detailed novel crystal structures obtained through gaussian filtering, segmentation networks, and with detailed atomic coordinates. Therefore, the invention would have been prima facie obvious to one of skill in the art at the time of filing of the application, absent evidence to the contrary.
Regarding claims 2, 8 and 14, Court et al. teaches the use of data from the Materials Project (Methods, Data Preparation and Formatting"; para. 1, ln. 1-7). This correlates to claim 1 of the invention as it relates to the claimed invention's teaching of using training data comprising first principles computed structures and properties of the plurality of materials (clm. 2, ln 1-2). As the Materials Project contains first principles computed structures data, Court et al. teaches the limitations of claim 2.
Regarding claims 4, 10 and 16, Court et al.’s teaching correlates to each limitation in claim 4 as Court teaches that once training has completed, and existing crystal structures are encoded into latent space, the model can generate novel structures by sampling from its own mapping given user-defined properties (pg. 4524, "Generation of New Crystal Structures”, para. 1, ln. 1-17). This is relevant to instant claim 4, which teaches predicting target properties of the one or more new materials based on the crystal structure of the one or more new materials (clm. 4, ln 1-2). This feature of Court et al.'s teaching addresses the limitations of claim 4.
Claims 3, 9 and 15 are rejected under 35 U.S.C. 103(a) as being unpatentable over Court et al. in view of Ren et al. as applied to claims 1, 2, 4, 7, 8, 10, 13, 14, and 16 above, further in view of Kim et al. (ACS Central Science 2020 6 (8), 1412-1420).
The instant claims are drawn to a method and apparatus for obtaining crystal structure data from a plurality of materials in a training data set for use in the generation of novel 3D crystal structure models.
Court et al. in view of Ren et al. teaches claims 1, 2, 4, 7, 8, 10, 13, 14, and 16 above.
Court et al. in view of Ren et al. do not teach training data preprocessing by creating supercells.
Kim et al. is directed to a method of predicting crystal structures using generative adversarial networks (GANs)
Regarding claims 3, 9, and 15, Kim et al. teaches a method of augmenting input training data, including supercell data, showing that this augmentation improves the model’s ability to recognize the same materials represented in different input features (translated, rotated, or supercell repeated) as identical (Training Data Set and Data Preprocessing; pg. 1414; Supporting Info; Fig. S3). This is relevant to claim 3 which claims a preprocessed dataset augmented with created supercells (clm. 8, ln. 4-8).
In KSR Int 'l v. Teleflex, the Supreme Court, in rejecting the rigid application of the teaching, suggestion, and motivation test by the Federal Circuit, indicated that “The principles underlying [earlier] cases are instructive when the question is whether a patent claiming the combination of elements of prior art is obvious. When a work is available in one field of endeavor, design incentives and other market forces can prompt variations of it, either in the same field or a different one. If a person of ordinary skill can implement a predictable variation, § 103 likely bars its patentability.” KSR Int'l v. Teleflex lnc., 127 S. Ct. 1727, 1740 (2007).
Applying the KSR standard to Court et al. and Ren et al., the examiner concludes that the combination of Court et al. and Ren et al. represents the use of known techniques to improve similar methods.
Both Court et al. and Ren et al. produce 3D crystal structure representations generated from training data, but lack training data accounting for spatial manipulations (such as translations or rotations along an axes) for repetitions of unit cells associated with compositions of matter (supercells). In the same field of research, Kim et al. addresses this limitation and produces 3D representations which derive from a generative adversarial network (GAN) using training data containing spatially manipulated data on crystal structure data supercell generation and manipulation. Although the number of unique structures in Kim et al.’s training set are lower and the data contains imbalances in comparison to Ren et al. and Court et al. (pg. 1414, “…we retain a total of 1240 unique structures with 112 compositions in the initial training set. We note that this data set has the data imbalance in the composition and affine invariance issues such as supercell, translation, and rotation.), Kim et al. trains their GAN to account for the spatial fluctuations of compounds occurring in 3D while Court et al. and Ren et al. do not.
One skilled in the art of computational chemistry would have been motivated to apply supercell training structure to training data taught in Kim et al. to the training data of Court et al. and Ren et al. to train a generative model on a wide variety of compounds in a diverse number of configurations. One of skill in the art before the effective filing date of the claimed invention would have had a reasonable expectation of success at performing the training data setup as taught by Kim et al. with training data from Court et al. and Ren et al. because all three sources utilize crystal structure data from the Materials Project (Kim et al., pg. 1414, “The training set… …was constructed using…the Materials Project (MP) database.” Furthermore, modifying the technique of Court in view of Ren with the technique of Kim would likely be successful, as Kim explicitly applies supercell variations to address data imbalance issues in their crystal structure training data set (pg. 1414, “…we used data augmentation, which is a commonly used technique in the machine-learning field to alleviate such a data imbalance and invariance problem. Specifically, we added the supercell structures as well as the structures in which translational and rotational (i.e., swapping the axes of the unit cell) operations are applied until these augmentations yield 1000 structures for each composition”). In doing so, Kim clearly demonstrates the application of supercell data (and translations and rotations associated with it) in the case of preparing training data, in address of the limitations of claim 3. This combination would have been expected to have provided a training dataset of diverse materials and structures at the unit cell level individually and also a range of supercell clusters with consideration to rotational changes and spatial geometry. Therefore, the invention would have been prima facie obvious to one of skill in the art at the time of filing of the application, absent evidence to the contrary.
Claims 5, 6, 11, 12, 17, and 18 are rejected under 35 U.S.C. 103(a) as being unpatentable over Court, et al. in view of Ren et al. and Kim et al. as applied to claims 1-4, 7, 8, 10, 13, 14, and 16 above, and in further view of Qi et al. (IEEE Signal Processing Letters, vol. 27, pp. 1485-1489, 2020).
The instant claims are drawn to a method and apparatus for obtaining crystal structure data from a plurality of materials in a training data set for use in the generation of novel 3D crystal structure models, including training a regression model for the prediction of the target properties of the one or more new materials, wherein the latent space of the generative model comprise the features for the regression model.
Court et al. in view of Ren et al. and Kim et al. teach claims 1-4, 7, 8, 10, 13, 14, and 16 above.
Court et al. in view of Ren et al. teaches training a regression model for the prediction of chemical properties.
Qi et al. teaches regression model validation.
Applying the KSR standard to Court et al. and Qi et al., the examiner concludes that some teaching, suggestion, or motivation in the prior art would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention. As Qi et al. validates regression models and Court et al. employs regression model training in their model, one of skill in the art of computational chemistry would have been motivated to apply Court et al.’s training in view of Qi et al.’s validation evidence. One of skill in the art before the effective filing date of the claimed invention would have had a reasonable expectation of success to performing the process of training a regression model for the prediction of chemical properties because Court et al. applied their teaching to crystal structure data and provide the necessary code and instructions detailed in their publication. Therefore, the invention would have been prima facie obvious to one of skill in the art at the time of filing of the application, absent evidence to the contrary.
Regarding claims 5, 11, and 17, Court et al. teaches using a VAE trained using a regression model. Specifically, Court et al. teaches evaluating a CGCNN using Mean Absolute Error (MAE) ("Evaluation of Property Predictions", para. 1, ln. 11; "Evaluation of Predicted Unit-Cell Parameters and Atomic Positions.", para. 2, ln. 1-4). This evaluation is known in the art to be used to validate regression models (Qi et al. pg. 1485, Introduction, para. 1 ln. 1-2) and correlates to claim 5, lines 1-2 of the invention, which claims training a regression model for the prediction of the target properties.
Regarding claim 6, 12 and 18, Court et al.’s teaching correlates to each limitation as Court et al. teaches passing one or more decoded latent vectors through the generative model ("Model Architecture" pg. 4530, para. 6, ln. 13-18). This correlates to claim 6 of the invention, which mentions the same passing step (clm. 6, ln. 1-2) Therefore, Court et al.'s teaching addresses the limitation of claims 6, 12, and 18.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHN T STUBBS whose telephone number is (571)272-0340. The examiner can normally be reached M-F 8-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Larry Riggs can be reached at 571-270-3062. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.T.S./Examiner, Art Unit 1686
/LARRY D RIGGS II/Supervisory Patent Examiner, Art Unit 1686