Last updated: April 19, 2026
Application No. 17/792,639
MOLECULE DESIGN

Non-Final OA §101§103§112
Filed
Jul 13, 2022
Examiner
SANFORD, DIANA PATRICIA
Art Unit
1687
Tech Center
1600 — Biotechnology & Organic Chemistry
Assignee
Flagship Pioneering Innovations Vi LLC
OA Round
1 (Non-Final)
Interview Optional

— +25.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 6 resolved cases, 2023–2026
Examiner Intelligence

SANFORD, DIANA PATRICIA View full profile →
Grants 83% — above average
Career Allow Rate
5 granted / 6 resolved
+23.3% vs TC avg
Strong +25% interview lift
Without
With
+25.0%
Interview Lift
resolved cases with interview
Typical timeline
4y 8m
Avg Prosecution
40 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
31.6%
-8.4% vs TC avg
§103
29.9%
-10.1% vs TC avg
§102
9.9%
-30.1% vs TC avg
§112
25.8%
-14.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 6 resolved cases
Office Action

§101 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Claims 1-11, 13, 15, 17-19, and 23-29 are pending and under consideration in this action. Claims 12, 14, 16, 20-22, and 30-39 were canceled in the amendment filed 2/15/2023.

Priority
The instant application is 371 of PCT/US21/13451, filed 1/14/2021, which claims priority to U.S. Provisional Application number 62/961,112, filed 1/14/2020, as reflected in the filing receipt mailed 9/14/2023. The claim for domestic benefit for claims 1-11, 13, 15, 17-19, and 23-29 is acknowledged. As such, the effective filing date of claims 1-11, 13, 15, 17-19, and 23-29 is 1/14/2020.

Specification
The title is objected to because the title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed. The following title is suggested: “Molecular design of compounds with targeted biological properties using machine learning”.
The disclosure is objected to because it contains an embedded hyperlink and/or other form of browser-executable code: www.rdkit.org in Para. [00251]. Applicant is required to delete the embedded hyperlink and/or other form of browser-executable code; references to websites should be limited to the top-level domain name without any prefix such as http:// or other browser-executable code. See MPEP § 608.01.
The listing of references in the specification is not a proper information disclosure statement (see at least Para. [0056], [0094]-[0095], [00103], [00111], [00119], [00127]-[00128], [00133], [00142]-[00147], [00150], [00152], [00155]-[00156], [00162], [00188], [00190], [00221], [00223], [00239], [00243], and [00255]). 37 CFR 1.98(b) requires a list of all patents, publications, or other information submitted for consideration by the Office, and MPEP § 609.04(a) states, "the list may not be incorporated into the specification but must be submitted in a separate paper." Therefore, unless the references have been cited by the examiner on form PTO-892, they have not been considered.

Claim Objections
Claims 18-19 and 25 are objected to because of the following informalities:
Claim 18 recites the abbreviations “EC50”, “IC50”, “ED50”, “LD50”, and “TD50” in lines 4-6 of the claim. These abbreviations should be corrected to “half maximal effective concentration (EC50)”, “half maximal inhibitory concentration (IC50)”, “median effective dose (ED50)”, “median lethal dose (LD50)”, and “median toxic dose (TD50)”, as this is the first recitation of each of the abbreviations in the claims. 
Claim 19 ends with a comma, which should be corrected to a period. Claim 19 is also missing a colon after “characterized by” in line 1 of the claim. 
Claim 25 recites the abbreviations “scTag-seq”, “CyTOF/SCoP”, “E-MS/Abseq”, “miRNA-seq”, and “CITE-seq” in lines 2-4 of the claim. These abbreviations should be corrected to write out the full phrase before the abbreviation, as this is the first recitation of each of the abbreviations in the claims.
Appropriate correction is required.

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 3-5, 19, 25, and 29 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 3 recites the limitation “wherein the first and second compound have the first molecular property” in lines 3-4 of the claim. There is insufficient antecedent basis for this limitation in the claim, since there is no prior mention of “the first molecular property” in claim 1, to which this claim depends. This rejection can be overcome by amendment of claim 1 to recite “wherein the first and second compound have the first biological property”, as “the first biological property” was recited in claim 1. Claims 4-5 are also rejected due to their dependency from claim 3.
Claim 19 recites the limitation “wherein the cell state is characterized by (i) an up-regulation …, (ii) a diseased state, (iii) an upregulation …, (iv) an upregulation…” in lines 1-7 of the claim. The metes and bounds of the claim are rendered indefinite due to the lack of clarity. The claim is missing an “or” or an “and” between limitations (iii) and (iv). It is unclear whether all four limitations are required for characterization of the cell state, or if only one of the four limitations is required. This rejection can be overcome by amendment of claim 19 to include “or” or “and” between limitations (iii) and (iv). 
Regarding claim 25, the phrase "such as" renders the claim indefinite because it is unclear whether the limitations following the phrase are part of the claimed invention. See MPEP § 2173.05(d).  
Claim 29 recites the limitation “using the first projection to obtain one or more candidate projections” in line 10 of the claim. There is insufficient antecedent basis for this limitation in the claim, since there is no prior mention of “the first projection” earlier in the claim. This rejection can be overcome by amendment of claim 29 to recite “the first projected representation”, as the “first projected representation” was recited earlier in the claim.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-11, 13, 15, 17-19, and 23-29 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite both (1) mathematical concepts (mathematical relationships, formulas or equations, or mathematical calculations) and (2) mental processes, i.e., concepts performed in the human mind (including observations, evaluations, judgements or opinions) (see MPEP § 2106.04(a)).
Step 1:
	In the instant application, claims 1-11, 13, 15, 17-19, 23-26, and 29 are directed towards a method, claim 27 is directed towards a system, and claim 28 is directed towards a manufacture, which falls into one of the categories of statutory subject matter.  (Step 1: YES).
Step 2A, Prong One:
	In accordance with MPEP § 2106, claims found to recite statutory subject matter (Step 1: YES) are then analyzed to determine if the claims recite any concepts that equate to an abstract idea, law of nature or natural phenomenon (Step 2A, Prong One). The following instant claims recite limitations that equate to one or more categories of judicial exceptions:
Claims 1, 27, and 28 recite a mathematical concept (i.e., projection data into latent representation space) in “for each respective compound in the first plurality of compounds, (a) projecting the information regarding the chemical structure of the respective compound into a latent representation space in accordance with a first plurality of weights associated with the untrained or partially untrained neural network encoder, wherein the first plurality of weights comprises 1000 weights, to obtain a corresponding projected representation of the respective compound” and “for each respective compound in the second plurality of compounds, (a) projecting the information regarding the chemical structure of the respective compound into a latent representation space in accordance with the first plurality of weights associated with the trained neural network encoder to obtain a corresponding projected representation of the respective compound”; a mathematical concept (i.e., using the classifier to determine if the compound properties; the classifier can be a logistic regression classifier, k-nearest neighbor classifier, a decision tree classifier, etc. as disclosed in specification Para. [00141]) in “inputting the corresponding projected representation of the respective compound into the untrained or partially untrained classifier to obtain a classification of the respective compound in accordance with a second plurality of weights associated with the untrained or partially trained classifier”; a mathematical concept (i.e., the comparison and updating is performed via back-propagation, see specification Para. [0081]) in “updating the first plurality of weights and the second plurality of weights by comparing the classification of each respective compound in the first plurality of compounds to the one or more biological properties of the respective compound in the first training dataset thereby obtaining a trained neural network encoder and a trained classifier”; a mathematical concept (i.e., the comparison and updating is performed via back-propagation, see specification Para. [0081]) in “updating the third plurality of weights by comparing the chemical structure of each respective compound outputted by the untrained or partially untrained decoder to the actual chemical structure of the respective compound from the second training dataset thereby obtaining a trained decoder”; and a mental process (i.e., an evaluation of the training sets for the presence of a compound) in “wherein the test compound is not present in the first and second training set”.
Claim 2 further recites a mental process (i.e., an evaluation of the information regarding the compound) in “wherein the information regarding a chemical structure of the respective compound in the first plurality of compounds is a chemical structure of the respective compound or a high dimensional vector representation based upon a chemical structure of the respective compound”.
Claim 3 further recites a mathematical concept (i.e., generating a projected representation through interpolation; for example, linear interpolation, see specification Para. [0177]) in “interpolating a projected representation of a first compound and a projected representation of a second compound, produced by the trained neural network encoder, wherein the first and second compound have the first molecular property thereby obtaining an interpolated projection”; and a mathematical concept (i.e., using a classifier to determine if the compound has the targeted property; the classifier can be a logistic regression classifier, k-nearest neighbor classifier, a decision tree classifier as disclosed in specification Para. [00141]) in “obtaining a classification of the respective candidate compound by inputting the corresponding projected representation of the respective candidate compound into the trained classifier, wherein, when the trained classifier indicates that the corresponding projected representation of the respective candidate compound has the first biological property, the respective candidate compound is deemed to have the first biological property”.
Claim 6 further recites a mathematical concept (i.e., using the classifier to determine if the compound has the targeted property; the classifier can be a logistic regression classifier, k-nearest neighbor classifier, a decision tree classifier as disclosed in specification Para. [00141]) in “inputting the projected representation of the first compound into the trained classifier to verify that the trained classifier identifies the first compound as having the first biological property”.
Claim 7 further recites a mental process (i.e., an evaluation of the information regarding the chemical structure) in “the information regarding the chemical structure of the respective compound is a molecular structure of the respective compound”; a mathematical concept (i.e., convert molecules into tensors, specification Para. [00126]) in “forming a featurization of the chemical structure”; and a mathematical concept (i.e., incorporating into vector space) in “incorporating the featurization of the chemical structure into a multi-dimensional vector space”
Claim 8 further recites a mathematical concept (i.e., formation of a tensor) in “wherein the featurization of the chemical structure is a tensor”.
Claim 9 further recites a mathematical concept (i.e., formation of a tensor) in “wherein the tensor is a one-dimensional vector or a two-dimensional matrix”.
Claim 10 further recites a mathematical concept (i.e., calculation of the extended circular fingerprint or one-hot-encoded vectors; see specification Para. [00127]) in “wherein the featurization of the chemical structure is an extended circular fingerprint, or a molecular graph of a plurality of one-hot-encoded vectors”.
Claim 11 further recites a mathematical concept (i.e., vector formation) in “wherein the multi-dimensional vector space is an N-dimensional space, wherein N is an integer between 20 and 80”.
Claim 17 further recites a mathematical concept (i.e., matrix formation) in “converting the SMILES string into a molecular graph representation that comprises an adjacency matrix and a feature matrix”.
Claim 29 recites a mathematical concept (i.e., a projected representation) in “wherein the first projected representation has N dimensions”, “wherein N is an integer between 20 and 80”, and “wherein the corresponding projected representation has N dimensions”; a mathematical concept (i.e., sampling vectors from the projected representation; see specification Para. [00171]) in “using the first projection to obtain one or more candidate projections”; a mental process (i.e., an evaluation of the candidate compounds for the presence of a compound) in “wherein the first compound is not present in the plurality of candidate compounds”; and a mathematical concept (i.e., using the classifier to determine if the compound has the targeted property; the classifier can be a logistic regression classifier, k-nearest neighbor classifier, a decision tree classifier as disclosed in specification Para. [00141]) in “obtaining a classification of the respective candidate compound by inputting the corresponding projected representation of the respective candidate compound into the trained classifier, wherein, when the trained classifier indicates that the corresponding projected representation of the respective candidate compound has the first biological property, the respective candidate compound is deemed to have the first biological property”.
These recitations are similar to the concepts of collecting information, and displaying certain results of the collection and analysis is Electric Power Group, LLC, v. Alstom (830 F.3d 1350, 119 USPQ2d 1739 (Fed. Cir. 2016)), comparing information regarding a sample or test to a control or target data in Univ. of Utah Research Found. v. Ambry Genetics Corp. (774 F.3d 755, 113 U.S.P.Q.2d 1241 (Fed. Cir. 2014)) and Association for Molecular Pathology v. USPTO (689 F.3d 1303, 103 U.S.P.Q.2d 1681 (Fed. Cir. 2012)), and organizing and manipulating information through mathematical correlations in Digitech Image Techs., LLC v Electronics for Imaging, Inc. (758 F.3d 1344, 111 U.S.P.Q.2d 1717 (Fed. Cir. 2014)) that the courts have identified as concepts that can be practically performed in the human mind or mathematical relationships.
	The abstract ideas recited in the claims are evaluated under the broadest reasonable interpretation (BRI) of the claim limitations when read in light of and consistent with the specification, and are determined to be directed to mental processes that in the simplest embodiments are not too complex to practically perform in the human mind. Additionally, the recited limitations that are identified as judicial exceptions from the mathematical concepts grouping of abstract ideas are abstract ideas irrespective of whether or not the limitations are practical to perform in the human mind. The instant claims must therefore be examined further to determine whether they integrate the abstract idea into a practical application (Step 2A, Prong One: YES).
Step 2A, Prong Two:
In determining whether a claim is directed to a judicial exception, further examination is performed that analyzes if the claim recites additional elements that when examined as a whole integrates the judicial exception(s) into a practical application (MPEP § 2106.04(d)). A claim that integrates a judicial exception into a practical application will apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception. The claimed additional elements are analyzed to determine if the abstract idea is integrated into a practical application (MPEP § 2106.04(d)(I)). If the claim contains no additional elements beyond the abstract idea, the claim fails to integrate the abstract idea into a practical application (MPEP § 2106.04(d)(III)). The following independent claims recite limitations that equate to additional elements:
Claim 1 recites “at a computer system comprising at least one processor and a memory storing at least one program for execution by the at least one processor, the at least one program comprising instructions”; “obtaining a first training dataset, in electronic form, wherein: the first training dataset comprises, for each respective compound in a first plurality of compounds, (i) information regarding a chemical structure of the respective compound and (ii) one or more biological properties, in a plurality of biological properties, of the respective compound, the first plurality of compounds comprises 100 or more compounds, and the plurality of biological properties includes the first biological property”; “obtaining a second training dataset, in electronic form, wherein the second training dataset comprises, for each respective compound in a second plurality of compounds, information regarding a chemical structure of the respective compound and wherein the second plurality of compounds comprises 100 or more compounds”; “inputting the corresponding projected representation of the respective compound into the untrained or partially untrained decoder to obtain a chemical structure of the respective compound in accordance with a third plurality of weights associated with the untrained or partially untrained decoder”; and “using the trained neural network encoder, trained classifier, and trained decoder to identify a test compound that has the first biological property”. 
Claim 27 recites “a computer system, comprising one or more processors and memory, the memory storing instructions for performing a method, using the one or more processors”; “obtaining a first training dataset, in electronic form, wherein: the first training dataset comprises, for each respective compound in a first plurality of compounds, (i) information regarding a chemical structure of the respective compound and (ii) one or more biological properties, in a plurality of biological properties, of the respective compound, the first plurality of compounds comprises 100 or more compounds, and the plurality of biological properties includes the first biological property”; “obtaining a second training dataset, in electronic form, wherein the second training dataset comprises, for each respective compound in a second plurality of compounds, information regarding a chemical structure of the respective compound and wherein the second plurality of compounds comprises 100 or more compounds”; “inputting the corresponding projected representation of the respective compound into the untrained or partially untrained decoder to obtain a chemical structure of the respective compound in accordance with a third plurality of weights associated with the untrained or partially untrained decoder”; and “using the trained neural network encoder, trained classifier, and trained decoder to identify a test compound that has the first biological property”.
Claim 28 recites “a non-transitory computer-readable medium storing one or more computer programs, executable by a computer, the computer comprising one or more processors and a memory, the one or more computer programs collective encoding computer executable instructions that perform a method”; “obtaining a first training dataset, in electronic form, wherein: the first training dataset comprises, for each respective compound in a first plurality of compounds, (i) information regarding a chemical structure of the respective compound and (ii) one or more biological properties, in a plurality of biological properties, of the respective compound, the first plurality of compounds comprises 100 or more compounds, and the plurality of biological properties includes the first biological property”; “obtaining a second training dataset, in electronic form, wherein the second training dataset comprises, for each respective compound in a second plurality of compounds, information regarding a chemical structure of the respective compound and wherein the second plurality of compounds comprises 100 or more compounds”; “inputting the corresponding projected representation of the respective compound into the untrained or partially untrained decoder to obtain a chemical structure of the respective compound in accordance with a third plurality of weights associated with the untrained or partially untrained decoder”; and “using the trained neural network encoder, trained classifier, and trained decoder to identify a test compound that has the first biological property”.
Claim 29 recites “at a computer system comprising at least one processor and a memory storing at least one program for execution by the at least one processor, the at least one program comprising instructions”; “obtaining a first projected representation of a first compound that is assigned the first biological property by inputting a chemical structure of the first compound into a trained neural network encoder”; “inputting each candidate projection in the one or more candidate projections into a trained decoder thereby obtaining a plurality of candidate compounds”; and “obtaining a corresponding projected representation for the respective candidate compound by inputting a chemical structure of the candidate compound into the trained neural network encoder”.
Regarding the above cited limitations in claims 1 and 27-29 of (i) at a computer system comprising at least one processor and a memory storing at least one program for execution by the at least one processor, the at least one program comprising instructions (claims 1, 27, and 29); and (ii) a non-transitory computer-readable medium storing one or more computer programs, executable by a computer, the computer comprising one or more processors and a memory, the one or more computer programs collective encoding computer executable instructions that perform a method (claim 28). These limitations require only a generic computer component, which does not improve computer technology. Therefore, these limitations equate to mere instructions to implement an abstract idea on a generic computer, which the courts have established does not render an abstract idea eligible in Alice Corp. 573 U.S. at 223, 110 USPQ2d at 1983. 
Regarding the above cited limitations in claims 1 and 27-29 of (iii) obtaining a first training dataset, in electronic form, wherein: the first training dataset comprises, for each respective compound in a first plurality of compounds, (i) information regarding a chemical structure of the respective compound and (ii) one or more biological properties, in a plurality of biological properties, of the respective compound, the first plurality of compounds comprises 100 or more compounds, and the plurality of biological properties includes the first biological property (claims 1 and 27-28); (iv) obtaining a second training dataset, in electronic form, wherein the second training dataset comprises, for each respective compound in a second plurality of compounds, information regarding a chemical structure of the respective compound and wherein the second plurality of compounds comprises 100 or more compounds (claims 1 and 27-28); (v) inputting the corresponding projected representation of the respective compound into the untrained or partially untrained decoder to obtain a chemical structure of the respective compound in accordance with a third plurality of weights associated with the untrained or partially untrained decoder (claims 1 and 27-28); (vi) using the trained neural network encoder, trained classifier, and trained decoder to identify a test compound that has the first biological property (claims 1 and 27-28); (vii) obtaining a first projected representation of a first compound that is assigned the first biological property by inputting a chemical structure of the first compound into a trained neural network encoder (claim 29); (viii) inputting each candidate projection in the one or more candidate projections into a trained decoder thereby obtaining a plurality of candidate compounds (claim 29); and (ix) obtaining a corresponding projected representation for the respective candidate compound by inputting a chemical structure of the candidate compound into the trained neural network encoder (claim 29). These limitations equate to insignificant, extra-solution activity of mere data gathering because these limitations gather data before or after the recited judicial exceptions of generating a projected representation, generating a compound classification, and updating the corresponding weights (see MPEP § 2106.04(d)). 
Additionally, none of the recited dependent claims recite additional elements which would integrate the judicial exception into a practical application. Specifically, claims 3 and 6-7 further limit the use of the trained encoder or decoder; claims 4 and 5 recite extra-solution activities; claim 13, 15, and 17 further limit the featurization of the chemical structure; claims 18-19 and 23-26 further limit the biological property or cell state. As such, claims 1-11, 13, 15, 17-19, and 23-29 are directed to an abstract idea (Step 2A, Prong Two: NO). 
Step 2B: 
Claims found to be directed to a judicial exception are then further evaluated to determine if the claims recite an inventive concept that provides significantly more than the judicial exception itself (Step 2B). The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the claims recite additional elements that equate to well-understood, routine and conventional (WURC) limitations (MPEP § 2106.05(d)). The instant independent claims recite same additional elements described in Step 2A, Prong Two above.
Regarding the above cited limitations in claims 1 and 27-29 of (i) at a computer system comprising at least one processor and a memory storing at least one program for execution by the at least one processor, the at least one program comprising instructions (claims 1, 27, and 29); and(ii) a non-transitory computer-readable medium storing one or more computer programs, executable by a computer, the computer comprising one or more processors and a memory, the one or more computer programs collective encoding computer executable instructions that perform a method (claim 28). These limitations equate to instructions to implement an abstract idea on a generic computing environment, which the courts have established does not provide an inventive concept (see MPEP § 2106.05(d) and MPEP § 2106.05(f)).
Regarding the above cited limitations in claims 1 and 27-28 of (iii) obtaining a first training dataset, in electronic form, wherein: the first training dataset comprises, for each respective compound in a first plurality of compounds, (i) information regarding a chemical structure of the respective compound and (ii) one or more biological properties, in a plurality of biological properties, of the respective compound, the first plurality of compounds comprises 100 or more compounds, and the plurality of biological properties includes the first biological property (claims 1 and 27-28); and (iv) obtaining a second training dataset, in electronic form, wherein the second training dataset comprises, for each respective compound in a second plurality of compounds, information regarding a chemical structure of the respective compound and wherein the second plurality of compounds comprises 100 or more compounds (claims 1 and 27-28). These limitations equate to receiving/transmitting data over a network, which the courts have established as a WURC limitation of a generic computer in buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014). 
Regarding the above cited limitations in claims 1 and 27-29 of (v) inputting the corresponding projected representation of the respective compound into the untrained or partially untrained decoder to obtain a chemical structure of the respective compound in accordance with a third plurality of weights associated with the untrained or partially untrained decoder (claims 1 and 27-28); (vi) using the trained neural network encoder, trained classifier, and trained decoder to identify a test compound that has the first biological property (claims 1 and 27-28); (vii) obtaining a first projected representation of a first compound that is assigned the first biological property by inputting a chemical structure of the first compound into a trained neural network encoder (claim 29); (viii) inputting each candidate projection in the one or more candidate projections into a trained decoder thereby obtaining a plurality of candidate compounds (claim 29); and (ix) obtaining a corresponding projected representation for the respective candidate compound by inputting a chemical structure of the candidate compound into the trained neural network encoder (claim 29). These limitations when viewed individually and in combination, are WURC limitations as taught by Oono et al. (U.S. Patent Application Publication US 2017/0161635 A1). Oono et al. discloses a generative model used to generate chemical compounds that have desired characteristics, e.g., activity against a selected target (Abstract). Oono et al. further discloses the generation of latent representations of compounds, which are passed through one or more layers of an encoder (limitations (vii) and (ix)) (Para. [0058], [0078], and [0110]). Oono et al. further discloses inputting a latent representation of a compound into the decoder and that the during training, the decoder may learn to regenerate original compound representations from latent representations (limitations (v) and (viii)) (Para. [0078] and [0136]). Oono et al. further discloses that after the generative model has been trained it may be used to generate a representation of a chemical compound that has a high likelihood of having a specific label/property, and that this compound is not part of the training set (limitation (vi)) (Para. [0009] and [0050]). 
These additional elements do not comprise an inventive concept when considered individually or as an ordered combination that transforms the claimed judicial exception into a patent-eligible application of the judicial exception. Therefore, the instant claims do not amount to significantly more than the judicial exception itself (Step 2B: NO). As such, claims 1-11, 13, 15, 17-19, and 23-29 are not patent eligible. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-9 and 27-28 are rejected under 35 U.S.C. 103 as being unpatentable over Oono et al. (U.S. Patent Application Publication US 2017/0161635 A1; published 6/8/2017) in view of Hernandez et al. (U.S. Patent Application Publication US 2019/0371476 A1; published 12/5/2019).
Regarding claim 1, Oono et al. teaches a generative model used to generate chemical compounds that have desired characteristics, e.g., activity against a selected target. The models may be used to generate chemical compounds that satisfy multiple requirements (i.e., a method of discovering a test compound that has a first biological property) (Abstract). Oono et al. further teaches that the invention may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. The computer comprises memory and a processor (i.e., at a computer system comprising at least one processor and a memory storing at least one program for execution by the at least one processor) (Para. [0143]-[0145]). Oono et al. further teaches that the training data may be compiled from information of chemical compounds and associated labels from databases such as PubChem. The data may also be obtained from drug screening libraries, combinatorial synthesis libraries, etc. (i.e., obtaining a first training dataset, in electronic form) (Para.[0082]). Oono et al. further teaches that the model is trained with training data set comprising chemical compound representations such as fingerprint data. The training data set further comprises labels associated with at least a subset of the chemical compounds in the training data set. The labels may have label elements such as one or more of compound activities and properties such as bioassay results, toxicity, cross-reactivity, pharmacokinetics, pharmacodynamics, bioavailability, solubility, etc. (i.e., the first training dataset comprises, for each respective compound in a first plurality of compounds, (i) information regarding a chemical structure of the respective compound and (ii) one or more biological properties, in a plurality of biological properties, of the respective compound) (Para. [0043]). Oono et al. further teaches an example where the training data set comprises 2,500 FDA-approved drugs, all of which have the class label Drug, and a large set of other non-drug compounds, all of which have the label Not Drug (i.e., the first plurality of compounds comprises 100 or more compounds and the plurality of biological properties includes the first biological property) (Para. [0182]). Oono et al. further teaches that a trained autoencoder, such as a trained probabilistic or variational autoencoder, may be used to generate or simulate observable-data values by sampling from the modeled joint probability distribution to generate a latent representation and by decoding this latent representation to reconstruct an input data point. The weights of the autoencoder are adjusted during training by an optimization method (i.e., training an untrained or partially untrained neural network encoder) (Para. [0054]).Oono et al. further teaches that the classifier is trained with supervised learning (i.e., training an untrained or partially untrained classifier) (Para. [0110]). Oono et al. further teaches that an autoencoder is trained on a large set of chemical compound representations. A latent representation generator (LRG) may form the first part of the autoencoder, in a similar position as an encoder. The LRG can be used to generate latent representations of compounds. The weights of the autoencoder are adjusted during training by an optimization method (i.e., for each respective compound in the first/second plurality of compounds, (a) projecting the information regarding the chemical structure of the respective compound into a latent representation space in accordance with a first plurality of weights associated with the untrained or partially untrained neural network encoder, to obtain a corresponding projected representation of the respective compound) (Para. [0054] and [0110]). Oono et al. further teaches that in the example of FIG. 6, the encoder maps each of the embedding vectors from a dimensionality of 1x10 to a dimensionality of 1x95 in the common latent space (e.g., there are 10x95 = 950 weights). It should be appreciated however, that the output dimensionality of encoder may take any suitable value (e.g., 100 instead of 95; 10x100 = 1000 weights) (i.e., wherein the first plurality of weights comprises 1000 weights) (Para. [0073]). Oono et al. further teaches that the encoder outputs a latent variable Z. From the latent variable Z, the sampling module can draw a sample to create a latent representation of the seed compound and its label information. This latent representation and the desired label may be input to the decoder, which can decode them to generate a random variable defined over the space of possible fingerprint values. The latent representations are input to a classifier. The classifier may be trained with supervised learning. The training data set of the classifier may comprise labeled drug and non-drug compounds. The classifier may be trained to output a continuous score that represents the compound's drug likeness (i.e., inputting the corresponding projected representation of the respective compound into the untrained or partially untrained classifier to obtain a classification of the respective compound in accordance with a second plurality of weights associated with the untrained or partially untrained classifier) (Para. [0078] and [0110]). Oono et al. further teaches a second training data set comprising chemical fingerprint data and an associated set of labels having a second label element. The labels having a first label element and the labels having a second label element are introduced into different portions of the generative model during training, for example into the encoder and decoder, respectively. The second label element represents the activity of a chemical compound associated with a chemical fingerprint in a second bioassay. In one example, the training data comprises approximately 2,500 FDA-approved drugs (i.e., obtaining a second training dataset, in electronic form, wherein the second training dataset comprises, for each respective compound in a second plurality of compounds, information regarding a chemical structure of the respective compound and wherein the second plurality of compounds comprises 100 or more compounds) (Para. [0047] and [0182]). Oono et al. further teaches that the training of the probabilistic decoder to decode the latent representation as a probabilistic reconstruction of the chemical compound fingerprint (i.e., training an untrained or partially untrained decoder) (Para. [0009]). Oono et al. further teaches that from the latent variable Z, the sampling module can draw a sample to create a latent representation of the seed compound and its label information. This latent representation and the desired label may be input to the decoder, which can decode them to generate a random variable defined over the space of possible fingerprint values. That is during the training of the autoencoder, the decoder may learn to regenerate original compound representations from latent representations. (i.e., inputting the corresponding projected representation of the respective compound into the untrained or partially untrained decoder to obtain a chemical structure of the respective compound in accordance with a third plurality of weights associated with the untrained or partially untrained decoder) (Para. [0078] and [0136]). Oono et al. further teaches that after the generative model has been trained, it may be used to generate a representation of a chemical compound that has a high likelihood of meeting the requirements of a specific label value (i.e., using the trained neural network encoder, trained classifier, and trained decoder to identify a test compound that has the first biological property) (Para. [0050]). Oono et al. further teaches that the model generates an output comprising identifying information for a chemical compound not represented in the training set (i.e., wherein the test compound is not present in the first and second training set) (Para. [0009]).
Regarding claim 2, Oono et al. teaches that compound information is typically characterized by a set of molecular descriptors, representing chemical information, such as a chemical formula, chemical structure, electron density or other chemical characteristics. Compound information may comprise fingerprint representations of each compound (i.e., wherein the information regarding a chemical structure of the respective compound in the first plurality of compounds is a chemical structure of the respective compound) (Para. [0035]).
Regarding claim 3, Oono et al. teaches that the encoder may be used to encode representations of chemical compounds, e.g., fingerprints, as an output of a different form, e.g. a latent variable. During training, the encoder must learn an encoding model that specifies a non-linear mapping of input x to latent variable Z (i.e., interpolating a projected representation of a first compound and a projected representation of a second compound, produced by the trained neural network encoder, wherein the first and second compound have the first molecular property thereby obtaining an interpolated projection) (Para. [0056]). Oono et al. further teaches that the latent representation and the desired label may both be input to the decoder. From these inputs, the decoder may generate a random variable over a distribution of molecular descriptors (e.g. fingerprints) likely to meet the requirements of desired label (i.e., inputting the interpolated projection into the trained decoder thereby obtaining a plurality of candidate compounds) (Para. [0076]). Oono et al. further teaches that the encoder is used to encode representations of chemical compounds, e.g., fingerprints, as an output of a different form, e.g. a latent variable. During training, the encoder must learn an encoding model that specifies a non-linear mapping of input x to latent variable Z. Chemical compound representations may be passed through one or more layers of an encoder and labels associated with each chemical compound representation may be input at a later layer of the encoder (i.e., for each respective candidate compound in all or a portion of the plurality of candidate compounds: obtaining a corresponding projected representation for the respective candidate compound by inputting a chemical structure of the candidate compound into the trained neural network encoder) (Para. [0056] and [0058]). Oono et al. further teaches the latent representations may be input to a classifier. The classifier may be trained with supervised learning. The training data set of the classifier may comprise labeled drug and non-drug compounds. The classifier may be trained to output a continuous score that represents the compound's drug likeness (i.e., obtaining a classification of the respective candidate compound by inputting the corresponding projected representation of the respective candidate compound into the trained classifier). Oono et al. further teaches that the model trained on a training set of fingerprints and labels may generate a representation of a chemical compound that has a high likelihood of meeting the requirements of a specified label value (i.e., when the trained classifier indicates that the corresponding projected representation of the respective candidate compound has the first biological property, the respective candidate compound is deemed to have the first biological property) (Para. [0050] and [0110]).
Regarding claims 4 and 5, Oono et al. teaches that subsequent activities, such as synthesis, in vivo and in vitro testing, and clinical trials with a chemical compound are understood to follow in certain embodiments of the invention (i.e., subjecting the first compound to a wet lab assay that verifies that the respective candidate compound has the first biological property and synthesizing the first compound) (Para. [0043]).
Regarding claim 6, Oono et al. teaches an ab initio generation process, where latent representation is drawn from a standard normal distribution N(0,1) by the sampling module. A single desired label is used. For each chemical compound representation to be generated by the model, a separate latent representation is drawn from N(0,1). For example, if the user wishes to generate two chemical compound representations, two separate latent representations drawn from N(0,1) (i.e., obtaining a first compound, not present in the first or second training dataset, that has the first biological property and has a known chemical structure) (Para. [0158]). Oono et al. further teaches the trained encoder as described for claim 1 above. The data provided to the encoder as pairs comprising a chemical compound representation and the label associated with the represented compound (i.e., obtaining a projected representation for the first compound by inputting a chemical structure of the first compound into the trained neural network encoder) (Para. [0148]). Oono et al. further teaches that the latent representation is input to the classifier. The classifier outputs a continuous score that represents the compound's drug likeness. To apply the ranking module, members of the unranked set of generated compound fingerprints are input to the latent representation generator and the generated latent representations are then input to the classifier. Each compound receives a drug likeness score from the classifier. The compounds are then ordered from highest score to lowest score (i.e., inputting the projected representation of the first compound into the trained classifier to verify that the trained classifier identifies the first compound as having the first biological property) (Para. [0182]). Oono et al. further teaches that the latent representation is input into the decoder. The output of the decoder is used to generate chemical compound representations (i.e., inputting the projected representation of the first compound into the trained decoder to verify that the trained decoder reconstructs the chemical structure of the first compound) (Para. [0159] and [0170]).
Regarding claim 7, Oono et al. teaches that the compound information is characterized by a set of molecular descriptors representing chemical information, such as a chemical formula, chemical structure, electron density or other chemical characteristics (i.e., the information regarding the chemical structure of the respective compound is a molecular structure of the respective compound) (Para. [0035]). Oono et al. further teaches an example where data is provided to the encoder as pairs comprising a chemical compound representation (                        
                            
                                    x
                                
                                    D
                                
                    ), such as a fingerprint comprising a feature vector of molecular descriptors, and the label (                        
                            
                                    y
                                
                                    D
                                
                    ) associated with the represented compound (i.e., forming a featurization of the chemical structure) (Para. [0148]). Oono et al. further teaches that the training set comprising pairs of a vector of values of molecular descriptors and a vector of label element values. The latent representation of the compounds fingerprint represents high-level abstractions and non-linear combinations of features that may provide a more accurate explanation of the behavior of the compound than standard drug likeness properties are able to provide (i.e., incorporating the featurization of the chemical structure into a multi-dimensional vector space) (Para. [0035] and [0109]). Oono et al. further teaches the projection of the information regarding the chemical structure of the respective compound into the latent representation space in accordance with the first plurality of weights associated with the untrained or partially untrained neural network encoder as described for claim 1 above. Oono et al. further teaches that pair input to the encoder may be described as                         
                            I
                            E
                            =
                            (
                            
                                    x
                                
                                    i
                                
                                    D
                                
                            ,
                            
                                    y
                                
                                    i
                                
                                    D
                                
                            )
                        
                    , wherein                         
                            
                                    x
                                
                                    i
                                
                                    D
                                
                    is a real-valued vector with dimensionality                         
                            
                                    d
                                    i
                                    m
                                
                                            x
                                        
                                            i
                                        
                                            D
                                        
                     and wherein                         
                            
                                    y
                                
                                    i
                                
                                    D
                                
                     denotes label data for the corresponding                         
                            
                                    x
                                
                                    i
                                
                                    D
                                
                    . The dimensionality of                         
                            
                                    x
                                
                                    i
                                
                                    D
                                
                    ,                         
                            
                                    d
                                    i
                                    m
                                
                                            x
                                        
                                            i
                                        
                                            D
                                        
                    , may be fixed throughout a training data set. Elements of                         
                            
                                    y
                                
                                    D
                                
                     may be scalars or vectors optionally having arbitrary dimensions. Label element values in                         
                            
                                    y
                                
                                    D
                                
                     may be continuous or binary (i.e., inputting the multi-dimensional vector space of the chemical structure into the untrained or partially untrained neural network encoder) (Para. [0148]).
Regarding claims 8 and 9, Oono et al. teaches that training comprises training the probabilistic encoder to encode a chemical compound fingerprint as a vector of means and a vector of standard deviations defining a latent variable (i.e., wherein the featurization of the chemical structure is a tensor and wherein the tensor is a one-dimensional vector) (Para. [0009]).
Regarding claim 27, Oono et al. teaches the limitations of a computer system, comprising one or more processors and memory, the memory storing instructions for performing a method, using the one or more processors; obtaining a first training dataset, in electronic form; the first training dataset comprises, for each respective compound in a first plurality of compounds, (i) information regarding a chemical structure of the respective compound and (ii) one or more biological properties, in a plurality of biological properties, of the respective compound, the first plurality of compounds comprises 100 or more compounds, the plurality of biological properties includes the first biological property; training an untrained or partially untrained neural network encoder and an untrained or partially untrained classifier by performing a first procedure that comprises: (i) for each respective compound in the first plurality of compounds, (a) projecting the information regarding the chemical structure of the respective compound into a latent representation space in accordance with a first plurality of weights associated with the untrained or partially untrained neural network encoder to obtain a corresponding projected representation of the respective compound, and (b) inputting the corresponding projected representation of the respective compound into the untrained or partially untrained classifier to obtain a classification of the respective compound in accordance with a second plurality of weights associated with the untrained or partially untrained classifier; obtaining a second training dataset, in electronic form, wherein the second training dataset comprises, for each respective compound in a second plurality of compounds, information regarding a chemical structure of the respective compound and wherein the second plurality of compounds comprises 100 or more compounds; training an untrained or partially untrained decoder by performing a second procedure that comprises: (i) for each respective compound in the second plurality of compounds, (a) projecting the information regarding the chemical structure of the respective compound into a latent representation space in accordance with the first plurality of weights associated with the trained neural network encoder to obtain a corresponding projected representation of the respective compound, and (b) inputting the corresponding projected representation of the respective compound into the untrained or partially untrained decoder to obtain a chemical structure of the respective compound in accordance with a third plurality of weights associated with the untrained or partially untrained decoder; and using the trained neural network encoder, trained classifier, and trained decoder to identify a test compound that has the first biological property, wherein the test compound is not present in the first and second training set as described for claim 1 above.
Regarding claim 28, Oono et al. teaches that the present invention also relates to apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium. The computer comprises memory and a processor (i.e., a non-transitory computer-readable medium storing one or more computer programs, executable by a computer, for performing a method, the computer comprising one or more processors and a memory, the one or more computer programs collectively encoding computer executable instructions that perform a method) (Para. [0143]). Oono et al. further teaches the limitations of obtaining a first training dataset, in electronic form; the first training dataset comprises, for each respective compound in a first plurality of compounds, (i) information regarding a chemical structure of the respective compound and (ii) one or more biological properties, in a plurality of biological properties, of the respective compound, the first plurality of compounds comprises 100 or more compounds, the plurality of biological properties includes the first biological property; training an untrained or partially untrained neural network encoder and an untrained or partially untrained classifier by performing a first procedure that comprises: (i) for each respective compound in the first plurality of compounds, (a) projecting the information regarding the chemical structure of the respective compound into a latent representation space in accordance with a first plurality of weights associated with the untrained or partially untrained neural network encoder to obtain a corresponding projected representation of the respective compound, and (b) inputting the corresponding projected representation of the respective compound into the untrained or partially untrained classifier to obtain a classification of the respective compound in accordance with a second plurality of weights associated with the untrained or partially untrained classifier; obtaining a second training dataset, in electronic form, wherein the second training dataset comprises, for each respective compound in a second plurality of compounds, information regarding a chemical structure of the respective compound and wherein the second plurality of compounds comprises 100 or more compounds; training an untrained or partially untrained decoder by performing a second procedure that comprises: (i) for each respective compound in the second plurality of compounds, (a) projecting the information regarding the chemical structure of the respective compound into a latent representation space in accordance with the first plurality of weights associated with the trained neural network encoder to obtain a corresponding projected representation of the respective compound, and (b) inputting the corresponding projected representation of the respective compound into the untrained or partially untrained decoder to obtain a chemical structure of the respective compound in accordance with a third plurality of weights associated with the untrained or partially untrained decoder; and using the trained neural network encoder, trained classifier, and trained decoder to identify a test compound that has the first biological property, wherein the test compound is not present in the first and second training set as described for claim 1 above. 
Oono et al. does not teach updating the first plurality of weights and the second plurality of weights by comparing the classification of each respective compound in the first plurality of compounds to the one or more biological properties of the respective compound in the first training dataset thereby obtaining a trained neural network encoder and a trained classifier; and updating the third plurality of weights by comparing the chemical structure of each respective compound outputted by the untrained or partially untrained decoder to the actual chemical structure of the respective compound from the second training dataset thereby obtaining a trained decoder.
Regarding claims 1, 27, and 28, Hernandez et al. teaches a method for predicting an association between two sets of input data, including drug encoding and decoding algorithms (Abstract and Para. [0011]). Hernandez et al. further teaches that the weights in the encoders and decoders are updated based, at least in part, on a comparison of the decoded output vector and the embedded vector provided as input to the modality-specific encoder. For example, a self-supervised learning technique is used to update values of parameters (e.g., weights) in the encoder and decoder during training (i.e., updating the first plurality of weights and the second plurality of weights by comparing the classification of each respective compound in the first plurality of compounds to the one or more biological properties of the respective compound in the first training dataset thereby obtaining a trained neural network encoder and a trained classifier) (Para. [0069]). Hernandez et al. further teaches that a deviation between the decoded vectors output from the decoders and the embedding input vectors provided as input to the encoders is measured and used to update the weights in the statistical model such that the model learns the associations between the data in a self-supervised way. In some embodiments, the self-supervised learning technique is implemented using a negative sampling loss function, and the error determined from the negative sampling loss function is backpropagated through the encoders and decoders (and optionally the embedding matrices used for data embedding) to update the estimates of the parameters (e.g., weights) for each of these components of the model (i.e., updating the third plurality of weights by comparing the chemical structure of each respective compound outputted by the untrained or partially untrained decoder to the actual chemical structure of the respective compound from the second training dataset thereby obtaining a trained decoder) (Para. [0092]).
An invention would have been prima facie obvious to one or ordinary skill in the art before the effective filing date of the claimed invention if some teaching, suggestion or motivation in the prior art would have led that person to combine the prior art teachings to arrive at the claimed invention. Oono et al. discloses a generative model used to generate chemical compounds that have desired characteristics, e.g., activity against a selected target, including an encoder, decoder, and predictor/classifier (Abstract and Para. [0006]). Hernandez et al. discloses a method for predicting an association between two sets of input data, including drug encoding and decoding algorithms (Abstract and Para. [0011]). 
Therefore, one of ordinary skill in the art would have been motivated to combine the model used to generate chemical compounds that have desired characteristics shown by Oono et al. with the weight adjustment of Hernandez et al. because the updated weights adapts the network such that the embedding vectors for connected nodes will be closer in embedded representation space than non-connected nodes (Hernandez et al., Para. [0072]). One of ordinary skill in the art would be able to combine the teachings of Oono et al. and Hernandez et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both incorporate a method for embedding and decoding compound information. Therefore, regarding claims 1-9 and 27-28, the instant invention is prima facie obvious (MPEP § 2142).

Claims 10-11, 13, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Oono et al. in view of Hernandez et al. as applied to claims 1-9 and 27-28 above, and further in view of Torng et al. (Graph Convolutional Neural Networks for Predicting Drug-Target Interactions. J. Chem Inf. Model 59(10): 4131-4139 (2019); published 10/3/2019).
Oono et al. in view of Hernandez et al., as applied to claims 1-9 and 27-28 above, does not teach wherein the featurization of the chemical structure is an extended circular fingerprint, or a molecular graph of a plurality of one-hot-encoded vectors; wherein the multi-dimensional vector space is an N-dimensional space, wherein N is an integer between 20 and 80; wherein the incorporating the featurization of the chemical structure into the multi-dimensional vector space for the chemical structure comprises inputting the featurization of the chemical structure into a spatial graph convolutional network (GCN); converting the chemical structure to a simplified molecular-input line-entry system (SMILES) string; and converting the SMILES string into a molecular graph representation that comprises an adjacency matrix and a feature matrix.
Regarding claims 10 and 11, Torng et al. teaches that small molecules can be naturally represented as 2D molecular graphs. Each atom node is associated with simple atom descriptors of size 62, including one-hot encoding of the atom’s element, the degree of the atom, the number of attached hydrogen atoms, the implicit valence, and an aromaticity indicator (i.e., wherein the featurization of the chemical structure is a molecular graph of a plurality of one-hot-encoded vectors and wherein the multi-dimensional vector space is an N-dimensional space, wherein N is an integer between 20 and 80) (Pg. 4135, Col. 1, Para. 2).
Regarding claim 13, Torng et al. teaches the ligand graph-convolutional neural network to extract features from the 2D ligand graphs, which takes input from the featurized molecular graph of the ligand. Graph convolution networks employ similar concepts of local spatial filters, but operates on graphs, and therefore have been applied to 2D molecular graphs to learn small molecule representations (i.e., wherein the incorporating the featurization of the chemical structure into the multi-dimensional vector space for the chemical structure comprises inputting the featurization of the chemical structure into a spatial graph convolutional network (GCN)) (Pg. 4132, Fig. 1 and Pg. 4132. Col. 2, Para. 2).
Regarding claim 17, Torng et al. teaches that small molecules can be naturally represented as 2D molecular graphs, with nodes representing individual atoms and edges representing bonds. The SMILES string encoding of each molecule is converted into a 2D molecular graph using RDKit. An example molecular structure is shown in Fig. 1 (i.e., converting the chemical structure to a simplified molecular-input line-entry system (SMILES) string) (Pg. 4135, Col. 1, Para. 2 and Pg. 4132, Fig. 1). Torng et al. further teaches that each atom node is associated with simple atom descriptors of size 62, including one-hot encoding of the atom’s element, the degree of the atom, the number of attached hydrogen atoms, the implicit valence, and an aromaticity indicator (i.e., a feature matrix). The edges are associated with bond features of size 6, including the bond type (single, double, triple, or aromatic), whether the bond was conjugated, and whether the bond was part of a ring (i.e., an adjacency matrix) (i.e., converting the SMILES string into a molecular graph representation that comprises an adjacency matrix and a feature matrix) (Pg. 4135, Col. 1, Para. 2).
Therefore, regarding claims 10-11, 13, and 17, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the model used to generate chemical compounds that have desired characteristics of Oono et al. in view of Hernandez et al. with the featurization of Torng et al. because the representation of Torng et al. incorporates the atom descriptors including aromaticity, bond type, conjugation, and connectivity (Torng et al., Pg. 4135, Col. 1, Para. 2) highlighted by Oono et al. as important molecular descriptors for compound characterization (Oono et al., Para. [0075]). One of ordinary skill in the art would be able to combine the teachings of Oono et al. in view of Hernandez et al. with Torng et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both incorporate a method for encoding chemical structures. Therefore, regarding claims 10-11, 13, and 17, the instant invention is prima facie obvious (MPEP § 2142).

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Oono et al. in view of Hernandez et al. as applied to claims 1-9 and 27-28 above, and further in view of Knyazev et al. (Spectral Multigraph Networks for Discovering and Fusing Relationships in Molecules. arXiv:1811.09595 (2018). DOI: https://doi.org/10.48550/arXiv.1811.09595 ; published 11/23/2018).
Oono et al. in view of Hernandez et al., as applied to claims 1-9 and 27-28 above, does not teach wherein the incorporating the featurization of the molecular structure into the multi-dimensional vector space for the chemical structure comprises an application of a spectral graph convolution (SGC) to the featurization of the chemical structure.
Regarding claim 15, Knyazev et al. teaches an application of spectral graph convolutional networks (GCNs) (Abstract). Knyazev et al. further teaches that the model was evaluated using five chemical graph classification datasets. Each graph represents some chemical compound labeled according to its functional properties. Node features are discrete in the datasets, and represented as one-hot vectors. Additional node or edge attributes were not considered in the model (i.e., wherein the incorporating the featurization of the molecular structure into the multi-dimensional vector space for the chemical structure comprises an application of a spectral graph convolution (SGC) to the featurization of the chemical structure) (Pg. 5, Para. 3-6).
Therefore, regarding claim 15, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the model used to generate chemical compounds that have desired characteristics of Oono et al. in view of Hernandez et al. with the molecular structure featurization of Knyazev et al. because the method of Knyazev et al. is shown to learn from graphs of arbitrary size and structure as well as improve the accuracy of the model (Knyazev et al., Pg. 8, Para. 4). One of ordinary skill in the art would be able to combine the teachings of Oono et al. in view of Hernandez et al. with Knyazev et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both incorporate a method for featurization of compounds. Therefore, regarding claim 15, the instant invention is prima facie obvious (MPEP § 2142).

Claims 18-19 and 23-26 are rejected under 35 U.S.C. 103 as being unpatentable over Oono et al. in view of Hernandez et al. as applied to claims 1-9 and 27-28 above, and further in view of Costello et al. (A community effort to assess and improve drug sensitivity prediction algorithms. Nat Biotechnol 32, 1202–1212 (2014); published 6/1/2014).
Oono et al. in view of Hernandez et al., as applied to claims 1-9 and 27-28 above, does not teach wherein the first biological property is selected from the group consisting of: an indication as to whether a compound activates a cell state, an indication as to whether a compound inhibits a cell state, an affinity for a biological target, an EC50 of the compound for inhibiting a biological state, an IC50 of the compound for inhibiting a biological state, an ED50 of the compound for inhibiting a biological state, an LD50 of the compound for inhibiting a biological state, and a TD50 of the compound for inhibiting a biological state; wherein the cell state is characterized by (i) an up-regulation or down-regulation of one or more respective genes in a plurality of genes associated with the cell state, (ii) a diseased state, (iii) an upregulation or a down-regulation of one or more biological pathways, (iv) an upregulation or a down-regulation of one or more biological pathways in a plurality of biological pathways; wherein the cell state is characterized by an upregulation or a down-regulation of one or more of cellular-components; wherein the one or more cellular-components comprises a plurality of genes; wherein the one or more cellular-components are quantified using single-cell ribonucleic acid (RNA) sequencing (scRNA-seq), scTag-seq, single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq), CyTOF/SCoP, E-MS/Abseq, miRNA-seq, CITE-seq, or any combinations thereof, or summaries of the same, including combinations, such as linear combinations, representing activated pathways in the single-cell cellular-component expression dataset; and wherein the one or more cellular-components comprises a plurality of proteins. 
Regarding claim 18, Costello et al. teaches an analysis of 44 drug sensitivity prediction algorithms. The top-performing approaches modeled nonlinear relationships and incorporated biological pathway information (Abstract). Costello et al. further teaches that six genomic, epigenomic, and proteomic profiling data sets were generated for 53 breast cancer cell lines. Drug responses as measured by growth inhibition were assessed after treating the 53 cell lines with 28 drugs (i.e., an indication as to whether a compound inhibits a cell state) (Pg. 1204, Fig. 1).
Regarding claim 19, Costello et al. teaches that the profiling data sets include gene expression data (i.e., an up-regulation or down-regulation of one or more respective genes in a plurality of genes associated with the cell state) (Pg. 1204, Fig. 1). Costello et al. further teaches that the data was generated for breast cancer cell lines (i.e., a diseased state) (Pg. 1204. Fig. 1). Costello et al. further teaches the inclusion of additional variables in the form of annotated biological pathways (i.e., an upregulation or a down-regulation of one or more biological pathways and an upregulation or a down-regulation of one or more biological pathways in a plurality of biological pathways) (Pg. 1204, Col. 2, Para. 1).
Regarding claim 23, Costello et al. teaches that the profiling data sets include RPPA, an antibody-based method to quantitatively measure protein abundance (i.e., wherein the cell state is characterized by an upregulation or a down-regulation of one or more cellular components) (Pg. 1204, Fig. 1, and Online Methods, Pg. 1, Col. 1, Para. 8).
Regarding claim 24, Costello et al. teaches that the profiling data sets include gene expression data (i.e., wherein the one or more cellular-components comprise a plurality of genes) (Pg. 1204, Fig. 1).
Regarding claim 25, Costello et al. teaches that the profiling data sets include RNA-seq data. Although not explicitly disclosed by Costello et al., it would have been obvious to one of ordinary skill in the art to use the RNA sequencing data from single cells instead of bulk, as described by Costello et al. (i.e., wherein the one or more cellular-components are quantified using single-cell ribonucleic acid (RNA) sequencing (scRNA-seq)) (Pg. 1204, Fig. 1, and Online Methods, Pg. 1, Col. 1, Para. 6).
Regarding claim 26, Costello et al. teaches that the profiling data sets include RPPA, an antibody-based method to quantitatively measure protein abundance (i.e., wherein the one or more cellular-components comprises a plurality of proteins) (Pg. 1204, Fig. 1, and Online Methods, Pg. 1, Col. 1, Para. 8).
Therefore, regarding claims 18-19 and 23-26, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the model used to generate chemical compounds that have desired characteristics of Oono et al. in view of Hernandez et al. with the biological properties of Costello et al. because incorporation of gene expression data, as well as other large profiling datasets significantly enhance prediction performance (Costello et al., Pg. 1210, Col. 2, Para. 2). One of ordinary skill in the art would be able to combine the teachings of Oono et al. in view of Hernandez et al. with Costello et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both include a method of incorporating biological data into the model. Therefore, regarding claims 18-19 and 23-26, the instant invention is prima facie obvious (MPEP § 2142).

Claim 29 is rejected under 35 U.S.C. 103 as being unpatentable over Oono et al. (U.S. Patent Application Publication US 2017/0161635 A1; published 6/8/2017) in view of Torng et al. (Graph Convolutional Neural Networks for Predicting Drug-Target Interactions. J. Chem Inf. Model 59(10): 4131-4139 (2019); published 10/3/2019).
Regarding claim 29, Oono et al. teaches a generative model used to generate chemical compounds that have desired characteristics, e.g., activity against a selected target. The models may be used to generate chemical compounds that satisfy multiple requirements (i.e., a method of discovering a candidate compound that has a first biological property) (Abstract). Oono et al. further teaches that the limitation of at a computer system comprising at least one processor and a memory storing at least one program for execution by the at least one processor as described for claim 1 above. Oono et al. further teaches that an autoencoder is trained on a large set of chemical compound representations. A latent representation generator (LRG) may form the first part of the autoencoder, in a similar position as an encoder. The LRG can be used to generate latent representations of compounds. The generated representation, e.g. a fingerprint, satisfies a certain set of assay results (i.e., obtaining a first projected representation of a first compound that is assigned the first biological property by inputting a chemical structure of the first compound into a trained neural network encoder) (Para. [0110] and [0112]). Oono et al. further teaches that when the fingerprint of the seed compound and its associated label are input to the encoder, the encoder can output a latent variable Z. From the latent variable Z, the sampling module can draw a sample to create a latent representation of the seed compound and its label information (i.e., using the first projection to obtain one or more candidate projections) (Para. [0078]). Oono et al. further teaches that the latent representation and the desired label may be input to the decoder, which can decode them to generate a random variable defined over the space of possible fingerprint values (i.e., inputting each candidate projection in the one or more candidate projections into a trained decoder thereby obtaining a plurality of candidate compounds) (Para. [0078]). Oono et al. further teaches that the model generates an output comprising identifying information for a chemical compound not represented in the training set (i.e., wherein the first compound is not present in the plurality of candidate compounds) (Para. [0009]). Oono et al. further teaches that in the ab initio case, the generation of candidate compounds is constrained only by the desired label. Accordingly, ab initio generation may be used when there are no restrictions on the physical structure of the candidate compounds. Because the generated compounds are restricted only by the desired label, ab initio generation may be more likely to generate novel compounds that may not yet exist in a chemical compound database (Para. [0076]). The compounds are then input into the trained encoder (i.e., training a corresponding projected representation for the respective candidate compound by inputting a chemical structure of the candidate compound into the trained neural network encoder) (Para. [0078]). Oono et al. further teaches that the seed compound may also be a compound that has been physically tested to possess a subset of desired label outcomes, but for which an improvement in certain other label outcomes, such as decreased toxicity, improved solubility, and/or improved ease of synthesis, is desired (Para. [0077]). The classifier may be used to classify a compound fingerprint by assigning a drug likeness score (Para. [0099]). A generative model trained on a training set of fingerprints and labels may generate a representation of a chemical compound that has a high likelihood of meeting the requirements of a specified label value (i.e., the first biological property) (i.e., obtaining a classification of the respective candidate compound by inputting the corresponding projected representation of the respective candidate compound into the trained classifier, wherein, when the trained classifier indicates that the corresponding projected representation of the respective candidate compound has the first biological property, the respective candidate compound is deemed to have the first biological property) (Para. [0050]).
Oono et al. does not teach wherein the first projected representation has N dimensions, and wherein N is an integer between 20 and 80.
Regarding claim 29, Torng et al. teaches that small molecules can be naturally represented as 2D molecular graphs, with nodes representing individual atoms and edges representing bonds. SMILES string encoding of each molecule is converted into a 2D molecular graph using RDKit. Each atom node is associated with simple atom descriptors of size 62, including one-hot encoding of the atom’s element, the degree of the atom, the number of attached hydrogen atoms, the implicit valence, and an aromaticity indicator (i.e., wherein the first/corresponding projected representation has N dimensions, and wherein N is an integer between 20 and 80) (Pg. 4135, Col. 1, Para. 2).
Therefore, regarding claim 29, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the model used to generate chemical compounds that have desired characteristics of Oono et al. with the dimensional representation of Torng et al. because the representation of Torng et al. incorporates the atom descriptors including aromaticity, bond type, conjugation, and connectivity (Torng et al., Pg. 4135, Col. 1, Para. 2) highlighted by Oono et al. as important molecular descriptors for compound characterization (Oono et al., Para. [0075]). One of ordinary skill in the art would be able to combine the teachings of Oono et al. with Torng et al. with reasonable expectation of success due to the same nature of the problem to be solved, since both incorporate a method of encoding chemical structures. Therefore, regarding claim 29, the instant invention is prima facie obvious (MPEP § 2142).

Conclusion
No claims allowed.

Inquiries
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DIANA P SANFORD whose telephone number is (571)272-6504. The examiner can normally be reached Mon-Fri 8am-5pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karlheinz Skowronek can be reached at (571)272-9047. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/D.P.S./Examiner, Art Unit 1687                                                                                                                                                                                                        

/Lori A. Clow/Primary Examiner, Art Unit 1687
Read full office action
Prosecution Timeline

Jul 13, 2022
Application Filed
Jan 30, 2026
Non-Final Rejection — §101, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/396,197
Patent 12603153
DE NOVO GENERATION OF HIGH DIVERSITY PROTEINS IN SILICO WITH SELECTIVE AFFINITY AND CROSS-REACTIVITY MINIMIZATION
2y 5m to grant Granted Apr 14, 2026
17/364,044
Patent 12592826
GEOSPATIAL-TEMPORAL PATHOGEN TRACING
2y 5m to grant Granted Mar 31, 2026
17/356,237
Patent 12565673
METHODS FOR THE DESIGN OF NONALLOSTERIC SIRTUIN ACTIVATING COMPOUNDS
2y 5m to grant Granted Mar 03, 2026
17/372,808
Patent 12547889
METHOD AND APPARATUS FOR SYNTHESIZING TARGET PRODUCTS BY USING NEURAL NETWORKS
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 4 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
83%
Grant Probability
99%
With Interview (+25.0%)
4y 8m
Median Time to Grant
Low
PTA Risk
Based on 6 resolved cases by this examiner. Grant probability derived from career allow rate.